Web lists-archives.com

Re: on linux virtual memory manager (was: Re: disable akonadi)




René J.V. Bertin - 03.09.18, 13:17:
> On Monday September 03 2018 11:29:27 Martin Steigerwald wrote:
> >memory and swap pressure. Until the kernel started to kick out
> >processes
> >with SIGKILL:
> Ah, the OOM killer. I had a whole exchange about this kind of memory
> management a couple of years back. I don't remember the details but
> had a good reason to turn off the feature and just let *alloc calls
> fail. Until I came across an app or two (possibly KWin) that don't
> handle allocation failures. And that's the worse thing about this
> kind of memory management IMHO: people start relying on it and stop
> accounting for the possibility that allocations might fail.
> 
> So you're saying the OOM killer didn't first kill the application that
> made impossible request? How is that not wrong?

How would it know which application this would be when mutiple processes 
allocate a chunk of memory each? If you have a bank and all people want 
their money back at the same time, how would you decide who will not get 
their money back?

Developers changed OOM killer fundamentally in Linux kernel 2.6.36. 
Before that it tried to guess, but in a totally broken way. For example 
it looked for virtual memory. Now it does not guess. Its RSS + swap size 
and root processes receive 3% bonus and that is it for OOM score 
calculation. Oh, and you can adjust the score for processes manually.

The first time I learned about this behavior of the Linux kernel I 
thought: WTF? AFAIR the Solaris kernel does not do that. I am not sure 
what BSD kernels like the one of FreeBSD or DragonFly BSD do.

They all have virtual memory managers. They all have similar issues to 
deal with: That applications allocate (way) more virtual address space 
than they use later on.

If you disable OOM, it can happen that you cannot start applications 
although you have way more physical memory and swap space free than what 
would be required to run it. atop shows this nicely in "SWP" line

| vmcom  12.5G | vmlim  27.7G |

currently on this ThinkPad T520.

First is what the kernel promises to the applications, second is the 
limit as configured, by default half of the physical RAM + all of the 
swap space.

What you can do to allow more memory allocations with the same amount of 
physical memory while still disabling OOM is:

1) Increase swap size.

2) Increase /proc/sys/vm/overcommit_ratio to maybe about 80 or 90 so 
that the kernel allows to allocate 80 or 90% of the physical memory even 
with overcommitting disabled completely.

Even with that, as the ThinkPad T520 still had 8 GiB of RAM, I had it 
that in the second Plasma session I could not start another Firefox 
anymore, although the machine still had more than enough free physical 
memory and swap space.

Just as an example how crazy this is:

% ps --sort -vsz -axo pid,cmd,pmem,rss,vsz | head
  PID CMD                         %MEM   RSS    VSZ
 2717 /usr/lib/x86_64-linux-gnu/l  0.9 161632 268808920

^^^^ this is QtWebEngine: /usr/lib/x86_64-linux-gnu/qt5/libexec/
QtWebEngineProcess

26677 /usr/bin/baloo_file          0.0  9172 268803636
 2337 /usr/bin/baloo_file         10.7 1742544 268754296
 2891 /usr/bin/kmail -qwindowtitl  3.4 564420 6300680
14405 /usr/bin/amarok              0.8 130332 4659040
 2343 /usr/bin/plasmashell         1.3 212184 4323664
 2469 /usr/bin/akregator -session  1.5 256184 3624412
 2335 /usr/bin/kwin_x11 -session   0.3 58536 3077096
19944 /usr/lib/firefox/firefox     3.8 618412 2780652

Do you see baloo_file? It allocated 268754296 KiB of virtual address 
space, that is about 256 GiB, but "just" 1742544 of physical memory 
(some of that shared with other processes!), that is about 1701 MiB. 
Still a lot if you ask me. I am not sure why it allocated that much 
physical memory on this machine. 

I have no idea how it was capable to do allo, as with stress I was not 
able to allocate 30 GiB of virtual address space in one go, but it 
appears those almost 256 GiB are even continuous address space, 
according to:

% pmap -x 2337
2337:   /usr/bin/baloo_file
Address           Kbytes     RSS   Dirty Mode  Mapping
[…]
00007f5eb0000000 268435456 1643580       0 r--s- index
[…]

I bet it may have allocated these in steps, but anyway, I never really 
understood the default heuristic of the Linux kernel regarding 
overcommit. And baloo is not the only one doing such crazy things. 
QtWebEngine did too. I also saw Java virtual machine / Java applications 
like to do that.

Enough of that. Just in case you like to get rid of baloo file indexer 
you may like to enable strict overcommit. :)

Ciao,
-- 
Martin