Web lists-archives.com

Re: If Linux Is About Choice, Why Then ...




Le quintidi 25 germinal, an CCXXV, tomas@xxxxxxxxxx a écrit :
> You keep repeating this misconception. "Could be" "nobody would". By your
> logic, Apache and PostgreSQL (among many following this model) wouldn't
> work. They do. Pretty reliably, at that.

I am sorry, but you are mistaken here, possibly because you have only a
vague idea of what "monitoring system" is exactly about.

You see, when people talk about "monitoring systems", they are not after
"pretty" reliable, they are after PERFECTLY reliable. They want
reliability even against million-to-one coincidences.

(With the default kernel configuration, "being killed due to a stale PID
file" is a 1/65535 coincidence, much higher than million-to-one, except
in Discworld logic.)

Since perfectly is not possible, they settle for as-much-as-possible.
And SysV init is very far from achieving the optimum.

Look at the process hierarchy of your SysV-init-based system: Apache and
PostgreSQL are direct children of PID 1, but PID 1 does not know about
them. If they exit, PID 1 will reap them, but nothing more. There are
many reasons that can cause that: OOM killer, bug in the program,
hardware problems, stale PID file, admin mistake, etc. Some of them will
leave more or less discreet traces in the logs, but not all of them. And
you may find these reasons unlikely, but when someone interested in
"monitoring systems" hears "unlikely", they understand "possible".

And I can say that it happened to me: I have, not often but not just
once either, found that Apache or another daemon was not running, and
could not find the reason easily.

If you are still not convinced, look at the other serious monitoring
systems: all of them have at least a provision to run as PID 1.

Regards,

-- 
  Nicolas George

Attachment: signature.asc
Description: Digital signature