Difference between revisions of "SME Server:Documentation:Developers Manual:Chapter9"
Line 104: | Line 104: | ||
===Adding a supervised service=== | ===Adding a supervised service=== | ||
− | See http:// | + | See http://cr.yp.to/daemontools.html |
Check your application has a -d option or similar which means that it stays in the foreground, and logs to standard output rather than syslog. That makes it suitable for running as a supervised service. | Check your application has a -d option or similar which means that it stays in the foreground, and logs to standard output rather than syslog. That makes it suitable for running as a supervised service. |
Latest revision as of 14:25, 5 February 2010
Process startup, supervision and shutdown
Process startup
In typical Linux systems, services (processes) are started at boot time through a mechanism such as System V init. When the system administrator needs to change the settings, they modify the configuration files and then restart the service or notify the process that it needs to re-read the configuration.
It is usually assumed that processes which have been started will continue to run, and only require intervention during configuration changes. There are a number of problems with this model, which are addressed by the SME Server:
- Processes do occasionally fail through software errors, memory exhaustion and accidental finger poking by the system administrator.
- Some startup scripts and processes do not gracefully handle server crashes, such as power outages. The startup scripts and processes often use process identifier (PID) files to determine whether the process is running. Reliable handling of PID files is impossible to achieve under all failure cases.
- Many processes do not deal properly with rapid invocation of stop and start requests. This is often, but not always, due to "PID file race" conditions.
Process supervision: runit (and supervise)
The SME Server addresses these issues by running processes under the runit process supervision environment, which:
- runs each process under control of its own supervisor process
- imposes process limits
- restarts the process if it fails
- provides a consistent mechanism for controlling the underlying process
The runit process tree
When a Linux system boots, it starts the init process, which then starts all other processes. When init enters "run-level 7", it starts /etc/runit/2 from an entry in /etc/inittab.
/etc/runit/2 starts the runsvdir master supervision process, which scans the /service/ directory for work to do. If the runsvdir command happened to fail, it would be restarted by init.
The runsvdir command looks for subdirectories under the /service/ directory, and starts a runsv process to manage that directory. If any of the runsv processes fail, they will be restarted by runsvdir.
Each runsv process looks for a run script under the directory it is managing. runsv runs the run script and keeps a connection to the process started by that script. If the process dies, it is restarted.
If the directory also has a log subdirectory, runsv runs run script in that directory and connects the output of the main program to the input of the "logger" process.
This produces a process tree which looks something like this:
[root@gsxdev1 events]# pstree 1 init-+-acpid |-md1_raid1 |-md2_raid1 | ... |-runsvdir-+-runsv-+-multilog | | `-ulogd | |-6*[runsv---multilog] | |-runsv-+-multilog | | `-ntpd | |-runsv-+-multilog | | `-tinydns | |-runsv-+-cvm-unix | | `-multilog | |-runsv-+-multilog | | `-mysqld | |-5*[runsv-+-multilog] | | `-tcpsvd] | |-runsv-+-multilog | | `-oidentd | |-runsv-+-multilog | | `-smtp-auth-proxy | |-runsv-+-multilog | | `-smbd---smbd | |-runsv---httpd---10*[httpd]
This looks like a complex process tree, but is a critical part of the SME Server's design for reliability. Each process is independent, has a consistent management interface, has process limits imposed on it, and will restart if it happens to fail.
For further documentation on runit, refer to the runit manual page.
Run-level 7 and the e-smith-service wrapper
The SME Server runs in the normally unused run-level 7. This ensures that the only software running on the SME Server is software that we have chosen to run, and it is started and stopped in a consistent way. If we need to replace a standard startup script with one which runs the process under supervise, we can do so without modifying the original package.
In order to run a process under run-level 7, all you need to do is provide a link in the /etc/rc.d/rc7.d/ directory to your startup script. However, in most cases your process should only start if it is enabled in the configuration database.
If you look at the /etc/rc.d/rc7.d/ directory. you will see that it contains a large number of links to the /etc/rc.d/init.d/e-smith-service script.
S00microcode_ctl -> /etc/rc.d/init.d/e-smith-service S05syslog -> /etc/rc.d/init.d/e-smith-service S06cpuspeed -> /etc/rc.d/init.d/e-smith-service S15nut -> ../init.d/e-smith-service S15raidmonitor -> /etc/rc.d/init.d/e-smith-service S26apmd -> /etc/rc.d/init.d/e-smith-service S35bootstrap-console -> /etc/rc.d/init.d/e-smith-service [...]
This script is key to ensuring that services start when they are enabled and do not start when they are disabled, as it:
- Checks the name of the link, e.g. S05syslog
- Removes the S05 prefix, leaving syslog
- Checks to see whether syslog is defined in the configuration database, and whether it has its status set to enabled.
- If so, it runs the /etc/init.d/syslog script with the argument start.
- If the service is not enabled, it exits without starting the service.
Adding a supervised service
See http://cr.yp.to/daemontools.html
Check your application has a -d option or similar which means that it stays in the foreground, and logs to standard output rather than syslog. That makes it suitable for running as a supervised service.
Create a /var/service/XXX directory, containing an executable 'run' script something like:
#! /bin/sh exec 2>&1 exec /var/service/XXX -d
and a /var/service/XXX/log directory, containing an executable 'run' script something like:
#! /bin/sh exec setuidgid smelog \ /usr/local/bin/multilog t s500000 \ /var/log/XXX
You would then do:
mkdir /var/log/XXX chown smelog.smelog /var/log/XXX ln -s /var/service/XXX /service touch /var/service/XXX/down