Booting with runit

2005-03-19 23:59

I wrote, some time ago, an article explaining what the usual linux booting process, usually called SYSVinit does. Surprisingly, that article is one of the most populars on my site. I suppose it's because it's a common technology, and the existing docs had some issues, being either too technical or something.

In any case, that page serves a purpose, and that's good. I hope this new article serves another purpose: showing that there can be better/faster/nicer/more manageable ways to boot a Linux-like system.

To do that, I will show a tool called runit and the concepts behind it.

The tool

Runit is a rewrite of D.J. Bernstein's daemontools, and the basic concept is service supervision.

A supervised service has a small process (the supervisor), which knows how to start the service, how to clean up if it goes down, and whether the admin wants the service up or down.

This has some obvious advantages. If the application happens to crash, the supervisor will restart it. The supervisor is small enough that the odds of it crashing are very small, and if it happened, there are ways to manage that, too.

The interface to the services is more uniform, too. There is a standard mechanism to see the process status, get its PID and many things that, in the SYSV world, require handling on the level of the service's script. This makes the task of writing good service scripts much easier. Consider the following example, for the GPM service. They are functionally equivalent.

#!/bin/dash
# GPM in runit

exec 2>&1

. /etc/sysconfig/mouse
exec gpm -D -m $DEVICE -t $MOUSETYPE >/dev/null

#!/bin/bash
# GPM in SYSVinit, copied from Fedora, simplified to make it functionally
# equivalent to the runit version.

# source function library
. /etc/init.d/functions

RETVAL=0
start() {
    daemon gpm -m $DEVICE -t $MOUSETYPE $OPTIONS
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && touch /var/lock/subsys/gpm
}

stop() {
    echo -n "Shutting down console mouse services: "
    killproc gpm
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/gpm
}
case "$1" in
start)
        start
        ;;
stop)
        stop
        ;;
restart|reload)
        stop
        start
        ;;
condrestart)
        if [ -f /var/lock/subsys/gpm ]; then
            stop
            start
        fi
        ;;
status)
        status gpm
        RETVAL=$?
        ;;
*)
        echo $"Usage: $0 {start|stop|restart|condrestart|status}"
        exit 1
esac

Keep in mind that in the SYSVinit version, killproc and daemon are actually shell functions, defined in /etc/init.d/functions, and they are roughly 70 lines each. Oh, and if gpm goes down, on Fedora it stays down.

So, I think we can agree runit can make some things simpler. Let's see how it works.

Anatomy of a service

A service in runit is not a script, but a directory containing some specific files in it.

run

In the most basic setup, all you need is a run script. It's a regular shell script, nothing strange about it at all. I normally use dash as shell for these things, because it's smaller than bash, and so a little bit faster.

This script has to do a simple task:

perform any initialization the service requires (like sourcing config files for variables, setup paths, whatever)
Start the service.

The service should not go into daemon mode, it should run "in the foreground". If you want to keep logs for this service, it is good if you can make it simply log to standard output or standard error.

You can see what options work like this very simply. Start the service in an interactive shell. It should not give you back your prompt, and it should print logs on the screen.

The exec 2>&1 on the GPM example simply makes the standard error stream go the standard output one, so it unifies both.

So, create that script, make it executable, and put it into a directory.

finish

If there is a specific cleanup that is required whenever the service exits, create a finish script in the same directory. For example, suppose you want to send email to root whenever the service exits, because it's super-important.

#!/bin/dash

# Finish script for important service!

echo `date` 'The important service crashed!' | mail -s 'crash!' admin@bigcorp.com

log

Logs are a very important thing for an admin. They are your window into the system. Runit includes an excellent mechanism for logging, which is much simpler than syslog. If there is a log subdirectory in your service, it will treat it as a service, too, and will feed the standard output of your service to the log's run script. That should be a script that logs its input somewhere.

So, it looks like this:

service/run ==> service/log/run ==> logfile

Here's my universal log script:

#!/bin/dash

svname=servicename

if [ ! -d /var/log/$svname ]
then
        mkdir /var/log/$svname
        chown root.root /var/log/$svname
        chmod 700 /var/log/$svname
fi

exec /sbin/svlogd -tt /var/log/$svname

Replace servicename with whatever you want, save that as service/log/run, and that's it. (BTW, if someone knows how to get the name of the parent folder in shell scripts, we can make that really universal ;-)

Of course, the important command in the script is svlogd. It's part of runit. It's a tool that logs whatever it gets from standard input. The -tt option makes it timestamp in a human-readable format.

Svlogd also manages automatically log rotation, which makes tools like logrotate unnecessary. For that, read the docs, but here's what I do. I add a log/config file like this:

# Log a MB per-file, keep last ten files

s1000000
n10

Notice that svlogd rotates logs by size, not by date. That's a matter of argument in sysadmin circles. You can also do stuff like bzip2-compress your old logs, using the processor feature, if you want.

You could, if you want, use other same-style logging tools, like flog, multilog or many others. You could even use the regular syslog.

Service control

When you have a service, you have to be able to start it, stop it, etc.

The first thing you have to do is start the service supervisor, svrun. This will normally not be done manually, but by something like runsvdir, which starts supervisors for all services in a given folder (see below for example). Once the supervisor is running, here's what you can do.

In the supervised service paradigm, the service has a state (running/stopped) and a configured state (how it wants to be, up/down).

So, a service can be:

Running, down:

It will not be restarted if it finishes.
Running, up:

It will be restarted if needed.
Stopped and up:

It will be retarted (probably very soon).
Stopped and down:

Will not start unless you do it manually.

The tool to see the status is runsvstat:

bash-2.05b# runsvstat /var/service/dropbear/
/var/service/dropbear/: run (pid 22756) 6 seconds

Notice how it also gives service uptime, which is a useful thing, usually hard to get in SYSVinit.

The tool to control a service is runsvctrl:

#Switch it to up
runsvctrl u /var/service/dropbear

#Switch it to down
runsvctrl d /var/service/dropbear

#Make it run, but not switch to up (run once)
runsvctrl o /var/service/dropbear

It also has other commands, like 'send KILL signal' or 'send STOP signal', so read the docs.

The runit-init boot

WARNING: DO NOT TRY TO SWITCH TO RUNIT ON AN IMPORTANT BOX UNLESS YOU REALLY KNOW WHAT YOU ARE DOING

The main docs about booting with runit can be found here but let me give you a short tour.

getties

Make sure you create a set of decent services for getties (example) or else you can't login.

Basic startup

Runit runs, when starting the boot, the /etc/runit/1 script.

This should be the equivalent of Red Hat/Fedora's rc.sysinit, and you can figure out how to do it by carefully following your system's boot up to when it goes into runlevels (see my article if you must). The main goal of this script is running every thing that has to run at boot and then never again.

Here's an example for a ucrux based distro I am hacking for myself:

#!/bin/dash

PATH=/command:/sbin:/bin:/usr/sbin:/usr/bin
D=`date`

echo "The system is coming up.  Please wait."

echo  Load configuration
. /etc/rc.conf

echo  Start device management daemon
/sbin/devfsd /dev

echo  Activate swap
/sbin/swapon -a

echo  Mount root read-only
/bin/mount -n -o remount,ro /

echo  Check filesystems
/sbin/fsck -A -T -C -a
if [ $? -gt 1 ]; then
        echo
        echo "***************  FILESYSTEM CHECK FAILED  ******************"
        echo "*                                                          *"
        echo "*  Please repair manually and reboot. Note that the root   *"
        echo "*  file system is currently mounted read-only. To remount  *"
        echo "*  it read-write type: mount -n -o remount,rw /            *"
        echo "*  When you exit the maintainance shell the system will    *"
        echo "*  reboot automatically.                                   *"
        echo "*                                                          *"
        echo "************************************************************"
        echo
        /sbin/sulogin -p
        echo "Automatic reboot in progress..."
        /bin/umount -a
        /bin/mount -n -o remount,ro /
        /sbin/reboot -f
        exit 0
fi
echo  Mount local filesystems
/bin/mount -n -o remount,rw /
/bin/rm -f /etc/mtab*
/bin/mount -a -O no_netdev

echo $D > /tmp/start

echo  Clean up misc files
: > /var/run/utmp
/bin/rm -rf /forcefsck /fastboot /etc/nologin /etc/shutdownpid \
            /var/run/*.pid /var/lock/* /tmp/{.*,*} &> /dev/null

rm -Rf /tmp/.ICE-unix
/bin/mkdir -m 1777 /tmp/.ICE-unix

echo  Set kernel variables
/sbin/sysctl -p > /dev/null

echo  Update shared library links
/sbin/ldconfig

echo Updating module deps
/sbin/depmod -a &

echo  Configure host name
if [ "$HOSTNAME" ]; then
        echo "hostname: $HOSTNAME"
        /bin/hostname $HOSTNAME
fi

echo  Load random seed
if [ -f /var/tmp/random-seed ]; then
        /bin/cat /var/tmp/random-seed > /dev/urandom
fi

echo  Configure system clock
if [ ! -f /etc/adjtime ]; then
        echo "0.0 0 0.0" > /etc/adjtime
fi
if [ "$TIMEZONE" ]; then
        /bin/ln -sf /usr/share/zoneinfo/$TIMEZONE /etc/localtime
fi
/sbin/hwclock --hctosys

echo  Start log daemons
/usr/sbin/syslogd
/usr/sbin/klogd -c 4

echo  Load console font
if [ "$FONT" ]; then
        echo "font: $FONT"
        /usr/bin/setfont $FONT
fi

echo  Load console keymap
if [ "$KEYMAP" ]; then
        echo "keyboard: $KEYMAP"
        /bin/loadkeys -q $KEYMAP
fi

echo  Screen blanks after 15 minutes idle time
/usr/bin/setterm -blank 15

echo Config hardware
/etc/rc.d/coldplug start

# End of file

The bad news are, you will almost certainly have to hack this yourself, until someone decides to start making runit-based distros ;-)

Then, after that's done, you should have the filesystems mounted, the locale set, the font loaded, etc, etc.

So, now it runs /etc/runit/2. This is a much simpler script:

#!/bin/sh

PATH=/command:/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin

date >/tmp/end
exec env - PATH=$PATH \
runsvdir /var/service 'log: ...................................................................................\
...............................................................................................................\
...............................................................................................................\
..........................................................................................'

This one only assumes you create all your services in /var/service, so except for that, you can probably sue it as-is.

What runsvdir does is to start all the services in /var/service. At the same time. This parallelism will do wonders for your boot time! It can also render the system completely unbootable, so...

Service dependencies

Suppose you want to make sure gpm is runnng before starting X, because you are using its repeater mode. If you have no idea what I am talking about, don't worry. Just pretend that I gave you a valid reason for having a service depend on another ;-)

Then, that means X's run script should make sure that gpm's service is up before actually starting X.

Here's how you do it:

#!/bin/dash
sv -w 3 start /var/service/gpm /var/service/getty* || exit 1
exec xdm -nodaemon

This script waits until gpm and every getty service I defined have been up for 3 seconds. If one of those services has a configured state of down (meaning it will not start), then sv fails and the script exits.

The idea behind the 3 seconds is that you need some confidence that the services don't crash between the second and third lines in our script. Sadly that is a race condition that can lead to trouble and is unavoidable. I haven't found a way to do it right, though, and in any case, SYSVinit does it worse.

The other side of dependencies is that if in our example gpm exits, X should also:

#!/bin/dash
sv stop -v -k -w 30 /var/service/X

That would be gpm's finish script. So, when gpm goes down, it takes X with it. Usually, this is not a good idea, I am only doing it for example's sake.

In the real world, most services will survive gracefully, at least long enough so that the lower-level service restarts. dependencies on startup are good. Dependencies on finish are usually dangerous and annoying.

I mean, killing X? What about all the stuff you are doing???? If GPM dies, at worst you are without a mouse. Well, at least you can save stuff using shortcuts, instead of being dumped into a shell without warning :-)

Not booting via runit

You can easily boot using SYSVinit yet use runit to manage your services. Here's a silly SYSVinit service script for runit:

#!/bin/sh

case "$1" in
start)
        runsvdir /var/service
        ;;
stop)
        killall runsvdir
        ;;
restart|reload)
        killall runsvdir
        runsvdir /var/service
        ;;
esac

So, use that as your only SYSVinit service and be happy :-)

Runit, runit good!

So, it takes a certain amount of work switching to runit. You end up writing or retouching many scripts (although there are a bunch of them in runit's site. What do you get for your efforts?

Well, you get a better system, I think. The main things I find are:

Better management:

Runit services are more uniform, and better behaved. You get more meaningful status information.
Better dependencies:

If a service depends on another, you get to actually check to see if it works, instead of trusting luck, which is what SYSVinit does, pretty much.
Simpler service scripts:

If you add your own services, SYSVinit is much harder. Since I do that often, runit makes my life simpler.
Faster boots:

The startup parallelization runit does is good. Usually your CPU is quite idle on startup, while SYSV starts painfully one service at a time. I have cut my boot time in half by switching to runit in a Fedora 3 test system, with no loss of functionality.
Less hackish:

The concept of a runit service is well defined. The concept of logging is well defined. The concept of startup, of whether it should start on boot or not, are well defined, and in simple manners, and it includes the tools to manage the services effectively. That means you don't have bolted-on stuff like chkconfig, ntsysv, rcconf and so on. SYSVinit is used everywhere, but the tools are different almost everywhere.

And last but not least:

No runlevels:

The concept of runlevels is evil. It serves no useful purpose, and makes SYSVinit much harder than it needs to be.

But what the heck, if you want runlevels, you can have them with runit as well.

If you try this, and are happy with the results, (or even if you are unhappy ;-) please let me know!

If you are using RHEL4 or CentOS, then you can probably avoid much trouble using my runit RPM. It requires /bin/dash which you can get from my dash RPM (or just link some shell to /bin/dash).

After you install it, try booting with the init=/sbin/runit-init parameter passed to your kernel and it should boot just fine.

Added May 19 2006: I posted this about converting SysV scripts to runit.

Ralsina.Me — El sitio web de Roberto Alsina