Purchase  Copyright © 2002 Paul Sheer. Click here for copying permissions.  Home 

next up previous contents
Next: 21. System Services and Up: rute Previous: 19. Partitions, File Systems,   Contents


20. Advanced Shell Scripting

This chapter completes our discussion of sh shell scripting begun in Chapter 7 and expanded on in Chapter 9. These three chapters represent almost everything you can do with the bash shell.

20.1 Lists of Commands

The special operator && and || can be used to execute functions in sequence. For instance:

grep '^harry:' /etc/passwd || useradd harry

The || means to only execute the second command if the first command returns an error. In the above case, grep will return an exit code of 1 if harry is not in the /etc/passwd file, causing useradd to be executed.

An alternate representation is

grep -v '^harry:' /etc/passwd && useradd harry

where the -v option inverts the sense of matching of grep. && has the opposite meaning to ||, that is, to execute the second command only if the first succeeds.

Adept script writers often string together many commands to create the most succinct representation of an operation:

grep -v '^harry:' /etc/passwd && useradd harry || \
       echo "`date`: useradd failed" >> /var/log/my_special_log

20.2 Special Parameters: $?, $*,...

An ordinary variable can be expanded with $VARNAME. Commonly used variables like PATH and special variables like PWD and RANDOM were covered in Chapter 9. Further special expansions are documented in the following section, quoted verbatim from the bash man page (the footnotes are mine).(footnote follows) [Thanks to Brian Fox and Chet Ramey for this material.]

Special Parameters

The shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed.
Expands to the positional parameters (i.e., the command-line arguments passed to the shell script, with $1 being the first argument, $2 the second etc.), starting from one. When the expansion occurs within double quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS special variable. That is, "$*" is equivalent to "$1c$2c...", where c is the first character of the value of the IFS variable. If IFS is unset, the parameters are separated by spaces. If IFS is null, the parameters are joined without intervening separators.
Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is, "$@" is equivalent to "$1" "$2" ... When there are no positional parameters, "$@" and $@ expand to nothing (i.e., they are removed). [Hint: this is very useful for writing wrapper shell scripts that just add one argument.]
Expands to the number of positional parameters in decimal (i.e. the number of command-line arguments).
Expands to the status of the most recently executed foreground pipeline. [I.e., the exit code of the last command.]
Expands to the current option flags as specified upon invocation, by the set builtin command, or those set by the shell itself (such as the -i option).
Expands to the process ID of the shell. In a () subshell, it expands to the process ID of the current shell, not the subshell.
Expands to the process ID of the most recently executed background (asynchronous) command. [I.e., after executing a background command with command &, the variable $! will give its process ID.]
Expands to the name of the shell or shell script. This is set at shell initialization. If bash is invoked with a file of commands, $0 is set to the name of that file. If bash is started with the -c option, then $0 is set to the first argument after the string to be executed, if one is present. Otherwise, it is set to the file name used to invoke bash, as given by argument zero. [Note that basename $0 is a useful way to get the name of the current command without the leading path.]
At shell startup, set to the absolute file name of the shell or shell script being executed as passed in the argument list. Subsequently, expands to the last argument to the previous command, after expansion. Also set to the full file name of each command executed and placed in the environment exported to that command. When checking mail, this parameter holds the name of the mail file currently being checked.

20.3 Expansion

Expansion refers to the way bash modifies the command-line before executing it. bash performs several textual modifications to the command-line, proceeding in the following order:

Brace expansion
We have already shown how you can use, for example, the shorthand touch file_{one,two,three}.txt to create multiple files file_one.txt, file_two.txt, and file_three.txt. This is known as brace expansion and occurs before any other kind of modification to the command-line.
Tilde expansion
The special character ~ is replaced with the full path contained in the HOME environment variable or the home directory of the users login (if $HOME is null). ~+ is replaced with the current working directory and ~- is replaced with the most recent previous working directory. The last two are rarely used.
Parameter expansion
This refers to expanding anything that begins with a $. Note that $VAR and ${VAR } do exactly the same thing, except in the latter case, VAR can contain non-``whole word'' characters that would normally confuse bash.

There are several parameter expansion tricks that you can use to do string manipulation. Most shell programmers never bother with these, probably because they are not well supported by other UNIX systems.

${VAR :-default }
This will result in $VAR unless VAR is unset or null, in which case it will result in default.
${VAR :=default }
Same as previous except that default is also assigned to VAR if it is empty.
${VAR :-default }
This will result in an empty string if VAR is unset or null; otherwise it will result in default. This is the opposite behavior of ${VAR :-default }.
${VAR :?message }
This will result in $VAR unless VAR is unset or null, in which case an error message containing message is displayed.
${VAR :offset } or ${VAR :n :l }
This produces the nth character of $VAR and then the following l characters. If l is not present, then all characters to the right of the nth character are produced. This is useful for splitting up strings. Try:

echo ${TEXT:10:3}
echo ${TEXT:10}

${#VAR }
Gives the length of $VAR.
${!PRE *}
Gives a list of all variables whose names begin with PRE.
${VAR #pattern }
$VAR is returned with the glob expression pattern removed from the leading part of the string. For instance, ${TEXT#scr} in the above example will return ripting_for_phun.
${VAR ##pattern }
This is the same as the previous expansion except that if pattern contains wild cards, then it will try to match the maximum length of characters.
${VAR %pattern }
The same as ${VAR #pattern } except that characters are removed from the trailing part of the string.
${VAR %%pattern }
The same as ${VAR ##pattern } except that characters are removed from the trailing part of the string.
${VAR /search /replace }
$VAR is returned with the first occurrence of the string search replaced with replace.
${VAR /#search /replace }
Same as ${VAR /search /replace } except that the match is attempted from the leading part of $VAR.
${VAR /%search /replace }
Same as ${VAR /search /replace } except that the match is attempted at the trailing part of $VAR.
${VAR //search /replace }
Same as ${VAR /search /replace } except that all instances of search are replaced.
Backquote expansion
We have already shown backquote expansion in 7.12. Note that the additional notation $(command) is equivalent to `command` except that escapes (i.e., \) are not required for special characters.
Arithmetic expansion
We have already shown arithmetic expansion on page [*]. Note that the additional notation $((expression)) is equivalent to $[expression].
The last modifications to the command-line are the splitting of the command-line into words according to the white space between them. The IFS (Internal Field Separator) environment variable determines what characters delimit command-line words (usually whitespace). With the command-line divided into words, path names are expanded according to glob wild cards. Consult bash(1) for a comprehensive description of the pattern matching options that most people don't know about.

20.4 Built-in Commands

Many commands operate some built-in functionality of bash or are especially interpreted. These do not invoke an executable off the file system. Some of these were described in Chapter 7, and a few more are discussed here. For an exhaustive description, consult bash(1).

A single colon by itself does nothing. It is useful for a ``no operation'' line such as:

if <command> ; then
    echo "<command> was unsuccessful"

. filename args ...
A single dot is the same as the source command. See below.
alias command =value
Creates a pseudonym for a command. Try:

alias necho="echo -n"
necho "hello"

Some distributions alias the mv, cp, and rm commands to the same pseudonym with the -i ( interactive) option set. This prevents files from being deleted without prompting, but can be irritating for the administrator. See your ~/.bashrc file for these settings. See also unalias.
unalias command
Removes an alias created with alias.
alias -p
Prints list of aliases.
eval arg ...
Executes args as a line of shell script.
exec command arg ...
Begins executing command under the same process ID as the current script. This is most often used for shell scripts that are mere ``wrapper'' scripts for real programs. The wrapper script sets any environment variables and then execs the real program binary as its last line. exec should never return.
local var =value
Assigns a value to a variable. The resulting variable is visible only within the current function.
pushd directory and popd
These two commands are useful for jumping around directories. pushd can be used instead of cd, but unlike cd, the directory is saved onto a list of directories. At any time, entering popd returns you to the previous directory. This is nice for navigation since it keeps a history of wherever you have been.
printf format args ...
This is like the C printf function. It outputs to the terminal like echo but is useful for more complex formatting of output. See printf(3) for details and try printf "%10.3e\n" 12 as an example.
Prints the present working directory.
Prints the value of all environment variables. See also Section 20.6 on the set command.
source filename args ...
Reads filename into the current current shell environment. This is useful for executing a shell script when environment variables set by that script must be preserved.
Prints the accumulated user and system times for the shell and for processes run from the shell.
type command
Tells whether command is an alias, a built-in or a system executable.
Prints and sets various user resource limits like memory usage limits and CPU limits. See bash(1) for details.
See Section 14.2.
unset VAR
Deletes a variable or environment variable.
unset -f func
Deletes a function.
Pauses until all background jobs have completed.
wait PID
Pauses until background process with process ID of PID has exited, then returns the exit code of the background process.
wait %job
Same with respect to a job spec.

20.5 Trapping Signals -- the trap Command

You will often want to make your script perform certain actions in response to a signal. A list of signals can be found on page [*]. To trap a signal, create a function and then use the trap command to bind the function to the signal.

function on_hangup ()
    echo 'Hangup (SIGHUP) signal recieved'
trap on_hangup SIGHUP
while true ; do
    sleep 1
exit 0

Run the above script and then send the process ID the -HUP signal to test it. (See Section 9.5.)

An important function of a program is to clean up after itself on exit. The special signal EXIT (not really a signal) executes code on exit of the script:

function on_exit ()
    echo 'I should remove temp files now'
trap on_exit EXIT
while true ; do
    sleep 1
exit 0

Breaking the above program will cause it to print its own epitaph.

If - is given instead of a function name, then the signal is unbound (i.e., set to its default value).

20.6 Internal Settings -- the set Command

The set command can modify certain behavioral settings of the shell. Your current options can be displayed with echo $-. Various set commands are usually entered at the top of a script or given as command-line options to bash. Using set +option instead of set -option disables the option. Here are a few examples:

set -e
Exit immediately if any simple command gives an error.
set -h
Cache the location of commands in your PATH. The shell will become confused if binaries are suddenly inserted into the directories of your PATH, perhaps causing a No such file or directory error. In this case, disable this option or restart your shell. This option is enabled by default.
set -n
Read commands without executing them. This command is useful for syntax checking.
set -o posix
Comply exactly with the POSIX 1003.2 standard.
set -u
Report an error when trying to reference a variable that is unset. Usually bash just fills in an empty string.
set -v
Print each line of script as it is executed.
set -x
Display each command expansion as it is executed.
set -C
Do not overwrite existing files when using >. You can use >| to force overwriting.

20.7 Useful Scripts and Commands

Here is a collection of useful utility scripts that people are always asking for on the mailing lists. See page [*] for several security check scripts.

20.7.1 chroot

The chroot command makes a process think that its root file system is not actually /. For example, on one system I have a complete Debian installation residing under a directory, say, /mnt/debian. I can issue the command

chroot /mnt/debian bash -i

to run the bash shell interactively, under the root file system /mnt/debian. This command will hence run the command /mnt/debian/bin/bash -i. All further commands processed under this shell will have no knowledge of the real root directory, so I can use my Debian installation without having to reboot. All further commands will effectively behave as though they are inside a separate UNIX machine. One caveat: you may have to remount your /proc file system inside your chroot'd file system--see page [*].

This useful for improving security. Insecure network services can change to a different root directory--any corruption will not affect the real system.

Most rescue disks have a chroot command. After booting the disk, you can manually mount the file systems on your hard drive, and then issue a chroot to begin using your machine as usual. Note that the command chroot <new-root> without arguments invokes a shell by default.

20.7.2 if conditionals

The if test ... was used to control program flow in Chapter 7. Bash, however, has a built-in alias for the test function: the left square brace, [.

Using [ instead of test adds only elegance:

if [ 5 -le 3 ] ; then
    echo '5 < 3'

It is important at this point to realize that the if command understands nothing of arithmetic. It merely executes a command test (or in this case [) and tests the exit code. If the exit code is zero, then the command is considered to be successful and if proceeds with the body of the if statement block. The onus is on the test command to properly evaluate the expression given to it.

if can equally well be used with any command:

if echo "$PATH" | grep -qwv /usr/local/bin ; then
    export PATH="$PATH:/usr/local/bin"

conditionally adds /usr/local/bin if grep does not find it in your PATH.

20.7.3 patching and diffing

You may often want to find the differences between two files, for example to see what changes have been made to a file between versions. Or, when a large batch of source code may have been updated, it is silly to download the entire directory tree if there have been only a few small changes. You would want a list of alterations instead.

The diff utility dumps the lines that differ between two files. It can be used as follows:

diff -u <old-file> <new-file>

You can also use diff to see difference netween two directory trees. diff recursively compares all corresponding files:

diff -u --recursive --new-file <old-dir> <new-dir> > <patch-file>.diff

The output is known as a patch file against a directory tree, that can be used both to see changes, and to bring <old-dir> up to date with <new-dir>.

Patch files may also end in .patch and are often gzipped. The patch file can be applied to <old-dir> with

cd <old-dir>
patch -p1 -s < <patch-file>.diff

which makes <old-dir> identical to <new-dir>. The -p1 option strips the leading directory name from the patch file. The presence of a leading directory name in the patch file often confuses the patch command.

20.7.4 Internet connectivity test

You may want to leave this example until you have covered more networking theory.

The acid test for an Internet connection is a successful DNS query. You can use ping to test whether a server is up, but some networks filter ICMP messages and ping does not check that your DNS is working. dig sends a single UDP packet similar to ping. Unfortunately, it takes rather long to time out, so we fudge in a kill after 2 seconds.

This script blocks until it successfully queries a remote name server. Typically, the next few lines of following script would run fetchmail and a mail server queue flush, or possibly uucico. Do set the name server IP to something appropriate like that of your local ISP; and increase the 2 second time out if your name server typically takes longer to respond.

while true ; do
        dig @$MY_DNS_SERVER netscape.com IN A &
        { sleep 2 ; kill $DIG_PID ; } &
        sleep 1
        wait $DIG_PID
    ) 2>/dev/null | grep -q '^[^;]*netscape.com' && break

20.7.5 Recursive grep (search)

Recursively searching through a directory tree can be done easily with the find and xargs commands. You should consult both these man pages. The following command pipe searches through the kernel source for anything about the ``pcnet'' Ethernet card, printing also the line number:

find /usr/src/linux -follow -type f | xargs grep -iHn pcnet

(You will notice how this command returns rather a lot of data. However, going through it carefully can be quite instructive.)

Limiting a search to a certain file extension is just another common use of this pipe sequence.

find /usr/src/linux -follow -type f -name '*.[ch]' | xargs grep -iHn pcnet

Note that new versions of grep also have a -r option to recursively search through directories.

20.7.6 Recursive search and replace

Often you will want to perform a search-and-replace throughout all the files in an entire source tree. A typical example is the changing of a function call name throughout lots of C source. The following script is a must for any /usr/local/bin/. Notice the way it recursively calls itself.

N=`basename $0`
if [ "$1" = "-v" ] ; then
if [ "$3" = "" -o "$1" = "-h" -o "$1" = "--help" ] ; then
    echo "$N: Usage"
    echo "        $N [-h|--help] [-v] <regexp-search> \
<regexp-replace> <glob-file>"
    exit 0
S="$1" ; shift ; R="$1" ; shift
if echo "$1" | grep -q / ; then
    for i in "$@" ; do
        SEARCH=`echo "$S" | sed 's,/,\\\\/,g'`
        REPLACE=`echo "$R" | sed 's,/,\\\\/,g'`
        cat $i | sed "s/$SEARCH/$REPLACE/g" > $T
        if [ "$D" = "0" ] ; then
            if diff -q $T $i >/dev/null ; then
                if [ "$VERBOSE" = "-v" ] ; then
                    echo $i
                cat $T > $i
            rm -f $T
    find . -type f -name "$1" | xargs $0 $VERBOSE "$S" "$R"

20.7.7 cut and awk -- manipulating text file fields

The cut command is useful for slicing files into fields; try

cut -d: -f1 /etc/passwd
cat /etc/passwd | cut -d: -f1

The awk program is an interpreter for a complete programming language call AWK. A common use for awk is in field stripping. It is slightly more flexible than cut--

cat /etc/passwd | awk -F : '{print $1}'

--especially where whitespace gets in the way,

ls -al | awk '{print $6 " " $7 " " $8}'
ls -al | awk '{print $5 " bytes"}'

which isolates the time and size of the file respectively.

Get your nonlocal IP addresses with:

ifconfig | grep 'inet addr:' | fgrep -v '127.0.0.' | \
                                  cut -d: -f2 | cut -d' ' -f1

Reverse an IP address with:

echo | awk -F . '{print $4 "." $3 "." $2 "." $1 }'

Print all common user names (i.e., users with UID values greater than 499 on RedHat and greater than 999 on Debian):

awk -F: '$3 >= 500 {print $1}' /etc/passwd
( awk -F: '$3 >= 1000 {print $1}' /etc/passwd )

20.7.8 Calculations with bc

Scripts can easily use bc to do calculations that expr can't handle. For example, convert to decimal with

echo -e 'ibase=16;FFFF' | bc

to binary with

echo -e 'obase=2;12345' | bc

or work out the SIN of 45 degrees with

pi=`echo "scale=10; 4*a(1)" | bc -l`
echo "scale=10; s(45*$pi/180)" | bc -l

20.7.9 Conversion of graphics formats of many files

The convert program of the ImageMagick package is a command many Windows users would love. It can easily be used to convert multiple files from one format to another. Changing a file's extension can be done with echo filename | sed -e 's/\.old $/.new /'`. The convert command does the rest:

for i in *.pcx ; do
    CMD="convert -quality 625 $i `echo $i | sed -e 's/\.pcx$/.png/'`"
# Show the command-line to the user:
    echo $CMD
# Execute the command-line:
    eval $CMD

Note that the search-and-replace expansion mechanism could also be used to replace the extensions: ${i/%.pcx/.png} produces the desired result.

Incidentally, the above nicely compresses high-resolution pcx files--possibly the output of a scanning operation, or a LATEX compilation into PostScript rendered with GhostScript (i.e. gs -sDEVICE=pcx256 -sOutputFile='page%d.pcx' file.ps).

20.7.10 Securely erasing files

Removing a file with rm only unlinks the file name from the data. The file blocks may still be on disk, and will only be reclaimed when the file system reuses that data. To erase a file proper, requires writing random bytes into the disk blocks occupied by the file. The following overwrites all the files in the current directory:

for i in * ; do
    dd if=/dev/urandom    \
       of="$i"            \
       bs=1024            \
       count=`expr 1 +    \
          \`stat "$i" | grep 'Size:' | awk '{print $2}'\`  \
             / 1024`

You can then remove the files normally with rm.

20.7.11 Persistent background processes

Consider trying to run a process, say, the rxvt terminal, in the background. This can be done simply with:

rxvt &

However, rxvt still has its output connected to the shell and is a child process of the shell. When a login shell exits, it may take its child processes with it. rxvt may also die of its own accord from trying to read or write to a terminal that does not exist without the parent shell. Now try:

{ rxvt >/dev/null 2>&1 </dev/null & } &

This technique is known as forking twice, and redirecting the terminal to dev null. The shell can know about its child processes but not about the its ``grand child'' processes. We have hence create a daemon process proper with the above command.

Now, it is easy to create a daemon process that restarts itself if it happens to die. Although such functionality is best accomplished within C (which you will get a taste of in Chapter 22), you can make do with:

{ { while true ; do rxvt ; done ; } >/dev/null 2>&1 </dev/null & } &

You will notice the effects of all these tricks with:

ps awwwxf

20.7.12 Processing the process list

The following command uses the custom format option of ps to print every conceivable attribute of a process:

ps -awwwxo %cpu,%mem,alarm,args,blocked,bsdstart,bsdtime,c,caught,cmd,comm,\

The output is best piped to a file and viewed with a nonwrapping text editor. More interestingly, the awk command can print the process ID of a process with

ps awwx | grep -w 'htt[p]d' | awk '{print $1}'

which prints all the processes having httpd in the command name or command-line. This filter is useful for killing netscape as follows:

kill -9 `ps awx | grep 'netsc[a]pe' | awk '{print $1}'`

(Note that the [a] in the regular expression prevents grep from finding itself in the process list.)

Other useful ps variations are:

ps awwxf
ps awwxl
ps awwxv
ps awwxu
ps awwxs

The f option is most useful for showing parent-child relationships. It stands for forest, and shows the full process tree. For example, here I am running an X desktop with two windows:

    1 ?        S      0:05 init [5]
    2 ?        SW     0:02 [kflushd]
    3 ?        SW     0:02 [kupdate]
    4 ?        SW     0:00 [kpiod]
    5 ?        SW     0:01 [kswapd]
    6 ?        SW<    0:00 [mdrecoveryd]
  262 ?        S      0:02 syslogd -m 0
  272 ?        S      0:00 klogd
  341 ?        S      0:00 xinetd -reuse -pidfile /var/run/xinetd.pid
  447 ?        S      0:00 crond
  480 ?        S      0:02 xfs -droppriv -daemon
  506 tty1     S      0:00 /sbin/mingetty tty1
  507 tty2     S      0:00 /sbin/mingetty tty2
  508 tty3     S      0:00 /sbin/mingetty tty3
  509 ?        S      0:00 /usr/bin/gdm -nodaemon
  514 ?        S      7:04  _ /etc/X11/X -auth /var/gdm/:0.Xauth :0
  515 ?        S      0:00  _ /usr/bin/gdm -nodaemon
  524 ?        S      0:18      _ /opt/icewm/bin/icewm
  748 ?        S      0:08          _ rxvt -bg black -cr green -fg whi
  749 pts/0    S      0:00          |   _ bash
 5643 pts/0    S      0:09          |       _ mc
 5645 pts/6    S      0:02          |           _ bash -rcfile .bashrc
25292 pts/6    R      0:00          |               _ ps awwxf
11780 ?        S      0:16          _ /usr/lib/netscape/netscape-commu
11814 ?        S      0:00              _ (dns helper)
15534 pts/6    S      3:12 cooledit -I /root/.cedit/projects/Rute
15535 pts/6    S      6:03  _ aspell -a -a

The u option shows the useful user format, and the others show virtual memory, signal and long format.

20.8 Shell Initialization

Here I will briefly discuss what initialization takes place after logging in and how to modify it.

The interactive shell invoked after login will be the shell specified in the last field of the user's entry in the /etc/passwd file. The login program will invoke the shell after authenticating the user, placing a - in front of the the command name, which indicates to the shell that it is a login shell, meaning that it reads and execute several scripts to initialize the environment. In the case of bash, the files it reads are: /etc/profile, ~/.bash_profile, ~/.bash_login and ~/.profile, in that order. In addition, an interactive shell that is not a login shell also reads ~/.bashrc. Note that traditional sh shells only read /etc/profile and ~/.profile.

20.8.1 Customizing the PATH and LD_LIBRARY_PATH

Administrators can customise things like the environment variables by modifying these startup scripts. Consider the classic case of an installation tree under /opt/. Often, a package like /opt/staroffice/ or /opt/oracle/ will require the PATH and LD_LIBRARY_PATH variables to be adjusted accordingly. In the case of RedHat, a script,

for i in /opt/*/bin /usr/local/bin ; do
    test -d $i || continue
    echo $PATH | grep -wq "$i" && continue
    export PATH
if test `id -u` -eq 0 ; then
    for i in /opt/*/sbin /usr/local/sbin ; do
        test -d $i || continue
        echo $PATH | grep -wq "$i" && continue
        export PATH
for i in /opt/*/lib /usr/local/lib ; do
    test -d $i || continue
    echo $LD_LIBRARY_PATH | grep -wq "$i" && continue
    export LD_LIBRARY_PATH

can be placed as /etc/profile.d/my_local.sh with execute permissions. This will take care of anything installed under /opt/ or /usr/local/. For Debian, the script can be inserted directly into /etc/profile.

Page [*] of Section 23.3 contains details of exactly what LD_LIBRARY_PATH is.

(Unrelated, but you should also edit your /etc/man.config to add man page paths that appear under all installation trees under /opt/.)

20.9 File Locking

Often, one would like a process to have exclusive access to a file. By this we mean that only one process can access the file at any one time. Consider a mail folder: if two processes were to write to the folder simultaneously, it could become corrupted. We also sometimes want to ensure that a program can never be run twice at the same time; this insurance is another use for ``locking.''

In the case of a mail folder, if the file is being written to, then no other process should try read it or write to it: and we would like to create a write lock on the file. However if the file is being read from, no other process should try to write to it: and we would like to create a read lock on the file. Write locks are sometimes called exclusive locks; read locks are sometimes called shared locks. Often, exclusive locks are preferred for simplicity.

Locking can be implemented by simply creating a temporary file to indicate to other processes to wait before trying some kind of access. UNIX also has some more sophisticated builtin functions.

20.9.1 Locking a mailbox file

There are currently four methods of file locking. [The exim sources seem to indicate thorough research in this area, so this is what I am going on.]

``dot lock'' file locking. Here, a temporary file is created with the same name as the mail folder and the extension .lock added. So long as this file exists, no program should try to access the folder. This is an exclusive lock only. It is easy to write a shell script to do this kind of file locking.
``MBX'' file locking. Similar to 1, but a temporary file is created in /tmp. This is also an exclusive lock.
fcntl locking. Databases require areas of a file to be locked. fcntl is a system call to be used inside C programs.
flock file locking. Same as fcntl, but locks whole files.

The following shell function does proper mailbox file locking.

function my_lockfile ()
        echo $$ > $TEMPFILE 2>/dev/null || {
                echo "You don't have permission to access `dirname $TEMPFILE`"
                return 1
        ln $TEMPFILE $LOCKFILE 2>/dev/null && {
                rm -f $TEMPFILE
                return 0
        test "$STALE_PID" -gt "0" >/dev/null || {
                return 1
        kill -0 $STALE_PID 2>/dev/null && {
                rm -f $TEMPFILE
                return 1
        rm $LOCKFILE 2>/dev/null && {
            echo "Removed stale lock file of process $STALE_PID"
        ln $TEMPFILE $LOCKFILE 2>/dev/null && {
                rm -f $TEMPFILE
                return 0
        rm -f $TEMPFILE
        return 1

(Note how instead of `cat $LOCKFILE`, we use `< $LOCKFILE`, which is faster.)

You can include the above function in scripts that need to lock any kind file. Use the function as follows:

# wait for a lock
until my_lockfile /etc/passwd ; do
        sleep 1
# The body of the program might go here
# [...]
# Then to remove the lock,
rm -f /etc/passwd.lock

This script is of academic interest only but has a couple of interesting features. Note how the ln function is used to ensure ``exclusivity.'' ln is one of the few UNIX functions that is atomic, meaning that only one link of the same name can exist, and its creation excludes the possibility that another program would think that it had successfully created the same link. One might naively expect that the program

function my_lockfile ()
        test -e $LOCKFILE && return 1
        touch $LOCKFILE
        return 0

is sufficient for file locking. However, consider if two programs, running simultaneously, executed line 4 at the same time. Both would think that the lock did not exist and proceed to line 5. Then both would successfully create the lock file--not what you wanted.

The kill command is then useful for checking whether a process is running. Sending the 0 signal does nothing to the process, but the signal fails if the process does not exist. This technique can be used to remove a lock of a process that died before removing the lock itself: that is, a stale lock.

20.9.2 Locking over NFS

The preceding script does not work if your file system is mounted over NFS (network file system--see Chapter 28). This is obvious because the script relies on the PID of the process, which is not visible across different machines. Not so obvious is that the ln function does not work exactly right over NFS--you need to stat the file and actually check that the link count has increased to 2.

The commands lockfile (from the procmail package) and mutt_dotlock (from the mutt email reader but perhaps not distributed) do similar file locking. These commands, however, but do not store the PID in the lock file. Hence it is not possible to detect a stale lock file. For example, to search your mailbox, you can run:

lockfile /var/spool/mail/mary.lock
grep freddy /var/spool/mail/mary
rm -f /var/spool/mail/mary.lock

This sequence ensures that you are searching a clean mailbox even if /var is a remote NFS share.

20.9.3 Directory versus file locking

File locking is a headache for the developer. The problem with UNIX is that whereas we are intuitively thinking about locking a file, what we really mean is locking a file name within a directory. File locking per se should only be used on perpetual files, such as database files. For mailbox and passwd files we need directory locking [My own term.], meaning the exclusive access of one process to a particular directory entry. In my opinion, lack of such a feature is a serious deficiency in UNIX, but because it will require kernel, NFS, and (possibly) C library extensions, will probably not come into being any time soon.

20.9.4 Locking inside C programs

This topic is certainly outside of the scope of this text, except to say that you should consult the source code of reputable packages rather than invent your own locking scheme.

next up previous contents
Next: 21. System Services and Up: rute Previous: 19. Partitions, File Systems,   Contents