(M)  s i s t e m a   o p e r a c i o n a l   m a g n u x   l i n u x ~/ · documentação · suporte · sobre

 

3.10. External Filters, Programs and Commands

This is a descriptive listing of standard UNIX commands useful in shell scripts. The power of scripts comes from coupling system commands and shell directives with simple programming constructs.

3.10.1. Basic Commands

echo

prints (to stdout) an expression or variable (see Example 3-5).

echo Hello
echo $a

Normally, each echo command prints a terminal newline, but the -n option suppresses this.

ls

The basic file "list" command. It is all too easy to underestimate the power of this humble command. For example, using the -R, recursive option, ls provides a tree-like listing of a directory structure.

Example 3-46. Using ls to create a table of contents for burning a CDR disk

#!/bin/bash

# Script to automate burning a CDR.

# Uses Joerg Schilling's "cdrecord" package
# (http://www.fokus.gmd.de/nthp/employees/schilling/cdrecord.html)

# If this script invoked as an ordinary user, need to suid cdrecord
# (chmod u+s /usr/bin/cdrecord, as root).

if [ -z $1 ]
then
  IMAGE_DIRECTORY=/opt
# Default directory, if not specified on command line.
else
    IMAGE_DIRECTORY=$1
fi
    
ls -lRF $IMAGE_DIRECTORY > $IMAGE_DIRECTORY/contents
# The "l" option gives a "long" file listing.
# The "R" option makes the listing recursive.
# The "F" option marks the file types (directories suffixed by a /).
echo "Creating table of contents."

mkisofs -r -o cdimage.iso $IMAGE_DIRECTORY
echo "Creating ISO9660 file system image (cdimage.iso)."

cdrecord -v -isosize speed=2 dev=0,0 cdimage.iso
# Change speed parameter to speed of your burner.
echo "Burning the disk."
echo "Please be patient, this will take a while."

exit 0
cat, tac

cat, an acronym for concatenate, lists a file to stdout. When combined with redirection (> or >>), it is commonly used to concatenate files.

cat filename
              cat file.1 file.2 file.3 > file.123
The -n option to cat inserts consecutive numbers before each line of the target file(s).

tac, is the inverse of cat, listing a file backwards from its end.

rev

reverses each line of a file, and outputs to stdout. This is not the same effect as tac, as it preserves the order of the lines, but flips each one around.

bash$ cat file1.txt
This is line 1.
 This is line 2.
	      

bash$ tac file1.txt
This is line 2.
 This is line 1.
	      

bash$ rev file1.txt
.1 enil si sihT
 .2 enil si sihT
	      

cd

The familiar cd change directory command finds use in scripts where execution of a command requires being in a specified directory.

(cd /source/directory && tar cf - . ) | (cd /dest/directory && tar xvfp -)
[from the previously cited example by Alan Cox]

cp

This is the file copy command. cp file1 file2 copies file1 to file2, overwriting file2 if it already exists (see Example 3-49).

mv

This is the file move command. It is equivalent to a combination of cp and rm. It may be used to move multiple files to a directory. For some examples of using mv in a script, see Example 3-7 and Example A-2.

rm

Delete (remove) a file or files. The -f forces removal of even readonly files.

Warning

When used with the recursive flag -r, this command removes files all the way down the directory tree.

rmdir

Remove directory. The directory must be empty of all files, including dotfiles, for this command to succeed.

mkdir

Make directory, creates a new directory. mkdir -p project/programs/December creates the named directory. The -p option automatically creates any necessary parent directories.

chmod

Changes the attributes of an existing file (see Example 3-51).

chmod +x filename
# Makes "filename" executable for all users.

chmod 644 filename
# Makes "filename" readable/writable to owner, readable to
# others
# (octal mode).

chmod 1777 directory-name
# Gives everyone read, write, and execute permission in directory,
# however also sets the "sticky bit", which means that
# only the directory owner can change files in the directory.

chattr

Change file attributes. This has the same effect as chmod above, but with a different invocation syntax.

ln

Creates links to pre-existings files. Most often used with the -s, symbolic or "soft" link flag. This permits referencing the linked file by more than one name and is a superior alternative to aliasing (see Example 3-21).

ln -s oldfile newfile links the previously existing oldfile to the newly created link, newfile.

3.10.2. Complex Commands

find

exec COMMAND \;

Carries out COMMAND on each file that find scores a hit on. COMMAND terminates with \; (the ; is escaped to make certain the shell passes it to find literally, which concludes the command sequence). If COMMAND contains {}, then find substitutes the full path name of the selected file.

Example 3-47. Badname, eliminate file names in current directory containing bad characters and white space.

#!/bin/bash

# Delete filenames in current directory containing bad characters.

for filename in *
do
badname=`echo "$filename" | sed -n /[\+\{\;\"\\\=\?~\(\)\<\>\&\*\|\$]/p`
# Files containing those nasties:   + { ; " \ = ? ~ ( ) < > & * | $
rm $badname 2>/dev/null
#           So error messages deep-sixed.
done

# Now, take care of files containing all manner of whitespace.
find . -name "* *" -exec rm -f {} \;
# The path name of the file that "find" finds replaces the "{}".
# The '\' ensures that the ';' is interpreted literally, as end of command.

exit 0

See the man page for find for more detail.

xargs

A filter for feeding arguments to a command, and also a tool for assembling the commands themselves. It breaks a data stream into small enough chunks for filters and commands to process. Consider it as a powerful replacement for backquotes. In situations where backquotes fail with a too many arguments error, substituting xargs often works. Normally, xargs reads from 'stdin' or from a pipe, but it can also be given the output of a file.

ls | xargs -p -l gzip gzips every file in current directory, one at a time, prompting before each operation.

One of the more interesting xargs options is -n XX, which limits the number of arguments passed to XX.

ls | xargs -n 8 echo lists the files in the current directory in 8 columns.

Note: The default command for xargs is echo.

Example 3-48. Log file using xargs to monitor system log

#!/bin/bash

# Generates a log file in current directory
# from the tail end of /var/log messages.

# Note: /var/log/messages must be readable by ordinary users
#       if invoked by same (#root chmod 755 /var/log/messages).

( date; uname -a ) >>logfile
# Time and machine name
echo --------------------------------------------------------------------- >>logfile
tail -5 /var/log/messages | xargs |  fmt -s >>logfile
echo >>logfile
echo >>logfile

exit 0

Example 3-49. copydir, copying files in current directory to another, using xargs

#!/bin/bash

# Copy (verbose) all files in current directory
# to directory specified on command line.

if [ -z $1 ]
# Exit if no argument given.
then
  echo "Usage: `basename $0` directory-to-copy-to"
  exit 1
fi  

ls . | xargs -i -t cp ./{} $1
# This is the exact equivalent of
# cp * $1

exit 0
eval arg1, arg2, ...

Translates into commands the arguments in a list (useful for code generation within a script).

Example 3-50. Showing the effect of eval

#!/bin/bash

y=`eval ls -l`
echo $y

y=`eval df`
echo $y
# Note that LF's not preserved

exit 0

Example 3-51. Forcing a log-off

#!/bin/bash

y=`eval ps ax | sed -n '/ppp/p' | awk '{ print $1 }'`
# Finding the process number of 'ppp'

kill -9 $y
# Killing it


# Restore to previous state...

chmod 666 /dev/ttyS3
# Doing a SIGKILL on ppp changes the permissions
# on the serial port. Must be restored.

rm /var/lock/LCK..ttyS3
# Remove the serial port lock file.

exit 0
expr arg1 operation arg2 ...

All-purpose expression evaluator: Concatenates and evaluates the arguments according to the operation given (arguments must be separated by spaces). Operations may be arithmetic, comparison, string, or logical.

expr 3 + 5

returns 8

expr 5 % 3

returns 2

y=`expr $y + 1`

incrementing variable, same as let y=y+1 and y=$(($y+1)), as discussed elsewhere

z=`expr substr $string28 $position $length`

Note that external programs, such as sed and Perl have far superior string parsing facilities, and it might well be advisable to use them instead of the built-in bash ones.

Example 3-52. Using expr

#!/bin/bash

# Demonstrating some of the uses of 'expr'
# +++++++++++++++++++++++++++++++++++++++

echo

# Arithmetic Operators

echo Arithmetic Operators
echo
a=`expr 5 + 3`
echo 5 + 3 = $a

a=`expr $a + 1`
echo
echo a + 1 = $a
echo \(incrementing a variable\)

a=`expr 5 % 3`
# modulo
echo
echo 5 mod 3 = $a

echo
echo

# Logical Operators

echo Logical Operators
echo

a=3
echo a = $a
b=`expr $a \> 10`
echo 'b=`expr $a \> 10`, therefore...'
echo "If a > 10, b = 0 (false)"
echo b = $b

b=`expr $a \< 10`
echo "If a < 10, b = 1 (true)"
echo b = $b


echo
echo

# Comparison Operators

echo Comparison Operators
echo
a=zipper
echo a is $a
if [ `expr $a = snap` ]
# Force re-evaluation of variable 'a'
then
   echo "a is not zipper"
fi   

echo
echo



# String Operators

echo String Operators
echo

a=1234zipper43231
echo The string being operated upon is $a.

# index: position of substring
b=`expr index $a 23`
echo Numerical position of first 23 in $a is $b.

# substr: print substring, starting position & length specified
b=`expr substr $a 2 6`
echo Substring of $a, starting at position 2 and 6 chars long is $b.

# length: length of string
b=`expr length $a`
echo Length of $a is $b.

# 'match' operations similarly to 'grep'
b=`expr match $a [0-9]*`
echo Number of digits at the beginning of $a is $b.
b=`expr match $a '\([0-9]*\)'`
echo The digits at the beginning of $a are $b.

echo

exit 0

Note that : can substitute for match. b=`expr $a : [0-9]*` is an exact equivalent of b=`expr match $a [0-9]*` in the above example.

let

The let command carries out arithmetic operations on variables. In many cases, it functions as a less complex version of expr.

Example 3-53. Letting let do some arithmetic.

#!/bin/bash

echo

let a=11
# Same as 'a=11'
let a=a+5
# Equivalent to let "a = a + 5"
# (double quotes makes it more readable)
echo "a = $a"
let "a <<= 3"
# Equivalent of let "a = a << 3"
echo "a left-shifted 3 places = $a"

let "a /= 4"
# Equivalent to let "a = a / 4"
echo $a
let "a -= 5"
# Equivalent to let "a = a - 5"
echo $a
let "a = a * 10"
echo $a
let "a %= 8"
echo $a

exit 0

3.10.3. Time / Date Commands

date

Simply invoked, date prints the date and time to stdout. Where this command gets interesting is in its formatting and parsing options.

Example 3-54. Using date

#!/bin/bash

#Using the 'date' command

# Needs a leading '+' to invoke formatting.

echo "The number of days since the year's beginning is `date +%j`."
# %j gives day of year.


echo "The number of seconds elapsed since 01/01/1970 is `date +%s`."
# %s yields number of seconds since "UNIX epoch" began,
# but how is this useful?

prefix=temp
suffix=`eval date +%s`
filename=$prefix.$suffix
echo $filename
# It's great for creating "unique" temp filenames,
# even better than using $$.

# Read the 'date' man page for more formatting options.

exit 0
time

Outputs very verbose timing statistics for executing a command.

time ls -l / gives something like this:

0.00user 0.01system 0:00.05elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (149major+27minor)pagefaults 0swaps

See also the very similar times command in the previous section.

touch

Utility for updating access/modification times of a file to current system time or other specified time, but also useful for creating a new file. The command touch zzz will create a new file of zero length, named zzz, assuming that zzz did not previously exist. Time-stamping empty files in this way is useful for storing date information, for example in keeping track of modification times on a project. See Example 3-11.

at

The at job control command executes a given set of commands at a specified time. This is a user version of cron.

at 2pm January 15 prompts for a set of commands to execute at that time. These commands may include executable shell scripts.

Using either the -f option or input redirection (<), at reads a command list from a file. This file can include shell scripts, though they should, of course, be noninteractive.

bash$ at 2:30 am Friday < at-jobs.list
job 2 at 2000-10-27 02:30
	      

batch

The batch job control command is similar to at, but it runs a command list when the system load drops below .8. Like at, it can read commands from a file with the -f option.

cal

Prints a neatly formatted monthly calendar to stdout. Will do current year or a large range of past and future years.

sleep

This is the shell equivalent of a wait loop. It pauses for a specified number of seconds, doing nothing. This can be useful for timing or in processes running in the background, checking for a specific event every so often (see Example 3-101).

sleep 3
# Pauses 3 seconds.

3.10.4. Text Processing Commands

sort

File sorter, often used as a filter in a pipe. See the man page for options.

diff

Simple file comparison utility. This compares the target files line-by-line sequentially. In some applications, such as comparing word dictionaries, it may be helpful to filter the files through sort and uniq before piping them to diff. diff file-1 file-2 outputs the lines in the files that differ, with carets showing which file each particular line belongs to. A common use for diff is generating difference files to be used with patch (see below). The -e option outputs files suitable for ed or ex scripts.

patch -p1 <patch-file
# Takes all the changes listed in 'patch-file' and applies them
# to the files referenced therein.

cd /usr/src
gzip -cd patchXX.gz | patch -p0
# Upgrading kernel source using 'patch'.
# From the Linux kernel docs "README",
# by anonymous author (Alan Cox?).

There are available various fancy frontends for diff, such as spiff, wdiff, xdiff, and mgdiff.

comm

Versatile file comparison utility. The files must be sorted for this to be useful.

comm -options first-file second-file

comm file-1 file-2 outputs three columns:

  • column 1 = lines unique to file-1

  • column 2 = lines unique to file-2

  • column 3 = lines common to both.

The options allow suppressing output of one or more columns.

  • -1 suppresses column 1

  • -2 suppresses column 2

  • -3 suppresses column 3

  • -12 suppresses both columns 1 and 2, etc.

uniq

This filter removes duplicate lines from a sorted file. It is often seen in a pipe coupled with sort.

cat list-1 list-2 list-3 | sort | uniq > final.list
# Concatenates the list files,
# sorts them,
# removes duplicate lines,
# and finally writes the result to an output file.

expand

A filter than converts tabs to spaces, often seen in a pipe.

cut

A tool for extracting fields from files. It is similar to the print $N command set in awk, but more limited. It may be simpler to use cut in a script than awk. Particularly important are the -d (delimiter) and -f (field specifier) options.

Using cut to obtain a listing of the mounted filesystems:

cat /etc/mtab | cut -d ' ' -f1,2

Using cut to list the OS and kernel version:

uname -a | cut -d" " -f1,3,11,12

cut -d ' ' -f2,3 filename is equivalent to awk '{ print $2, $3 }' filename

colrm

Column removal filter. This removes columns (characters) from a file and writes them, lacking the range of specified columns, back to stdout. colrm 2 4 <filename removes the second through fourth characters from each line of the text file filename.

Warning

If the file contains tabs or nonprintable characters, this may cause unpredictable behavior.

paste

Tool for merging together different files into a single, multi-column file. In combination with cut, useful for creating system log files.

join

Consider this a more flexible version of paste. It works on exactly two files, but permits specifying which fields to paste together, and in which order.

head

lists the first 10 lines of a file to stdout (see Example 3-66).

tail

lists the end of a file to stdout (the default is 10 lines, but this can be changed). Commonly used to keep track of changes to a system logfile, using the -f option, which outputs lines appended to the file.

Example 3-48, Example 3-66, and Example 3-101 show tail in action.

grep

A multi-purpose file search tool that uses regular expressions. Originally a command/filter in the ancient ed line editor, g/re/p, or global - regular expression - print.

grep pattern [file...]

search the files file, etc. for occurrences of pattern.

ls -l | grep '.txt' has the same effect as ls -l *.txt.

The -i option to grep causes a case-insensitive search.

Example 3-101 demonstrates how to use grep to search for a keyword in a system log file.

Example 3-55. Emulating "grep" in a script

#!/bin/bash

# Very crude reimplementation of 'grep'.

if [ -z $1 ]  # Check for argument to script.
then
  echo "Usage: `basename $0` pattern"
  exit 1
fi  

echo

for file in *     # Traverse all files in $PWD.
do
  output=$(sed -n /"$1"/p $file)  # Command substitution.

  if [ ! -z "$output" ]  # Variable $file quoted, otherwise error on multi-line output.
  then
    echo -n "$file: "
    echo $output
  fi

  echo
done  

echo

exit 0

# Exercises for reader:
# -------------------
# 1) Add newlines to output, if more than one match in any given file.
# 2) Add features.

Note: egrep is the same as grep -E. This uses a somewhat different, extended set of regular expressions, which may make the search somewhat more flexible.

Note: fgrep is the same as grep -F. It does a literal string search (no regular expressions), which generally speeds things up quite a bit.

Note: To search compressed files, use zgrep. It also works on non-compressed files, though slower than plain grep. This is handy for searching through a mixed set of files, some of them compressed, some not.

look

The command look works like grep, but does a lookup on a "dictionary", a sorted word list. By default, look searches for a match in /usr/dict/words, but a different dictionary file may be specified.

Example 3-56. Checking words in a list for validity

#!/bin/bash
# lookup:
# Does a dictionary lookup on each word in a data file.

file=words.data  # Data file to read words to test from.

echo

while [ "$word" != end ]  # Last word in data file.
do
  read word   # From data file, because of redirection at end of loop.
  look $word > /dev/null  # Don't want to display lines in dictionary file.
  lookup=$?   # Exit value of 'look'.

  if [ "$lookup" -eq 0 ]
  then
    echo "\"$word\" is valid."
  else
    echo "\"$word\" is invalid."
  fi  

done <$file  # Redirects stdin to $file, so "reads" come from there.

echo

exit 0
sed, awk

Scripting languages especially suited for parsing text files and command output. May be embedded singly or in combination in pipes and shell scripts.

sed

Non-interactive "stream editor", permits using many ex commands in batch mode. It finds many uses in shell scripts. See Appendix B.

awk

Programmable file extractor and formatter, good for manipulating and/or extracting fields (columns) in structured text files. Its syntax is similar to C. See Section B.2.

groff, gs, TeX

Text markup languages. Used for preparing copy for printing or formatted video display.

Man pages use groff (see Example A-1). Ghostscript (gs) is the GPL version of Postscript. TeX is Donald Knuth's elaborate typsetting system. It is often convenient to write a shell script encapsulating all the options and arguments passed to one of these markup languages.

wc

wc gives a "word count" on a file or I/O stream:

$ wc /usr/doc/sed-3.02/README
20     127     838 /usr/doc/sed-3.02/README
[20 lines  127 words  838 characters]

wc -w gives only the word count.

wc -l gives only the line count.

wc -c gives only the character count.

wc -L gives only the length of the longest line.

Using wc to count how many .txt files are in current working directory:

$ ls *.txt | wc -l

See Example 3-66 and Example 3-75.

tr

character translation filter.

Caution

must use quoting and/or brackets, as appropriate.

tr "A-Z" "*" <filename changes all the uppercase letters in filename to asterisks (writes to stdout).

tr -d [0-9] <filename deletes all digits from the file filename.

Example 3-57. toupper: Transforms a file to all uppercase.

#!/bin/bash

# Changes a file to all uppercase.

if [ -z $1 ]
# Standard check whether command line arg is present.
then
  echo "Usage: `basename $0` filename"
  exit 1
fi  

tr [a-z] [A-Z] <$1

exit 0

Example 3-58. lowercase: Changes all filenames in working directory to lowercase.

#! /bin/bash
#
# Changes every filename in working directory to all lowercase.
#
# Inspired by a script of john dubois,
# which was translated into into bash by Chet Ramey,
# and considerably simplified by Mendel Cooper,
# author of this HOWTO.


for filename in *  #Traverse all files in directory.
do
   fname=`basename $filename`
   n=`echo $fname | tr A-Z a-z`  #Change name to lowercase.
   if [ $fname != $n ]  # Rename only files not already lowercase.
   then
     mv $fname $n
   fi  
done   

exit 0

Example 3-59. rot13: rot13, ultra-weak encryption.

#!/bin/bash
# Classic rot13 algorithm, encryption that might fool a 3-year old.
# Usage: ./rot13.sh filename
# or     ./rot13.sh <filename
# or     ./rot13.sh and supply keyboard input (stdin)

cat "$@" | tr 'a-zA-Z' 'n-za-mN-ZA-M'   # "a" goes to "n", "b" to "o", etc.
# The 'cat "$@"' construction permits getting input either from stdin or from a file.

exit 0
fold

A filter that wraps inputted lines to a specified width (see Example 3-62).

fmt

Simple-minded file formatter, used as a filter in a pipe to "wrap" long lines of text output (see Example 3-48 and Example 3-62).

ptx

The ptx [targetfile] command outputs a permuted index (cross-reference list) of the targetfile. This may be further filtered and formatted in a pipe, if necessary.

column

Column formatter. This filter transforms list-type text output into a "pretty-printed" table by inserting tabs at appropriate places.

Example 3-60. Using column to format a directory listing

#!/bin/bash
# This is a slight modification of the example file in the "column" man page.


(printf "PERMISSIONS LINKS OWNER GROUP SIZE MONTH DAY HH:MM PROG-NAME\n" \
; ls -l | sed 1d) | column -t

# The "sed 1d" in the pipe deletes the first line of output,
# which would be "total        N",
# where "N" is the total number of files found by "ls -l".

# The -t option to "column" pretty-prints a table.

exit 0
nl

Line numbering filter. nl filename lists filename to stdout, but inserts consecutive numbers at the beginning of each non-blank line. If filename omitted, operates on stdin.

Example 3-61. nl: A self-numbering script.

#!/bin/bash

# This file echoes itself twice to stdout with its lines numbered.

# 'nl' sees this as line 3 since it does not number blank lines.
# 'cat -n' sees the above line as number 5.

nl `basename $0`

echo; echo  # Now, let's try it with 'cat -n'

cat -n `basename $0`
# The difference is that 'cat -n' numbers the blank lines.

exit 0
pr

Print formatting filter. This will paginate a file (or stdout) into sections suitable for hard copy printing. A particularly useful option is -d, forcing double-spacing.

Example 3-62. Formatted file listing.

#!/bin/bash

# Get a file listing...

b=`ls /usr/local/bin`

# ...40 columns wide.
echo $b | fmt -w 40

# Could also have been done by
# echo $b | fold - -s -w 40
 
exit 0
printf

The printf, formatted print, command is an enhanced echo. It is a limited variant of the C language printf, and the syntax is somewhat different.

printf format-string... parameter...

See the printf man page for in-depth coverage.

Caution

Older versions of bash may not support printf.

Example 3-63. printf in action

#!/bin/bash

# printf demo

PI=3.14159265358979
DecimalConstant=31373
Message1="Greetings,"
Message2="Earthling."

echo

printf "Pi to 2 decimal places = %1.2f" $PI
echo
printf "Pi to 9 decimal places = %1.9f" $PI
# Note correct round off.

printf "\n"
# Prints a line feed, equivalent to 'echo'.

printf "Constant = \t%d\n" $DecimalConstant
# Insert tab (\t)

printf "%s %s \n" $Message1 $Message2

echo

exit 0

3.10.5. File and Archiving Commands

tar

The standard UNIX archiving utility. Originally a Tape ARchiving program, from whence it derived its name, it has developed into a general purpose package that can handle all manner of archiving with all types of destination devices, ranging from tape drives to regular files to even stdout (see Example 3-4). GNU tar has long since been patched to accept gzip compression options, such as tar czvf archive-name.tar.gz *, which recursively archives and compresses all files (except "dotfiles") in a directory tree.

cpio

This specialized archiving copy command is rarely used any more, having been supplanted by tar/gzip. It still has its uses, such as moving a directory tree.

Example 3-64. Using cpio to move a directory tree

#!/bin/bash

# Copying a directory tree using cpio.

if [ $# -ne 2 ]
then
  echo Usage: `basename $0` source destination
  exit 1
fi  

source=$1
destination=$2

find "$source" -depth | cpio -admvp "$destination"

exit 0
gzip

The standard GNU/UNIX compression utility, replacing the inferior and proprietary compress. The corresponding decompression command is gunzip, which is the equivalent of gzip -d.

The filter zcat decompresses a gzipped file to stdout, as possible input to a pipe or redirection. This is, in effect, a cat command that works on compressed files (including files processed with the older compress utility). See Example 3-14.

bzip2

An alternate compression utility, usually more efficient than gzip, especially on large files. The corresponding decompression command is bunzip2.

sq

Yet another compression utility, a filter that works only on sorted ASCII word lists. It uses the standard invocation syntax for a filter, sq < input-file > output-file. Fast, but not nearly as efficient as gzip. The corresponding uncompression filter is unsq, invoked like sq.

Note: The output of sq may be piped to gzip for further compression.

shar

Shell archiving utility. The files in a shell archive are concatenated without compression, and the resultant archive is essentially a shell script, complete with #!/bin/sh header, and containing all the necessary unarchiving commands. Shar archives still show up in Internet newsgroups, but otherwise shar has been pretty well replaced by tar/gzip. The unshar command unpacks shar archives.

split

Utility for splitting a file into smaller chunks. Usually used for splitting up large files in order to back them up on floppies or preparatory to e-mailing or uploading them.

file

A utility for identifying file types. The command file file-name will return a file specification for file-name, such as ascii text or data. It references the magic numbers found in /usr/share/magic, /etc/magic, or /usr/lib/magic, depending on the Linux/UNIX distribution.

Example 3-65. stripping comments from C program files

#!/bin/bash

# Strips out the comments (/* comment */) in a C program.

NOARGS=1
WRONG_FILE_TYPE=2

if [ $# = 0 ]
then
  echo "Usage: `basename $0` C-program-file" >&2 # Error message to stderr.
  exit $NOARGS
fi  

# Test for correct file type.
type=`eval file $1 | awk '{ print $2, $3, $4, $5 }'`
# "file $1" echoes file type...
# then awk removes the first field of this, the filename...
# then the result is fed into the variable "type".
correct_type="ASCII C program text"

if [ "$type" != "$correct_type" ]
then
  echo
  echo "This script works on C program files only."
  echo
  exit $WRONG_FILE_TYPE
fi  


# Rather cryptic sed script:
#--------
sed '
/^\/\*/d
/.*\/\*/d
' $1
#--------
# Easy to understand if you take several hours to learn sed fundamentals.


# Need to add one more line to the sed script to deal with
# case where line of code has a comment following it on same line.
# This is left as a non-trivial exercise for the reader.

exit 0
uuencode

This utility encodes binary files into ASCII characters, making them suitable for transmission in the body of an e-mail message or in a newsgroup posting.

uudecode

This reverses the encoding, decoding uuencoded files back into the original binaries.

Example 3-66. uudecoding encoded files

#!/bin/bash

lines=35
# Allow 35 lines for the header (very generous).

for File in *
# Test all the files in the current working directory...
do
  search1=`head -$lines $File | grep begin | wc -w`
  search2=`tail -$lines $File | grep end | wc -w`
  # Uuencoded files have a "begin" near the beginning, and an "end" near the end.
  if [ $search1 -gt 0 ]
  then
    if [ $search2 -gt 0 ]
    then
      echo "uudecoding - $File -"
      uudecode $File
    fi  
  fi
done  

exit 0
sum, cksum, md5sum

These are utilities for generating checksums. A checksum is a number mathematically calculated from the contents of a file, for the purpose of checking its integrity. A script might refer to a list of checksums for security purposes, such as ensuring that the contents of key system files have not been altered or corrupted.

strings

Use the strings command to find printable strings in a binary or data file. It will list sequences of printable characters found in the target file. This might be handy for a quick 'n dirty examination of a core dump or for looking at an unknown graphic image file (strings image-file | more might show something like JFIF, which would identify the file as a jpeg graphic). In a script, you would probably parse the output of strings with grep or sed.

more, less

Pagers that display a text file or stream to stdout, one screenful at a time. These may be used to filter the output of a script.

3.10.6. Communications Commands

host

Searches for information about an Internet host by name or IP address, using DNS.

vrfy

Verify an Internet e-mail address.

nslookup

Do an Internet "name server lookup" on a host by IP address. This may be run either interactively or noninteractively, i.e., from within a script.

dig

Similar to nslookup, do an Internet "name server lookup" on a host. May be run either interactively or noninteractively, i.e., from within a script.

traceroute

Trace the route taken by packets sent to a remote host. This command works within a LAN, WAN, or over the Internet. The remote host may be specified by an IP address. The output of this command may be filtered by grep or sed in a pipe.

rcp

"Remote copy", copies files between two different networked machines. Using rcp and similar utilities with security implications in a shell script may not be advisable. Consider instead, using an expect script.

sx, rx

The sx and rx command set serves to transfer files to and from a remote host using the xmodem protocol. These are generally part of a communications package, such as minicom.

sz, rz

The sz and rz command set serves to transfer files to and from a remote host using the zmodem protocol. Zmodem has certain advantages over xmodem, such as greater transmission rate and resumption of interrupted file transfers. Like sx and rx, these are generally part of a communications package.

uucp

UNIX to UNIX copy. This is a communications package for transferring files between UNIX servers. A shell script is an effective way to handle a uucp command sequence.

Since the advent of the Internet and e-mail, uucp seems to have faded into obscurity, but it still exists and remains perfectly workable in situations where an Internet connection is not available or appropriate.

3.10.7. Miscellaneous Commands

jot, seq

These utilities emit a sequence of integers, with a user selected increment. This can be used to advantage in a for loop.

Example 3-67. Using seq to generate loop arguments

#!/bin/bash

for a in `seq 80`
# Same as   for a in 1 2 3 4 5 ... 80   (saves much typing!).
# May also use 'jot' (if present on system).
do
  echo -n "$a "
done

echo

exit 0
which

which <command-xxx> gives the full path to "command-xxx". This is useful for finding out whether a particular command or utility is installed on the system.

$bash which pgp

/usr/bin/pgp

script

This utility records (saves to a file) all the user keystrokes at the command line in a console or an xterm window. This, in effect, create a record of a session.

lp

The lp and lpr commands send file(s) to the print queue, to be printed as hard copy. [1] These commands trace the origin of their names to the line printers of another era.

bash$ cat file1.txt | lp

It is often useful to pipe the formatted output from pr to lp.

bash$ pr -options file1.txt | lp

Formatting packages, such as groff and Ghostscript may send their output directly to lp.

bash$ groff -Tascii file.tr | lp

bash$ gs -options | lp file.ps

Related commands are lpq, for viewing the print queue, and lprm, for removing jobs from the print queue.

tee

[UNIX borrows an idea here from the plumbing trade.]

This is a redirection operator, but with a difference. Like the plumber's tee, it permits "siponing off" the output of a command or commands within a pipe, but without affecting the result. This is useful for printing an ongoing process to a file or paper, perhaps to keep track of it for debugging purposes.

                   tee
                 |------> to file
                 |
  ===============|===============
  command--->----|-operator-->---> result of command(s)
  ===============================
	      

cat listfile* | sort | tee check.file | uniq > result.file
(The file check.file contains the concatenated sorted "listfiles", before the duplicate lines are removed by uniq.)

clear

The clear command simply clears the text screen at the console or in an xterm. The prompt and cursor reappear at the upper lefthand corner of the screen or xterm window. This command may be used either at the command line or in a script. See Example 3-36.

yes

In its default behavior the yes command feeds a continuous string of the character y followed by a line feed to stdout. A control-c terminates the run. A different output string may be specified, as in yes different string, which would continually output different string to stdout. One might well ask the purpose of this. From the command line or in a script, the output of yes can be redirected or piped into a program expecting user input. In effect, this becomes a sort of poor man's version of expect.

mkfifo

This obscure command creates a named pipe, a temporary First-In-First-Out buffer for transferring data between processes. Typically, one process writes to the FIFO, and the other reads from it. See Example A-7.

pathchk

This command checks the validity of a filename. If the filename exceeds the maximum allowable length (255 characters) or one or more of the directories in its path is not searchable, then an error message results. Unfortunately, pathchk does not return a recognizable error code, and it is therefore pretty much useless in a script.

dd

This is the somewhat obscure and much feared "data duplicator" command. It simply copies a file (or stdin/stdout), but with conversions. Possible conversions are ASCII/EBCDIC, upper/lower case, swapping of byte pairs between input and output, and skipping and/or truncating the head or tail of the input file. A dd --help lists the conversion and other options that this powerful utility takes.

The dd command can copy raw data and disk images to and from devices, such as floppies and tape drives. It can even be used to create boot floppies.

dd if=kernel-image of=/dev/fd0H1440
One important use for dd is initializing temporary swap files (see Example 3-97).

Notes

[1]

The print queue is the group of jobs "waiting in line" to be printed.