25 February 2006

Unix Utilities

(This is a subheading of Unix-3)

In my post on Unix shells, I mentioned that Unix is an extremely powerful and versatile operating system. The reason why I say this is that Unix has historically been supremely well-endowed with utilities. Mainly this is because all of the various flavors of Unix are used primarily for software development, and hence have especially sensitive files, directories, or data streams to manage.

Many of these are programs listed under the Tools and Utilities category of important Unix shell commands. A large number of these utilities include search engines, such as grep, look, and find. There are the shell utilities, which are interpreted program languages: awk, sed, echo, dc, bc. The point of an interpreted language is that it does not literally execute your script in an unprotected mode; it's actually an interactive program that uses your script as input, but has considerable leeway in responding to it. Instead of crashing when you have an error, it helps you to find the error, or recovers prior versions of your script that didn't crash.

Sed is used for editing text files. Basically it works like this: you invoke sed by simply using it in the command line, then using any number of sed commands, such as s (substitute), then the basic parameters of the command (like "address" is replaced with "10 Downing Street") and location info (e.g., the file name.)
           $ sed 's/address/10 Downing Street/' [old]new file

(The quotes are just good practice; they're not always essential.) Sed allows you to use complex wildcards and nest multiple commands (-e):
           $ sed -e 's/c/c/' -e 's/y/Y/' [old]new file
It's interesting to note that sed is regarded as a utility, and yet it (and all the other shell utilities) are used entirely within the shell where one invokes them. Of course, to a Unix user this is a trivial remark. But to someone accustomed to programs being invoked, and operating in their own shell (or "window"), this is a fairly novel concept. Moreover, sed is likely to be invoked by a script, so that programs one writes seamlessly use sed.

Echo is a very simple utility; it writes the argument as the standard output. The trivial example is
           $ echo good morning
           good morning
Understandably, this might seem a bit silly. But what echo can also do is return the values of parameters:
           $ echo $PATH
This is useful in scripts when echo is combined with "escape sequences" (operands) which modify the standard output relative to the operand (e.g., inset tabs, backspace). Echo scripts are used for interactive modules on programs, in which the program asks users data required to do its job. Also, instructors frequently use echo combined with some other command to demonstrate what the other command does (example; the author linked types "echo "abc 123" | sed [...]" and echo returns the results of what sed does to "abc 123."

The utilities dc and bc are both calculators; dc stands for "desk calculator," bc for "basic calculator." A big difference between the two is that bc is somewhat simpler, and uses conventional math notation; dc uses Reverse Polish Notation (e.g., "1 2 + p" returns 3), while bc uses infix (e.g., "1+2" returns 3); the advantage of RPN is that it makes it easier to enter complex formulae without resorting to parentheses; you know the order in which the processor will perform math operations.

While this list is woefully incomplete, I am reminded that no list would be complete without mentioning make. This is a tool that keeps track of dependencies among constituent files of a program. It keeps track of the function of executable file types so that a lot of the routine steps of program development are automated. First, the programmer must create a makefile that records all of the dependencies of the program's files: sources, object modules, assembler files, and so on. When the make command is issued, source files are recompiled. This utility is very flexible, so that (for example) one can define many potential files as SOURCES.

Further automation resulted in the makedepend, which generates makefiles.
RESOURCES: The Art of Unix Programming, Eric Raymond; esp. "make: Automating Your Recipes"; "Sed - An Introduction and Tutorial," Bruce Barnett (2007); "Using Unix 'desk calculator'"; "bc - The shell maestro’s calculator,"Pandu Rao (2007) & "GNU bc," CyrekSoft; "Unix Manual Page for awk," "bc," "dc," "look," "sed," "ufsrestore," "vi," & "yacc"; "UNIX Shell Scripting Universe," Richard Reepe;

ONLINE TUTORIALS: Unix tutorial, Stanford University School of Earth Sciences, USA; "Unix for Web developers," eXtropia tutorials; "UNIX Tutorial for Beginners," University of Surrey, Guildford, UK; "Unix System Administration Independent Learning," USAIL;

BOOKS: UNIX: the Complete Reference, by Kenneth Rosen, Douglas Host, James Farber, & Richard Rosinski—Tata McGraw-Hill edition 2002

Labels: , , , ,


Post a Comment

<< Home