This post is meant to cover some of the subtleties of learning shell programming. TLDP's Bash Beginner's Guide and TLDP's Advanced Bash Scripting Guide are really worth a read, but I'll point out some highlights that every data-munging NLP practitioner should know.
Variable Evaluation
Options
Options change how bash behaves when executing a scripts. You can set bash options either when invoking bash or during script execution with "set."
My favorite set of flags is:
set -e # Stop on non-zero exit codes
set -o pipefail # Stop on non-zero exit codes for any program in a pipe
set -x # Print each command to stderr as it is being executed
Alternatively "set -v" shows the actual commands from the script before executing them (as opposed to the behavior of -x, which shows the command after variable substitution and decomposing pipes).
For a full rundown of options, see http://tldp.org/LDP/abs/html/options.html.
Traps
Now that you know how to make sure your scripts don't keep charging forward after a fatal error, you might be wondering "What if I need to do some cleanup before exiting? What if I want to print an error message before exiting?" Exception handling provides this in other languages, but in bash, they're called traps.
Have a look here: http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_12_02.html.
Heredocs
Ever wanted to generate a script from within a script? Send a long string of commands over ssh within a script? Hard-code an entire document within a script? Heredocs are the answer.
You can also use <<-EOF (notice the dash before the limit string) to indicate that leading whitespace should be stripped from each line of the heredoc or <<"EOF" to indicate that variable substitution should not be performed inside the heredoc.
The full story is here: http://tldp.org/LDP/abs/html/here-docs.html
Process Substitution and Named Pipes
Heredocs are for the simple case when you just want to write some data to a process's stdin. What if the tool you want to use takes in a file? Or multiple filenames? But you don't have a file. Process substitution to the rescue. It can pipe the stdout of a process to a file descriptor like so: cat <(yes). Named pipes (or FIFO pipes) just generalize this concept by letting you assign filenames on disk instead of passing a file descriptor.
For process substitution, see http://tldp.org/LDP/abs/html/process-sub.html and for named pipes: http://tldp.org/LDP/abs/html/extmisc.html#NAMEDPIPEREF.
Self-extracting scripts (A Fun Parlor Trick)
Some example use cases: 1) You have tarball that you want to install, but you want to include an installer script AND you only want to distribute a single shell script. 2)
Note that you might want extract the file to /dev/shm (shared memory), rather than putting it on disk, since writing the file would add unnecessary startup time otherwise.
For the full scoop, have a look at http://www.linuxjournal.com/node/1005818.
Other bash necessities
Passwordless SSH key forwarding, awk scripting, and
No comments:
Post a Comment