Skip to content

Instantly share code, notes, and snippets.

@siriniok
Created January 16, 2016 09:14
Show Gist options
  • Save siriniok/5fa92e69f1e6ccc95f1f to your computer and use it in GitHub Desktop.
Save siriniok/5fa92e69f1e6ccc95f1f to your computer and use it in GitHub Desktop.
Better Bash Scripting in 15 Minutes

Source

Better Bash Scripting in 15 Minutes

The tips and tricks below originally appeared as one of Google's "Testing on the Toilet" (TOTT) episodes.
This is a revised and augmented version.

Safer Scripting

I start every bash script with the following prolog:

#!/bin/bash
set -o nounset
set -o errexit

This will take care of two very common errors:

  1. Referencing undefined variables (which default to "")
  2. Ignoring failing commands The two settings also have shorthands ("-u" and "-e") but the longer versions are more readable.

If a failing command is to be tolerated use this idiom:

if ! ; then
echo "failure ignored"
fi

Note that some Linux commands have options which as a side-effect suppress some failures, e.g.
"mkdir -p" and "rm -f".

Also note, that the "errexit" mode, while a valuable first line of defense, does not catch all failures, i.e. under certain circumstances failing commands will go undetected.
(For more info, have a look at this thread.)

A reader suggested the additional use of "set -o pipefail"

Functions

Bash lets you define functions which behave like other commands -- use them liberally; it will give your bash scripts a much needed boost in readability:

ExtractBashComments() {
egrep "^#"
}

cat myscript.sh | ExtractBashComments | wc

comments=$(ExtractBashComments < myscript.sh)

Some more instructive examples:

SumLines() { # iterating over stdin - similar to awk
local sum=0
local line=""
while read line ; do
sum=$((${sum} + ${line}))
done
echo ${sum}
}

SumLines < data_one_number_per_line.txt

log() { # classic logger
local prefix="[$(date +%Y/%m/%d %H:%M:%S)]: "
echo "${prefix} $@" >&2
}

log "INFO" "a message"

Try moving all bash code into functions leaving only global variable/constant definitions and a call to "main" at the top-level.

Variable Annotations

Bash allows for a limited form of variable annotations. The most important ones are:

  • local (for local variables inside a function)
  • readonly (for read-only variables)

a useful idiom: DEFAULT_VAL can be overwritten

with an environment variable of the same name

readonly DEFAULT_VAL=${DEFAULT_VAL:-7}

myfunc() {

initialize a local variable with the global default

local some_var=${DEFAULT_VAL}
...
}

Note that it is possible to make a variable read-only that wasn't before:

x=5
x=6
readonly x
x=7 # failure

Strive to annotate almost all variables in a bash script with either local or readonly.

Favor $() over backticks (`)

Backticks are hard to read and in some fonts easily confused with single quotes.
$()also permits nesting without the quoting headaches.

both commands below print out: A-B-C-D

echo "A-echo B-echo C-`echo D```"
echo "A-$(echo B-$(echo C-$(echo D)))"

Favor [[]] (double brackets) over []

[[]] avoids problems like unexpected pathname expansion, offers some syntactical improvements,
and adds new functionality:

Operator ** ** Meaning
|| logical or (double brackets only)
&& logical and (double brackets only)
< string comparison (no escaping necessary within double brackets)
-lt numerical comparison
= string matching with globbing
== string matching with globbing (double brackets only, see below)
=~ string matching with regular expressions (double brackets only , see below)
-n string is non-empty
-z string is empty
-eq numerical equality

-ne numerical inequality

single bracket

[ "${name}" > "a" -o ${name} < "m" ]

double brackets

[[ "${name}" > "a" && "${name}" < "m" ]]

Regular Expressions/Globbing

These new capabilities within double brackets are best illustrated via examples:

t="abc123"
[[ "$t" == abc* ]] # true (globbing)
[[ "$t" == "abc*" ]] # false (literal matching)
[[ "$t" =~ [abc]+[123]+ ]] # true (regular expression)
[[ "$t" =~ "abc*" ]] # false (literal matching)

Note, that starting with bash version 3.2 the regular or globbing expression
must not be quoted. If your expression contains whitespace you can store it in a variable:

r="a b+"
[[ "a bbb" =~ $r ]] # true

Globbing based string matching is also available via the case statement:

case $t in
abc*) _ _ ;;
esac

String Manipulation

Bash has a number of (underappreciated) ways to manipulate strings.

Basics

f="path1/path2/file.ext"

len="${#f}" # = 20 (string length)

slicing: ${:} or ${::}

slice1="${f:6}" # = "path2/file.ext"
slice2="${f:6:5}" # = "path2"
slice3="${f: -8}" # = "file.ext"(Note: space before "-")
pos=6
len=5
slice4="${f:${pos}:${len}}" # = "path2"

Substitution (with globbing)

f="path1/path2/file.ext"

single_subst="${f/path?/x}" # = "x/path2/file.ext"
global_subst="${f//path?/x}" # = "x/x/file.ext"

string splitting

readonly DIR_SEP="/"
array=(${f//${DIR_SEP}/ })
second_dir="${array1}" # = path2

Deletion at beginning/end (with globbing)

f="path1/path2/file.ext"

deletion at string beginning extension="${f#*.}" # = "ext"

greedy deletion at string beginning

filename="${f##*/}" # = "file.ext"

deletion at string end

dirname="${f%/*}" # = "path1/path2"

greedy deletion at end

root="${f%%/*}" # = "path1"

Avoiding Temporary Files

Some commands expect filenames as parameters so straightforward pipelining does not work.
This is where <() operator comes in handy as it takes a command and transforms it into something
which can be used as a filename:

download and diff two webpages

diff <(wget -O - url1) <(wget -O - url2)

Also useful are "here documents" which allow arbitrary multi-line string to be passed
in on stdin. The two occurrences of 'MARKER' brackets the document.
'MARKER' can be any text.

DELIMITER is an arbitrary string

command << MARKER
...
${var}
$(cmd)
...
MARKER

If parameter substitution is undesirable simply put quotes around the first occurrence of MARKER:

command << 'MARKER'
...
no substitution is happening here.
$ (dollar sign) is passed through verbatim.
...
MARKER

Built-In Variables

For reference

$0 name of the script
$n positional parameters to script/function
$$ PID of the script
$! PID of the last command executed (and run in the background)
$? exit status of the last command (${PIPESTATUS} for pipelined commands)
$# number of parameters to script/function
$@ all parameters to script/function (sees arguments as separate word)
$* all parameters to script/function (sees arguments as single word)

Note

$* is rarely the right choice.
$@ handles empty parameter list and white-space within parameters correctly
$@ should usually be quoted like so "$@"

Debugging

To perform a syntax check/dry run of your bash script run:

bash -n myscript.sh

To produce a trace of every command executed run:

bash -v myscripts.sh

To produce a trace of the expanded command use:

bash -x myscript.sh

-v and -x can also be made permanent by adding
set -o verbose and set -o xtrace to the script prolog.
This might be useful if the script is run on a remote machine, e.g.
a build-bot and you are logging the output for remote inspection.

Signs you should not be using a bash script

  • your script is longer than a few hundred lines of code
  • you need data structures beyond simple arrays
  • you have a hard time working around quoting issues
  • you do a lot of string manipulation
  • you do not have much need for invoking other programs or pipe-lining them
  • you worry about performance

Instead consider scripting languages like Python or Ruby.

References

Thanks to Peter Brinkmann and Kim Hazelwood for their feedback on drafts of this post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment