Bash scripting

Bash scripting#

Bash theory#

These notes will cover the basic programming concepts you need to know to get the most out of writing scripts in Bash.

Bash (and other shell languages) isn’t really a programming language – it is a command interpreter that’s best used for organizing the inputs and outputs of programs written in other languages. Historically this would be the C programming language. In our class, we will be writing “real” programs in Python.

But I digress. Despite the fact that Bash isn’t a programming language like C or Python, you can still program in it. What do you need to know to write Bash programs?

Variables#

This section was adapted from [@VariableSubstitution]

In Bash, there is no concept of “types” – this is not a compiled language where different amounts of memory need to be reserved for different types of data.

Rather, all variables in Bash are simply value-placeholders. That is:

The name of a variable is a placeholder for its value, which is the data that it holds.
Referencing/retrieving the value of a variable is called variable substitution.

For example, if variable1 is the name of a variable, then $variable1 is a reference to its value, the data it contains.

An equivalent syntax for variable substitution that is generally more robust is to add curly braces, as in: ${variable1}

Running a command such as echo $variable1 or echo ${variable1} will cause the value referenced by variable1 to be substituted and passed to the echo command.

Here’s an example of variable assignment and substitution:

# Create a variable with the name "a" and the value "375"
$ a=375

# Create a variable with the name "hello" and the value of the variable "a" (375)
$ hello=$a

# Not a variable reference, just the string "hello" ...
$ echo hello
hello

# This *is* a variable substitution
$ echo $hello
375

# This is another way to do a variable substitution
echo ${hello}
375

No spaces are permitted on either side of = sign when initializing variables.

Variable names#

What can you name variables? Here are the rules:

Variable names must start with a letter (not a number or any other character)
Variable names cannot contain whitespace or punctuation

Command substitution#

We can also assign the results of a command to a variable:

# The "%x %r %Z" string is an example of a date format string.
# See `man date` for examples of other date format strings
right_now="$(date +"%x %r %Z")"

The characters $( ) tell the shell: “substitute the results of the enclosed command”. This technique is known as command substitution.

In the above example script, the variable right_now gets assigned the result of calling the date command with with the argument "%x %r %Z" which outputs the current date and time.

Like variable substitutions, it is a good idea to wrap command substitutions in double quotes to prevent unwanted word splitting in case the result of the expansion contains whitespace characters – see the (#Quoting) section below.

Environment variables#

This section was adapted from [@WritingShellScripts] and [@ExportManPage]

Any time a shell session is initialized, some variables are already set by startup files.

To see all the variables that are in the environment, use the printenv command:

$ printenv
SHELL=/bin/bash
...

You can also view environment variables in a bash session using echo:

$ echo $SHELL
/bin/bash

You can create your own environment variables using the export command.

$ MYDEPT=Sales
$ echo $MYDEPT
Sales
$ export MYDEPT

# In one line:
export SOMEOTHERVAR=Value

Quoting#

This section was adapted from [@Quoting]

Quoting means just that, bracketing a string in quotes. This has the effect of protecting special characters in the string from reinterpretation or expansion by the shell or shell script.

A character is special if it has an interpretation other than its literal meaning. For example, the asterisk * represents a wild card character in globbing and Regular Expressions.

$ ls -l [Vv]*
-rw-rw-r--    1 bozo  bozo       324 Apr  2 15:05 VIEWDATA.BAT
-rw-rw-r--    1 bozo  bozo       507 May  4 14:25 vartrace.sh
-rw-rw-r--    1 bozo  bozo       539 Apr 14 17:11 viewdata.sh

$ ls -l '[Vv]*'
ls: [Vv]*: No such file or directory

$ ls -l '[vv]*'
ls: [vv]*: no such file or directory

Quoting variables#

When referencing a variable, it is generally advisable to enclose its name in double quotes, like so:

$ local var="string-with-special-characters#*;>,"
$ echo "$var"
string-with-special-characters#*;>,

This prevents reinterpretation of all special characters within the quoted string, with the following exceptions:

$ (used for variable dereferencing)
\ (escape character)

Keeping $ as a special character within double quotes permits referencing a quoted variable ($variable), that is, replacing the variable with its value in the resulting string.

Using double quotes also prevents word splitting. An argument enclosed in double quotes presents itself as a single word, even if it contains whitespace separators.

List="one two three"

for a in $List     # Splits the variable in parts at whitespace.
do
  echo "$a"
done
# one
# two
# three

echo "---"

for a in "$List"   # Preserves whitespace in a single variable.
do #     ^     ^
  echo "$a"
done
# one two three

Escaping#

Escaping is a method of quoting single characters. The escape \ preceding a character tells the shell to interpret that character literally.

$ echo "Hello world"
Hello world

$ echo "Hello \"world\""
Hello "world"

You can see more examples and information on escaping here.

Single quotes#

Single quotes ' operate similarly to double quotes, but do not permit referencing variables, since the special meaning of $ is turned off.

Within single quotes, every special character except ' gets interpreted literally. Consider single quotes (‘full quoting’) to be a stricter method of quoting than double quotes (“partial quoting”).

Since even the escape character \ gets a literal interpretation within single quotes, trying to enclose a single quote within single quotes will not yield the expected result.

Functions#

This section was adapted from [@WritingShellScriptsa]

As programs get longer and more complex, they become more difficult to design, code, and maintain. As with any large endeavor, it is often useful to break a single, large task into a series of smaller tasks. We can do this with functions.

A couple of important points about functions in bash:

They must be defined before they can be used.
Second, the function body (the portions of the function between the { and } characters) must contain at least one valid command.

Here is an example:

# The function definition must come before any function calls
system_info()
{
  # At least one valid command is required in a function body
  echo "function system_info"
}

# No brackets are included in the function call
system_info

Running the above lines will define the system_info() function, then call it, simply result in the echo command running.

Positional parameters#

Functions in bash support positional parameters by default. This allows us to do things like specify the name of the output file on the command line, as well as set a default output file name if no name is specified.

Positional parameters are a series of special variables ($0 through $9) that contain the contents of the command line.

For example, let’s change the earlier function definition above slightly:

system_info()
{
  echo "$1"
  echo "$2"
  echo "$3"
}

Then, let’s see what happens if we were to use this function:

$ system_info Hello world !
Hello
world
!

$ system_info "Hello world!" "How are you?" "I am good, thanks!"
Hello world!
How are you?
I am good, thanks!

Detecting positional parameters#

Often, we will want to check to see if we have command line arguments on which to act. There are a couple of ways to do this. First, we could simply check to see if $1 contains anything like so:

if [ "$1" != "" ]; then
    echo "Positional parameter 1 contains something"
else
    echo "Positional parameter 1 is empty"
fi

Second, the shell maintains a variable called $# that contains the number of items on the command line in addition to the name of the command ($0).

# -gt means "greater than" in Bash, and -lt is "less than".
if [ $# -gt 0 ]; then
    echo "Your command line contains $# arguments"
else
    echo "Your command line contains no arguments"
fi

Naming positional parameters with `local`#

In a function that has many positional parameters, it can be difficult to keep track of what each $1 $2, etc. should mean.

A common practise in bash functions is to create named variables within a function using local:

system_info()
{
  local param1="$1"
  local param2="$2"
  local param3="$3"
}

local is a bash builtin that can only be used within a function; it makes the variable name have a visible scope restricted to that function.

Conditionals#

This section was adapted from [@WritingShellScriptsb]

Most programs need to make decisions and perform different actions depending on various conditions. In bash, there are two main things to know for achieving conditional logic:

How the shell evaluates the success or failure of a command (Exit status)
How the shell can control the flow of execution in our program.

These two things are elaborated in the following sections.

Exit status#

Commands (including the scripts and shell functions we write) issue a value to the system when they terminate, called an exit status. This value, which is an integer in the range of 0 to 255 ¹ ¹Most of these numbers aren’t used – 0 (success) and 1 (failure) are most common. You can see a useful discussion of where to find more information about exit codes on stackoverflow., indicates the success or failure of the command’s execution. By convention, a value of zero indicates success and any other value indicates failure.

The shell provides a parameter $? that we can use to examine the exit status of the previously run command. Here we see it in action:

$ ls -d /usr/bin
/usr/bin

# The last command terminated sucessfully, so we have a zero exit code when calling $?
$ echo $?
0

$ ls -d /bin/usr
ls: cannot access /bin/usr: No such file or directory

# The last command did not terminate successfully, so we have a non-zero exit code when calling $?
$ echo $?
2

Some commands use different exit status values to provide diagnostics for errors, while many commands simply exit with a value of one when they fail. man pages often include a section entitled “Exit Status,” describing what codes are used. However, a zero always indicates success.

The shell provides two extremely simple builtin commands that do nothing except terminate with either a zero or one exit status. The true command always executes successfully and the false command always executes unsuccessfully:

$ true
$ echo $?
0
$ false
$ echo $?
1

We can use these commands to see how the if statement works. What the if statement really does is evaluate the success or failure of commands:

$ if true; then echo "It's true."; fi
It's true.

$ if false; then echo "It's true."; fi
$

The command echo “It’s true.” is executed when the command following if executes successfully, and is not executed when the command following if does not execute successfully.

`exit`#

We can (and should!) set the exit status of our own scripts when they finish. To do this, use the exit command. The exit command causes the script to terminate immediately and set the exit status to whatever value is given as an argument.

For example: exit 0 exits our script and sets the exit status to 0 (success), whereas exit 1 exits your script and sets the exit status to 1 (failure).

`if`#

The if command is fairly simple on the surface; it makes a decision based on the exit status of a command. The if command’s syntax looks like this:

if commands; then
    commands
[elif commands; then
    commands...]
[else
    commands]
fi

Typically, you will see the if command combined with the test command, seen below.

`test`#

The test command is used most often with the if command to perform true/false decisions.

The command is unusual in that it has two different syntactic forms:

# First form
test expression

# Second form, which is far more common
# Note: the word "test" does not appear, but this is in fact a "test" command!
[ expression ]

The test command works simply. If the given expression is true, test exits with a status of zero; otherwise it exits with a status of 1.

The neat feature of test is the variety of expressions we can create. Here is an example:

if [ -f .bash_profile ]; then
    echo "You have a .bash_profile. Things are fine."
else
    echo "Yikes! You have no .bash_profile!"
fi

In this example, we use the expression -f .bash_profile . This expression asks, “Is .bash_profile a file?” If the expression is true, then test exits with a zero (indicating true) and the if command executes the command(s) following the word then. If the expression is false, then test exits with a status of one and the if command executes the command(s) following the word else.

Here is a partial list of the conditions that test can evaluate. Since test is a shell builtin, use help test to see a complete list:

Expression	Description
`-d file`	True if `file` is a directory.
`-e file`	True if `file` exists.
`-f file`	True if `file` exists and is a regular file.
`-L file`	True if `file` is a symbolic link.
`-r file`	True if `file` is a file readable by you.
`-w file`	True if `file` is a file writable by you.
`-x file`	True if `file` is a file executable by you.
`file1 -nt file2`	True if `file1` is newer than `file2.`
`file1 -ot file2`	True if `file1` is older than `file2.`
`-z string`	True if `string` is empty.
`-n string`	True if `string` is not empty.
`string1 = string2`	True if `string1` equals `string2.`
`string1 != string2`	True if `string1` does not equal `string2.`

A quick note on syntax#

Note that the above example can be written in a few ways:

# Preferred form
if [ -f .bash_profile ]; then
    echo "You have a .bash_profile. Things are fine."
else
    echo "Yikes! You have no .bash_profile!"
fi

# Alternate form
if [ -f .bash_profile ]
then echo "You have a .bash_profile. Things are fine."
else echo "Yikes! You have no .bash_profile!"
fi

The semicolon ; is a command separator. Using it allows us to put more than one command on a line.

For example: $ clear; ls will clear the screen, then execute the ls command.

We use the semicolon as we did to allow us to put the word then on the same line as the if command, because it’s easier to read that way.

Code reuse#

The primary goal of writing a program in any language is to cryztalize useful logic in a reusable form – that is, to write a program. The outcome is that you don’t need to repeat yourself once you’ve solved a problem once.

In bash there are two main avenues we will take to achieve this goal:

create function libraries
create executable scripts

The following sections elaborate each technique.

Function library with `source`#

This section was adapted from [@LinuxCommandLine]

Most programming languages permit programmers to specify external files to be included within their programs. This is often used to add “boilerplate” code to programs for such things as defining standard constants and referencing external library function definitions.

Bash has a builtin command, source, that implements this feature. This section will cover the ways it can make our scripts more powerful and easier to maintain.

source reads a specified file and executes the commands within it using the current shell. It works both with the interactive command line and within a script. Using the command line for example, we can reload the .bashrc file by executing the following command:

$ source ~/.bashrc

Note that the source command can be abbreviated by a single dot character like so:

$ . ~/.bashrc

When source is used on the command line, the commands in the file are treated as if they are being typed directly at the keyboard. In a shell script, the commands are treated as though they are part of the script.

source is a natural way to share functions and variables across many bash programs. For example, it makes sense to have a shared function to display error messages:

error_msg() {
  printf "%s\n" "$1" >&2
}

To share these functions across other scripts, we could build a library of functions and source that library. As an example, we could put all the common code in a file called ~/bash-scripts.sh and add the following code to both scripts to source that file:

FUNCLIB=~/bash-scripts.sh

if [[ -r "$FUNCLIB" ]]; then
    source "$FUNCLIB"
else
    echo "Cannot read function library!" >&2
    exit 1
fi

If you put source statements like this in your ~/.bashrc, your functions will always be available each time you open a new terminal instance.

Scripts#

In the simplest terms, a shell script is a file containing a series of commands. The shell reads this file and carries out the commands as though they have been entered directly on the command line.

Say we have the following file, example.sh:

#!/bin/bash
echo "This is an example script"

We can use the bash program to run this script:

$ bash example.sh
This is an example script

This is fairly similar to the source command we saw earlier. The difference is a bit subtle:

source <library-name> will run within our current shell instance (preserving variables and functions)
bash <script-name> will launch a new shell instance, which will NOT preserve variables and functions

The impetus to use one method or the other varies by purpose:

write a library and use source to create re-useable functions to be used on the command line or in other scripts
write a script to do a specific task

There is overlap between these two purposes, so don’t overthink it too much. You will often find yourself writing scripts that should be libraries, and vice versa – you can always make these changes to your programs whenever you like.

Shebangs#

The first line of any script, in ANY programming language (bash, sh, or, as we will soon see, `python) should start with a shebang:

#!/bin/bash

# the first line beginning with #! is a shebang!

Let’s break down the components of the shebang to better understand it:

# – a comment
! – in this context, a special character indicating that this line should be executed by the program that loads this file
/bin/bash – the path to the interpreter for the code written in this file. Note that /bin/bash is an absolute path.

We will soon see that the most portable method for specifying a shebang is by providing the path to using the env program, like so:

#!/usr/bin/env bash     # a bash script shebang
#!/usr/bin/env sh       # a sh script shebang
#!/usr/bin/env perl     # a perl script shebang
#!/usr/bin/env python   # a python script shebang

For now, all that matters is that you provide a path to the bash program on your system. You can see valid options by running whereis bash:

$ whereis bash
bash: /bin/bash

Permissions#

This section was adapted from [@HowToSetPermissions]

In order to run a script without specifying an interpreter, you need to make the script executable, which is a permission that you can set on the file itself.

The sections below show how to view and set file permissions in linux filesystems.

View permissions with `ls`#

The ouptut of ls -l will show the current permissions for files and folders:

-rwxr--rw- 1 user user 0 Jan 19 12:59 file1.txt

The letters rwx stand for Read/Write/Execute permission. These rights are shown three times, first for the Owner, then the Group and lastly Others (world)

Edit permissions with `chmod`#

The command to modify permissions is chmod. There are two ways to modify permissions, with numbers or with letters.

Check out this this chmod documentation for a really great interactive demo.

Numeric#

chmod 400 file - Read by owner
chmod 040 file - Read by group
chmod 004 file - Read by world
chmod 200 file - Write by owner
chmod 020 file - Write by group
chmod 002 file - Write by world
chmod 100 file - execute by owner
chmod 010 file - execute by group
chmod 001 file - execute by world

To combine these, just add the numbers together:

chmod 444 file - Allow read permission to owner and group and world
chmod 777 file - Allow everyone to read, write, and execute file

Symbolic#

chmod also accepts symbolic arguments for permission changes, where:

rwx: read/write/execute
ugo: user/group/world

Some examples:

Deny execute permission to everyone: $ chmod a-x file
Allow read permission to everyone: $ chmod a+r file
Make a file readable and writable by the group and others: $ chmod go+rw file
Make a shell script executable by the user/owner: $ chmod u+x myscript.sh
Allow everyone to read, write, and execute the file and turn on the set group-ID: $ chmod =rwx,g+s file

Some files are configured to have very restrictive permissions to prevent unauthorized access. Changing these permissions can create security problems.

To change or edit files that are owned by root, sudo chmod must be used. Note that changing permissions incorrectly can quickly make your system unusable! Please be careful when using sudo!

$ sudo chmod o+x /usr/local/bin/somefile

Recursive Permission Changes#

chmod -R will change all the permissions of each file and folder under a specified directory at once.

For example, $ chmod 777 -R /path/to/Dir will grant read/write/execute permissions to all users for ALL files in /path/to/Dir.

To assign reasonably secure permissions to files and folders/directories, it’s common to give files a permission of 644, and directories a 755 permission, using the find command and a pipe we can target just files or just folders as in the following examples.

$ sudo find /path/to/Dir -type f -print0 | xargs -0 sudo chmod 644`
$ sudo find /path/to/Dir -type d -print0 | xargs -0 sudo chmod 755

Again if using sudo be careful, in particular watch for extra spaces in your command/path.

Changing Ownership and Group membership#

A file’s owner can be changed using the chown command.

$ sudo chown kate file1.txt

A file’s group can also be changed using the chown command.

$ sudo chown :mygroup file1.txt

chown can also change the owner and group in a single command:

$ sudo chown tux:mygroup file1.txt

Style#

You can see the following resources for style guides for bash coding:

“Unofficial Shell Scripting Stylesheet” by the Linux Documentation Project
“Shell Style Guide” by Google developers

Resources#

This section was adapted from https://linuxcommand.org/lc3_resources.php

Aside from everything covered in these notes, you can refer to the following resources:

Bash Builtins - commands built into the shell itself
The GNU Coreutils - the essential utilities included with most Linux distributions. These are divided into three groups:
Other Commands - other commonly-used Linux utilities

L1: Developer Environment Setup L1 Part 2: Bash Better

Categories

Bash scripting

Contents

Bash scripting#

Bash theory#

Variables#

Variable names#

Command substitution#

Environment variables#

Quoting#

Quoting variables#

Escaping#

Single quotes#

Functions#

Positional parameters#

Detecting positional parameters#

Naming positional parameters with `local`#

Conditionals#

Exit status#

`exit`#

`if`#

`test`#

A quick note on syntax#

Code reuse#

Function library with `source`#

Scripts#

Shebangs#

Permissions#

View permissions with `ls`#

Edit permissions with `chmod`#

Numeric#

Symbolic#

Recursive Permission Changes#

Changing Ownership and Group membership#

Style#

Resources#

Categories

Bash scripting

Contents

Bash scripting#

Bash theory#

Variables#

Variable names#

Command substitution#

Environment variables#

Quoting#

Quoting variables#

Escaping#

Single quotes#

Functions#

Positional parameters#

Detecting positional parameters#

Naming positional parameters with local#

Conditionals#

Exit status#

exit#

if#

test#

A quick note on syntax#

Code reuse#

Function library with source#

Scripts#

Shebangs#

Permissions#

View permissions with ls#

Edit permissions with chmod#

Numeric#

Symbolic#

Recursive Permission Changes#

Changing Ownership and Group membership#

Style#

Resources#

Naming positional parameters with `local`#

`exit`#

`if`#

`test`#

Function library with `source`#

View permissions with `ls`#

Edit permissions with `chmod`#