Scripting¶
In this section there are a couple of main concepts that we are going to discuss:
- What are shell scripts?
- Where can scripts and other programs be stored?
- Environment variables
- Login profiles
- Program exit status
What is a shell script?¶
A shell script is a text file of actions you want the computer to take. The first line, called the "shebang" defines what shell interpreter to be used.
The lines after the shebang are effectively a series of commands
Following is a simple script we will use as an example in the following sections:
Notice the first line is the "shebang" that starts
with a #!, followed by the interpreter, which in
this case is the bash shell. We then have the echo
program, which just prints out the argument(s) to
the console.
Quotes
Bash (and other shells) often use whitespace to delimit arguments to programs. Quotations keep arguments with spaces together.
Double quotes and single quotes work differently in bash: double quotes allow expansion of variables and some other kinds of things, whereas single quotes are verbatim. Please use your command-line editor to copy this into a file named hello.sh.
A couple of helpful programs that you may want to use:
Helpful programs¶
| Program | Action | Example |
|---|---|---|
echo |
Print statements | echo "Hello, World!" |
hostname |
Name of the server you are logged into | hostname -f |
date |
Current date and time (many format options) | date |
Running Scripts¶
There are two ways to execute shell scripts:
1) Invoking it with the appropriate shell
2) Refer to it as a program directly by its path
Execution Permissions
Why did this happen? What can we do to check the permissions? Use the command ls -l hello.sh to see the permissions:
The file hello.sh doesn't have the execute bit set, so we can't run it as a program. You can change file and directory permissions using the chmod program
The chmod program allows you to add and remove read(r)/write(w)/execute(x) permissions for the user (u), group (g), and others (o) with the following syntax:
This changes the execute permission from:
-rw-r--r-- username student 34 Jan 22 14:13 hello.sh
to
-rwxr-xr-x username student 34 Jan 22 14:13 hello.sh
which allows us to "execute" it as a program. For a quick refresher on what permissions are, take a look at Navigating Filesystems from week 1.
Now, we can run hello.sh directly as a program!
How would we remove read permissions on a file for both the file group and others?
$PATH variable¶
You may have noticed that we needed to execute our file relative to our current directory (i.e ./hello.sh instead of just hello.sh). If we try to run just hello.sh, we will get a "command not found" error:
In order for a shell to be able to find this little program and use it as a command, we need to add the directory it is located in to be part of our PATH variable. (We'll talk more on shell variables in a bit).
Essentially,the PATH variable controls where the shell looks for executable programs to run as commands. You can see all the directories that are searched for commands with the following command:
ls command is located in /usr/bin/, which is listed in our PATH variable!
It's common to make a bin directory in your home directory, and store any executable files you want to run as commands there. Let's make a bin directory, move our program there, and add the bin directory to our PATH variable!
Tip
Also, remember that file extensions are optional in UNIX (you can remove the .sh from the file name).
Now our hello program is available from anywhere! We can change directories, and our shell will still be able to find the hello program, because we added it to our PATH variable.
What happens if we close the shell and reopen it?
The change to our PATH variable is not kept if we close the shell, so currently, we need to alter the PATH every time we run
There is a file called the .bashrc file that you can add the export PATH=$PATH:~/bin command to, which will be ran whenever you open a terminal!
Environment variables¶
Environment variables are similar to variables in other programming languages. But, they also have some important distinctions that set them apart.
Variables can have a local scope (in a script) or a global scope (with child processes).
There are three types of variables:
- Simple
- Magic
- Special
You can also modify software (or shell) behavior by environment variables.
Simple variables¶
They are like other programming languages, and you can define them yourself! There are no types (mostly everything is text).
Simple assignment:
You can access the variable with with the $var or ${var}:
Variables are conventionally uppercase, but it's not necessary.
There is also a weird thing where 0 is true and 1 is false, which we will discuss later.
Lastly, you can use Command substitution to set a variable to the output of a command:
Variables also have a scope. If you define a variable, it is only visible in local scope (current script) by default.
To propagate the variable down to child
processes, you need to export it.
You can also declare the variable and export it on the same line.
Magic variables¶
There are a couple of "magic" variables which are not like other programming languages.
Special variables¶
There are many different special variables you can use.
| Variable | Meaning |
|---|---|
$1 ... $99 |
x-th argument of script/function |
$_ |
Last argument of script/function |
$@ |
All arguments of script/function (whitespace) |
$# |
Count of arguments of script/function |
$? |
Exit status of last process (more on this later) |
$! |
Process ID of last process |
As an example, let's look at the following script:
| hellonames.sh | |
|---|---|
We can now provide this script arguments, and use them in our script!
System Variables¶
There are also special software/shell environment variables that change the behavior of different things.
| Variable | Meaning |
|---|---|
PATH |
Directories containing programs |
MANPATH |
Directories containing manual pages |
LD_LIBRARY_PATH |
Directories containing shared libraries |
Login profile¶
As we discussed earlier, the PATH variable is reset
every time you log into the cluster, or open a new
terminal. What if we wanted to have it be modified
every time we started a new session? There's a
solution for that! It's called a login profile! The commands in your login profile will be ran every time you open a new terminal.
Typically, on Linux you would use the ~/.bashrc
file, but on the UNIX system we have on the clusters
it's contained in the ~/.bash_profile hidden file.
| /home/username/.bashrc (or .bash_profile) | |
|---|---|
Let's go ahead and make that now:
There are many things you can put in the login profile to configure your personal session, but the two that we are going to talk about are:
- Variables
- Aliases
Variables¶
You can also add variables (like PATH) to your
login profile.
bin folder in your
home directory to the PATH variable every time
you create a new session. Note that we have $PATH
at the beginning of the assignment, that's so we
don't only have $HOME/bin as our entire PATH.
Because that would mean the only place that the
computer looks for programs would be in your
home directory. Which would be bad.
Aliases¶
An alias is a verbatim command substitution that happens on the command line when invoked like a program. Here's one example:
ssh username@negishi.rcac.purdue.edu every time you want to log into Negishi, This alias will allow you to instead just type Negishi.
Warning
Something that you should NOT do is load modules (especially conda modules) in your login profile. This can mess up the rest of the start up process and can cause weird errors.
Exit status & Control Flow¶
Successful programs should exit with a zero (0).
And any non-zero exit status should be considered
an error condition. Often, programs will document
the meaning of their different exit status values
in their manual page.
As shorthand, you may also see conditionals formatted like this:
Bash also allows us to run several tests against files and variables with true/false outcomes:
File & Path Tests¶
| Test | Meaning | Example |
|---|---|---|
-f file |
Regular file exists | [[ -f input.txt ]] && echo "input.txt exists" |
-d dir |
Directory exists | [[ -d $SCRATCH ]] && echo "Scratch exists" |
-e path |
Path exists | [[ -e results.out ]] || echo "Missing results" |
-r file |
Readable | [[ -r data.csv ]] && head data.csv |
-w file |
Writable | [[ -w output.log ]] && echo "test" >> output.log |
-x file |
Executable | [[ -x run.sh ]] && ./run.sh |
String Tests¶
String tests are commonly used in shell scripts for validating input and making decisions based on names, paths, or environment variables.
| Test | Meaning | Example |
|---|---|---|
-z str |
String is empty | [[ -z "$1" ]] && echo "No argument given" |
-n str |
String is not empty | [[ -n "$USER" ]] && echo "User is $USER" |
str1 == str2 |
Equal | [[ "$HOSTNAME" == "login01" ]] && echo "On login 1" |
str1 != str2 |
Not equal | [[ "$SHELL" != "/bin/bash" ]] && echo "Not using bash" |
Numeric Comparisons¶
Numeric comparisons can be useful when you want to compare values such as counts, limits, job sizes, or resource requests.
| Operator | Meaning | Example |
|---|---|---|
-eq |
equal | [[ "$N" -eq 16 ]] && echo "Using 16 cores" |
-ne |
not equal | [[ "$TASKS" -ne 1 ]] && echo "Parallel job" |
-lt |
less than | [[ "$N" -lt 4 ]] && echo "Small job" |
-le |
less or equal | [[ "$N" -le 32 ]] && echo "Within node limits" |
-gt |
greater than | [[ "$N" -gt 10 ]] && echo "Large job" |
-ge |
greater or equal | [[ "$MEM" -ge 128 ]] && echo "High-memory job" |
Logical Operators¶
| Operator | Meaning | Example |
|---|---|---|
&& |
AND | [[ -f in.txt && -w out.txt ]] && ./process.sh |
|| |
OR | [[ -d "$SCRATCH" ]] || mkdir -p "$SCRATCH" |
! |
NOT | [[ ! -f config.yaml ]] && echo "Missing config" |
Loops¶
Lastly, loops are implemented in bash, and can be particularly useful for looping over files or arguments.
You can use command substitution to loop through files:
You can loop through file arguments with the $@ variable:
Lastly, you can loop through an array of integers:
There's many more aspects of bash that we're not going to talk about here like while loops, functions, and variable substitution. Before we move on, it's important to note that if a command fails, bash will just continue on by default.
Next section: Pipes