Customizing bash to improve reproducibility

August 13, 2020 • PD Schloss • 9 min read

In past episodes of Code Club we haven’t spent much time talking about how to customize your bash environment. Perhaps you’ve already found how you can change the background and font colors using the themes that are built into the software. If you’re going to spend a lot of time looking at the screen, it’s nice to be able to create a pleasant working environment. Besides these aesthetic customizations, we can also customize the behavior of our command line environment to make it easier to interact with the command line.

In today’s episode we’ll see how we can modify the hidden .bashrc file that lives in your home directory. First, we’ll customize our command line environment by creating aliases that are words that stand in for other things. For example, if there are arguments that you always use when running a command, you can put that into an alias. By default R will prompt you to save your session and offer to load data from a previous session. Both practices are problematic from a reproducibility stand point because you may lose track of what variables were created previously. But we can create an alias that always tells R to turn off those features. Second, we’ll see how we can customize our prompt. I’ll share with you the code used to generate a prompt that will tell us the status of our project’s repository. In the exercises, I’ll encourage you to make your own customizations.

An important point to remember in customizing our environment is that we don’t want those customizations to impact our actual analysis. This is because if you get my code and my code depends on the customizations, but I don’t give you those customization, then your version of the analysis will break. For example, imagine if were I create an alias for sed that is called sed, but runs sed -E. I might code my bash scripts in a way that assumes I always have access to the enhanced regular expressions. Now, you get a hold of my code and try to run those scripts. But it doesn’t work for you. Why? Because you don’t have the same sed alias that I have. We always want to be mindful of how much our bash scripts depend on our aliases.

Today’s video won’t do anything with the data from the project that we’ve been working on over the recent episodes. So, even if you’re only watching this video to learn more about customizing your command line environment and don’t know what a 16S rRNA gene is, you should get a lot out of today’s video. Please take the time to follow along on your own computer and attempt the exercises. Don’t worry if you aren’t sure how to solve the exercises, at the end of the video I will provide solutions. If you haven’t been following along but would like to, please check out the notes below where you’ll find instructions on catching up, reference notes, and links to supplemental material. You can find my version of the project on GitHub.

Pat’s fancy command prompts

This code originally from Karl Broman who was kind enough to share it with Pat

# color prompt to include branch information
function color_my_prompt {
  local __user_and_host="\[\033[01;32m\]\u@\h"
  local __cur_location="\[\033[01;34m\]\W"
  local __git_branch='`git branch 2> /dev/null | grep -e ^* | sed -E  s/^\\\\\*\ \(.+\)$/\(\\\\\1\)\ /`'
  local __prompt_tail="\[\033[35m\]$"
  local __last_color="\[\033[00m\]"

  RED="\[\033[0;31m\]"
  YELLOW="\[\033[0;33m\]"
  GREEN="\[\033[0;32m\]"

  # Capture the output of the "git status" command.
  git_status="$(git status 2> /dev/null)"

  # Set color based on clean/staged/dirty.
  if [[ ${git_status} =~ "working directory clean" ]]; then
      state="${GREEN}"
  elif [[ ${git_status} =~ "working tree clean" ]]; then
      state="${GREEN}"
  elif [[ ${git_status} =~ "Changes to be committed" ]]; then
      state="${YELLOW}"
  else
      state="${RED}"
  fi

  export PS1="$__user_and_host $__cur_location ${state}$__git_branch$__prompt_tail$__last_color "
}

# Tell bash to execute this function just before displaying its prompt.
PROMPT_COMMAND=color_my_prompt

Important things to remember

source ~/.bash_profile

or

source ~/.bashrc

Alias

Syntax:

alias alias_name="commands"

Example:

alias R="R --no-save --no-restore"

Customizing the prompt

Specific notes for Mac OS X about .bash files

Specific notes for Linux and Windows 10 about .bash files

Installations

If you haven’t been following along, you can get caught up by doing the following:

Exercises

1. Create an alias for rm that prompts you whenever you are about to delete a file. Normally this would be achieved using rm -i <file_name>. Where could this cause problems with our bash scripts?

alias rm="rm -i"

There are places in our bash scripts where we do some “garbage collection” with rm. We likely don’t want to be asked about deleting stuff if we’re automating it. If we want to keep this alias, then we need to go back and add -f to our rm commands.

2. Create an alias called lsl that runs ls -lth whenever it is used

alias lsl='ls -lth'

3. The customized prompt outputs my user and computer name (i.e. user and host). To free up some room in the prompt, I’d like to remove that information. How would you modify the color_my_prompt function to do this?

Modify the export PS1 line to be

export PS1="$__cur_location ${state}$__git_branch$__prompt_tail$__last_color "