-
Notifications
You must be signed in to change notification settings - Fork 4
Browsing files
The general syntax is: ls [options] [files]
. Both the options and the files are optional, and files can be files or directories. Now we introduce some of the options:
Option Description
-
-a
Also show hidden files -
-l
Long format, will show one file per line, with size, owner, date… -
-h
Used with -l, will display file size in human-readable format (e.g. 2.3Mb instead of 2298011 ) -
-d
Show directories as files, without listing their content
The options can be combined together, and the following two commands are identical:
ls -l -h -a
ls -lha
If we want to list the files present at the root, we don't need to move there, but simply ask ls which path to scan for you:
ls /
Here another example:
ls /homes/qi/tutorial/
You can type as many paths (files or directories) as needed in a single ls command:
ls -l ~/.bashrc ~/.screenrc /homes/qi/tutorial/
As we noticed, ls can receive more than one file. Usually, though, we don't type every single item to be listed, but instead we use wildcards, then the shell will expand our shortcuts into a list of paths. There are wildcards, ranges and lists to be used.
Symbol | Meaning | Example |
---|---|---|
* | Any set of characters (any length) |
*.fasta : all files ending with “.fasta” |
? | A single character |
A???.txt : files starting with A, followed by exactly 3 chars, endin by “.txt” |
[a-z] | Range: any single lowercase letters |
file1[a-c].txt : files called file1a, file1b and file1c, ending with “.txt” |
[0-9] | Range, any single digit |
reads_R[1-2].fastq : reads_R1.fastq and reads_R2.fastq |
{a,b} | Comma separated list of words |
photo_{andrea,john}.jpg : photo_andrea.jpg and photo_john.jpg |
This course comes with a structure of directories and test files. To download it you will need an Internet connection (in the machine you are logged into, so if you use a cluster you might need to go to a net-enabled node).
cd
git clone https://github.com/telatin/learn_bash
This command will download the latest version of "learn_bash". Since we first used the cd
command to return to our home directory, we should have a ~/learn_bash/
directory in our account now.
We should be in our home directory. Check with pwd
.
To enter the new directory, type (remember the TAB):
cd learn_bash
Now, using cd and ls try figuring out:
- How many directories are inside the examples directory
- The content of each directory
Create a directory called copies inside the examples directory. There are many ways:
mkdir copies
Otherwise, you have to craft the proper relative or absolute path (e.g. the absolute path is mkdir ~/learn_bash/copies
).
Let's try again to copy some files. In particular, we want a selection of files inside the phage directory:
# If we are not inside the examples directory:
cd ~/learn_bash/
# Copy some files
cp -v phage/*.f?? copies/
In this case, we use a new switch, -v
(verbose) that will print all the files copied (useful when we want to see the progress). Using both *
and ?
wildcards we select all the files having an extension of three chars, the first being “f” (e.g. fna, faa).
In bash if we type text after a #
it is ignored. I will use this feature to explain some commands like:
# The following line will list the files in your home
ls -l ~
The find command can print all the files from a starting path, including directories and subdirectories.
Some examples:
# Print all files and directories in my home
find ~
# Print all files and directories in a specific path
find /usr/lib/ssl
# Print only directories / files
find ~ -type d
find ~ -type f
# Print files in a home with a specific extension
find ~ -name "*.txt"
The simplest command is cat
(concatenate), that can print the content of one or more files. Example:
cat ~/learn_bash/files/wine.csv
- Can you type it using a relative path?
When a file is huge, it's very convenient to have a look at a fraction of it. The commands head
and tail
allows printing only the first (or last) lines of a file. By default 10 lines, but you can change this with -n
:
head ~/learn_bash/files/wine.csv
head -n 3 ~/learn_bash/files/wine.csv
tail -n 5 ~/learn_bash/files/wine.csv
Do you remember man
? Good, as we can now use a new command to interactively view text files that will behave as "man":
# Run it, then press 'q' to exit:
less ~/learn_bash/phage/vir_genomic.gff
# To disable wordwrap and see clearly the lines:
less -S ~/learn_bash/phage/vir_genomic.gff
Counting the number of lines of a file is a common task. The wc
(wordcount) command can do this, and something more.
# Count lines, words, characters of a file:
wc ~/learn_bash/files/introduction.txt
# Count only lines:
wc -l ~/learn_bash/files/introduction.txt
# Also on multiple files
wc -l ~/learn_bash/phage/*.*
grep is a powerful command to extract lines containing a pattern. The simples use is “grep wordtosearch file”:
grep ">" ~/learn_bash/phage/vir_protein.faa
In this case, the word we looked for is simply the >
character, that is, we extracted all the lines containing it.
We are not going to expand this, but you can perform complex searches using a language called regular expressions.
Some switches: -c
to count the number of matching lines, -i
to perform a case insensitive search, -v
to print the lines not containing the pattern.
See Presentation on regular expressions for grep
So far, every command we issues gave us some text lines that we inspected, but we never saved them for long term storage. Consider the following command:
find ~/learn_bash -type d
If we want to save the output in a new file, the shell offers us a redirection symbol:
find ~/learn_bash -type d > ~/examples/directories.txt
With this command, we created a new file called ~/examples/directories.txt, where the output of find was stored.
Our commands print two types of text.
We explained the behaviour of most commands as a set of characters printed on our screen. This is a simplification: the characters printed can be either "real output" or user "messages" (technically called standard output and standard error). The '>' sign will redirect the standard output (or STDOUT), but sometimes we are interested in the standard error (or STDERR). Try:
ls -l ~/.bashrc ~/.404
What can you note?
ls -l ~/.bashrc ~/.404 2> ~/ls_demo.err
Now you know how to redirect the standard error (i.e. using 2>
).
Let's make a real-world example: when we align short reads against the reference we expect the output to be the alignments (in SAM/BAM format), but the program can be interested in printing some user information (e.g. alignment progress, how many unmapped reads…), so will use the standard error.
Go to your home directory.
Try counting the lines from two files you choose inside your home, plus /etc/passwd
.
Now count the lines of /etc/passwd
, but using a relative path!
Go to the ~/learn_bash/scripts/
and try to list the files included in the ~/examples/scripts/files, using the relative path.
Finally, always from the ~/learn_bash/scripts/
directory. Save into a file called phage_files_lines.txt placed inside your home the number of lines of each file inside the examples/phage directory. Use only relative paths.
In our training server I installed a library that will colorize (in yellow), the STDERR, leaving unchanged the STOUT stream. To enable it:
source /homes/lib/load_stderred
Try now:
ls -l ~/.bashrc ~/.404
· Bioinformatics at the Command Line - Andrea Telatin, 2017-2020
Menu