-
Notifications
You must be signed in to change notification settings - Fork 4
Short 2
The cat command (for concatenate) will print the content of one (or more) text file(s).
Let's start locating some ".txt" files. First, we can move to the "course" directory we created in our home:
cd ~/course
From there, we can use find to locate files in a specific path (that will be the learn_bash directory, again in our home), using different criteria like wildcards to filter for filenames:
find ../learn_bash/ -name "*.txt"
Choose one of those files and try cat. For example:
cat ../learn_bash/files/introduction.txt
If the file is long, we don't want to flood our terminal with the whole thing. Sometimes a preview of the first (or last) lines will be enough (by default 10 lines, use -n NUMBER
to specify otherwise). The head and tail commands will do this:
head ~/learn_bash/files/introduction.txt
tail -n 3 ~/learn_bash/files/introduction.txt
The grep command will only print lines matching a pattern. For example:
grep Darwin ~/learn_bash/files/introduction.txt
will print one line, while:
grep Newton ~/learn_bash/files/introduction.txt
will print none
The wc command returns how many lines, words and characters are present in a text file:
wc ~/learn_bash/files/introduction.txt
To only print the number of lines:
wc -l ~/learn_bash/files/introduction.txt
Try this exercise alone:
- Locate files in the learn_bash directory, that end with "faa" (stands for FASTA amino acidic)
- Use cat to print the content of one of such files
- Use wc to detect the number of lines
Finally, to extract just the sequence headers:
grep '>' ~/learn_bash/phage/vir_cds_from_genomic.fna
The less command will display a text file for interactive visualization, and behaves like the man command we used before. Try:
less ~/learn_bash/phage/vir_genomic.gff
If you want to avoid word-wrap and prefer to keep the lines intact:
less -S ~/learn_bash/phage/vir_genomic.gff
If you want to increase the space of tabs:
less -S -x 20 ~/learn_bash/phage/vir_genomic.gff
· Bioinformatics at the Command Line - Andrea Telatin, 2017-2020
Menu