Skip to content

Commit

Permalink
SDC Tools Module Sprint 2
Browse files Browse the repository at this point in the history
  • Loading branch information
illicitonion committed Dec 5, 2024
1 parent 0663f07 commit c0b34be
Show file tree
Hide file tree
Showing 12 changed files with 265 additions and 30 deletions.
50 changes: 50 additions & 0 deletions common-content/en/module/tools/grep-in-pipelines/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
+++
title = "grep in pipelines"
headless = true
time = 20
facilitation = false
emoji= "πŸ’»"
[objectives]
1="List the files in a directory which contain an upper-case letter in their name with ls and grep."
2="Count the number of files in a directory which contain an upper-case letter in their name with ls, grep, and wc."
3="Explain why we don't need to pass -1 to ls when piping its output."
+++

We've already used grep to search for text in files using regular expressions.

We can also pipe other commands' output to `grep` to search the output the same way.

For example, we can write:

```console
% ls -1
report-draft
report-version-1
report-version-1.1
report-version-2
report-final
report-final-2
% ls -1 | grep -v '[0-9]'
report-draft
report-final
```

The original `ls -1` command showed us all the files in the current directory.

By piping this to `grep -v '[0-9]'` we can filter this output down to just the files whose names don't contain numbers.

`grep` operates on lines, and `ls -1` outputs one file per line, so `grep` tests each file one at a time.

### `ls` vs `ls -1`

In our terminal, when we run `ls -1`, we get one file output per line. But if we run `ls` in our terminal, we get the files on one line, separated by spaces.

We know that `grep` operates on individual lines, so it may seem like `ls | grep` would have a problem - `ls` prints more than one file per line.

But `ls` behaves specially. It detects whether it's outputting to a terminal, or a pipe, and acts differently:
* If it's outputting to a pipe, it outputs one file per line.
* If it's outputting to a terminal, it tries to be useful and take up less space. But if you pass `-1` it will _force_ `ls` to output one file per line.

So we can write `ls | grep -v '[0-9]'` - we don't need to pass `-1` to `ls`.

It's good to know that sometimes programs behave differently when outputting to a terminal or a pipeline.
35 changes: 35 additions & 0 deletions common-content/en/module/tools/head-and-tail/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
+++
title = "head and tail"
headless = true
time = 20
facilitation = false
emoji= "πŸ’»"
[objectives]
1="Output the first 5 lines of a file using head"
2="Output the last 5 lines of a file using tail"
3="Output the five lines starting 10 lines into a file using head and tail"
+++

`head` outputs lines (or bytes) from the start of a file. `tail` outputs lines (or bytes) from the end of a fail.

![Julia Evans' comic about head and tail](https://wizardzines.com/images/uploads/head-tail.png)

(Source, including text-only transcript: https://wizardzines.com/comics/head-tail/)

Learn about `head` and `tail` from their man pages (and the backlog exercises).

Imagine we have an input file which has 100 lines.

{{<multiple-choice
delimiter="~"
question="What would the command `head -n 8 input` output?"
answers="The first 8 bytes of the file ~ The last 8 lines of the file. ~ The first 8 lines of the file."
feedback="Not quite - are you confusing -n and -c? ~ Not quite - are you confusing head and tail? ~ Right! -n takes a number of lines to output, and head goes from the start of the file."
correct="2" >}}

{{<multiple-choice
delimiter="~"
question="What command could we write to skip the first three lines of the file, and then output the next 2 lines?"
answers="head -n3 input | tail -n2 ~ tail -n+4 | head -n2 ~ tail -n+3 | head -n2"
feedback="No - remember each stage in a pipeline applies to the output of the previous stage, not the original file. ~ Right - tail skips the first few lines, then head takes just a few from the top of that output. ~ Not quite - how many lines does this skip?"
correct="1" >}}
14 changes: 12 additions & 2 deletions common-content/en/module/tools/jq/index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
+++
title = "jq"
headless = true
time = 30
time = 90
facilitation = false
emoji= "❓"
[objectives]
1="Use jq to retrieve information from a JSON file"
+++

### jq
All of the tools we've seen so far operate on lines, words, or characters.

Often these are the formats we have. And indeed, often we decide to have our programs output in these formats to make the output easy to process with tools.

But there are more complex formats it can be useful to process too. You have already used JSON in the course. It is a text format that allows us to represent arrays, objects, strings, numbers, and more.

`jq` is a tool for processing JSON without having to write a whole program. This can be really useful to quickly analyse some data.

Read [Earthly's introduction to `jq`](https://earthly.dev/blog/jq-select/).

Practice using `jq` using the relevant backlog exercises.
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
+++
title = "Programming language concepts"
headless = true
time = 120
facilitation = false
emoji= "πŸ“–"
[objectives]
1="Describe what a variable is."
2="Describe how a variable relates to a memory location."
3="Identify whether variables have fixed types in C, Python, and JavaScript."
4="Explain how the next memory location is found when declaring a local variable on the stack."
5="Explain why some variables are allocated on the heap not the stack."
6="Explain when memory used for a variable on the stack is released."
7="Explain when memory used for a variable on the heap is released."
8="Define an operator."
9="Give examples of common operators."
10="Explain the difference between integer division and floating point division."
11="Describe the meaning of the &, |, ^, and ~ bitwise operators."
12="Manually perform the function of the &, |, ^, and ~ bitwise operators on two integers."
13="Describe the meaning of the && (and), || (or), and ! (not) operators."
14="Explain when it's more appropriate to use a while loop or a for loop."
15="Identify and explain the differences between a function definition in C and Python."
16="Explain what happens when you call a function."
17="Explain what a class is."
18="Describe the relationship between an object and a class."
19="Compare compiled and interpreted languages."
20="Explain one advantage of compiled languages, and one advantage of interpreted languages."
+++

Read chapter 9 of How Computers Work.

Do every exercise listed in the chapters.

You can skip the projects (though you're welcome to try any of them if you have time!).

Check you have achieved each learning objective listed on this page.

This file was deleted.

37 changes: 31 additions & 6 deletions common-content/en/module/tools/shell-pipelines/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,36 @@ time = 30
facilitation = false
emoji= "πŸ’»"
[objectives]
1="Count the occurences of different lines within a file using sort and uniq"
2="List the files in a directory which contain an upper-case letter in their name with ls and grep"
3="Count the number of files in a directory which contain an upper-case letter in their name with ls, grep, and wc"
4="Replace all occurrences of one character with another using tr"
5="TODO: Pipe the output of jq to something interesting"
1="Describe what a shell pipeline is."
2="Explain why we use shell pipelines."
3="Explain the difference between stdout and stderr."
4="Explain what gets passed between two programs when they're combined with a |."
+++

### Shell tools
Read the learning objectives listed on this page: Bear in mind what you’re trying to achieve while reading this text. If a topic isn’t making much sense, and isn’t in the objectives, you can probably skip over it. If a topic is listed in the objectives, you should keep studying it until you are confident you’ve met the objective.

Read [Thinking in Pipelines from Effective Shell](https://effective-shell.com/part-2-core-skills/thinking-in-pipelines/).

Key take-aways:
* Most programs take some input, and produce some output.
* Instead of writing one program to do exactly what we want, we can often combine existing programs.
* Programs read from stdin, and write to stdout and stderr.
* They write their main output to stdout.
* They write error messages, progress messages, and other information that isn't their main output to stderr.
* We can pass information between programs using a pipe (`|`).
* We can write the output of programs to a file using `>`, or append to a file using `>>`.

{{<multiple-choice
question="If /doesnotexist doesn't exist, what will be output to stdout and stderr by the command `ls /doesnotexist`"
answers="stdout: an error message. stderr: nothing. | stdout: /doesnotexist. stderr: an error message. | stdout: nothing. stderr: an error message."
feedback="Not quite - what are stdout and stderr for? | Not quite - ls only lists files that exist. | Right - ls doesn't have any files to list as output, but does have an error to display."
correct="2" >}}

{{<multiple-choice
delimiter="~"
question="If the working directory contains the files: 'primates', 'fish', and 'monotremes', what will `ls | sort | grep i | wc -l` output?"
answers="fish primates monotremes ~ 2 ~ fish primates ~ 3"
feedback="Not quite - `ls | sort` would output this, but there are more commands in the pipeline. ~ Right! We list three files, sort them, search for ones that contain an i (fish and primates), then count the number of output lines (one per file). ~ Not quite - `ls | sort | grep i` would output this, but there's one more command in the pipeline. ~ Not quite - check what the grep command in the pipeline does."
correct="1" >}}

Next we will learn about some programs commonly used in pipelines.
50 changes: 50 additions & 0 deletions common-content/en/module/tools/sort-and-uniq/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
+++
title = "sort and uniq"
headless = true
time = 20
facilitation = false
emoji= "πŸ’»"
[objectives]
1="Count the occurences of different lines within a file using sort and uniq"
+++

`sort` sorts its input. `uniq` deduplicates adjacent matching lines.

![Julia Evans' comic about sort and uniq](https://wizardzines.com/images/uploads/sort-uniq.png)

(Source, including text-only transcript: https://wizardzines.com/comics/sort-uniq/)

Learn about `sort` and `uniq` from their man pages (and the backlog exercises).

Often we pipe to `sort | uniq` not just `uniq` so that duplicate lines will be next to each other before they're passed to `uniq`.

For the following quizzes, consider the following input file:
```console
% cat input
pigs 10
chickens 2
pigs 10
goats 3
hamsters 300
```

{{<multiple-choice
delimiter="~"
question="What command would output the lines of the file sorted alphabetically?"
answers="sort input ~ sort -u input ~ sort input | uniq"
feedback="Right - sort sorts the file. ~ Not quite - what does -u do? ~ Not quite - what does piping to uniq do?"
correct="0" >}}

{{<multiple-choice
delimiter="~"
question="What command would output the lines of the file sorted by the number after the first space, starting with hamsters 300?"
answers="sort -k1 input ~ sort -k2 input ~ sort -k2 -r -n input ~ sort -k2 -n input"
feedback="Not quite - check what -k1 does. ~ Not quite - look at the difference between alphabetical sorting and numerical sorting. ~ Right! We need to select the right field, sort numerically, and reverse the order to go biggest to smallest. ~ Close, but what order will things be sorted?"
correct="2" >}}

{{<multiple-choice
delimiter="~"
question="What would the command `awk '{print $1}' input | sort | uniq -c | sort -rn` output?"
answers="The names of each animal, sorted by which has the biggest number in their line. ~ A list of each unique animal in the file, sorted by which is on the most lines. ~ A list of eaech animal, sorted alphabetically, adding together the numbers that came after them if there were duplicates."
feedback="Not quite - look at the order the commands are being run in the pipeline. ~ Right! We take just the animal names, then sort them so that uniq will work, then ask uniq to count how many of each it saw, and then sort by how many uniq counted. ~ Not quite - look at the order the commands and running in the pipeline."
correct="1" >}}
27 changes: 27 additions & 0 deletions common-content/en/module/tools/tr/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
+++
title = "tr"
headless = true
time = 20
facilitation = false
emoji= "πŸ’»"
[objectives]
1="Replace all occurrences of one character with another using tr"
+++

`tr` translates (replaces) characters.

Learn about `tr` from its man page (and the backlog exercises).

{{<multiple-choice
delimiter="~"
question="What would the command `echo 'hello' | tr 'eo' 'yz'` output?"
answers="hello ~ hyllz ~ hyzllyz"
feedback="Not quite - check how multiple characters in a string are interpreted. ~ Right! Multiple characters in the first arguments means look for any of them. ~ Not quite - check how multiple characters in a string are interpreted."
correct="1" >}}

{{<multiple-choice
delimiter="~"
question="What command could we write to delete all of the vowels from the input?"
answers="tr -d 'aeiou' ~ tr 'aeiou' '' ~ tr -d '[aeiou]'"
feedback="Right - we list all of the vowels as things to delete. ~ No - the no-flag form of tr doesn't allow an empty second string. ~ No - this will remove all of the vowels, but also remove other characters. tr doesn't accept regular expressions."
correct="0" >}}
2 changes: 1 addition & 1 deletion org-cyf-sdc/content/tools/sprints/1/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ layout = 'sprint'
emoji= '⏱️'
menu_level = ['module']
weight = 2
theme = "Shell tools, and how computers work"
theme = "Shell tools and how computers work"
+++
2 changes: 1 addition & 1 deletion org-cyf-sdc/content/tools/sprints/2/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@ layout = 'sprint'
emoji= '⏱️'
menu_level = ['module']
weight = 2
theme = "Shell pipelines, hardware, and programming language concepts"
theme = "Shell pipelines and programming language concepts"
+++
4 changes: 2 additions & 2 deletions org-cyf-sdc/content/tools/sprints/2/backlog/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ layout = 'backlog'
emoji= 'πŸ₯ž'
menu_level = ['sprint']
weight = 2
backlog= 'Module-Template'
backlog_filter='πŸ“… Sprint 1'
backlog= 'Module-Tools'
backlog_filter= 'πŸ“… Sprint 2'
+++
25 changes: 20 additions & 5 deletions org-cyf-sdc/content/tools/sprints/2/prep/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,27 @@ emoji= 'πŸ§‘πŸΎβ€πŸ’»'
menu_level = ['sprint']
weight = 1
[[blocks]]
name="Read about programming languages"
src="module/tools/read-about-programming-languages"
[[blocks]]
name="jq"
src="module/tools/jq"
name="Programming language concepts"
src="module/tools/programming-language-concepts"
[[blocks]]
name = "Shell pipelines"
src="module/tools/shell-pipelines"
[[blocks]]
name="Brian Kernighan on pipelines"
src="https://www.youtube.com/watch?v=bKzonnwoR2I"
[[blocks]]
name="grep"
src="module/tools/grep-in-pipelines"
[[blocks]]
name="sort and uniq"
src="module/tools/sort-and-uniq"
[[blocks]]
name="head and tail"
src="module/tools/head-and-tail"
[[blocks]]
name="tr"
src="module/tools/tr"
[[blocks]]
name="jq"
src="module/tools/jq"
+++

0 comments on commit c0b34be

Please sign in to comment.