From af23032e3020068be62f1b3bab97dfd756a488ba Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 13:41:12 +0200 Subject: [PATCH 01/16] Add exercise boilerplate --- exercises/concept/log-parser/.docs/hints.md | 21 ++++ .../concept/log-parser/.docs/instructions.md | 116 ++++++++++++++++++ .../concept/log-parser/.docs/introduction.md | 99 +++++++++++++++ exercises/concept/log-parser/.formatter.exs | 4 + exercises/concept/log-parser/.gitignore | 24 ++++ .../concept/log-parser/.meta/config.json | 19 +++ exercises/concept/log-parser/.meta/design.md | 24 ++++ .../concept/log-parser/.meta/exemplar.ex | 0 .../concept/log-parser/lib/log_parser.ex | 3 + exercises/concept/log-parser/mix.exs | 28 +++++ .../log-parser/test/log_parser_test.exs | 5 + .../concept/log-parser/test/test_helper.exs | 2 + 12 files changed, 345 insertions(+) create mode 100644 exercises/concept/log-parser/.docs/hints.md create mode 100644 exercises/concept/log-parser/.docs/instructions.md create mode 100644 exercises/concept/log-parser/.docs/introduction.md create mode 100644 exercises/concept/log-parser/.formatter.exs create mode 100644 exercises/concept/log-parser/.gitignore create mode 100644 exercises/concept/log-parser/.meta/config.json create mode 100644 exercises/concept/log-parser/.meta/design.md create mode 100644 exercises/concept/log-parser/.meta/exemplar.ex create mode 100644 exercises/concept/log-parser/lib/log_parser.ex create mode 100644 exercises/concept/log-parser/mix.exs create mode 100644 exercises/concept/log-parser/test/log_parser_test.exs create mode 100644 exercises/concept/log-parser/test/test_helper.exs diff --git a/exercises/concept/log-parser/.docs/hints.md b/exercises/concept/log-parser/.docs/hints.md new file mode 100644 index 0000000000..1bd4c66d07 --- /dev/null +++ b/exercises/concept/log-parser/.docs/hints.md @@ -0,0 +1,21 @@ +# Hints + +## General + +- Review regular expression patterns from the introduction. Remember, when creating the pattern a string, you must escape some characters. +- Read about the [`Regex` module][regex-docs] in the documentation. +- Read about the [regular expression sigil][sigils-regex] in the Getting Started guide. +- Check out this website about regular expressions: [Regular-Expressions.info][website-regex-info]. +- Check out this website about regular expressions: [Rex Egg -The world's most tyrannosauical regex tutorial][website-rexegg]. +- Check out this website about regular expressions: [RegexOne - Learn Regular Expressions with simple, interactive exercises.][website-regexone]. +- Check out this website about regular expressions: [Regular Expressions 101 - an online regex sandbox][website-regex-101]. +- Check out this website about regular expressions: [RegExr - an online regex sandbox][website-regexr]. + + +[regex-docs]: https://hexdocs.pm/elixir/Regex.html +[sigils-regex]: https://elixir-lang.org/getting-started/sigils.html#regular-expressions +[website-regex-info]: https://www.regular-expressions.info +[website-rexegg]: https://www.rexegg.com/ +[website-regexone]: https://regexone.com/ +[website-regex-101]: https://regex101.com/ +[website-regexr]: https://regexr.com/ diff --git a/exercises/concept/log-parser/.docs/instructions.md b/exercises/concept/log-parser/.docs/instructions.md new file mode 100644 index 0000000000..fda854e3c2 --- /dev/null +++ b/exercises/concept/log-parser/.docs/instructions.md @@ -0,0 +1,116 @@ +# Instructions + +This exercise addresses the parsing of log files. + +After a recent security review you have been asked to clean up the organization's archived log files. + +All strings passed to the functions are guaranteed to be non-null and without leading and trailing spaces. + +## 1. Identify garbled log lines + +You need some idea of how many log lines in your archive do not comply with current standards. +You believe that a simple test reveals whether a log line is valid. +To be considered valid a line should begin with one of the following strings: + +- [TRC] +- [DBG] +- [INF] +- [WRN] +- [ERR] +- [FTL] + +Implement the `IsValidLine` function to return `false` if a string is not valid otherwise `true`. + +```go +IsValidLine("[ERR] A good error here"); +// => true +IsValidLine("Any old [ERR] text"); +// => false +IsValidLine("[BOB] Any old text"); +// => false +``` + +## 2. Split the log line + +A new team has joined the organization, and you find their log files are using a strange separator for "fields". +Instead of something sensible like a colon ":" they use a string such as "<--->" or "<=>" (because it's prettier) in fact any string that has a first character of "<" and a last character of ">" and any combination of the following characters "~", "\*", "=" and "-" in between. + +Implement the `SplitLogLine` function that takes a line and returns an array of strings each of which contains a field. + +```go +SplitLogLine("section 1<*>section 2<~~~>section 3") +// => []string{"section 1", "section 2", "section 3"}, +``` + +## 3. Count the number of lines containing `password` in quoted text + +The team needs to know about references to passwords in quoted text so that they can be examined manually. + +Implement the `CountQuotedPasswords` function to provide an indication of the likely scale of the manual exercise. + +Identify log lines where the string "password", which may be in any combination of upper or lower case, is surrounded by quotation marks. +You should account for the possibility of additional content between the quotation marks before and after "password". +Each line will contain at most two quotation marks. + +Lines passed to the routine may or may not be valid as defined in task 1. +We process them in the same way, whether or not they are valid. + +```go +lines := []string{ + `[INF] passWord`, // contains 'password' but not surrounded by quotation marks + `"passWord"`, // count this one + `[INF] User saw error message "Unexpected Error" on page load.`, // does not contain 'password' + `[INF] The message "Please reset your password" was ignored by the user`, // count this one +} +// => 2 +``` + +## 4. Remove artifacts from log + +You have found that some upstream processing of the logs has been scattering the text "end-of-line" followed by a line number (without an intervening space) throughout the logs. + +Implement the `RemoveEndOfLineText` function to take a string and remove the end-of-line text and return a "clean" string. + +Lines not containing end-of-line text should be returned unmodified. + +Just remove the end of line string. Do not attempt to adjust the whitespaces. + +```go +RemoveEndOfLineText("[INF] end-of-line23033 Network Failure end-of-line27") +// => "[INF] Network Failure " +``` + +## 5. Tag lines with user names + +You have noticed that some of the log lines include sentences that refer to users. +These sentences always contain the string `"User"`, followed by one or more space characters, and then a user name. +You decide to tag such lines. + +Implement a function `TagWithUserName` that processes log lines: + +- Lines that do not contain the string `"User "` remain unchanged. +- For lines that contain the string `"User "`, prefix the line with `[USR]` followed by the user name. + +For example: + +```go +result := TagWithUserName([]string{ + "[WRN] User James123 has exceeded storage space.", + "[WRN] Host down. User Michelle4 lost connection.", + "[INF] Users can login again after 23:00.", + "[DBG] We need to check that user names are at least 6 chars long.", + +}) +// => []string { +// "[USR] James123 [WRN] User James123 has exceeded storage space.", +// "[USR] Michelle4 [WRN] Host down. User Michelle4 lost connection.", +// "[INF] Users can login again after 23:00.", +// "[DBG] We need to check that user names are at least 6 chars long." +// } +``` + +You can assume that: + +- User names are followed by at least one whitespace character in the log. +- There is at most one occurrence of the string `"User "` in each line. +- User names are non-empty strings that do not contain whitespace. diff --git a/exercises/concept/log-parser/.docs/introduction.md b/exercises/concept/log-parser/.docs/introduction.md new file mode 100644 index 0000000000..ef191dbaed --- /dev/null +++ b/exercises/concept/log-parser/.docs/introduction.md @@ -0,0 +1,99 @@ +# Introduction + +## Regular Expressions + +Regular expressions (regex) are a powerful tool for working with strings in Elixir. Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions). String patterns representing the regular expression's meaning are first compiled then used for matching all or part of a string. + +In Elixir, the most common way to create regular expressions is using the `~r` sigil. Sigils provide _syntactic sugar_ shortcuts for common tasks in Elixir. To match a _string literal_, we can use the string itself as a pattern following the sigil. + +```elixir +~r/test/ +``` + +The `=~/2` operator is useful to perform a regex match on a string to return a `boolean` result. + +```elixir +"this is a test" =~ ~r/test/ +# => true +``` + +Two notes about using sigils: + +- many different delimiters may be used depending on your requirements rather than `/` +- string patterns are already _escaped_, when writing the pattern as a string not using a regex, you will have to _escape_ backslashes (`\`) + +### Character classes + +Matching a range of characters using square brackets `[]` defines a _character class_. This will match any one character to the characters in the class. You can also specify a range of characters like `a-z`, as long as the start and end represent a contiguous range of code points. + +```elixir +regex = ~r/[a-z][ADKZ][0-9][!?]/ +"jZ5!" =~ regex +# => true +"jB5?" =~ regex +# => false +``` + +_Shorthand character classes_ make the pattern more concise. For example: + +- `\d` short for `[0-9]` (any digit) +- `\w` short for `[A-Za-z0-9_]` (any 'word' character) +- `\s` short for `[ \t\r\n\f]` (any whitespace character) + +When a _shorthand character class_ is used outside of a sigil, it must be escaped: `"\\d"` + +### Alternations + +_Alternations_ use `|` as a special character to denote matching one _or_ another + +```elixir +regex = ~r/cat|bat/ +"bat" =~ regex +# => true +"cat" =~ regex +# => true +``` + +### Quantifiers + +_Quantifiers_ allow for a repeating pattern in the regex. They affect the group preceding the quantifier. + +- `{N, M}` where `N` is the minimum number of repetitions, and `M` is the maximum +- `{N,}` match `N` or more repetitions + - `{0,}` may also be written as `*`: match zero-or-more repetitions + - `{1,}` may also be written as `+`: match one-or-more repetitions +- `{,N}` match up to `N` repetitions + +### Groups + +Round brackets `()` are used to denote _groups_ and _captures_. The group may also be _captured_ in some instances to be returned for use. In Elixir, these may be named or un-named. Captures are named by appending `?` after the opening parenthesis. Groups function as a single unit, like when followed by _quantifiers_. + +```elixir +regex = ~r/(h)at/ +Regex.replace(regex, "hat", "\\1op") +# => "hop" + +regex = ~r/(?b)/ +Regex.scan(regex, "blueberry", capture: :all_names) +# => [["b"], ["b"]] +``` + +### Anchors + +_Anchors_ are used to tie the regular expression to the beginning or end of the string to be matched: + +- `^` anchors to the beginning of the string +- `$` anchors to the end of the string + +### Interpolation + +Because the `~r` is a shortcut for `"pattern" |> Regex.escape() |> Regex.compile!()`, you may also use string interpolation to dynamically build a regular expression pattern: + +```elixir +anchor = "$" +regex = ~r/end of the line#{anchor}/ +"end of the line?" =~ regex +# => false +"end of the line" =~ regex +# => true +``` diff --git a/exercises/concept/log-parser/.formatter.exs b/exercises/concept/log-parser/.formatter.exs new file mode 100644 index 0000000000..d2cda26edd --- /dev/null +++ b/exercises/concept/log-parser/.formatter.exs @@ -0,0 +1,4 @@ +# Used by "mix format" +[ + inputs: ["{mix,.formatter}.exs", "{config,lib,test}/**/*.{ex,exs}"] +] diff --git a/exercises/concept/log-parser/.gitignore b/exercises/concept/log-parser/.gitignore new file mode 100644 index 0000000000..737e559ec0 --- /dev/null +++ b/exercises/concept/log-parser/.gitignore @@ -0,0 +1,24 @@ +# The directory Mix will write compiled artifacts to. +/_build/ + +# If you run "mix test --cover", coverage assets end up here. +/cover/ + +# The directory Mix downloads your dependencies sources to. +/deps/ + +# Where third-party dependencies like ExDoc output generated docs. +/doc/ + +# Ignore .fetch files in case you like to edit your project deps locally. +/.fetch + +# If the VM crashes, it generates a dump, let's ignore it too. +erl_crash.dump + +# Also ignore archive artifacts (built via "mix archive.build"). +*.ez + +# Ignore package tarball (built via "mix hex.build"). +regular_expressions-*.tar + diff --git a/exercises/concept/log-parser/.meta/config.json b/exercises/concept/log-parser/.meta/config.json new file mode 100644 index 0000000000..6590333234 --- /dev/null +++ b/exercises/concept/log-parser/.meta/config.json @@ -0,0 +1,19 @@ +{ + "authors": [ + "angelikatyborska" + ], + "contributors": [], + "files": { + "solution": [ + "lib/log_parser.ex" + ], + "test": [ + "test/log_parser_test.exs" + ], + "exemplar": [ + ".meta/exemplar.ex" + ] + }, + "language_versions": ">=1.10", + "blurb": "Learn about regular expressions by parsing dates." +} diff --git a/exercises/concept/log-parser/.meta/design.md b/exercises/concept/log-parser/.meta/design.md new file mode 100644 index 0000000000..277bdb862a --- /dev/null +++ b/exercises/concept/log-parser/.meta/design.md @@ -0,0 +1,24 @@ +# Design + +We assume that the student already knows basic regular expressions. The goal of this exercise is to teach them Elixir-specific details only. + +## Learning objectives + +- Know about the Regex module +- Know about the `=~` operator +- Know that some String functions accept regular expressions, e.g. match?, replace, split. +- Know how to get a value from a captured named group +- Compiling Regular expressions with variable content + +## Out of scope + +- Teaching the syntax of regular expressions + +## Prerequisites + +- `strings` + +## Concepts + +- `regular-expressions` + diff --git a/exercises/concept/log-parser/.meta/exemplar.ex b/exercises/concept/log-parser/.meta/exemplar.ex new file mode 100644 index 0000000000..e69de29bb2 diff --git a/exercises/concept/log-parser/lib/log_parser.ex b/exercises/concept/log-parser/lib/log_parser.ex new file mode 100644 index 0000000000..03ba514284 --- /dev/null +++ b/exercises/concept/log-parser/lib/log_parser.ex @@ -0,0 +1,3 @@ +defmodule LogParser do + +end diff --git a/exercises/concept/log-parser/mix.exs b/exercises/concept/log-parser/mix.exs new file mode 100644 index 0000000000..3e208a9e8b --- /dev/null +++ b/exercises/concept/log-parser/mix.exs @@ -0,0 +1,28 @@ +defmodule LogParser.MixProject do + use Mix.Project + + def project do + [ + app: :log_parser, + version: "0.1.0", + # elixir: "~> 1.10", + start_permanent: Mix.env() == :prod, + deps: deps() + ] + end + + # Run "mix help compile.app" to learn about applications. + def application do + [ + extra_applications: [:logger] + ] + end + + # Run "mix help deps" to learn about dependencies. + defp deps do + [ + # {:dep_from_hexpm, "~> 0.3.0"}, + # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"} + ] + end +end diff --git a/exercises/concept/log-parser/test/log_parser_test.exs b/exercises/concept/log-parser/test/log_parser_test.exs new file mode 100644 index 0000000000..a1b258c8d7 --- /dev/null +++ b/exercises/concept/log-parser/test/log_parser_test.exs @@ -0,0 +1,5 @@ +defmodule DateParserTest do + use ExUnit.Case + + @tag task_id: 1 +end diff --git a/exercises/concept/log-parser/test/test_helper.exs b/exercises/concept/log-parser/test/test_helper.exs new file mode 100644 index 0000000000..e8677a3440 --- /dev/null +++ b/exercises/concept/log-parser/test/test_helper.exs @@ -0,0 +1,2 @@ +ExUnit.start() +ExUnit.configure(exclude: :pending, trace: true, seed: 0) From e959308ee738f7b8826aa77490878f006042f683 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 15:50:02 +0200 Subject: [PATCH 02/16] Fork go's exercise --- .../concept/log-parser/.docs/instructions.md | 110 ++++------ .../concept/log-parser/.meta/config.json | 3 + exercises/concept/log-parser/.meta/design.md | 8 +- .../concept/log-parser/lib/log_parser.ex | 21 ++ .../log-parser/test/log_parser_test.exs | 188 +++++++++++++++++- 5 files changed, 251 insertions(+), 79 deletions(-) diff --git a/exercises/concept/log-parser/.docs/instructions.md b/exercises/concept/log-parser/.docs/instructions.md index fda854e3c2..cbe12075ff 100644 --- a/exercises/concept/log-parser/.docs/instructions.md +++ b/exercises/concept/log-parser/.docs/instructions.md @@ -1,116 +1,74 @@ # Instructions -This exercise addresses the parsing of log files. - After a recent security review you have been asked to clean up the organization's archived log files. -All strings passed to the functions are guaranteed to be non-null and without leading and trailing spaces. - ## 1. Identify garbled log lines You need some idea of how many log lines in your archive do not comply with current standards. You believe that a simple test reveals whether a log line is valid. To be considered valid a line should begin with one of the following strings: -- [TRC] -- [DBG] -- [INF] -- [WRN] -- [ERR] -- [FTL] - -Implement the `IsValidLine` function to return `false` if a string is not valid otherwise `true`. - -```go -IsValidLine("[ERR] A good error here"); -// => true -IsValidLine("Any old [ERR] text"); -// => false -IsValidLine("[BOB] Any old text"); -// => false -``` - -## 2. Split the log line +- [DEBUG] +- [INFO] +- [WARNING] +- [ERROR] -A new team has joined the organization, and you find their log files are using a strange separator for "fields". -Instead of something sensible like a colon ":" they use a string such as "<--->" or "<=>" (because it's prettier) in fact any string that has a first character of "<" and a last character of ">" and any combination of the following characters "~", "\*", "=" and "-" in between. +Implement the `valid_line?/1` function to return `true` if the log line is valid. -Implement the `SplitLogLine` function that takes a line and returns an array of strings each of which contains a field. +```elixir +LogParser.valid_line?("[ERROR] Network Failure") +# => true -```go -SplitLogLine("section 1<*>section 2<~~~>section 3") -// => []string{"section 1", "section 2", "section 3"}, +LogParser.valid_line?("Network Failure") +# => false ``` -## 3. Count the number of lines containing `password` in quoted text - -The team needs to know about references to passwords in quoted text so that they can be examined manually. +## 2. Split the log line -Implement the `CountQuotedPasswords` function to provide an indication of the likely scale of the manual exercise. +Shorting after starting the log parsing project, you realize that one application's logs aren't split into lines like the others. In this project, what should have been separate lines, is instead on a single line, connected by fancy arrows such as `<--->` or `<*~*~>`. -Identify log lines where the string "password", which may be in any combination of upper or lower case, is surrounded by quotation marks. -You should account for the possibility of additional content between the quotation marks before and after "password". -Each line will contain at most two quotation marks. +In fact, any string that has a first character of `<`, a last character of `>`, and any combination of the following characters `~`, `*`, `=`, and `-` in between can be used as a separator in this project's logs. -Lines passed to the routine may or may not be valid as defined in task 1. -We process them in the same way, whether or not they are valid. +Implement the `split_line/1` function that takes a line and returns a list of strings. -```go -lines := []string{ - `[INF] passWord`, // contains 'password' but not surrounded by quotation marks - `"passWord"`, // count this one - `[INF] User saw error message "Unexpected Error" on page load.`, // does not contain 'password' - `[INF] The message "Please reset your password" was ignored by the user`, // count this one -} -// => 2 +```elixir +LogParser.split_line("[INFO] Start.<*>[INFO] Processing...<~~~>[INFO] Success.") +# => ["[INFO] Start.", "[INFO] Processing...", "[INFO] Success."] ``` -## 4. Remove artifacts from log +## 3. Remove artifacts from log You have found that some upstream processing of the logs has been scattering the text "end-of-line" followed by a line number (without an intervening space) throughout the logs. -Implement the `RemoveEndOfLineText` function to take a string and remove the end-of-line text and return a "clean" string. +Implement the `remove_artifacts/1` function to take a string and remove all occurrence end-of-line text and return a clean log line. Lines not containing end-of-line text should be returned unmodified. Just remove the end of line string. Do not attempt to adjust the whitespaces. -```go -RemoveEndOfLineText("[INF] end-of-line23033 Network Failure end-of-line27") -// => "[INF] Network Failure " +```elixir +LogParser.remove_artifacts("[WARNING] end-of-line23033 Network Failure end-of-line27") +# => "[WARNING] Network Failure " ``` -## 5. Tag lines with user names +## 4. Tag lines with user names You have noticed that some of the log lines include sentences that refer to users. -These sentences always contain the string `"User"`, followed by one or more space characters, and then a user name. +These sentences always contain the string `"User"`, followed by one or more whitespace characters, and then a user name. You decide to tag such lines. -Implement a function `TagWithUserName` that processes log lines: - -- Lines that do not contain the string `"User "` remain unchanged. -- For lines that contain the string `"User "`, prefix the line with `[USR]` followed by the user name. - -For example: - -```go -result := TagWithUserName([]string{ - "[WRN] User James123 has exceeded storage space.", - "[WRN] Host down. User Michelle4 lost connection.", - "[INF] Users can login again after 23:00.", - "[DBG] We need to check that user names are at least 6 chars long.", - -}) -// => []string { -// "[USR] James123 [WRN] User James123 has exceeded storage space.", -// "[USR] Michelle4 [WRN] Host down. User Michelle4 lost connection.", -// "[INF] Users can login again after 23:00.", -// "[DBG] We need to check that user names are at least 6 chars long." -// } +Implement a function `tag_with_user_name/1` that processes log lines: + +- Lines that do not contain the string `"User"` remain unchanged. +- For lines that contain the string `"User"`, prefix the line with `[USER]` followed by the user name. + +```elixir +LogParser.tag_with_user_name("[INFO] User Alice created a new project") +# => "[USER] Alice [INFO] User Alice created a new project" ``` You can assume that: -- User names are followed by at least one whitespace character in the log. -- There is at most one occurrence of the string `"User "` in each line. +- Each occurrence of the string `"User"` is followed by one or more whitespace character and the user name. +- There is at most one occurrence of the string `"User"` on each line. - User names are non-empty strings that do not contain whitespace. diff --git a/exercises/concept/log-parser/.meta/config.json b/exercises/concept/log-parser/.meta/config.json index 6590333234..781be3c75e 100644 --- a/exercises/concept/log-parser/.meta/config.json +++ b/exercises/concept/log-parser/.meta/config.json @@ -14,6 +14,9 @@ ".meta/exemplar.ex" ] }, + "forked_from": [ + "go/parsing-log-files" + ], "language_versions": ">=1.10", "blurb": "Learn about regular expressions by parsing dates." } diff --git a/exercises/concept/log-parser/.meta/design.md b/exercises/concept/log-parser/.meta/design.md index 277bdb862a..3b5ec4066a 100644 --- a/exercises/concept/log-parser/.meta/design.md +++ b/exercises/concept/log-parser/.meta/design.md @@ -7,8 +7,10 @@ We assume that the student already knows basic regular expressions. The goal of - Know about the Regex module - Know about the `=~` operator - Know that some String functions accept regular expressions, e.g. match?, replace, split. -- Know how to get a value from a captured named group +- Know about modifiers (e.g. unicode, case-insensitive) +- Know how to get a value from a captured group - Compiling Regular expressions with variable content +- Know that sigils can be used with different delimiters ## Out of scope @@ -17,6 +19,10 @@ We assume that the student already knows basic regular expressions. The goal of ## Prerequisites - `strings` +- `lits` +- `pattern-matching` +- `nil` +- `if` ## Concepts diff --git a/exercises/concept/log-parser/lib/log_parser.ex b/exercises/concept/log-parser/lib/log_parser.ex index 03ba514284..7df5e2a49f 100644 --- a/exercises/concept/log-parser/lib/log_parser.ex +++ b/exercises/concept/log-parser/lib/log_parser.ex @@ -1,3 +1,24 @@ defmodule LogParser do + def valid_line?(line) do + line =~ ~r/^\[(DEBUG|INFO|WARNING|ERROR)\]/ + end + def split_line(line) do + String.split(line, ~r/\<([\*\=\-\~])*\>/) + end + + def remove_artifacts(line) do + String.replace(line, ~r/end-of-line(\d)+/i, "") + end + + def tag_with_user_name(line) do + result = Regex.run(~r/User\s+(\S+)/, line) + + if result == nil do + line + else + [_ | user_name] = result + "[USER] #{user_name} #{line}" + end + end end diff --git a/exercises/concept/log-parser/test/log_parser_test.exs b/exercises/concept/log-parser/test/log_parser_test.exs index a1b258c8d7..e87583247a 100644 --- a/exercises/concept/log-parser/test/log_parser_test.exs +++ b/exercises/concept/log-parser/test/log_parser_test.exs @@ -1,5 +1,189 @@ -defmodule DateParserTest do +defmodule LogParserTest do use ExUnit.Case - @tag task_id: 1 + describe "valid_line?/1" do + @tag task_id: 1 + test "valid DEBUG message" do + assert LogParser.valid_line?("[DEBUG] response time 3ms") == true + end + + @tag task_id: 1 + test "valid INFO message" do + assert LogParser.valid_line?("[INFO] the latest information") == true + end + + @tag task_id: 1 + test "valid WARNING message" do + assert LogParser.valid_line?("[WARNING] something might be wrong") == true + end + + @tag task_id: 1 + test "valid ERROR message" do + assert LogParser.valid_line?("[ERROR] something really bad happened") == true + end + + @tag task_id: 1 + test "unknown level" do + assert LogParser.valid_line?("[BOB] something really bad happened") == false + end + + @tag task_id: 1 + test "line must start with level" do + assert LogParser.valid_line?("bad start [DEBUG] ") == false + end + + @tag task_id: 1 + test "level must be wrapped in square brackets" do + assert LogParser.valid_line?("ERROR something really bad happened") == false + end + + @tag task_id: 1 + test "level must be uppercase" do + assert LogParser.valid_line?("[warning] something might be wrong") == false + end + end + + describe "split_line/1" do + @tag task_id: 2 + test "splits into three sections" do + assert LogParser.split_line("[INFO] Start.<*>[INFO] Processing...<~~~>[INFO] Success.") == [ + "[INFO] Start.", + "[INFO] Processing...", + "[INFO] Success." + ] + end + + @tag task_id: 2 + test "symbols =, ~, *, and - can be freely mixed" do + assert LogParser.split_line( + "[DEBUG] Attempt nr 2<=>[DEBUG] Attempt nr 3<-*~*->[ERROR] Failed to send SMS." + ) == [ + "[DEBUG] Attempt nr 2", + "[DEBUG] Attempt nr 3", + "[ERROR] Failed to send SMS." + ] + end + + @tag task_id: 2 + test "symbols other than =, ~, *, or - do not split" do + assert LogParser.split_line( + "[INFO] Attempt nr 1<=!>[INFO] Attempt nr 2< >[INFO] Attempt nr 3" + ) == [ + "[INFO] Attempt nr 1<=!>[INFO] Attempt nr 2< >[INFO] Attempt nr 3" + ] + end + + @tag task_id: 2 + test "symbols between angular brackets aren't required" do + assert LogParser.split_line("[INFO] Attempt nr 1<>[INFO] Attempt nr 2") == [ + "[INFO] Attempt nr 1", + "[INFO] Attempt nr 2" + ] + end + + @tag task_id: 2 + test "angular brackets are required" do + assert LogParser.split_line("[ERROR] Failed to send SMS**[ERROR] Invalid API key.") == [ + "[ERROR] Failed to send SMS**[ERROR] Invalid API key." + ] + end + + @tag task_id: 2 + test "angular brackets must be closed required" do + assert LogParser.split_line("[ERROR] Failed to send SMS<**[ERROR] Invalid API key.") == [ + "[ERROR] Failed to send SMS<**[ERROR] Invalid API key." + ] + end + end + + describe "remove_artifacts/1" do + @tag task_id: 3 + test "removes a single 'end-of-line' followed by a line number" do + assert LogParser.remove_artifacts("[WARNING] Network Failure end-of-line27") == + "[WARNING] Network Failure " + end + + @tag task_id: 3 + test "leaves other lines unchanged" do + assert LogParser.remove_artifacts("[DEBUG] Process started") == + "[DEBUG] Process started" + end + + @tag task_id: 3 + test "removes multiple 'end-of-line's followed by line numbers" do + assert LogParser.remove_artifacts( + "[WARNING] end-of-line23033 Network Failure end-of-line27" + ) == "[WARNING] Network Failure " + end + + @tag task_id: 3 + test "removes 'end-of-line' and line numbers even when not separated form the rest of the log by a space" do + assert LogParser.remove_artifacts("[WARNING]end-of-line23033Network Failureend-of-line27") == + "[WARNING]Network Failure" + end + + @tag task_id: 3 + test "does not remove 'end-of-line' if not followed by a line number" do + assert LogParser.remove_artifacts("[INFO] end-of-line User disconnected end-of-lineXYZ") == + "[INFO] end-of-line User disconnected end-of-lineXYZ" + end + + @tag task_id: 3 + test "does not remove 'end-of-line' if a number is separated by a space" do + assert LogParser.remove_artifacts("[DEBUG] Query runtime:end-of-line 6ms") == + "[DEBUG] Query runtime:end-of-line 6ms" + end + + @tag task_id: 3 + test "is case-insensitive" do + assert LogParser.remove_artifacts("[DEBUG] END-of-LINE77 Process started End-Of-Line09") == + "[DEBUG] Process started " + end + end + + describe "tag_with_user_name/1" do + @tag task_id: 4 + test "extracts user name and appends it to the line" do + assert LogParser.tag_with_user_name("[WARN] User James123 has exceeded storage space") == + "[USER] James123 [WARN] User James123 has exceeded storage space" + end + + @tag task_id: 4 + test "leaves other lines unchanged" do + assert LogParser.tag_with_user_name("[DEBUG] Process started") == + "[DEBUG] Process started" + end + + @tag task_id: 4 + test "multiple spaces can appear after the word 'User'" do + assert LogParser.tag_with_user_name("[INFO] User Bob9 reported post fxa3qa") == + "[USER] Bob9 [INFO] User Bob9 reported post fxa3qa" + end + + @tag task_id: 4 + test "user name can be delimited by tabs" do + assert LogParser.tag_with_user_name( + "[ERROR] User\t!!!\tdoes not have a valid payment method" + ) == + "[USER] !!! [ERROR] User\t!!!\tdoes not have a valid payment method" + end + + @tag task_id: 4 + test "user name can be delimited by new lines" do + assert LogParser.tag_with_user_name("[DEBUG] Created User\nAlice908101\nat 14:02") == + "[USER] Alice908101 [DEBUG] Created User\nAlice908101\nat 14:02" + end + + @tag task_id: 4 + test "user name can end with the end of the line" do + assert LogParser.tag_with_user_name("[INFO] New log in for User __JOHNNY__") == + "[USER] __JOHNNY__ [INFO] New log in for User __JOHNNY__" + end + + @tag task_id: 4 + test "works for Ukrainian user names with emoji" do + assert LogParser.tag_with_user_name("[INFO] Promoted User АНАСТАСІЯ_🙂 to admin") == + "[USER] АНАСТАСІЯ_🙂 [INFO] Promoted User АНАСТАСІЯ_🙂 to admin" + end + end end From 0ff62499b8193e8857a80a747b2fceb94b2cd263 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 16:57:48 +0200 Subject: [PATCH 03/16] Update config --- config.json | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/config.json b/config.json index a119cc80a0..dd96eb26d7 100644 --- a/config.json +++ b/config.json @@ -219,13 +219,11 @@ "slug": "date-parser", "name": "Date Parser", "uuid": "57198686-71c9-4f38-973a-a111435560e7", - "concepts": [ - "regular-expressions" - ], + "concepts": [], "prerequisites": [ "strings" ], - "status": "active" + "status": "deprecated" }, { "slug": "rpg-character-sheet", @@ -650,6 +648,22 @@ "typespecs" ], "status": "beta" + }, + { + "slug": "log-parser", + "name": "Log Parser", + "uuid": "708fa8b8-59b9-43e7-be40-74759c3cc9a4", + "concepts": [ + "regular-expressions" + ], + "prerequisites": [ + "strings", + "lists", + "pattern-matching", + "nil", + "if" + ], + "status": "beta" } ], "practice": [ From 134e8d0d05ea3524049ec7ad52a53dd8a5d1c70a Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 17:58:20 +0200 Subject: [PATCH 04/16] Write hints --- exercises/concept/log-parser/.docs/hints.md | 27 +++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/exercises/concept/log-parser/.docs/hints.md b/exercises/concept/log-parser/.docs/hints.md index 1bd4c66d07..63be68d144 100644 --- a/exercises/concept/log-parser/.docs/hints.md +++ b/exercises/concept/log-parser/.docs/hints.md @@ -11,6 +11,23 @@ - Check out this website about regular expressions: [Regular Expressions 101 - an online regex sandbox][website-regex-101]. - Check out this website about regular expressions: [RegExr - an online regex sandbox][website-regexr]. +## 1. Identify garbled log lines + +- Use the [`r` sigil][sigil-r] to create a regular expression. +- There is [an operator]([match-operator]) that can be used to check a string against a regular expression. There is also a [`Regex` function][regex-match] and a [`String` function][string-match] that can do the same. + +## 2. Split the log line + +- There is a [`Regex` function][regex-split] as well as a [`String` function][string-split] that can split a string into a list of strings based on a regular expression. + +## 3. Remove artifacts from log + +- There is a [`Regex` function][regex-replace] as well as a [`String` function][string-replace] that can change a part of a string that matches a given regular expression to a different string. +- There is a [modifier][regex-modifiers] that can make the whole regular expression case-insensitive. + +## 4. Tag lines with user names + +There is a [`Regex` function][regex-run] that runs a regular expression against a string and returns all captures. [regex-docs]: https://hexdocs.pm/elixir/Regex.html [sigils-regex]: https://elixir-lang.org/getting-started/sigils.html#regular-expressions @@ -19,3 +36,13 @@ [website-regexone]: https://regexone.com/ [website-regex-101]: https://regex101.com/ [website-regexr]: https://regexr.com/ +[sigil-r]: https://hexdocs.pm/elixir/Kernel.html#sigil_r/2 +[match-operator]: https://hexdocs.pm/elixir/Kernel.html#=~/2 +[regex-match]: https://hexdocs.pm/elixir/Regex.html#match?/2 +[string-match]: https://hexdocs.pm/elixir/String.html#match?/2 +[regex-split]: https://hexdocs.pm/elixir/Regex.html#split/3 +[string-split]: https://hexdocs.pm/elixir/String.html#split/3 +[regex-replace]: https://hexdocs.pm/elixir/Regex.html#replace/4 +[string-replace]: https://hexdocs.pm/elixir/String.html#replace/4 +[regex-modifiers]: https://hexdocs.pm/elixir/Regex.html#module-modifiers +[regex-run]: https://hexdocs.pm/elixir/Regex.html#run/3 From 342a93fc1e76232a9efc453b72d02a82b1584ff2 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 17:58:35 +0200 Subject: [PATCH 05/16] Prepare boilerplate solution --- .../concept/log-parser/.meta/exemplar.ex | 24 +++++++++++++++++++ .../concept/log-parser/lib/log_parser.ex | 15 ++++-------- 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/exercises/concept/log-parser/.meta/exemplar.ex b/exercises/concept/log-parser/.meta/exemplar.ex index e69de29bb2..7df5e2a49f 100644 --- a/exercises/concept/log-parser/.meta/exemplar.ex +++ b/exercises/concept/log-parser/.meta/exemplar.ex @@ -0,0 +1,24 @@ +defmodule LogParser do + def valid_line?(line) do + line =~ ~r/^\[(DEBUG|INFO|WARNING|ERROR)\]/ + end + + def split_line(line) do + String.split(line, ~r/\<([\*\=\-\~])*\>/) + end + + def remove_artifacts(line) do + String.replace(line, ~r/end-of-line(\d)+/i, "") + end + + def tag_with_user_name(line) do + result = Regex.run(~r/User\s+(\S+)/, line) + + if result == nil do + line + else + [_ | user_name] = result + "[USER] #{user_name} #{line}" + end + end +end diff --git a/exercises/concept/log-parser/lib/log_parser.ex b/exercises/concept/log-parser/lib/log_parser.ex index 7df5e2a49f..e89de92564 100644 --- a/exercises/concept/log-parser/lib/log_parser.ex +++ b/exercises/concept/log-parser/lib/log_parser.ex @@ -1,24 +1,17 @@ defmodule LogParser do def valid_line?(line) do - line =~ ~r/^\[(DEBUG|INFO|WARNING|ERROR)\]/ + # Please implement the valid_line?/1 function end def split_line(line) do - String.split(line, ~r/\<([\*\=\-\~])*\>/) + # Please implement the split_line/1 function end def remove_artifacts(line) do - String.replace(line, ~r/end-of-line(\d)+/i, "") + # Please implement the remove_artifacts/1 function end def tag_with_user_name(line) do - result = Regex.run(~r/User\s+(\S+)/, line) - - if result == nil do - line - else - [_ | user_name] = result - "[USER] #{user_name} #{line}" - end + # Please implement the tag_with_user_name/1 function end end From 0df8ca4a9f0ebe767912680a63ef840b9b6323fe Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 17:58:51 +0200 Subject: [PATCH 06/16] Write simpler introduction --- concepts/regular-expressions/introduction.md | 95 +++++-------------- .../concept/log-parser/.docs/introduction.md | 95 +++++-------------- 2 files changed, 50 insertions(+), 140 deletions(-) diff --git a/concepts/regular-expressions/introduction.md b/concepts/regular-expressions/introduction.md index c4bb608799..8482df58ef 100644 --- a/concepts/regular-expressions/introduction.md +++ b/concepts/regular-expressions/introduction.md @@ -1,95 +1,52 @@ # Introduction -Regular expressions (regex) are a powerful tool for working with strings in Elixir. Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions). String patterns representing the regular expression's meaning are first compiled then used for matching all or part of a string. +## Regular Expressions -In Elixir, the most common way to create regular expressions is using the `~r` sigil. Sigils provide _syntactic sugar_ shortcuts for common tasks in Elixir. To match a _string literal_, we can use the string itself as a pattern following the sigil. +Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions), similarly to other popular languages like Java, JavaScript, or Ruby. -```elixir -~r/test/ -``` - -The `=~/2` operator is useful to perform a regex match on a string to return a `boolean` result. +The `Regex` module offers functions for working with regular expressions. Some of the `String` module functions accept regular expressions as arguments as well. -```elixir -"this is a test" =~ ~r/test/ -# => true -``` +~~~~exercism/note +This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures. +~~~~ -Two notes about using sigils: +### Sigils -- many different delimiters may be used depending on your requirements rather than `/` -- string patterns are already _escaped_, when writing the pattern as a string not using a regex, you will have to _escape_ backslashes (`\`) - -## Character classes - -Matching a range of characters using square brackets `[]` defines a _character class_. This will match any one character to the characters in the class. You can also specify a range of characters like `a-z`, as long as the start and end represent a contiguous range of code points. +The most common way to create regular expressions is using the `~r` sigil. ```elixir -regex = ~r/[a-z][ADKZ][0-9][!?]/ -"jZ5!" =~ regex -# => true -"jB5?" =~ regex -# => false +~r/test/ ``` -_Shorthand character classes_ make the pattern more concise. For example: - -- `\d` short for `[0-9]` (any digit) -- `\w` short for `[A-Za-z0-9_]` (any 'word' character) -- `\s` short for `[ \t\r\n\f]` (any whitespace character) - -When a _shorthand character class_ is used outside of a sigil, it must be escaped: `"\\d"` +Note that all Elixir sigils support [different kinds of delimiters][sigils], not only `/`. -## Alternations +### Matching -_Alternations_ use `|` as a special character to denote matching one _or_ another +The `=~/2` can be used to perform a regex match that returns `boolean` result. Alternatively, there are also `match/3` functions in the `Regex` module as well as the `String` module. ```elixir -regex = ~r/cat|bat/ -"bat" =~ regex -# => true -"cat" =~ regex +"this is a test" =~ ~r/test/ # => true -``` - -## Quantifiers - -_Quantifiers_ allow for a repeating pattern in the regex. They affect the group preceding the quantifier. - -- `{N, M}` where `N` is the minimum number of repetitions, and `M` is the maximum -- `{N,}` match `N` or more repetitions - - `{0,}` may also be written as `*`: match zero-or-more repetitions - - `{1,}` may also be written as `+`: match one-or-more repetitions -- `{,N}` match up to `N` repetitions - -## Groups -Round brackets `()` are used to denote _groups_ and _captures_. The group may also be _captured_ in some instances to be returned for use. In Elixir, these may be named or un-named. Captures are named by appending `?` after the opening parenthesis. Groups function as a single unit, like when followed by _quantifiers_. - -```elixir -regex = ~r/(h)at/ -Regex.replace(regex, "hat", "\\1op") -# => "hop" - -regex = ~r/(?b)/ -Regex.scan(regex, "blueberry", capture: :all_names) -# => [["b"], ["b"]] +String.match?("Alice has 7 apples", ~r/\d{2}/) +# => false ``` -## Anchors +### Capturing -_Anchors_ are used to tie the regular expression to the beginning or end of the string to be matched: +If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups. -- `^` anchors to the beginning of the string -- `$` anchors to the end of the string +### Modifiers -Because the `~r` is a shortcut for `"pattern" |> Regex.escape() |> Regex.compile!()`, you may also use string interpolation to dynamically build a regular expression pattern: +The behavior of a regular expression can be modified by appending special flags. When using a sigil to create a regular expression, add the modifiers after the second delimiter. + +Common modifiers are: +- `i` - makes the match case-insensitive. +- `u` - enables Unicode specific patterns like `\p` snf causes character classes like `\w`, `\s` etc. to also match Unicode. ```elixir -anchor = "$" -regex = ~r/end of the line#{anchor}/ -"end of the line?" =~ regex -# => false -"end of the line" =~ regex +"this is a TEST" =~ ~r/test/i # => true ``` + +[sigils]: https://hexdocs.pm/elixir/syntax-reference.html#sigils diff --git a/exercises/concept/log-parser/.docs/introduction.md b/exercises/concept/log-parser/.docs/introduction.md index ef191dbaed..8482df58ef 100644 --- a/exercises/concept/log-parser/.docs/introduction.md +++ b/exercises/concept/log-parser/.docs/introduction.md @@ -2,98 +2,51 @@ ## Regular Expressions -Regular expressions (regex) are a powerful tool for working with strings in Elixir. Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions). String patterns representing the regular expression's meaning are first compiled then used for matching all or part of a string. +Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions), similarly to other popular languages like Java, JavaScript, or Ruby. -In Elixir, the most common way to create regular expressions is using the `~r` sigil. Sigils provide _syntactic sugar_ shortcuts for common tasks in Elixir. To match a _string literal_, we can use the string itself as a pattern following the sigil. +The `Regex` module offers functions for working with regular expressions. Some of the `String` module functions accept regular expressions as arguments as well. -```elixir -~r/test/ -``` - -The `=~/2` operator is useful to perform a regex match on a string to return a `boolean` result. +~~~~exercism/note +This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures. +~~~~ -```elixir -"this is a test" =~ ~r/test/ -# => true -``` +### Sigils -Two notes about using sigils: - -- many different delimiters may be used depending on your requirements rather than `/` -- string patterns are already _escaped_, when writing the pattern as a string not using a regex, you will have to _escape_ backslashes (`\`) - -### Character classes - -Matching a range of characters using square brackets `[]` defines a _character class_. This will match any one character to the characters in the class. You can also specify a range of characters like `a-z`, as long as the start and end represent a contiguous range of code points. +The most common way to create regular expressions is using the `~r` sigil. ```elixir -regex = ~r/[a-z][ADKZ][0-9][!?]/ -"jZ5!" =~ regex -# => true -"jB5?" =~ regex -# => false +~r/test/ ``` -_Shorthand character classes_ make the pattern more concise. For example: - -- `\d` short for `[0-9]` (any digit) -- `\w` short for `[A-Za-z0-9_]` (any 'word' character) -- `\s` short for `[ \t\r\n\f]` (any whitespace character) +Note that all Elixir sigils support [different kinds of delimiters][sigils], not only `/`. -When a _shorthand character class_ is used outside of a sigil, it must be escaped: `"\\d"` +### Matching -### Alternations - -_Alternations_ use `|` as a special character to denote matching one _or_ another +The `=~/2` can be used to perform a regex match that returns `boolean` result. Alternatively, there are also `match/3` functions in the `Regex` module as well as the `String` module. ```elixir -regex = ~r/cat|bat/ -"bat" =~ regex -# => true -"cat" =~ regex +"this is a test" =~ ~r/test/ # => true -``` - -### Quantifiers - -_Quantifiers_ allow for a repeating pattern in the regex. They affect the group preceding the quantifier. - -- `{N, M}` where `N` is the minimum number of repetitions, and `M` is the maximum -- `{N,}` match `N` or more repetitions - - `{0,}` may also be written as `*`: match zero-or-more repetitions - - `{1,}` may also be written as `+`: match one-or-more repetitions -- `{,N}` match up to `N` repetitions -### Groups - -Round brackets `()` are used to denote _groups_ and _captures_. The group may also be _captured_ in some instances to be returned for use. In Elixir, these may be named or un-named. Captures are named by appending `?` after the opening parenthesis. Groups function as a single unit, like when followed by _quantifiers_. - -```elixir -regex = ~r/(h)at/ -Regex.replace(regex, "hat", "\\1op") -# => "hop" - -regex = ~r/(?b)/ -Regex.scan(regex, "blueberry", capture: :all_names) -# => [["b"], ["b"]] +String.match?("Alice has 7 apples", ~r/\d{2}/) +# => false ``` -### Anchors +### Capturing -_Anchors_ are used to tie the regular expression to the beginning or end of the string to be matched: +If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups. -- `^` anchors to the beginning of the string -- `$` anchors to the end of the string +### Modifiers -### Interpolation +The behavior of a regular expression can be modified by appending special flags. When using a sigil to create a regular expression, add the modifiers after the second delimiter. -Because the `~r` is a shortcut for `"pattern" |> Regex.escape() |> Regex.compile!()`, you may also use string interpolation to dynamically build a regular expression pattern: +Common modifiers are: +- `i` - makes the match case-insensitive. +- `u` - enables Unicode specific patterns like `\p` snf causes character classes like `\w`, `\s` etc. to also match Unicode. ```elixir -anchor = "$" -regex = ~r/end of the line#{anchor}/ -"end of the line?" =~ regex -# => false -"end of the line" =~ regex +"this is a TEST" =~ ~r/test/i # => true ``` + +[sigils]: https://hexdocs.pm/elixir/syntax-reference.html#sigils From 78276409191771e66ab7c6545844dfdb1ff1e703 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 17:59:59 +0200 Subject: [PATCH 07/16] Give up on interpolated regexes --- exercises/concept/log-parser/.meta/design.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercises/concept/log-parser/.meta/design.md b/exercises/concept/log-parser/.meta/design.md index 3b5ec4066a..c55c7a8b61 100644 --- a/exercises/concept/log-parser/.meta/design.md +++ b/exercises/concept/log-parser/.meta/design.md @@ -9,12 +9,12 @@ We assume that the student already knows basic regular expressions. The goal of - Know that some String functions accept regular expressions, e.g. match?, replace, split. - Know about modifiers (e.g. unicode, case-insensitive) - Know how to get a value from a captured group -- Compiling Regular expressions with variable content - Know that sigils can be used with different delimiters ## Out of scope - Teaching the syntax of regular expressions +- Compiling Regular expressions with variable content ## Prerequisites From 14b4244bc58c08d5df8a9ab2c31705400e4bb461 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:00:28 +0200 Subject: [PATCH 08/16] Fix gitignore --- exercises/concept/log-parser/.gitignore | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercises/concept/log-parser/.gitignore b/exercises/concept/log-parser/.gitignore index 737e559ec0..f1877a57ac 100644 --- a/exercises/concept/log-parser/.gitignore +++ b/exercises/concept/log-parser/.gitignore @@ -20,5 +20,5 @@ erl_crash.dump *.ez # Ignore package tarball (built via "mix hex.build"). -regular_expressions-*.tar +log-parser-*.tar From b97fd3cc57d210eba9411714a475011d2a2c2283 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:02:00 +0200 Subject: [PATCH 09/16] Fix config --- config.json | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/config.json b/config.json index dd96eb26d7..4e46f7f7b7 100644 --- a/config.json +++ b/config.json @@ -220,9 +220,7 @@ "name": "Date Parser", "uuid": "57198686-71c9-4f38-973a-a111435560e7", "concepts": [], - "prerequisites": [ - "strings" - ], + "prerequisites": [], "status": "deprecated" }, { From b07703aa56555f0d89f0a41abc6a512d749fb230 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:03:44 +0200 Subject: [PATCH 10/16] Fix formatting config --- exercises/concept/log-parser/.meta/config.json | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/exercises/concept/log-parser/.meta/config.json b/exercises/concept/log-parser/.meta/config.json index 781be3c75e..4950f2e9d1 100644 --- a/exercises/concept/log-parser/.meta/config.json +++ b/exercises/concept/log-parser/.meta/config.json @@ -2,7 +2,6 @@ "authors": [ "angelikatyborska" ], - "contributors": [], "files": { "solution": [ "lib/log_parser.ex" @@ -14,9 +13,9 @@ ".meta/exemplar.ex" ] }, + "language_versions": ">=1.10", "forked_from": [ "go/parsing-log-files" ], - "language_versions": ">=1.10", "blurb": "Learn about regular expressions by parsing dates." } From 531d46ea13c3ec6ea6718d36a15ea7d1f893185b Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:13:00 +0200 Subject: [PATCH 11/16] Hints need to be a list --- exercises/concept/log-parser/.docs/hints.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercises/concept/log-parser/.docs/hints.md b/exercises/concept/log-parser/.docs/hints.md index 63be68d144..76d5e06015 100644 --- a/exercises/concept/log-parser/.docs/hints.md +++ b/exercises/concept/log-parser/.docs/hints.md @@ -27,7 +27,7 @@ ## 4. Tag lines with user names -There is a [`Regex` function][regex-run] that runs a regular expression against a string and returns all captures. +- There is a [`Regex` function][regex-run] that runs a regular expression against a string and returns all captures. [regex-docs]: https://hexdocs.pm/elixir/Regex.html [sigils-regex]: https://elixir-lang.org/getting-started/sigils.html#regular-expressions From 6c175fbbf67562f55d87ae43b7f331ebf9ef69ae Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:13:37 +0200 Subject: [PATCH 12/16] Friendlier tone --- exercises/concept/log-parser/.docs/instructions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercises/concept/log-parser/.docs/instructions.md b/exercises/concept/log-parser/.docs/instructions.md index cbe12075ff..31e1f19afe 100644 --- a/exercises/concept/log-parser/.docs/instructions.md +++ b/exercises/concept/log-parser/.docs/instructions.md @@ -44,7 +44,7 @@ Implement the `remove_artifacts/1` function to take a string and remove all occu Lines not containing end-of-line text should be returned unmodified. -Just remove the end of line string. Do not attempt to adjust the whitespaces. +Just remove the end of line string, there's no need to adjust the whitespaces. ```elixir LogParser.remove_artifacts("[WARNING] end-of-line23033 Network Failure end-of-line27") From f76d1494b79bb3a0ca41c1e6978f9cd6efdcdfb3 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:14:10 +0200 Subject: [PATCH 13/16] Fix headling levels in concept introduction --- concepts/regular-expressions/introduction.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/concepts/regular-expressions/introduction.md b/concepts/regular-expressions/introduction.md index 8482df58ef..a6c0857cd6 100644 --- a/concepts/regular-expressions/introduction.md +++ b/concepts/regular-expressions/introduction.md @@ -1,7 +1,5 @@ # Introduction -## Regular Expressions - Regular expressions in Elixir follow the **PCRE** specification (**P**erl **C**ompatible **R**egular **E**xpressions), similarly to other popular languages like Java, JavaScript, or Ruby. The `Regex` module offers functions for working with regular expressions. Some of the `String` module functions accept regular expressions as arguments as well. @@ -10,7 +8,7 @@ The `Regex` module offers functions for working with regular expressions. Some o This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures. ~~~~ -### Sigils +## Sigils The most common way to create regular expressions is using the `~r` sigil. @@ -20,7 +18,7 @@ The most common way to create regular expressions is using the `~r` sigil. Note that all Elixir sigils support [different kinds of delimiters][sigils], not only `/`. -### Matching +## Matching The `=~/2` can be used to perform a regex match that returns `boolean` result. Alternatively, there are also `match/3` functions in the `Regex` module as well as the `String` module. @@ -32,11 +30,11 @@ String.match?("Alice has 7 apples", ~r/\d{2}/) # => false ``` -### Capturing +## Capturing If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups. -### Modifiers +## Modifiers The behavior of a regular expression can be modified by appending special flags. When using a sigil to create a regular expression, add the modifiers after the second delimiter. From 876b20885755fc570333919d12f14696c1e2c349 Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sun, 12 Jun 2022 18:14:46 +0200 Subject: [PATCH 14/16] Update blurb --- exercises/concept/log-parser/.meta/config.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercises/concept/log-parser/.meta/config.json b/exercises/concept/log-parser/.meta/config.json index 4950f2e9d1..32ffb975d3 100644 --- a/exercises/concept/log-parser/.meta/config.json +++ b/exercises/concept/log-parser/.meta/config.json @@ -17,5 +17,5 @@ "forked_from": [ "go/parsing-log-files" ], - "blurb": "Learn about regular expressions by parsing dates." + "blurb": "Learn about regular expressions by parsing logs." } From a133b42b28f34414761df842f73a3cf5913e7d9e Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sat, 30 Jul 2022 10:22:12 +0200 Subject: [PATCH 15/16] Add tutorial links to assumed knowledge note --- concepts/regular-expressions/introduction.md | 2 ++ exercises/concept/log-parser/.docs/hints.md | 4 ++-- exercises/concept/log-parser/.docs/introduction.md | 7 +++++++ 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/concepts/regular-expressions/introduction.md b/concepts/regular-expressions/introduction.md index a6c0857cd6..5744dfbf3b 100644 --- a/concepts/regular-expressions/introduction.md +++ b/concepts/regular-expressions/introduction.md @@ -6,6 +6,8 @@ The `Regex` module offers functions for working with regular expressions. Some o ~~~~exercism/note This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures. + +If you need a refresh your regular expression knowledge, check out one of those sources: [Regular-Expressions.info][website-regex-info], [Rex Egg][website-rexegg], [RegexOne][website-regexone], [Regular Expressions 101][website-regex-101], [RegExr][website-regexr]. ~~~~ ## Sigils diff --git a/exercises/concept/log-parser/.docs/hints.md b/exercises/concept/log-parser/.docs/hints.md index 76d5e06015..9f8e01f5f6 100644 --- a/exercises/concept/log-parser/.docs/hints.md +++ b/exercises/concept/log-parser/.docs/hints.md @@ -6,8 +6,8 @@ - Read about the [`Regex` module][regex-docs] in the documentation. - Read about the [regular expression sigil][sigils-regex] in the Getting Started guide. - Check out this website about regular expressions: [Regular-Expressions.info][website-regex-info]. -- Check out this website about regular expressions: [Rex Egg -The world's most tyrannosauical regex tutorial][website-rexegg]. -- Check out this website about regular expressions: [RegexOne - Learn Regular Expressions with simple, interactive exercises.][website-regexone]. +- Check out this website about regular expressions: [Rex Egg - The world's most tyrannosauical regex tutorial][website-rexegg]. +- Check out this website about regular expressions: [RegexOne - Learn Regular Expressions with simple, interactive exercises][website-regexone]. - Check out this website about regular expressions: [Regular Expressions 101 - an online regex sandbox][website-regex-101]. - Check out this website about regular expressions: [RegExr - an online regex sandbox][website-regexr]. diff --git a/exercises/concept/log-parser/.docs/introduction.md b/exercises/concept/log-parser/.docs/introduction.md index 8482df58ef..14926d8a2d 100644 --- a/exercises/concept/log-parser/.docs/introduction.md +++ b/exercises/concept/log-parser/.docs/introduction.md @@ -8,6 +8,8 @@ The `Regex` module offers functions for working with regular expressions. Some o ~~~~exercism/note This exercise assumes that you already know regular expression syntax, including character classes, quantifiers, groups, and captures. + +If you need a refresh your regular expression knowledge, check out one of those sources: [Regular-Expressions.info][website-regex-info], [Rex Egg][website-rexegg], [RegexOne][website-regexone], [Regular Expressions 101][website-regex-101], [RegExr][website-regexr]. ~~~~ ### Sigils @@ -50,3 +52,8 @@ Common modifiers are: ``` [sigils]: https://hexdocs.pm/elixir/syntax-reference.html#sigils +[website-regex-info]: https://www.regular-expressions.info +[website-rexegg]: https://www.rexegg.com/ +[website-regexone]: https://regexone.com/ +[website-regex-101]: https://regex101.com/ +[website-regexr]: https://regexr.com/ From 394686e70c2e022d5be9ae9a4eeb4ee6c719470c Mon Sep 17 00:00:00 2001 From: Angelika Tyborska Date: Sat, 30 Jul 2022 10:29:15 +0200 Subject: [PATCH 16/16] Add code example for capturing --- concepts/regular-expressions/introduction.md | 7 ++++++- exercises/concept/log-parser/.docs/introduction.md | 7 ++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/concepts/regular-expressions/introduction.md b/concepts/regular-expressions/introduction.md index 5744dfbf3b..22d3d93b38 100644 --- a/concepts/regular-expressions/introduction.md +++ b/concepts/regular-expressions/introduction.md @@ -34,7 +34,12 @@ String.match?("Alice has 7 apples", ~r/\d{2}/) ## Capturing -If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups. +If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always a match for the whole regular expression, and the following elements are matched groups. + +```elixir +Regex.run(~r/(\d) apples/, "Alice has 7 apples") +# => ["7 apples", "7"] +``` ## Modifiers diff --git a/exercises/concept/log-parser/.docs/introduction.md b/exercises/concept/log-parser/.docs/introduction.md index 14926d8a2d..141e20b9c3 100644 --- a/exercises/concept/log-parser/.docs/introduction.md +++ b/exercises/concept/log-parser/.docs/introduction.md @@ -36,7 +36,12 @@ String.match?("Alice has 7 apples", ~r/\d{2}/) ### Capturing -If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always the whole string, and the following elements are matched groups. +If a simple boolean check is not enough, use the `Regex.run/3` function to get a list of all captures (or `nil` if there was no match). The first element in the returned list is always a match for the whole regular expression, and the following elements are matched groups. + +```elixir +Regex.run(~r/(\d) apples/, "Alice has 7 apples") +# => ["7 apples", "7"] +``` ### Modifiers