Skip to content

Commit

Permalink
Add more assertion docs for directories
Browse files Browse the repository at this point in the history
  • Loading branch information
jfy133 committed Jan 18, 2025
1 parent 51d3c20 commit a6d7fe1
Showing 1 changed file with 42 additions and 0 deletions.
42 changes: 42 additions & 0 deletions sites/docs/src/content/docs/contributing/nf-test/assertions.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,10 @@ which is equivalent to
log[0][1]
```

The first `get(0)` or `[0]` corresponds to the emitted channel object itself.

The `get(1)` or `[1]` corresponds to the second object of the channel object. Most nf-core modules and pipelines typically emit two sub-components of an object: a meta map and the file(s)/directories etc.. Specifying `get(q)` or `[1]` thus corresponds to the file(s)/directories for recording in a an snapshot.

## Additional Reading

- [Updating Snapshots](https://code.askimed.com/nf-test/docs/assertions/snapshots/#updating-snapshots)
Expand Down Expand Up @@ -316,6 +320,44 @@ _Motivation:_ I want to check the presence of a specific string or data pattern

_Explanation:_ check if a specific string (`"MT192765.1\t10214\t.\tATTTAC\tATTAC\t29.8242"`) is present in the content of a gzipped file, specified by `path(process.out.vcf[0][1]).linesGzip.toString()`.

### Snapshotting variable files in a channel emitting a directory

_Motivation_: If a channel emits just a directory, by default nf-test will recursively list all files in that and all sub directories, and generate md5sums of all the files. However, in some cases, _some_ of the files in the directory may have unstable md5sums. I want to snapshot all stable files with md5sums, but only snapshot names of unstable files.

```bash
then {
def stablefiles = []
file(process.out.db.get(0).get(1)).eachFileRecurse{ file -> if (!file.isDirectory() && !["database.log", "database.fastaid2LCAtaxid", "database.taxids_with_multiple_offspring"].find {file.toString().endsWith(it)}) {stablefiles.add(file)} }
def unstablefiles = []
file(process.out.db.get(0).get(1)).eachFileRecurse{ file -> if (["database.log", "database.fastaid2LCAtaxid", "database.taxids_with_multiple_offspring"].find {file.toString().endsWith(it)}) {unstablefiles.add(file.getName().toString())} }
assertAll(
{ assert process.success },
{ assert snapshot(
stablefiles.sort(),
unstablefiles.sort(),
process.out.versions
).match() }
)
}
```
_Explanation_: We create two lists of files paths within the emitted directory, filter these two for stable and unstable files respectively, and snapshot the lists of paths.
In more detail, we generate an empty list (`stablefiles`). We then retrieve the directory from the channel `db` using `get(1)` (rather than the meta), and retrieve all files and directories that are inside that directory using `endFileRecurse` and, however we only append to the list (`.add(file)`) those files that are not a directory and not paths that end in (`endsWith`) the file names identified as unstable (`"database.log", "database.fastaid2LCAtaxid", "database.taxids_with_multiple_offspring"`).
We then do the reverse (`unstablefiles`), where we loop again through the directory, but this time append only files that _do_ match the identified unstable file names. However do not append the path itself, but just the filename by converting to a string (`getName().toString()`) when adding to the list.
We finally pass the two lists of paths to the `assertAll`, but for the unstable files
:::note
We have to explicitly exclude directories in the first case, because `eachFileRecurse` includes directories when listing all files.
If directories are included in the list of files to be snapshot, nf-test by default looks inside any directory in the list (here called `stablefiles`) and also runs an md5sum on any file in the listed directory as well.
Therefore, even if you explicitly exclude the file during the `endFileRercurse` and `find` function, and thus it is not explicitly in the `stablefiles` list itself, the file will still be picked by nf-test via the directory.
Therefore, by excluding directories, you do not get an accidental 'double' listing of files you wish to exclude.
:::
## Useful nf-test operators and functions
### Regular Expressions
Expand Down

0 comments on commit a6d7fe1

Please sign in to comment.