Skip to content

Commit

Permalink
Merge pull request #26 from jkshenton/master
Browse files Browse the repository at this point in the history
Add soprano CLI
  • Loading branch information
jkshenton authored Oct 3, 2024
2 parents 2bd5f54 + c70b940 commit 31ee78e
Show file tree
Hide file tree
Showing 138 changed files with 25,646 additions and 2,095 deletions.
7 changes: 3 additions & 4 deletions .github/workflows/docs-build-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,8 @@ jobs:
with:
python-version: 3.11

- name: Install dependencies
run: |
pip install -r requirements.txt
- name: Install Hatch
run: pip install hatch

# (optional) Cache your executed notebooks between runs
# if you have config:
Expand All @@ -51,7 +50,7 @@ jobs:
# Build the book
- name: Build the book
run: |
jupyter-book build docs
hatch run docs:build
# Upload the book's HTML as an artifact
- name: Upload artifact
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ jobs:
- name: Build release distributions
run: |
# Seems to work for Soprano:
python -m pip install build
python -m build
python -m pip install hatch
hatch build
- name: Upload distributions
uses: actions/upload-artifact@v4
Expand All @@ -52,8 +52,8 @@ jobs:
# Dedicated environments with protections for publishing are strongly recommended.
environment:
name: pypi
# OPTIONAL: uncomment and update to include your PyPI project URL in the deployment status:
# url: https://pypi.org/p/YOURPROJECT
# include PyPI project URL in the deployment status:
url: https://pypi.org/project/Soprano

steps:
- name: Retrieve release distributions
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,6 @@ venv.bak/
# Test stuff
tests/test_save/*
tests/*.pkl

# Temporary files from tutorial notebooks
tutorials/_temp_output/*
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ The AtomsCollection class generalises ASE's Atoms class by treating groups of st
Many functions in Soprano require to compute interatomic distances, such as when computing bonds, or estimating NMR dipolar couplings. Soprano always takes the utmost care in dealing with periodic boundaries, using algorithms that ensure that the closest periodic copies are always properly accounted for in a fast and efficient way. This approach can also be used in custom functions as the algorithm can be found in the function `soprano.utils.minimum_periodic`.

### Easy processing of NMR parameters and spectral simulations
ASE can read NMR parameters in the `.magres` file format, but Soprano can turn them to more meaningful physical quantities such as isotropies, anisotropies and asymmetries. In addition, with a full database of NMR active nuclei, Soprano can compute quadrupolar and dipolar couplings for specific isotopes. Finally, Soprano can produce a fast approximation of a powder spectrum - both MAS and static - in the diluted atoms approximation, or if that is not enough for your needs, provide an interface to NMR simulation software [Simpson](http://inano.au.dk/about/research-centers/nmr/software/simpson/).
ASE can read NMR parameters in the `.magres` file format, but Soprano can turn them to more meaningful physical quantities such as isotropies, anisotropies and asymmetries. In addition, with a full database of NMR active nuclei, Soprano can compute quadrupolar and dipolar couplings for specific isotopes. Finally, Soprano can produce a fast approximation of a powder spectrum - both MAS and static - in the diluted atoms approximation, or if that is not enough for your needs, provide an interface to NMR simulation software [Simpson](https://inano.au.dk/about/research-centers-and-projects/nmr/software/simpson).

### Machine learning and phylogenetic analysis
The `soprano.analyse.phylogen` module contains functionality to classify collections of structures based on relevant parameters of choice and identify similarities and patterns using Scipy's hierarchy and k-means clustering algorithms. This can be of great help when analysing collections of potential crystal structure looking for polymorphs, finding defect sites, or analysing disordered systems.
Expand Down
14 changes: 12 additions & 2 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ latex:

# Information about where the book exists on the web
repository:
url: https://github.com/CCP-NC/soprano # Online location of your book
path_to_book: docs # Optional path to your book, relative to the repository root
url: https://github.com/jkshenton/soprano # Online location of your book
# path_to_book: docs # Optional path to your book, relative to the repository root
branch: master # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
Expand All @@ -38,6 +38,10 @@ html:
# google_analytics_id : "" # A GA id that can be used to track book views.
# announcement : "" # A banner announcement at the top of the site.

launch_buttons:
colab_url: "https://colab.research.google.com"
binderhub_url: "https://mybinder.org"

sphinx:
extra_extensions:
- 'sphinx.ext.autodoc'
Expand All @@ -46,6 +50,8 @@ sphinx:
- 'sphinx.ext.autosummary'
- 'sphinxcontrib.mermaid'
- 'sphinx.ext.mathjax'
- 'sphinx_click'
- 'sphinxcontrib.bibtex'
config:
add_module_names: True
html_theme: "sphinx_book_theme"
Expand All @@ -58,3 +64,7 @@ sphinx:
inherited-members: True
private-members: True
show-inheritance: True
# Automatically include type hints in the descriptions
autodoc_typehints: 'description' # Or 'both' if you want them in both the signature and description

bibtex_bibfiles: ['references.bib']
5 changes: 5 additions & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ format: jb-article
root: intro
sections:
- file: installation
- file: cli
sections:
- file: cli-cookbook
- file: tutorials
sections:
- file: tutorials/01-basic_concepts.ipynb
Expand All @@ -10,6 +13,8 @@ sections:
- file: tutorials/04-clustering.ipynb
- file: tutorials/05-nmr.ipynb
- file: tutorials/06-defect_calculations.ipynb
- file: tutorials/07-soprano-cli.ipynb
- file: submitter
- file: api
- file: citing
- file: references
219 changes: 219 additions & 0 deletions docs/cli-cookbook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# CLI Cookbook

## NMR data extraction
The `nmr` subcommand has a number of options to extract NMR data from a Magres file. You can see the full help by running `soprano nmr -h`. Here are some common examples:

* Extract a full summary (will look for both EFG and MS data):

```bash
soprano nmr seedname.magres
```

* Output summary to a CSV file:

```bash
soprano nmr seedname.magres -o summary.csv
```

* Output summary to a JSON file:

```bash
soprano nmr seedname.magres -o summary.json
```

* Extract a full summary for multiple files:

```bash
soprano nmr *.magres
```

* Extract a full summary for multiple files, merging into one table:

```bash
soprano nmr --merge *.magres
```

* Extract just the MS data:

```bash
soprano nmr -p ms seedname.magres
```

* Extract just the MS data for Carbon:

```bash
soprano nmr -p ms -s C seedname.magres
```

* Or just the first 4 Carbon atoms:

```bash
soprano nmr -p ms -s C.1-4 seedname.magres
```

* Extract just the MS data for Carbon and Nitrogen:

```bash
soprano nmr -p ms -s C,N seedname.magres
```

* Extract just MS data for the sites with label H1a:

```bash
soprano nmr -p ms -s H1a seedname.magres
```

* Set chemical shift references and gradients (non-specified references are set to zero and non-specified gradients are set to -1):

```bash
soprano nmr -p ms --references C:170,H:100 --gradients C:-1,H:-0.95 seedname.magres
```

* Set custom isotope

```bash
soprano nmr -p efg --isotopes 13C,2H seedname.magres
```

* By default, Soprano will reduce the structure to the uniques sites (based either on CIF labels or symmetry operations. If you want to disable this, you can use the `--no-reduce` option:

```bash
soprano nmr --no-reduce seedname.magres
```

* You can construct queries that are applied to all loaded magres files using the pandas dataframe query syntax. For example, to extract the MS data for all H sites with a chemical shielding between 100 and 200 ppm *and* an asymmetry parameter greater than 0.5:

```bash
soprano nmr -s H --query "10 < MS_shielding < 30 and MS_asymmetry > 0.5" *.magres
```

## 2D NMR plots

The `plotnmr` subcommand can be used to generate 2D NMR plots from a magres file. Most of the options are the same as for the `nmr` subcommand in terms of filtering sites, setting references, isotopes etc. You can see the full help by running `soprano plotnmr --help`.

Here are some common examples:

* Plot proton-proton correlation spectrum:

```bash
soprano plotnmr -p 2D -x H -y H seedname.magres
```

* Plot C-H correlation spectrum with marker sizes proportional to the dipolar coupling strength. Plot the chemical shift rather than shielding by supplying reference values:

```bash
soprano plotnmr -x C -y H --scale-marker-by dipolar --references C:180,H:30 seedname.magres
```

* As previous, but plot a heatmap and contour lines in addition to the markers:

```bash
soprano plotnmr -x C -y H --scale-marker-by dipolar --references C:180,H:30 --heatmap --contour seedname.magres
```

* Plot the H-H double quantum correlation spectrum:

```bash
soprano plotnmr -p 2D -x H -y H --yaxis-order 2Q seedname.magres
```

* As previous, but averaging over dynamic CH3 and NH3 sites:

```bash
soprano plotnmr -p 2D -x H -y H --yaxis-order 2Q -g CH3,NH3 seedname.magres
```

* By default, Soprano will reduce the system to the inequivalent sites first (e.g. those with the same CIF label or a symmetrically equivalent position). To prevent this, use the `--no-reduce` option:

```bash
soprano plotnmr -p 2D -x H -y H --yaxis-order 2Q -g CH3,NH3 --no-reduce seedname.magres
```

* Impose a distance cut-off (in Å) between pairs of sites:

```bash
soprano plotnmr -p 2D -x C -y H --rcut 1.5 seedname.magres
```

* Combining several of these options:

```bash
soprano plotnmr -p 2D -x C -y H \
-g CH3 \
--rcut 1.5 \
--scale-marker-by dipolar \
--no-markers \
--references C:180,H:30 \
--heatmap \
--colormap "viridis" \
--contour \
--contour-levels 15 \
--contour-color "black" \
--contour-linewidth 0.5 \
seedname.magres
```



## Dipolar Couplings

* Extract dipolar couplings between all pairs of sites:

```bash
soprano dipolar seedname.magres
```

* Extract dipolar couplings between all pairs of sites, outputting to a CSV file:

```bash
soprano dipolar seedname.magres -o dipolar.csv
```

* Extract dipolar couplings between all pairs of sites, and print out those whose absolute value is greater than 10 kHz:

```bash
soprano dipolar --query "abs(D) > 10.0" seedname.magres
```


## Split up molecules

The `splitmols` command can be used to split up a structure into its components (e.g. molecules, framework) based on a connectivity matrix. You can see the full help by running `soprano splitmols --help`. This should work with structure files in any format that ASE can read (= almost all structure formats).

By default the command will output the components to separate extended xyz files. For example

* Split up a structure into molecules within the same unit cell etc. and output to separate .xyz files:

```bash
soprano splitmols seedname.cif
```

* Split up a structure into molecules use the ASE GUI to view the structures (no files are written):

```bash
soprano splitmols seedname.cif --view --no-write
```

* Split up a structure into molecules and output to a directory in the CASTEP .cell format:

```
soprano splitmols seedname.cif -o output_directory -f cell
```

* Center the molecules in a new cell with a 10 Å vacuum spacing:

```bash
soprano splitmols seedname.cif -c --vacuum 10.0
```

* Split a zeolite framework with a molecule in a pore into separate files. Here the `--vdw-scale` option is used to increase the van der Waals radii of the atoms by 30% to ensure that the framework is intact and the molecule is separate. The `--no-cell-indices` option is used to prevent the framework atoms from crossing the cell boundaries. These settings work for the tests/test_data/ZSM-5_withH2O.cif example. In other cases you might need to tweak the vdW values manually using the ` --vdw-custom` flag. Use the `-vvv` verbosity flag to see the vdW radii used.

```bash
soprano splitmols seedname.cif --vdw-scale 1.3 --no-cell-indices
```

* Split the molecules into a new cell defined manually. We can provide the cell as a single float (= cubic cell with that lattice parameter) or as a string with three floats separated by spaces (e.g. `"10 10 20"` for a 10x10x20 Å cell or `"10 10 10 90 90 90"` for a 10x10x10 Å cell with 90° angles) or as a list of 9 floats (e.g. `"10 0 0 0 10 0 0 0 10"`) for a general cell.

```bash
soprano splitmols seedname.cif --cell "10 10 20"
```
9 changes: 9 additions & 0 deletions docs/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Command Line Interface
=======================================================

.. click:: soprano.scripts.cli:soprano
:prog: soprano
:show-nested:



Loading

0 comments on commit 31ee78e

Please sign in to comment.