Skip to content

Commit

Permalink
Docs (#68)
Browse files Browse the repository at this point in the history
  • Loading branch information
marcosfelt authored Sep 10, 2020
1 parent 8d64f6f commit e9dc6ca
Show file tree
Hide file tree
Showing 41 changed files with 1,874 additions and 1,435 deletions.
67 changes: 67 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Contributing

Some instructions for people contributing back.

### Downloading the code

1. Clone the repository:
```git clone https://github.com/sustainable-processes/summit_private.git```
2. Intall poetry by following the instructions [here](https://python-poetry.org/docs/#installation). We use poetry for dependency management.
3. Install all dependencies:
```poetry install```
3. To run tests:
```poetry run pytest --doctest-modules --ignore=case_studies```

### Commit Worfklow

- Use the [project board](https://github.com/orgs/sustainable-processes/projects/1) to keep track of issues. Issues will automatically be moved along in the board when they are closed in Github.
- Write tests in the tests/ folder
- Documentation follows the [numpy docstring format](https://numpydoc.readthedocs.io/en/latest/format.html#documenting-class-instances)
- Please include examples when possible that can be tested using [doctest](https://docs.python.org/3/library/doctest.html)
- All publicly available classes and methods should have a docstring
- Commit to a branch off master and submit pull requests to merge.
- To create a branch locally and push it:
```bash
$ git checkout -b BRANCH_NAME
# Once you've made some changes
$ git commit -am "commit message"
$ git push -u origin BRANCH_NAME
#Now if you come back to Github, your branch should exist
```
- All pull requests need one review.
- Tests will be run automatically when a pull request is created, and all tests need to pass before the pull request is merged.

### Docker
Sometimes, it is easier to run tests using a Docker container (e.g., on compute clusters). Here are the commands to build and run the docker containers using the included Dockferfile. The container entrypoint is python, so you just need to specify the file name.

To build the container and upload the container to Docker Hub.:
```
docker build . -t marcosfelt/summit:latest
docker push marcosfelt/summit:latest
```
You can change the tag from `latest` to whatever is most appropriate (e.g., the branch name). I have found that this takes up a lot of space on disk, so I have been running the commands on our private servers.
Then, to run a container, here is an example with the SnAr experiment code. The home directory of the container is called `summit_user`, hence we mount the current working directory into that folder. We remove the container upon finishing using `--rm` and make it interactive using `--it` (remove this if you just want the container to run in the background). [Neptune.ai](https://neptune.ai/) is used for the experiments so the API token is passed in. Finally, I specify the image name and the tag and before referencing the python file I want to run.
```
export NEPTUNE_API_TOKEN= #place your neptune token here
sudo docker run -v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$NEPTUNE_API_TOKEN summit:snar_benchmark snar_experiment_2.py
```
Singularity (for running Docker containers on the HPC):
```
export NEPTUNE_API_TOKEN=
singularity exec -B `pwd`/:/summit_user docker://marcosfelt/summit:snar_benchmark snar_experiment.py
```
### Releases
Below is the old process for building a release. In the future, we will have this automated using Github actions.
1. Install [s3pypi](https://github.com/novemberfiveco/s3pypi) and [dephell](https://dephell.org/docs/installation.html)
2. Install AWS credentials to upload pypi.rxns.io (Kobi is the one who controls this).
3. Bump the version in pyproject.toml and then run:
```dephell deps convert --from=pyproject.toml --to=setup.py```
4. Go into setup.py and delete the lines for extras_install_requires
4. Upload the package to the private pypi repository:
```s3pypi --bucket pypi.rxns.io``
116 changes: 50 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,70 @@
# Summit
![summit_banner](docs/source/_static/banner_4.png)

Summit is a set of tools for optimising chemical processes. We’ve started by targeting reactions.

Currently, reaction optimisation in the fine chemicals industry is done by intuition or design of experiments, which both scale poorly with the complexity of the problem. Summit applies recent advances in machine learning to make the process of reaction optimisation faster. Essentially, it applies algorithms that learn which conditions (e.g., temperature, stoichiometry, etc.) are important to maximising one or more objectives (e.g., yield, enantiomeric excess). This is achieved through an iterative cycle.
## What is Summit?
Currently, reaction optimisation in the fine chemicals industry is done by intuition or design of experiments, Both scale poorly with the complexity of the problem.

For a more academic treatment of Summit, check out “Benchmarking Machine Learning for Reaction Optimisation.” If you just want to try it, out, check out our [tutorial](https://gosummit.readthedocs.io/en/latest/tutorial.html).
Summit uses recent advances in machine learning to make the process of reaction optimisation faster. Essentially, it applies algorithms that learn which conditions (e.g., temperature, stoichiometry, etc.) are important to maximising one or more objectives (e.g., yield, enantiomeric excess). This is achieved through an iterative cycle.

Summit has two key features:

- **Strategies**: Optimisation algorithms designed to find the best conditions with the least number of iterations. Summit has eight strategies implemented.
- **Benchmarks**: Simulations of chemical reactions that can be used to test strategies. We have both mechanistic and data-driven benchmarks.

To get started, see the Quick Start below or follow our [tutorial](https://gosummit.readthedocs.io/en/latest/tutorial.html).

Currently, Summit has the following strategies implemented:

- **TSEMO**: Multi-objective Bayesian optimisation strategy by [Bradford et al.]()
- **Gryffin**: Single-objective Bayesian optimisation strategy designed for categoical variables [Häse et al.](https://arxiv.org/abs/2003.12127)
- **SOBO**: Single-objective Bayesian optimisation strategy ([GpyOpt](https://gpyopt.readthedocs.io/))
- **Nelder-Mead**: Single-objective optimisation stategy for local search
- **SNOBFIT**: Single-objective optimisation strategy by [Huyer et al.](https://www.mat.univie.ac.at/~neum/ms/snobfit.pdf)
- **Deep Raction Optimiser**: Deep reinforcement learning by [Zhou et al.](https://pubs.acs.org/doi/10.1021/acscentsci.7b00492)
- **Factorial DoE**: Factorial design of experiments
- **Random**: Random search

## Installation

To install summit, use the following command:

```pip install git+https://github.com/sustainable-processes/summit.git@0.5.0#egg=summit```

## Documentation
## Quick Start

The documentation for summit can be found [here](https://gosummit.readthedocs.io/en/latest/index.html).
<!-- It would be great to add a "Quick Start" here.-->

## Development

Some instructions for people contributing back.

### Downloading the code

1. Clone the repository:
```git clone https://github.com/sustainable-processes/summit_private.git```
2. Intall poetry by following the instructions [here](https://python-poetry.org/docs/#installation). We use poetry for dependency management.
3. Install all dependencies:
```poetry install```
3. To run tests:
```poetry run pytest --doctest-modules --ignore=case_studies```

### Commit Worfklow

- Use the [project board](https://github.com/orgs/sustainable-processes/projects/1) to keep track of issues. Issues will automatically be moved along in the board when they are closed in Github.
- Write tests in the tests/ folder
- Documentation follows the [numpy docstring format](https://numpydoc.readthedocs.io/en/latest/format.html#documenting-class-instances)
- Please include examples when possible that can be tested using [doctest](https://docs.python.org/3/library/doctest.html)
- All publicly available classes and methods should have a docstring
- Commit to a branch off master and submit pull requests to merge.
- To create a branch locally and push it:
```bash
$ git checkout -b BRANCH_NAME
# Once you've made some changes
$ git commit -am "commit message"
$ git push -u origin BRANCH_NAME
#Now if you come back to Github, your branch should exist
```
- All pull requests need one review.
- Tests will be run automatically when a pull request is created, and all tests need to pass before the pull request is merged.

### Docker
Sometimes, it is easier to run tests using a Docker container (e.g., on compute clusters). Here are the commands to build and run the docker containers using the included Dockferfile. The container entrypoint is python, so you just need to specify the file name.

To build the container and upload the container to Docker Hub.:
```
docker build . -t marcosfelt/summit:latest
docker push marcosfelt/summit:latest
```
You can change the tag from `latest` to whatever is most appropriate (e.g., the branch name). I have found that this takes up a lot of space on disk, so I have been running the commands on our private servers.
Below, we show how to use the Nelder-Mead strategy to optimise a benchmark representing a nucleophlic aromatic substitution (SnAr) reaction.
```python
# Import summit
from summit.benchmarks import SnarBenchmark, MultitoSingleObjective
from summit.strategies import NelderMead
from summit.run import Runner

Then, to run a container, here is an example with the SnAr experiment code. The home directory of the container is called `summit_user`, hence we mount the current working directory into that folder. We remove the container upon finishing using `--rm` and make it interactive using `--it` (remove this if you just want the container to run in the background). [Neptune.ai](https://neptune.ai/) is used for the experiments so the API token is passed in. Finally, I specify the image name and the tag and before referencing the python file I want to run.
# Instantiate the benchmark
exp = SnarBenchmark()

```
export NEPTUNE_API_TOKEN= #place your neptune token here
sudo docker run -v `pwd`/:/summit_user --rm -it --env NEPTUNE_API_TOKEN=$NEPTUNE_API_TOKEN summit:snar_benchmark snar_experiment_2.py
```
# Since the Snar benchmark has two objectives and Nelder-Mead is single objective, we need a multi-to-single objective transform
transform = MultitoSingleObjective(
exp.domain, expression="-sty/1e4+e_factor/100", maximize=False
)

Singularity (for running Docker containers on the HPC):
```
export NEPTUNE_API_TOKEN=
singularity exec -B `pwd`/:/summit_user docker://marcosfelt/summit:snar_benchmark snar_experiment.py
# Set up the strategy, passing in the optimisation domain and transform
nm = NelderMead(exp.domain, transform=transform)

# Use the runner to run closed loop experiments
r = Runner(
strategy=nm, experiment=exp,max_iterations=50
)
r.run()
```

### Releases
## Documentation

The documentation for summit can be found [here](https://gosummit.readthedocs.io/en/latest/index.html).

Below is the old process for building a release. In the future, we will have this automated using Github actions.

1. Install [s3pypi](https://github.com/novemberfiveco/s3pypi) and [dephell](https://dephell.org/docs/installation.html)
2. Install AWS credentials to upload pypi.rxns.io (Kobi is the one who controls this).
3. Bump the version in pyproject.toml and then run:
```dephell deps convert --from=pyproject.toml --to=setup.py```
4. Go into setup.py and delete the lines for extras_install_requires
4. Upload the package to the private pypi repository:
```s3pypi --bucket pypi.rxns.io```
## Issues?
Submit an [issue](https://github.com/sustainable-processes/summit/issues) or send an email to kcmf2@cam.ac.uk.


Binary file removed docs/source/_static/TSEMO_DTLZ2.png
Binary file not shown.
Binary file added docs/source/_static/acquistion_function.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/banner_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 2 additions & 5 deletions docs/source/_static/snar_experiments_external_0.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
NAME,tau,equiv_pldn,conc_dfnb,temperature,strategy
TYPE,DATA,DATA,DATA,DATA,METADATA
0,0.65,2.2,0.4600000000000001,75.0,LHS
1,0.9500000000000001,3.0,0.38,111.0,LHS
2,1.25,4.6,0.14,57.0,LHS
3,1.85,3.8000000000000003,0.30000000000000004,39.0,LHS
4,1.55,1.4,0.22000000000000003,93.0,LHS
0,1.697276616717274,2.3766612355128656,0.49423078653350516,81.35696424017263,Single-objective BayOpt
1,1.9349141004941464,3.7215572584228926,0.2886673868745884,33.0546657397271,Single-objective BayOpt
7 changes: 2 additions & 5 deletions docs/source/_static/snar_experiments_external_1.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
NAME,tau,equiv_pldn,conc_dfnb,temperature,sty,e_factor,computation_t,experiment_t,strategy
TYPE,DATA,DATA,DATA,DATA,DATA,DATA,METADATA,METADATA,METADATA
0,0.65,2.2,0.4600000000000001,75.0,7786.655447945639,10.987035663168497,0.0,0.01232290267944336,LHS
1,0.9500000000000001,3.0,0.38,111.0,2887.7432355153373,20.60376622418164,0.0,0.011729001998901367,LHS
2,1.25,4.6,0.14,57.0,1249.553463058851,32.831486727082876,0.0,0.009754657745361328,LHS
3,1.85,3.8000000000000003,0.30000000000000004,39.0,1796.9127067316595,16.483119872945093,0.0,0.011143207550048828,LHS
4,1.55,1.4,0.22000000000000003,93.0,1595.8336602961792,20.411290438847523,0.0,0.009853839874267578,LHS
0,1.697276616717274,2.3766612355128656,0.49423078653350516,81.35696424017263,2577.8678212410014,13.11795197598188,0.0,0.01884603500366211,Single-objective BayOpt
1,1.9349141004941464,3.7215572584228926,0.2886673868745884,33.0546657397271,1694.334796274914,16.58069255511405,0.0,0.015223026275634766,Single-objective BayOpt
10 changes: 3 additions & 7 deletions docs/source/_static/snar_experiments_external_2.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
NAME,tau,equiv_pldn,conc_dfnb,temperature,sty,e_factor,strategy
TYPE,DATA,DATA,DATA,DATA,DATA,DATA,METADATA
418,0.5030621661413801,2.159430271452188,0.3331364369939005,39.349316576589885,8571.64186963127,20.263331718101174,TSEMO2
417,1.9427586165924804,2.4385333547452435,0.21844473638973613,40.65833629490961,1900.7421954026868,20.262233951542083,TSEMO2
340,0.9372710173172387,2.5229832750502386,0.4554495530196181,102.96622111313425,3068.1191873838484,20.300237076480652,TSEMO2
896,1.2733780600751703,4.840804510690066,0.21208537109030945,68.18683968991941,1260.200774131921,21.023509554987164,TSEMO2
506,0.9155072752567002,3.835044815518099,0.48601776052851564,109.58468407825623,3397.9204691962477,20.275176634930965,TSEMO2
NAME,tau,equiv_pldn,conc_dfnb,temperature,strategy
TYPE,DATA,DATA,DATA,DATA,METADATA
0,1.122318954003142,3.164344834741235,0.23598966596928997,34.44861796011246,Single-objective BayOpt
1 change: 1 addition & 0 deletions docs/source/_static/snar_sobo_external.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"name": "SOBO", "transform": {"transform_domain": [{"type": "ContinuousVariable", "is_objective": false, "name": "tau", "description": "residence time in minutes", "units": null, "bounds": [0.5, 2.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "equiv_pldn", "description": "equivalents of pyrrolidine", "units": null, "bounds": [1.0, 5.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "conc_dfnb", "description": "concentration of 2,4 dinitrofluorobenenze at reactor inlet (after mixing) in M", "units": null, "bounds": [0.1, 0.5]}, {"type": "ContinuousVariable", "is_objective": false, "name": "temperature", "description": "Reactor temperature in degress celsius", "units": null, "bounds": [30.0, 120.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "scalar_objective", "description": "-sty/1e4+e_factor/100", "units": null, "bounds": [0.0, 1.0]}], "name": "MultitoSingleObjective", "domain": [{"type": "ContinuousVariable", "is_objective": false, "name": "tau", "description": "residence time in minutes", "units": null, "bounds": [0.5, 2.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "equiv_pldn", "description": "equivalents of pyrrolidine", "units": null, "bounds": [1.0, 5.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "conc_dfnb", "description": "concentration of 2,4 dinitrofluorobenenze at reactor inlet (after mixing) in M", "units": null, "bounds": [0.1, 0.5]}, {"type": "ContinuousVariable", "is_objective": false, "name": "temperature", "description": "Reactor temperature in degress celsius", "units": null, "bounds": [30.0, 120.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "sty", "description": "space time yield (kg/m^3/h)", "units": null, "bounds": [0.0, 100.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "e_factor", "description": "E-factor", "units": null, "bounds": [0.0, 10.0]}], "transform_params": {"expression": "-sty/1e4+e_factor/100", "maximize": false}}, "strategy_params": {"prev_param": null, "use_descriptors": false, "gp_model_type": "GP", "acquisition_type": "EI", "optimizer_type": "lbfgs", "evaluator_type": "random", "kernel": {"input_dim": 4, "active_dims": [0, 1, 2, 3], "name": "Mat52", "useGPU": false, "variance": [1.0], "lengthscale": [1.0], "ARD": false, "class": "GPy.kern.Matern52"}, "exact_feval": false, "ARD": true, "standardize_outputs": true}}
1 change: 1 addition & 0 deletions docs/source/_static/snar_sobo_external_2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"name": "SOBO", "transform": {"transform_domain": [{"type": "ContinuousVariable", "is_objective": false, "name": "tau", "description": "residence time in minutes", "units": null, "bounds": [0.5, 2.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "equiv_pldn", "description": "equivalents of pyrrolidine", "units": null, "bounds": [1.0, 5.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "conc_dfnb", "description": "concentration of 2,4 dinitrofluorobenenze at reactor inlet (after mixing) in M", "units": null, "bounds": [0.1, 0.5]}, {"type": "ContinuousVariable", "is_objective": false, "name": "temperature", "description": "Reactor temperature in degress celsius", "units": null, "bounds": [30.0, 120.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "scalar_objective", "description": "-sty/1e4+e_factor/100", "units": null, "bounds": [0.0, 1.0]}], "name": "MultitoSingleObjective", "domain": [{"type": "ContinuousVariable", "is_objective": false, "name": "tau", "description": "residence time in minutes", "units": null, "bounds": [0.5, 2.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "equiv_pldn", "description": "equivalents of pyrrolidine", "units": null, "bounds": [1.0, 5.0]}, {"type": "ContinuousVariable", "is_objective": false, "name": "conc_dfnb", "description": "concentration of 2,4 dinitrofluorobenenze at reactor inlet (after mixing) in M", "units": null, "bounds": [0.1, 0.5]}, {"type": "ContinuousVariable", "is_objective": false, "name": "temperature", "description": "Reactor temperature in degress celsius", "units": null, "bounds": [30.0, 120.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "sty", "description": "space time yield (kg/m^3/h)", "units": null, "bounds": [0.0, 100.0]}, {"type": "ContinuousVariable", "is_objective": true, "name": "e_factor", "description": "E-factor", "units": null, "bounds": [0.0, 10.0]}], "transform_params": {"expression": "-sty/1e4+e_factor/100", "maximize": false}}, "strategy_params": {"prev_param": [[[1.6972766167172741, 2.3766612355128656, 0.4942307865335052, 81.35696424017263], [1.9349141004941464, 3.7215572584228926, 0.2886673868745884, 33.0546657397271]], [[0.1266072623642813], [0.003626554076350874]]], "use_descriptors": false, "gp_model_type": "GP", "acquisition_type": "EI", "optimizer_type": "lbfgs", "evaluator_type": "random", "kernel": {"input_dim": 4, "active_dims": [0, 1, 2, 3], "name": "Mat52", "useGPU": false, "variance": [0.9996612796837149], "lengthscale": [2.246874970301238], "ARD": false, "class": "GPy.kern.Matern52"}, "exact_feval": false, "ARD": true, "standardize_outputs": true}}
Loading

0 comments on commit e9dc6ca

Please sign in to comment.