Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frame out the env spec cep #50

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions cep-??.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
<table>
<tr><td> Title </td><td> Environment yaml specification </td>
<tr><td> Status </td><td> Draft </td></tr>
<!-- <tr><td> Status </td><td> Draft | Proposed | Accepted | Rejected | Deferred | Implemented </td></tr> -->
<tr><td> Author(s) </td>
<td>
Eric Dill
Marius van Niekerk
Copy link
Author

@ericdill ericdill May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariusvniekerk you interested in being a co-author of this?

Jaime Rodríguez-Guerra
ericdill marked this conversation as resolved.
Show resolved Hide resolved
Albert DeFusco
ericdill marked this conversation as resolved.
Show resolved Hide resolved
</td>
</tr>
<tr><td> Created </td><td> March 28, 2023</td></tr>
<tr><td> Updated </td><td> NA </td></tr>
<tr><td> Discussion </td><td> NA </td></tr>
<tr><td> Implementation </td><td> NA </td></tr>
</table>

## Abstract

This CEP aims to formalize the current state of the environment.yaml specification.
It notably does not attempt to address any proposed changes to the specification.
That is currently being discussed elsewhere.

## Motivation

There are a number of existing projects that can accept the env.yml format as input.
Each project that wants to use the env.yml spec has to implement their own parser.
This is a problem because it means that there are multiple implementations of the same specification.
This makes it difficult to evolve the specification because right now it is implicitly defined as however `conda-env` handles the env.yml file.
This CEP aims to formalize the specification so that it can be used as a reference for other projects, including `conda-env`.
There are [existing docs](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) that defines how a user should create an environment.yml file.
From what I can tell this is the only formal specification for this file.
It is the objective of this CEP to formalize the existing specification so that other projects can standardize on what they expect as input.
Evolutions to this spec are expected and welcome, but are out of the scope for the initial proposal.

Issues found in the wild that result from env.yaml not having a published spec:
- [vscode issue #280](https://github.com/microsoft/vscode-python/issues/280)

## Specification

There are four keys that the conda env spec expects:

`name`, `str`, specifying the name of the conda env that will be created.

`channels`, `list`, specifying the channels that the solver should find and pull packages from

`dependencies`, `list`, specifying the top-level dependencies that the solver should start with
ericdill marked this conversation as resolved.
Show resolved Hide resolved
- `dependencies` also has an optional dict with the `pip` key that expects a `list` as its value. This key tells the solver to also include the listed dependencies to pull from pypi
- Each non-pip `dependencies` entry must follow the form expected by the current [MatchSpec](https://github.com/conda/conda/blob/a8e441e3c0e80b0d4e1595579f7d9eaad2b0fb2b/conda/models/match_spec.py#L92) implementation. Some examples of currently supported dependency specs (at least at the time of this CEP):
```
Examples:
>>> str(MatchSpec(name='foo', build='py2*', channel='conda-forge'))
'conda-forge::foo[build=py2*]'
>>> str(MatchSpec('foo 1.0 py27_0'))
'foo==1.0=py27_0'
>>> str(MatchSpec('foo=1.0=py27_0'))
'foo==1.0=py27_0'
>>> str(MatchSpec('conda-forge::foo[version=1.0.*]'))
'conda-forge::foo=1.0'
>>> str(MatchSpec('conda-forge/linux-64::foo>=1.0'))
"conda-forge/linux-64::foo[version='>=1.0']"
>>> str(MatchSpec('*/linux-64::foo>=1.0'))
"foo[subdir=linux-64,version='>=1.0']"
```

`prefix`, `str`, the full path to the environment. It's not clear to me how `name` and `prefix` interact.
ericdill marked this conversation as resolved.
Show resolved Hide resolved

`variables`, `dict`, {`str`: `str`} format of environment variables to set/unset when the environment is activated/deactivated.

jsonschema version of the current spec:

```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"name": {"type": "string"},
"prefix": {"type": "string"},
"channels": {
"type": "array",
"items": {"type": "string"}
},
"dependencies": {
"type": "array",
"items": {
"anyOf": [
{"type": "string"},
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How far down the MatchSpec rabbit hole do we want to go with this jsonschema? jsonschema does support various formatting validators like regex, but I'm a little hesitant to try and capture MatchSpec in a regex... Thoughts?

@mariusvniekerk @AlbertDeFusco @jezdez @wolfv @jaimergp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels we need a JSONSchema for the MatchSpec first, and then we can refer to it if anything. Can we leave a "type hint" for now? e.g. MatchSpecStr?

{"$ref": "#/definitions/pip"}
]
}
},
"variables": {
"type": "object",
"additionalProperties": {"type": "string"}
}
},
"definitions": {
"pip": {
"type": "object",
"additionalProperties": {
"type": "array",
"required": ["pip"],
"properties": {
"pip": {"type": "array"}
}
}
}
}
}
```

## Other sections

Other relevant sections of the proposal. Common sections include:

* Specification -- The technical details of the proposed change.
* Motivation -- Why the proposed change is needed.
* Rationale -- Why particular decisions were made in the proposal.
* Backwards Compatibility -- Will the proposed change break existing
packages or workflows.
* Alternatives -- Any alternatives considered during the design.
* Sample Implementation -- Links to prototype or a sample implementation of
the proposed change.
* FAQ -- Frequently asked questions (and answers to them).
* Resolution -- A short summary of the decision made by the community.
* Reference -- Any references used in the design of the CEP.

## Copyright

All CEPs are explicitly [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/).
4 changes: 4 additions & 0 deletions env-specs/env1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name: stats
dependencies:
- numpy
- pandas
12 changes: 12 additions & 0 deletions env-specs/env2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: stats2
channels:
- javascript
dependencies:
- python=3.9
- bokeh=2.4.2
- conda-forge::numpy=1.21.*
- nodejs=16.13.*
- flask
- pip
- pip:
- Flask-Testing
8 changes: 8 additions & 0 deletions env-specs/env3.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: env-name
channels:
- conda-forge
- defaults
dependencies:
- python=3.7
- codecov
prefix: /Users/username/anaconda3/envs/env-name
Copy link
Author

@ericdill ericdill May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jezdez I've been poking at this name vs prefix locally with conda env create and mamba env create. Neither seem to do anything with this prefix key, even though it shows up in conda env export. It seems as though this is a valid key for the spec that conda env will accept, but conda env doesn't actually do anything with this key. can you confirm that I'm understanding this behavior correctly?

$ conda env export

name: base
channels:
  - conda-forge
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_gnu
  - brotlipy=0.7.0=py310h5764c6d_1005
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.18.1=h7f98852_0
  - ca-certificates=2022.12.7=ha878542_0
  - certifi=2022.12.7=pyhd8ed1ab_0
  - cffi=1.15.1=py310h255011f_3
  - charset-normalizer=3.1.0=pyhd8ed1ab_0
  - colorama=0.4.6=pyhd8ed1ab_0
  - conda=23.1.0=py310hff52083_0
  - conda-package-handling=2.0.2=pyh38be061_0
  - conda-package-streaming=0.7.0=pyhd8ed1ab_1
  - cryptography=40.0.1=py310h34c0648_0
  - fmt=9.1.0=h924138e_0
  - icu=72.1=hcb278e6_0
  - idna=3.4=pyhd8ed1ab_0
  - keyutils=1.6.1=h166bdaf_0
  - krb5=1.20.1=h81ceb04_0
  - ld_impl_linux-64=2.40=h41732ed_0
  - libarchive=3.6.2=h3d51595_0
  - libcurl=7.88.1=hdc1c0ab_1
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libffi=3.4.2=h7f98852_5
  - libgcc-ng=12.2.0=h65d4601_19
  - libgomp=12.2.0=h65d4601_19
  - libiconv=1.17=h166bdaf_0
  - libmamba=1.4.1=hcea66bb_0
  - libmambapy=1.4.1=py310h1428755_0
  - libnghttp2=1.52.0=h61bc06f_0
  - libnsl=2.0.0=h7f98852_0
  - libsolv=0.7.23=h3eb15da_0
  - libsqlite=3.40.0=h753d276_0
  - libssh2=1.10.0=hf14f497_3
  - libstdcxx-ng=12.2.0=h46fd767_19
  - libuuid=2.38.1=h0b41bf4_0
  - libxml2=2.10.3=hfdac1af_6
  - libzlib=1.2.13=h166bdaf_4
  - lz4-c=1.9.4=hcb278e6_0
  - lzo=2.10=h516909a_1000
  - mamba=1.4.1=py310h51d5547_0
  - ncurses=6.3=h27087fc_1
  - openssl=3.1.0=h0b41bf4_0
  - pip=23.0.1=pyhd8ed1ab_0
  - pluggy=1.0.0=pyhd8ed1ab_5
  - pybind11-abi=4=hd8ed1ab_3
  - pycosat=0.6.4=py310h5764c6d_1
  - pycparser=2.21=pyhd8ed1ab_0
  - pyopenssl=23.1.1=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.10.10=he550d4f_0_cpython
  - python_abi=3.10=3_cp310
  - readline=8.2=h8228510_1
  - reproc=14.2.4=h0b41bf4_0
  - reproc-cpp=14.2.4=hcb278e6_0
  - requests=2.28.2=pyhd8ed1ab_1
  - ruamel.yaml=0.17.21=py310h1fa729e_3
  - ruamel.yaml.clib=0.2.7=py310h1fa729e_1
  - setuptools=65.6.3=pyhd8ed1ab_0
  - tk=8.6.12=h27826a3_0
  - toolz=0.12.0=pyhd8ed1ab_0
  - tqdm=4.65.0=pyhd8ed1ab_1
  - tzdata=2023c=h71feb2d_0
  - urllib3=1.26.15=pyhd8ed1ab_0
  - wheel=0.40.0=pyhd8ed1ab_0
  - xz=5.2.6=h166bdaf_0
  - yaml-cpp=0.7.0=h27087fc_2
  - zstandard=0.19.0=py310hdeb6495_1
  - zstd=1.5.2=h3eb15da_6
prefix: /home/ericdill/mambaforge

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I've been confused by it since it started appearing.

10 changes: 10 additions & 0 deletions env-specs/env4.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: env-name
channels:
- conda-forge
- defaults
dependencies:
- python=3.7
- codecov
variables:
VAR1: valueA
VAR2: valueB
53 changes: 53 additions & 0 deletions env-specs/schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"prefix": {
"type": "string"
},
"channels": {
"type": "array",
"items": {
"type": "string"
}
},
"dependencies": {
"type": "array",
"items": {
"anyOf": [
{
"type": "string"
},
{
"$ref": "#/definitions/pip"
}
]
}
},
"variables": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"definitions": {
"pip": {
"type": "object",
"additionalProperties": {
"type": "array",
"required": [
"pip"
],
"properties": {
"pip": {
"type": "array"
}
}
}
}
}
}
Loading