-
Notifications
You must be signed in to change notification settings - Fork 129
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 5777861
Showing
40 changed files
with
5,955 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
*.DS_Store | ||
*.pyc | ||
*.egg* | ||
venv* | ||
dsr/dsr/summary* | ||
*log_* | ||
.gitignore | ||
.ipynb_checkpoints | ||
~$* | ||
*.vscode/ | ||
dsr/build | ||
dsr/dsr/cyfunc* | ||
**/log/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
BSD 3-Clause License | ||
|
||
Copyright (c) 2018, Lawrence Livermore National Security, LLC | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
* Neither the name of the copyright holder nor the names of its | ||
contributors may be used to endorse or promote products derived from | ||
this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | ||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | ||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
This work was produced under the auspices of the U.S. Department of | ||
Energy by Lawrence Livermore National Laboratory under Contract | ||
DE-AC52-07NA27344. | ||
|
||
This work was prepared as an account of work sponsored by an agency of | ||
the United States Government. Neither the United States Government nor | ||
Lawrence Livermore National Security, LLC, nor any of their employees | ||
makes any warranty, expressed or implied, or assumes any legal liability | ||
or responsibility for the accuracy, completeness, or usefulness of any | ||
information, apparatus, product, or process disclosed, or represents that | ||
its use would not infringe privately owned rights. | ||
|
||
Reference herein to any specific commercial product, process, or service | ||
by trade name, trademark, manufacturer, or otherwise does not necessarily | ||
constitute or imply its endorsement, recommendation, or favoring by the | ||
United States Government or Lawrence Livermore National Security, LLC. | ||
|
||
The views and opinions of authors expressed herein do not necessarily | ||
state or reflect those of the United States Government or Lawrence | ||
Livermore National Security, LLC, and shall not be used for advertising | ||
or product endorsement purposes. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Deep symbolic regression | ||
|
||
Deep symbolic regression (DSR) is a deep learning algorithm for symbolic regression--the task of recovering tractable mathematical expressions from an input dataset. The package `dsr` contains the code for DSR, including a single-point, parallelized launch script (`dsr/run.py`), baseline genetic programming-based symbolic regression algorithm, and an sklearn-like interface for use with your own data. | ||
|
||
This code supports the ICLR 2021 paper [Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients](https://openreview.net/forum?id=m5Qsh0kBQG). | ||
|
||
# Installation | ||
|
||
Installation is straightforward in a Python 3 virtual environment using Pip. From the repository root: | ||
|
||
``` | ||
python3 -m venv venv3 # Create a Python 3 virtual environment | ||
source venv3/bin/activate # Activate the virtual environmnet | ||
pip install -r requirements.txt # Install Python dependencies | ||
export CFLAGS="-I $(python -c "import numpy; print(numpy.get_include())") $CFLAGS" # Needed on Mac to prevent fatal error: 'numpy/arrayobject.h' file not found | ||
pip install -e ./dsr # Install DSR package | ||
``` | ||
|
||
To perform experiments involving the GP baseline, you will need the additional package `deap`. | ||
|
||
# Example usage | ||
|
||
To try out DSR, use the following command from the repository root: | ||
|
||
``` | ||
python -m dsr.run ./dsr/dsr/config.json --b=Nguyen-6 | ||
``` | ||
|
||
This should solve in around 50 training steps (~30 seconds on a laptop). | ||
|
||
# Getting started | ||
|
||
## Configuring runs | ||
|
||
DSR uses JSON files to configure training. | ||
|
||
Top-level key "task" specifies details of the benchmark expression for DSR or GP. See docs in `regression.py` for details. | ||
|
||
Top-level key "training" specifies the training hyperparameters for DSR. See docs in `train.py` for details. | ||
|
||
Top-level key "controller" specifies the RNN controller hyperparameters for DSR. See docs for in `controller.py` for details. | ||
|
||
Top-level key "gp" specifies the hyperparameters for GP if using the GP baseline. See docs for `dsr.baselines.gspr.GP` for details. | ||
|
||
## Launching runs | ||
|
||
After configuring a run, launching it is simple: | ||
|
||
``` | ||
python -m dsr.run [PATH_TO_CONFIG] [--OPTIONS] | ||
``` | ||
|
||
## Sklearn interface | ||
|
||
DSR also provides an [sklearn-like regressor interface](https://scikit-learn.org/stable/modules/generated/sklearn.base.RegressorMixin.html). Example usage: | ||
|
||
``` | ||
from dsr import DeepSymbolicRegressor | ||
import numpy as np | ||
# Generate some data | ||
np.random.seed(0) | ||
X = np.random.random((10, 2)) | ||
y = np.sin(X[:,0]) + X[:,1] ** 2 | ||
# Create the model | ||
model = DeepSymbolicRegressor("config.json") | ||
# Fit the model | ||
model.fit(X, y) # Should solve in ~10 seconds | ||
# View the best expression | ||
print(model.program_.pretty()) | ||
# Make predictions | ||
model.predict(2 * X) | ||
``` | ||
|
||
## Using an external dataset | ||
|
||
To use your own dataset, simply provide the path to the `"dataset"` key in the config, and give your task an arbitary name. | ||
|
||
``` | ||
"task": { | ||
"task_type": "regression", | ||
"name": "my_task", | ||
"dataset": "./path/to/my_dataset.csv", | ||
... | ||
} | ||
``` | ||
|
||
Then run DSR: | ||
|
||
``` | ||
python -m dsr.run path/to/config.json | ||
``` | ||
|
||
Note the `--b` flag matches the name of the CSV file (-`.csv` ). | ||
|
||
## Command-line examples | ||
|
||
Show command-line help and quit | ||
|
||
``` | ||
python -m dsr.run --help | ||
``` | ||
|
||
Train 2 indepdent runs of DSR on the Nguyen-1 benchmark using 2 cores | ||
|
||
``` | ||
python -m dsr.run config.json --b=Nguyen-1 --mc=2 --num_cores=2 | ||
``` | ||
|
||
Train DSR on all 12 Nguyen benchmarks using 12 cores | ||
|
||
``` | ||
python -m dsr.run config.json --b=Nguyen --num_cores=12 | ||
``` | ||
|
||
Train 2 independent runs of GP on Nguyen-1 | ||
|
||
``` | ||
python -m dsr.run config.json --method=gp --b=Nguyen-1 --mc=2 --num_cores=2 | ||
``` | ||
|
||
Train DSR on Nguyen-1 and Nguyen-4 | ||
|
||
``` | ||
python -m dsr.run config.json --b=Nguyen-1 --b=Nguyen-4 | ||
``` | ||
|
||
# Release | ||
|
||
LLNL-CODE-647188 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from dsr.core import DeepSymbolicOptimizer | ||
from dsr.task.regression.sklearn import DeepSymbolicRegressor | ||
|
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
"""Defines constraints for GP individuals, to be used as decorators for | ||
evolutionary operations.""" | ||
|
||
from dsr.functions import UNARY_TOKENS, BINARY_TOKENS | ||
|
||
TRIG_TOKENS = ["sin", "cos", "tan", "csc", "sec", "cot"] | ||
|
||
# Define inverse tokens | ||
INVERSE_TOKENS = { | ||
"exp" : "log", | ||
"neg" : "neg", | ||
"inv" : "inv", | ||
"sqrt" : "n2" | ||
} | ||
|
||
# Add inverse trig functions | ||
INVERSE_TOKENS.update({ | ||
t : "arc" + t for t in TRIG_TOKENS | ||
}) | ||
|
||
# Add reverse | ||
INVERSE_TOKENS.update({ | ||
v : k for k, v in INVERSE_TOKENS.items() | ||
}) | ||
|
||
DEBUG = False | ||
|
||
|
||
def check_inv(ind): | ||
"""Returns True if two sequential tokens are inverse unary operators.""" | ||
|
||
names = [node.name for node in ind] | ||
for i, name in enumerate(names[:-1]): | ||
if name in INVERSE_TOKENS and names[i+1] == INVERSE_TOKENS[name]: | ||
if DEBUG: | ||
print("Constrained inverse:", ind) | ||
return True | ||
return False | ||
|
||
|
||
def check_const(ind): | ||
"""Returns True if children of a parent are all const tokens.""" | ||
|
||
names = [node.name for node in ind] | ||
for i, name in enumerate(names): | ||
if name in UNARY_TOKENS and names[i+1] == "const": | ||
if DEBUG: | ||
print("Constrained const (unary)", ind) | ||
return True | ||
if name in BINARY_TOKENS and names[i+1] == "const" and names[i+1] == "const": | ||
if DEBUG: | ||
print(print("Constrained const (binary)", ind)) | ||
return True | ||
return False | ||
|
||
|
||
def check_trig(ind): | ||
"""Returns True if a descendant of a trig operator is another trig | ||
operator.""" | ||
|
||
names = [node.name for node in ind] | ||
trig_descendant = False # True when current node is a descendant of a trig operator | ||
trig_dangling = None # Number of unselected nodes in trig subtree | ||
for i, name in enumerate(names): | ||
if name in TRIG_TOKENS: | ||
if trig_descendant: | ||
if DEBUG: | ||
print("Constrained trig:", ind) | ||
return True | ||
trig_descendant = True | ||
trig_dangling = 1 | ||
elif trig_descendant: | ||
if name in BINARY_TOKENS: | ||
trig_dangling += 1 | ||
elif name not in UNARY_TOKENS: | ||
trig_dangling -= 1 | ||
if trig_dangling == 0: | ||
trig_descendant = False | ||
return False | ||
|
||
|
||
def make_check_min_len(min_length): | ||
"""Creates closure for minimum length constraint""" | ||
|
||
def check_min_len(ind): | ||
"""Returns True if individual is less than minimum length""" | ||
|
||
if len(ind) < min_length: | ||
if DEBUG: | ||
print("Constrained min len: {} (length {})".format(ind, len(ind))) | ||
return True | ||
|
||
return False | ||
|
||
return check_min_len | ||
|
||
|
||
def make_check_max_len(max_length): | ||
"""Creates closure for maximum length constraint""" | ||
|
||
def check_max_len(ind): | ||
"""Returns True if individual is greater than maximum length""" | ||
|
||
if len(ind) > max_length: | ||
if DEBUG: | ||
print("Constrained max len: {} (length {})".format(ind, len(ind))) | ||
return True | ||
|
||
return False | ||
|
||
return check_max_len | ||
|
||
|
||
def make_check_num_const(max_const): | ||
"""Creates closure for maximum number of constants constraint""" | ||
|
||
def check_num_const(ind): | ||
"""Returns True if individual has more than max_const const tokens""" | ||
|
||
num_const = len([t for t in ind if t.name == "const"]) | ||
if num_const > max_const: | ||
if DEBUG: | ||
print("Constrained max const: {} ({} consts)".format(ind, num_const)) | ||
return True | ||
|
||
return False | ||
|
||
return check_num_const |
Oops, something went wrong.