Skip to content

Latest commit

 

History

History
72 lines (56 loc) · 3.63 KB

README.md

File metadata and controls

72 lines (56 loc) · 3.63 KB

Learning-Based Coded Computation

alt text

Coded computation is an emerging approach that applies ideas from coding theory to impart resource-efficient resilience against slowdowns and failures that occur in large-scale distributed computing systems. This repository contains a framework for exploring the use of machine learning to apply coded computation to broader classes of computations. More background on coded computation, its challenges, and the potential for machine learning within coded computation is provided in this blog post.

This repository focuses primarily on applying learning-based coded computation to impart resilience to distributed systems performing inference with neural networks.

This repository contains the code used for the following papers:

This repository originated as the artifact associated with the SOSP 2019 paper Parity Models: Erasure-Coded Resilience for Prediction Serving Systems. It has since evolved beyond serving this paper in isolation. The original artifact associated with the SOSP 2019 paper is located in the sosp2019-artifact branch.

Repository structure

  • train: Code for training a neural network parity model
  • clipper-parm: Code for ParM, a prediction serving system that employs learning-based coded computation to impart resilience against slowdowns and failures. For more details, see our paper.

Please see the READMEs in each of these subdirectories for more details.

Cloning the repository

If you are only interested in the training portion of the repository, you can clone the repository as:

git clone https://github.com/Thesys-lab/parity-models.git

If you are interested in running ParM, the prediction serving system employing parity models, you will need to clone required submodules using the --recursive flag:

git clone --recursive https://github.com/Thesys-lab/parity-models.git

License

Copyright 2019, Carnegie Mellon University

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Support

We graciously acknowledge support from the National Science Foundation (NSF) under grant CNS-1850483, an NSF Graduate Research Fellowship (DGE-1745016 and DGE-1252522), and Amazon Web Services.