diff --git a/README.md b/README.md index dce27b7..684cc7e 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,17 @@ +
+

SWT-Bench

+ +[![Build & Test](https://github.com/logic-star-ai/swt-bench/actions/workflows/build.yml/badge.svg)](https://github.com/logic-star-ai/swt-bench/actions/workflows/build.yml) + + Build + +[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) + +
+ + ## 👋 Overview + SWT-bench is a benchmark for evaluating large language models on testing generation for real world software issues collected from GitHub. Given a *codebase* and an *issue*, a language model is tasked with generating a *reproducing test* that fails in the original state of the code base and passes after a patch resolving the issue has been applied.