Burro 🫏

Burro is a command-line interface (CLI) tool for evaluating Large Language Model (LLM) outputs. It provides a straightforward way to run different types of evaluations with secure API key management.

🚀 Features

Three specialized evaluation types:
- Answer correctness evaluation with context
- Close-ended QA matching
- Simple output-expected comparison
Secure OpenAI API key management
JSON-based evaluation configurations

📋 Prerequisites

OpenAI API key

🛠️ Installation

MacOS - Apple Silicon (M1/M2/M3)

sudo curl -L "https://github.com/thisguymartin/burro/releases/download/latest/build-mac-silicon" -o /usr/local/bin/burro && sudo chmod +x /usr/local/bin/burro

MacOS - Intel

sudo curl -L "https://github.com/thisguymartin/burro/releases/download/latest/build-mac-intel" -o /usr/local/bin/burro && sudo chmod +x /usr/local/bin/burro

Linux - ARM

sudo curl -L "https://github.com/thisguymartin/burro/releases/download/latest/build-linux-arm" -o /usr/local/bin/burro && sudo chmod +x /usr/local/bin/burro

Linux - Intel

sudo curl -L "https://github.com/thisguymartin/burro/releases/download/latest/build-linux-intel" -o /usr/local/bin/burro && sudo chmod +x /usr/local/bin/burro

Windows

Download build-windows.exe from the releases page
Rename it to burro.exe
Move it to your desired location (e.g., C:\Program Files\burro\burro.exe)

🔧 Usage

Setting up API Keys

burro set-openai-key

Running Evaluations

burro run-eval <evaluation-file>

📊 Evaluation Types

✅ Current Evaluation Types

Close QA (closeqa.json)
- Exact matching for close-ended questions
- Strict format validation
- Support for multiple correct answers
Simple Evals (evals.json)
- Basic output vs expected comparisons
- Quick and efficient validation
- Flexible matching options

🔜 Coming Soon

LLM-as-a-Judge Evaluations

Advanced evaluation methods using LLMs as judges:

🔜 Battle: Compare outputs from different models head-to-head
🔜 Humor: Evaluate the humor and wit in model responses
🔜 Moderation: Check content for safety and appropriateness
🔜 Security: Assess responses for potential security vulnerabilities
🔜 Summarization: Evaluate the quality and accuracy of text summaries
🔜 SQL: Verify the correctness of generated SQL queries
🔜 Translation: Assess translation quality across languages
🔜 Fine-tuned binary classifiers: Specialized evaluations using custom-trained models

Heuristic Evaluations

Mathematical and algorithmic comparison methods:

🔜 Levenshtein distance: Measure string similarity using edit distance
🔜 Exact match: Check for perfect matches between outputs
🔜 Numeric difference: Compare numerical values and tolerances
🔜 JSON diff: Analyze structural differences in JSON outputs
🔜 Jaccard distance: Calculate similarity between sets of tokens

Current Evaluation Types

1. Close QA (closeqa.json)

Evaluates exact matching responses for close-ended questions.

Example format:

{
  "input": "List the first three prime numbers in ascending order, separated by commas.",
  "output": "2,3,5",
  "criteria": "Numbers must be in correct order, separated by commas with no spaces"
}

2. Simple Evals (evals.json)

Compares model outputs against expected answers.

Example format:

{
  "input": "What is the capital of France?",
  "output": "The capital city of France is Paris",
  "expected": "Paris"
}

🔒 Security Features

AES encryption for API key storage
Secure key generation
Encrypted SQLite storage

System Architecture Check

To determine which version you should download, you can check your system's architecture:

MacOS

uname -m

This will return:

arm64: Use Apple Silicon version (M1/M2/M3 Macs)
x86_64: Use Intel version

Linux

uname -m

This will return:

aarch64 or arm64: Use Linux ARM version
x86_64: Use Linux Intel version

Troubleshooting

Permission Denied

If you encounter permission issues during installation:

# Check current permissions
ls -l /usr/local/bin/burro

# Fix permissions if needed
sudo chmod +x /usr/local/bin/burro

Command Not Found

If burro command is not found after installation:

Verify the installation location is in your PATH
Try restarting your terminal
Verify the executable exists and has proper permissions

Uninstallation Guide

MacOS & Linux

sudo rm /usr/local/bin/burro

# Verify removal
which burro  # Should return nothing if successfully removed

Windows

Delete burro.exe from your installation location
If added to PATH:
- Open System Properties (Win + Pause|Break)
- Click "Advanced system settings"
- Click "Environment Variables"
- Under "System variables" or "User variables", find "Path"
- Click "Edit"
- Remove the directory containing burro.exe
- Click "OK" to save changes

Verify removal:

where.exe burro  # Should return nothing if successfully removed

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
example		example
src		src
test		test
.gitignore		.gitignore
README.md		README.md
deno.json		deno.json
deno.lock		deno.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Burro 🫏

🚀 Features

📋 Prerequisites

🛠️ Installation

MacOS - Apple Silicon (M1/M2/M3)

MacOS - Intel

Linux - ARM

Linux - Intel

Windows

🔧 Usage

Setting up API Keys

Running Evaluations

📊 Evaluation Types

✅ Current Evaluation Types

🔜 Coming Soon

LLM-as-a-Judge Evaluations

Heuristic Evaluations

Current Evaluation Types

1. Close QA (closeqa.json)

2. Simple Evals (evals.json)

🔒 Security Features

System Architecture Check

MacOS

Linux

Troubleshooting

Permission Denied

Command Not Found

Uninstallation Guide

MacOS & Linux

Windows

About

Releases 1

Languages

thisguymartin/burro

Folders and files

Latest commit

History

Repository files navigation

Burro 🫏

🚀 Features

📋 Prerequisites

🛠️ Installation

MacOS - Apple Silicon (M1/M2/M3)

MacOS - Intel

Linux - ARM

Linux - Intel

Windows

🔧 Usage

Setting up API Keys

Running Evaluations

📊 Evaluation Types

✅ Current Evaluation Types

🔜 Coming Soon

LLM-as-a-Judge Evaluations

Heuristic Evaluations

Current Evaluation Types

1. Close QA (closeqa.json)

2. Simple Evals (evals.json)

🔒 Security Features

System Architecture Check

MacOS

Linux

Troubleshooting

Permission Denied

Command Not Found

Uninstallation Guide

MacOS & Linux

Windows

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Languages