🐢 Open-Source Evaluation & Testing for AI & LLM systems
-
Updated
Jan 23, 2025 - Python
🐢 Open-Source Evaluation & Testing for AI & LLM systems
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.
Framework for testing vulnerabilities of large language models (LLM).
This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.
A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.
Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python
Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq
Different approaches to evaluate RAG !!!
Proposal for industry RAG evaluation: Generative Universal Evaluation of LLMs and Information retrieval
PandaChat-RAG benchmark for evaluation of RAG systems on a non-synthetic Slovenian test dataset.
Add a description, image, and links to the rag-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the rag-evaluation topic, visit your repo's landing page and select "manage topics."