Back to Resources
๐ฌ
DeepEval
EvaluationThe open-source LLM evaluation framework with 50+ research-backed metrics.
7kstars600forksPython
About
DeepEval provides a pytest-style interface for unit-testing LLM outputs with metrics covering hallucination, relevancy, faithfulness, safety, and more. Backed by Confident AI's platform for team-level evals.
Key Features
- 50+ metrics
- Pytest integration
- Hallucination detection
- Safety scoring
- RAG eval
Tags
EvaluationTestingLLMHallucinationCI/CD