Match the project to your task before installing it.
Observability and Evaluation · Public
evidently
Observability and evaluation project for turning logs, quality metrics, drift, or experiment results into reviewable signals.
Check whether this project matches your task before installing it.
What it can doObservability setup paths, metric boundaries, sample-data redaction, evaluation checks, and failure triageReview the portable capability path.
Before continuingVerify in a sandboxDo not treat a preview pack as a proven local install.
GitHub snapshot7.6k stars865 forks · 97 contributors
Doramagic.ai Last verification date: 2026-06-29 Verification method: source evidence, semantic profile, public page gate, and static build acceptance.
Publication status · 2026-06-29
What is evidently?
- evidently helps developers observe, evaluate, or monitor AI/data application behavior and quality.
- Best fit: Developers who need reviewable observability or evaluation workflows for AI apps, data pipelines, or experiments.
- Not for: Not for users without logs/sample data, privacy boundaries, or those who only need a chat UI.
- Capability added to an AI workflow: Observability setup paths, metric boundaries, sample-data redaction, evaluation checks, and failure triage
- First safe verification step: Verify collection, metric interpretation, export, and deletion paths with redacted sample data first.
- Verification state: source, Quick Start, and sandbox install checks are recorded as passed.
- Top risk: May increase setup, validation, or first-run risk for the user.
- Evidence base: https://github.com/evidentlyai/evidently, https://github.com/evidentlyai/evidently#readme, Human Manual, Pitfall Log
01
Quick decision
Use this section to decide whether the project is worth a deeper read.Observability and evaluation project for turning logs, quality metrics, drift, or experiment results into reviewable signals.
7.6k stars · 865 forks
02
What it can do
Translate the upstream project into concrete capabilities the user can judge before installing.Overview and System Architecture
Related topics: Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails, UI Service, Storage Backends, and Deployment
Source: https://github.com/evidentlyai/evidently / Human Manual
Core Evaluation Engine: Reports, Metrics, Presets, and Datasets
Related topics: Overview and System Architecture, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Source: https://github.com/evidentlyai/evidently / Human Manual
LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Related topics: Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, UI Service, Storage Backends, and Deployment
Source: https://github.com/evidentlyai/evidently / Human Manual
UI Service, Storage Backends, and Deployment
Related topics: Overview and System Architecture, Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Source: https://github.com/evidentlyai/evidently / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Source: Doramagic discovery, validation, and Project Pack records
Sources: https://github.com/evidentlyai/evidently, Human Manual, Project Pack evidence, and downstream validation signals.
03
Community Discussion Evidence
Project-level external discussion stays visible on the detail page, not only inside the manual.Community Discussion Evidence
12 source-linked itemsReview these external discussions before using evidently with real data or production workflows. They are review inputs, not standalone proof that the project is production-ready.
-
01
SemanticSimilarity fails with sentence-transformers > 5.3.0
github / github_issue
-
02
Make `LLMEval` descriptors plottable from Tests
github / github_issue
-
03
Legacy metrics to new Report API
github / github_issue
-
04
Unauthenticated path traversal arbitrary file read in Evidently UI datas
github / github_issue
-
05
Plotly Graph Objects - Deprecated module is in use.
github / github_issue
-
06
Protect this repo from AI-generated PRs
github / github_issue
-
07
Fix semantic similarity in LLM eval tutorial
github / github_issue
-
08
The fixed value for feel_zeroes in get_binned_data may lead to deviation
github / github_issue
-
09
Error when trying to create collector config in self-hosted environment
github / github_issue
-
10
python 3.13 support
github / github_issue
-
11
Modify scales of plots generated in report
github / github_issue
-
12
Installation risk requires verification
GitHub / issue
04
How to start
Only source-backed commands are shown here. Verify them in an isolated environment first.Try the prompt first
Test the workflow without installing the upstream project.
previewRead the Human Manual
Understand inputs, outputs, limits, and failure modes.
manualTake context to your AI host
Use the compiled assets in your preferred AI environment.
contextRun sandbox verification
Confirm install commands and rollback before using a primary environment.
verifypip install evidentlyOfficial start command · https://github.com/evidentlyai/evidently#readme · verified: yes
05
Human Manual
The English page must expose the real manual, not a short placeholder.8+ sections · Human Manual
evidently Manual
Evidently is \u200b\u200ban open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
Open the full manual- https://github.com/evidentlyai/evidently Project Manual
- Table of Contents
- Overview and System Architecture
- Related Pages
- Purpose and Scope
- Repository Layout
- Core Subsystems
- Frontend Stack and Data Contracts
Overview and System Architecture
Related topics: Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails, UI Service, Storage Backends, and Deployment
Source: https://github.com/evidentlyai/evidently / Human Manual
Core Evaluation Engine: Reports, Metrics, Presets, and Datasets
Related topics: Overview and System Architecture, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Source: https://github.com/evidentlyai/evidently / Human Manual
LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Related topics: Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, UI Service, Storage Backends, and Deployment
Source: https://github.com/evidentlyai/evidently / Human Manual
UI Service, Storage Backends, and Deployment
Related topics: Overview and System Architecture, Core Evaluation Engine: Reports, Metrics, Presets, and Datasets, LLM Evaluation, Descriptors, Prompts, RAG, and Guardrails
Source: https://github.com/evidentlyai/evidently / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
Source: Doramagic discovery, validation, and Project Pack records
06
AI Context Pack and portable assets
After deciding to continue, take the project context into your own AI host.Complete pack plus user-owned assets
These files are planning and verification assets for Claude Code, Codex, Gemini, Cursor, ChatGPT, and other AI hosts.
07
Preflight checks
Treat this page as a planning asset, not proof that your local environment is ready.- The manual is generated from source-linked project files and Doramagic validation signals.
- Community evidence warnings stay visible instead of being converted into marketing claims.
- This English page is indexable because the locale quality gate passed and explicit English index approval is enabled.
- Use the upstream repository as the final authority for installation commands, license, and version-specific behavior.
08
Pitfall Log and verification risks
Doramagic surfaces high-risk items before users treat a candidate capability as verified.Installation risk requires verification
May increase setup, validation, or first-run risk for the user.
Configuration risk requires verification
May increase setup, validation, or first-run risk for the user.
Runtime risk requires verification
May increase setup, validation, or first-run risk for the user.
Runtime risk requires verification
May increase setup, validation, or first-run risk for the user.
Runtime risk requires verification
May increase setup, validation, or first-run risk for the user.
Maintenance risk requires verification
May increase setup, validation, or first-run risk for the user.
Maintenance risk requires verification
May increase setup, validation, or first-run risk for the user.
Security or permission risk requires verification
May increase setup, validation, or first-run risk for the user.