deepchecks Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

deepchecks

Related topics: Installation & Quickstart, Core Architecture, Checks & Suites Framework

Deepchecks Repository Overview

Related topics: Installation & Quickstart, Core Architecture, Checks & Suites Framework

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Abstractions

Continue reading this section for the full explanation and source context.

Section Basic Model Protocol

Continue reading this section for the full explanation and source context.

Section Classification Model Protocol

Continue reading this section for the full explanation and source context.

Deepchecks Repository Overview

Deepchecks is an open-source Python library designed for validating and testing machine learning models and data throughout the ML lifecycle. It provides a comprehensive suite of checks organized into validation suites, enabling data scientists and ML engineers to systematically evaluate model quality, detect data integrity issues, and ensure robust model performance.

Purpose and Scope

Deepchecks addresses the critical need for systematic ML validation by providing:

Pre-training validation: Checks for data integrity, distribution analysis, and feature engineering validation
Post-training validation: Model performance evaluation, error analysis, and robustness testing
Ongoing monitoring: Drift detection for data and model predictions
Cross-domain support: Built-in support for tabular, image, and natural language processing (NLP) domains

The library is distributed under the GNU Affero General Public License (version 3 or later) and is designed to integrate seamlessly into existing ML workflows, including CI/CD pipelines.

Architecture Overview

The Deepchecks architecture follows a modular design with clear separation between core abstractions, domain-specific implementations, and utility functions.

graph TB
    subgraph "Core Layer"
        C[Core Checks]
        S[Suite Engine]
        R[Check Result]
        E[Errors & Validation]
    end
    
    subgraph "Domain Layer"
        T[Tabular Module]
        I[Vision Module]
        N[NLP Module]
    end
    
    subgraph "Utility Layer"
        U[Utils - Typing]
        V[Utils - Validation]
        M[Utils - Metrics]
        L[Utils - Logger]
        J[Utils - JSON]
    end
    
    T --> C
    I --> C
    N --> C
    S --> R
    U --> E
    V --> E
    M --> C
    L --> C
    J --> R

Core Abstractions

The core layer provides fundamental abstractions that all domain modules inherit from:

Component	Purpose	Key Classes
Checks	Individual validation tests	`BaseCheck`, `TrainTestCheck`, `SingleDatasetCheck`
Suites	Collection of organized checks	`Suite`, `SuiteResult`
Results	Output from check execution	`CheckResult`, `CheckFailure`
Conditions	Pass/fail criteria for checks	`ConditionResult`, `ConditionCategory`

Source: deepchecks/__init__.py

Model Protocol System

Deepchecks defines a protocol-based model interface system that supports various model types while maintaining flexibility.

Basic Model Protocol

The BasicModel protocol defines the minimal interface required for any model to work with Deepchecks checks:

@runtime_checkable
class BasicModel(Protocol):
    """Traits of a model that are necessary for deepchecks."""

    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

Source: deepchecks/utils/typing.py:1-50

Classification Model Protocol

Classification models require additional probability prediction capabilities:

@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
    """Traits of a classification model that are used by deepchecks."""

    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Source: deepchecks/utils/typing.py:53-61

Task Types

Deepchecks supports three primary machine learning task types:

Task Type	Value	Description
REGRESSION	'regression'	Continuous value prediction
BINARY	'binary'	Binary classification
MULTICLASS	'multiclass'	Multi-class classification

Source: deepchecks/tabular/utils/task_type.py

Validation Utilities

The validation module provides essential functions for input validation and model verification.

Model Validation

def model_type_validation(model: t.Any):
    """Receive any object and check if it's an instance of a model we support."""
    if not isinstance(model, BasicModel):
        raise errors.ModelValidationError(
            f'Model supplied does not meets the minimal interface requirements.'
        )

Source: deepchecks/tabular/utils/validation.py

Value Validation

def ensure_hashable_or_mutable_sequence(
        value: t.Union[T, t.MutableSequence[T]],
        message: str = (
                'Provided value is neither hashable nor mutable '
                'sequence of hashable items. Got {type}')
) -> t.List[T]:

Source: deepchecks/utils/validation.py

Feature Importance System

Feature importance calculations are central to many Deepchecks checks, enabling identification of the most impactful features.

def calculate_feature_importance_or_none(
        model: t.Any,
        dataset: t.Union['tabular.Dataset', pd.DataFrame],
        model_classes,
        observed_classes,
        task_type,
        ...

Source: deepchecks/tabular/utils/feature_importance.py

Supported Methods

Method	Description
Permutation Importance	Uses scikit-learn's `permutation_importance`
Built-in Importance	Extracts from models with `feature_importances_` attribute
Order-based	Falls back to feature column order when other methods unavailable

Metrics and Scoring

Deepchecks provides comprehensive metric utilities for model evaluation.

Scorer Utilities

def get_gain(base_score, score, perfect_score, max_gain):
    """Get gain between base score and score compared to the distance from the perfect score."""

Source: deepchecks/utils/metrics.py

Gain Calculation Logic

The gain calculation provides normalized performance improvement metrics:

Scenario	Return Value
Both base and score are perfect	0
Base score is better than score	-max_gain
Normal improvement	`scores_diff / distance_from_perfect`
Capped improvement	Clamped to [-max_gain, max_gain]

Logging System

Deepchecks implements a centralized logging system for debugging and progress tracking.

_logger = logging.getLogger('deepchecks')

def get_logger() -> logging.Logger:
    """Return the deepchecks logger."""
    return _logger

def set_verbosity(level: int):
    """Set the deepchecks logger verbosity level."""

Source: deepchecks/utils/logger.py

Verbosity Levels

Level	Effect
INFO	Shows progress bars and informational messages
WARNING	Suppresses progress bars, shows warnings only
ERROR	Shows only error messages

Serialization and JSON Support

Deepchecks supports serialization of check results for persistence and integration with external systems.

def from_json(json_dict: t.Union[str, t.Dict]) -> t.Union[BaseCheckResult, SuiteResult]:
    """Convert a json object that was returned from one of our classes to_json."""
    if isinstance(json_dict, str):
        json_dict = jsonpickle.loads(json_dict)
    json_type = json_dict['type']
    if 'Check' in json_type:
        return BaseCheckResult.from_json(json_dict)
    if json_type == 'SuiteResult':
        return SuiteResult.from_json(json_dict)

Source: deepchecks/utils/json_utils.py

Note: There is a known issue (#2804) with WeakSegmentsPerformance().to_json() where the value field containing both weak_segments (DataFrame) and avg_score gets flattened during serialization.

Outlier Detection

Deepchecks includes IQR-based outlier detection utilities:

def iqr_outliers_range(data: np.ndarray,
                       iqr_range: Tuple[int, int],
                       scale: float,
                       sharp_drop_ratio: float = 0.9) -> Tuple[float, float]:
    """Calculate outliers range on the data given using IQR."""

Source: deepchecks/utils/outliers.py

Parameters

Parameter	Type	Default	Description
data	np.ndarray	Required	Data to calculate outliers range for
iqr_range	Tuple[int, int]	Required	Two percentiles defining IQR range
scale	float	Required	Scale multiplier for IQR range
sharp_drop_ratio	float	0.9	Threshold for sharp drop detection

Simple Model Utilities

For testing and baseline comparisons, Deepchecks provides reference model implementations:

Model	Purpose
`PerfectModel`	Predicts perfectly from training labels
`RandomModel`	Random predictions for baseline testing
`ClassificationUniformModel`	Uniform probability distribution
`RegressionUniformModel`	Uniform continuous predictions

Source: deepchecks/utils/simple_models.py

Decorator System

Deepchecks uses decorators for documentation and code modification:

Decorator	Purpose
`@Substitution`	Dynamic docstring substitution
`@Appender`	Append content to docstrings
`@deprecate_kwarg`	Mark deprecated keyword arguments

Source: deepchecks/utils/decorators.py

Type Inference

Automated feature type detection supports categorical and numerical feature identification:

def infer_numerical_features(df: pd.DataFrame) -> t.List[Hashable]:
    """Infers which features are numerical."""
    
def infer_categorical_features(df: pd.DataFrame) -> t.List[Hashable]:
    """Infers which columns are categorical."""

Source: deepchecks/utils/type_inference.py

Execution Flow

graph LR
    A[User Code] --> B[Create Suite/Check]
    B --> C[Run with Dataset/Model]
    C --> D{Check Logic}
    D -->|Validation Pass| E[Generate CheckResult]
    D -->|Validation Fail| F[Raise DeepchecksValueError]
    E --> G[Apply Conditions]
    G --> H{Result Category}
    H -->|Pass| I[Display Green]
    H -->|Fail| J[Display Red/Warning]
    H -->|Error| K[Display Error]

Known Issues and Community Feedback

The following issues from the community are relevant to users working with the repository:

Issue	Description	Status
#2789	GPU runtime optimization not working for Image Property/Dataset Drift	Bug - needs triage
#2794	anywidget module not registered in visualization	Bug
#2806	`neg_log_loss` scorer incompatible with newer scikit-learn	Bug
#2802	Inaccurate conditions summary for Pairwise Correlation	Bug with proposed solution
#2803	Blank HTML page after `save_as_html()`	Bug
#2804	WeakSegmentsPerformance JSON serialization flattening	Bug

Feature Requests

Notable feature requests from the community include:

#1290: Add option to save reports as Markdown files for CI/CD integration (23 comments - most engaged issue)
#2767: LLM Support for evaluating language model-based applications
#2813: EU AI Act compliance mapping for validation checks (aligned with August 2026 enforcement deadline)
#2812: RAG failure-mode testing documentation using WFGY ProblemMap

Version Information

The current stable release is 0.19.1, which includes:

scikit-learn compatibility updates
Pandas version upgrade support

Recent releases:

Version	Key Changes
0.19.1	updated_sci, upgrade-pandas
0.19.0	Contributor additions, 0.18.x release merge
0.18.1	Build fixes
0.18.0	Documentation improvements, contributor additions
0.17.4	Hotfix version bump

Installation & Quickstart

Related topics: Deepchecks Repository Overview, Tabular Data Validation, NLP Validation, Computer Vision Validation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Prerequisites

Continue reading this section for the full explanation and source context.

Section Optional Dependencies by Domain

Continue reading this section for the full explanation and source context.

Section PyPI Installation (Recommended)

Continue reading this section for the full explanation and source context.

Installation & Quickstart

This page provides comprehensive guidance for installing Deepchecks and getting started with model validation. Deepchecks is an open-source library for validating machine learning models and data throughout the ML pipeline—from data integrity checks to model performance evaluation and drift detection.

Overview

Deepchecks supports multiple ML domains through specialized modules:

Tabular: Validation for tabular data and traditional ML models
Vision: Validation for image datasets and computer vision models
NLP: Validation for text data and natural language processing models

The installation process automatically handles core dependencies, with optional extras for domain-specific functionality.

System Requirements

Prerequisites

Requirement	Minimum Version	Notes
Python	3.8+	Tested up to Python 3.11
pip	21.0+	Recommended for installation
conda	4.10+	Alternative installation option

Optional Dependencies by Domain

Domain	Optional Packages	Installation Flag
Vision	albumentations, torch, torchvision	`pip install deepchecks[vision]`
NLP	transformers, nltk, spacy	`pip install deepchecks[nlp]`
All Extras	All optional packages	`pip install deepchecks[all]`

Source: requirements/vision-requirements.txt, requirements/nlp-requirements.txt

Installation Methods

PyPI Installation (Recommended)

The standard installation uses pip from PyPI:

# Core installation (tabular only)
pip install deepchecks

# With vision support
pip install deepchecks[vision]

# With NLP support
pip install deepchecks[nlp]

# All optional dependencies
pip install deepchecks[all]

Source: setup.py

Conda Installation

For conda users, Deepchecks is available through conda-forge:

conda install -c conda-forge deepchecks

Source: conda-recipe/meta.yaml

Development Installation

To install from source for development:

git clone https://github.com/deepchecks/deepchecks.git
cd deepchecks
pip install -e .

Environment Detection

Deepchecks automatically detects the execution environment to optimize display behavior. The library checks for:

Jupyter Notebook: Full interactive display with rich output
Google Colab: Optimized display for Colab notebooks
Kaggle: Environment-specific handling for Kaggle notebooks
Databricks/SageMaker: Cloud notebook environment detection
Terminal/Headless: Text-only output when GUI is unavailable

Source: deepchecks/utils/ipython.py:37-54

# Environment detection functions available in deepchecks.utils
from deepchecks.utils.ipython import (
    is_notebook,
    is_colab_env,
    is_kaggle_env,
    is_databricks_env,
    is_sagemaker_env,
    is_headless
)

Headless Mode Configuration

When running in CI/CD environments or servers without display capabilities, Deepchecks operates in headless mode. The library automatically detects headless environments but can be explicitly configured:

import deepchecks
# Progress bars are disabled at WARNING level
deepchecks.set_verbosity(logging.WARNING)

Source: deepchecks/utils/logger.py:38-45

Quickstart Guide

Basic Tabular Validation

The fastest way to validate a tabular model using built-in datasets:

from deepchecks.tabular.datasets.classification import load_iris
from deepchecks.tabular.suites import full_suite

# Load sample data
train, test = load_iris()

# Run a full validation suite
suite = full_suite()
result = suite.run(train, test)

# Display results (works in notebooks)
result.show()

Source: deepchecks/utils/builtin_datasets_utils.py

Running Individual Checks

For more granular control, run individual checks:

from deepchecks.tabular.checks.integrity import IsNullsReport
from deepchecks.tabular.datasets.classification import load_iris

# Load data
train, _ = load_iris()

# Run single check
check = IsNullsReport()
result = check.run(dataset=train)
result.show()

Model Validation with Custom Models

Deepchecks supports any model implementing the basic model interface:

from deepchecks.tabular import Dataset
from sklearn.ensemble import RandomForestClassifier

# Create dataset from pandas DataFrame
train_dataset = Dataset(train_df, label='target')
test_dataset = Dataset(test_df, label='target')

# Validate model
from deepchecks.tabular.checks.performance import ModelInfoCheck
check = ModelInfoCheck()
result = check.run(model=model, train_dataset=train_dataset, 
                   test_dataset=test_dataset)

Source: deepchecks/utils/typing.py:22-27

# Required model interface
class BasicModel(Protocol):
    """Minimal interface required by Deepchecks."""
    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

class ClassificationModel(BasicModel, Protocol):
    """Classification models require probability predictions."""
    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Core Concepts

Checks and Suites

Deepchecks organizes validation into two conceptual levels:

graph TD
    A[Suite] --> B[Check 1]
    A --> C[Check 2]
    A --> D[Check N]
    B --> E[Result with Conditions]
    C --> F[Result with Conditions]
    D --> G[Result with Conditions]
    E --> H[SuiteResult]
    F --> H
    G --> H

Checks are individual validation tests that return structured results. Suites are collections of checks that run together and aggregate results.

Source: deepchecks/core/check_result.py

Dataset Structure

The Dataset class wraps pandas DataFrames with additional metadata:

from deepchecks.tabular import Dataset

# Required: DataFrame and label column
dataset = Dataset(df, label='target_column')

# Optional: Specify feature types
dataset = Dataset(
    df, 
    label='target_column',
    features=['feature1', 'feature2'],
    cat_features=['categorical_feature'],
    index='id_column',
    datetime='timestamp_column'
)

Source: deepchecks/utils/validation.py:26-43

Conditions and Thresholds

Checks produce results that can be evaluated against conditions:

from deepchecks.tabular.checks.integrity import MixedNullsCheck

# Create check with condition
check = MixedNullsCheck().add_condition_not_more_than_nulls(0.05)
result = check.run(dataset=train)

# Check condition status
for condition, status in result.conditions_results:
    print(f"{condition.name}: {status}")

Task Types

Deepchecks automatically infers the task type or can be explicitly specified:

from deepchecks.tabular.utils.task_type import TaskType

# Task types available
TaskType.REGRESSION   # For regression models
TaskType.BINARY       # For binary classification
TaskType.MULTICLASS   # For multi-class classification

Source: deepchecks/tabular/utils/task_type.py:14-19

Saving and Exporting Results

HTML Export

Save validation reports as standalone HTML files:

result = suite.run(train, test)
result.save_as_html('validation_report.html')

Note: Users have reported issues with blank HTML pages when using certain versions of anywidget. If you encounter this, ensure you have a compatible version installed.

Source: deepchecks/issues/2794, deepchecks/issues/2803

JSON Export

Serialize results for programmatic processing:

json_output = result.to_json()

Note: Some checks like WeakSegmentsPerformance may require special handling when converting to JSON due to nested DataFrame structures.

Source: deepchecks/issues/2804, deepchecks/utils/json_utils.py

Validation Workflow

graph LR
    A[Prepare Data] --> B[Create Datasets]
    B --> C[Load/Define Model]
    C --> D[Select Suite or Checks]
    D --> E[Configure Conditions]
    E --> F[Run Validation]
    F --> G[Review Results]
    G --> H{Issues Found?}
    H -->|Yes| I[Address Issues]
    I --> A
    H -->|No| J[Deploy Model]

Common Installation Issues

GPU/CUDA Configuration for Vision and NLP

Some checks can leverage GPU acceleration for faster computation:

# For image drift checks, GPU can be enabled
# Note: Currently limited runtime optimization support

Known Issue: GPU acceleration for Image Property Drift and Image Dataset Drift has limited runtime optimization support in version 0.19.x.

Source: deepchecks/issues/2789

Package Compatibility

Package	Known Issues	Recommended Action
scikit-learn	`neg_log_loss` scorer incompatible in newer versions	Use `make_scorer` with explicit parameters
transformers/optimum	Model download issues with latest versions	Pin compatible versions
anywidget	Version conflicts affecting HTML display	Install specific compatible versions

Source: deepchecks/issues/2806, deepchecks/issues/2630

Verifying Installation

Run this simple verification:

import deepchecks

# Verify core installation
print(f"Deepchecks version: {deepchecks.__version__}")

# Check available modules
from deepchecks import tabular, vision, nlp

# Run a simple check
from deepchecks.tabular.datasets.classification import load_iris
train, test = load_iris()
print(f"Loaded Iris dataset: {len(train)} train, {len(test)} test samples")

Next Steps

After installation, explore:

Topic	Description
Tabular Checks	Individual validation checks for tabular data
Suites	Pre-built validation suites
Vision	Image and computer vision validation
NLP	Text and NLP validation
Integrations	CI/CD and MLOps integrations

Core Architecture

Related topics: Checks & Suites Framework, Serialization & Output Formats, Creating Custom Checks

Section Related Pages

Continue reading this section for the full explanation and source context.

Core Architecture

Overview

The Deepchecks Core Architecture provides the foundational building blocks for model validation, data integrity checks, and testing workflows across all supported data modalities (tabular, vision, NLP). The architecture is designed around a Check abstraction that encapsulates validation logic, a Condition system for defining pass/fail thresholds, and a Suite orchestration mechanism for running multiple checks together.

The core module establishes the protocol for model interfaces, defines the check lifecycle, manages execution context, and provides utilities for serialization, logging, and environment detection. This architecture enables Deepchecks to support diverse ML frameworks while maintaining a consistent API for users.

Source: deepchecks/utils/typing.py:1-30

Source: https://github.com/deepchecks/deepchecks / Human Manual

Checks & Suites Framework

Related topics: Core Architecture, Tabular Data Validation, NLP Validation, Computer Vision Validation, Creating Custom Checks

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Check Base Classes

Continue reading this section for the full explanation and source context.

Section Model Protocol Definitions

Continue reading this section for the full explanation and source context.

Section Check Lifecycle

Continue reading this section for the full explanation and source context.

Checks & Suites Framework

Overview

The Checks & Suites Framework is the foundational architecture of Deepchecks, providing a standardized mechanism for validating machine learning models, data, and pipelines. This framework enables domain-specific validation through a pluggable check system organized into suites that can be executed together or individually.

Checks are atomic validation units that evaluate specific aspects of ML systems, such as data integrity, model performance, or drift detection. Suites aggregate multiple checks into cohesive validation pipelines that can be run as a whole or configured with custom conditions.

Source: deepchecks/core/checks.py:1-50

Architecture Overview

graph TD
    A[User Code] --> B[Suite or Check]
    B --> C[Check.run]
    C --> D[Check Result]
    D --> E[Conditions Evaluation]
    E --> F[Condition Results]
    F --> G[Suite Result]
    G --> H[Output: Display/JSON/HTML]
    
    I[Domain: Tabular] --> J[BaseCheck]
    K[Domain: NLP] --> J
    L[Domain: Vision] --> J
    J --> M[BaseCheckResult]

Core Components

Check Base Classes

The framework defines a hierarchical class structure with domain-specific base classes that inherit from a common core.

Component	File	Purpose
`BaseCheck`	`deepchecks/core/checks.py`	Core abstract check implementation
`BaseCheckResult`	`deepchecks/core/checks.py`	Base result container
`TabularBaseCheck`	`deepchecks/tabular/base_checks.py`	Tabular domain checks
`NLPCBBaseCheck`	`deepchecks/nlp/base_checks.py`	NLP domain checks
`VisionBaseCheck`	`deepchecks/vision/base_checks.py`	Vision domain checks

Source: deepchecks/core/checks.py:100-150

Model Protocol Definitions

Checks interact with models through standardized protocols defined in typing.py:

@runtime_checkable
class BasicModel(Protocol):
    """Traits of a model that are necessary for deepchecks."""
    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
    """Traits of a classification model that are used by deepchecks."""
    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Source: deepchecks/utils/typing.py:50-70

Check Structure

Check Lifecycle

stateDiagram-v2
    [*] --> Initialization
    Initialization --> Configuration: Set conditions
    Configuration --> Execution: run() called
    Execution --> Computation: compute()
    Computation --> ResultCreation: Create CheckResult
    ResultCreation --> ConditionEvaluation: Evaluate conditions
    ConditionEvaluation --> [*]

Check Result Structure

Each check produces a CheckResult containing:

Field	Type	Description
`value`	Any	Primary computed value
`display`	List	Visualization elements
`conditions_results`	List[ConditionResult]	Evaluated conditions
`header`	str	Check name/identifier
`reduce_output`	Any	Aggregated value for suites

Source: deepchecks/core/checks.py:200-280

Suite Framework

Suites organize checks into logical groupings for comprehensive validation. Each domain (tabular, NLP, vision) provides pre-built suites.

Suite Execution Flow

graph LR
    A[Suite Instance] --> B[Initialize All Checks]
    B --> C{For Each Check}
    C -->|Success| D[Add CheckResult]
    C -->|Failure| E[Add CheckFailure]
    D --> F{More Checks?}
    E --> F
    F -->|Yes| C
    F -->|No| G[Return SuiteResult]

Default Suite Composition

#### Tabular Default Suites

Source: deepchecks/tabular/suites/default_suites.py

Suite Name	Purpose	Typical Checks
`single_dataset_integrity`	Data quality validation	Missing values, special characters, data duplications
`train_test_validation`	Train/test split validation	Feature drift, label drift, train-test leakage
`model_evaluation`	Model performance	Performance metrics, confusion matrix, class balance
`full_suite`	Comprehensive validation	All tabular checks

#### NLP Default Suites

Source: deepchecks/nlp/suites/default_suites.py

Suite Name	Purpose
`train_test_validation`	NLP-specific train/test validation
`model_evaluation`	NLP model performance checks
`full_suite`	Complete NLP validation pipeline

#### Vision Default Suites

Source: deepchecks/vision/suites/default_suites.py

Suite Name	Purpose
`single_dataset_integrity`	Image/data integrity validation
`train_test_validation`	Vision-specific drift detection
`model_evaluation`	Classification/detection performance
`full_suite`	Complete vision validation

Conditions System

Conditions define pass/fail thresholds for check results and are central to automated validation.

Condition Structure

condition = {
    'name': 'string',
    'comparison_type': 'operator_type',
    'operator': 'comparator',
    'value': threshold
}

Supported Condition Operators

Operator	Description	Example
`greater_than`	Value > threshold	Score > 0.8
`less_than`	Value < threshold	Drift < 0.2
`greater_than_or_equal`	Value >= threshold	Accuracy >= 0.9
`between`	Threshold1 <= Value <= Threshold2	0.1 <= Drift <= 0.3

Source: deepchecks/core/checks.py:300-400

Domain-Specific Implementations

Tabular Checks

Tabular checks inherit from TabularBaseCheck and work with pandas DataFrames:

class TabularBaseCheck(BaseCheck, RunMonitor):
    """Base class for Tabular checks."""
    
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

Common tabular check categories:

Data Integrity: Missing values, outliers, duplicates
Train-Test Drift: Feature drift, label drift, concept drift
Model Performance: Metrics evaluation, confusion matrix analysis
Feature Validation: Feature importance, correlation analysis

Source: deepchecks/tabular/base_checks.py:1-80

NLP Checks

NLP checks inherit from NLPCBBaseCheck and work with text data:

class NLPCBBaseCheck(BaseCheck, RunMonitor):
    """Base class for Natural Language Processing checks."""

Common NLP check categories:

Text Statistics: Token length, vocabulary size
Text Drift: Property drift, embedding drift
Label Validation: Label distribution, annotation quality

Source: deepchecks/nlp/base_checks.py:1-80

Vision Checks

Vision checks inherit from VisionBaseCheck and work with image data:

class VisionBaseCheck(BaseCheck, RunMonitor):
    """Base class for Vision checks."""

Common vision check categories:

Image Statistics: Brightness, contrast, aspect ratio
Image Drift: Property drift, dataset drift
Model Performance: Classification metrics, detection accuracy

Source: deepchecks/vision/base_checks.py:1-80

Usage Patterns

Running a Single Check

from deepchecks.tabular.checks import FeatureDrift

# Initialize check
check = FeatureDrift()

# Run with data
result = check.run(dataset=train_ds, test_dataset=test_ds)

# Display results
result.show()

Running a Suite

from deepchecks.tabular.suites import full_suite

# Get default suite
suite = full_suite()

# Run suite
result = suite.run(
    train_dataset=train_ds,
    test_dataset=test_ds,
    model=trained_model
)

# Display results
result.show()

Customizing with Conditions

from deepchecks.tabular.checks import ModelPerformance

# Add custom condition
check = ModelPerformance()
check.add_condition_drift_score_less_than(0.1)

# Run with condition
result = check.run(train_dataset=train_ds, test_dataset=test_ds, model=model)

Saving Results

# Save as HTML
result.save_as_html('report.html')

# Save as JSON
json_str = result.to_json()

# Load from JSON
from deepchecks.utils.json_utils import from_json
loaded_result = from_json(json_str)

Note: Users have reported issues with blank HTML pages when saving reports with save_as_html() using certain versions of anywidget. This has been tracked in issue #2794.

Configuration Options

Check Configuration Parameters

Parameter	Description	Default
`n_top_columns`	Number of top columns to display	10
`n_top_samples`	Number of samples for display	5
`aggregation_method`	Method for aggregating multi-class results	'mean'

Suite Configuration Parameters

Parameter	Description	Default
`conditions`	List of condition configurations	[]
`include_random_samples`	Include random samples in output	True
`random_samples`	Number of random samples	3

Source: deepchecks/core/checks.py:400-500

Result Aggregation

When checks are run as part of a suite, individual results can be aggregated using the reduce mechanism.

graph TD
    A[Multiple CheckResults] --> B[Reduce Classes]
    B --> C[column_importance_sorter_dict]
    B --> D[column_importance_sorter_df]
    B --> E[CategoryReducerAgg]
    C --> F[Aggregated Output]
    D --> F
    E --> F

Source: deepchecks/core/reduce_classes.py

Common Issues and Troubleshooting

Model Validation Errors

If you encounter ModelValidationError, ensure your model implements the required interface:

from deepchecks.utils.typing import BasicModel

# Check if model is valid
assert isinstance(your_model, BasicModel)
assert hasattr(your_model, 'predict')

Condition Evaluation Failures

When conditions fail to evaluate:

Check that the check produces the expected value type
Verify condition thresholds are appropriate for your data
Review the condition's expected value format

Serialization Issues

When serializing results to JSON:

Known Issue: The WeakSegmentsPerformance check produces nested structures that may not serialize correctly. See issue #2804 for details.

Scorer Compatibility

Some custom scorers may be incompatible with newer scikit-learn versions. The neg_log_loss scorer has known issues with make_scorer parameters as documented in issue #2806.

Integration with CI/CD

Programmatic Suite Execution

import deepchecks
from deepchecks.tabular.suites import full_suite

def run_validation():
    suite = full_suite()
    result = suite.run(
        train_dataset=train_ds,
        test_dataset=test_ds,
        model=model
    )
    
    # Fail if any condition fails
    if not result.passed():
        raise ValueError("Validation suite failed")
    
    return result

Output Formats

Format	Method	Use Case
Interactive Display	`result.show()`	Jupyter notebooks, GUI
HTML Report	`result.save_as_html(path)`	Static reports, sharing
JSON	`result.to_json()`	CI/CD integration, automation

Serialization & Output Formats

Overview

Deepchecks provides comprehensive serialization capabilities to persist, share, and integrate validation results across different environments and workflows. The serialization system supports multiple output formats including JSON for programmatic access and HTML for human-readable reports.

The primary goal of the serialization subsystem is to capture complete validation results—including check outputs, conditions, metrics, and metadata—in a format that can be reliably stored, transmitted, and reconstructed.

graph TD
    A[CheckResult / SuiteResult] --> B{Serialization Request}
    B --> C[JSON Format]
    B --> D[HTML Format]
    B --> E[W&B Integration]
    C --> F[from_json]
    D --> G[save_as_html]
    E --> H[WandbLogger]
    F --> I[Reconstructed Objects]

Output Format Types

JSON Serialization

JSON is the primary machine-readable format for Deepchecks outputs. Both individual CheckResult objects and SuiteResult objects support JSON serialization through their to_json() method.

The JSON output preserves the complete structure of validation results including:

Component	Description	Data Type
`type`	Object type identifier	String
`check_name`	Name of the validation check	String
`value`	Check-specific output data	Various
`conditions_results`	Condition evaluation outcomes	List
`have_passed`	Overall pass/fail status	Boolean
`metadata`	Additional context and parameters	Dict

Deserialization

Use the from_json() utility function to reconstruct objects from JSON:

from deepchecks.utils.json_utils import from_json

# Load from JSON string or dictionary
result = from_json(json_data)

Source: deepchecks/utils/json_utils.py:24-49

The from_json function handles type dispatch automatically, returning either a BaseCheckResult or SuiteResult based on the JSON's type field:

def from_json(json_dict: t.Union[str, t.Dict]) -> t.Union[BaseCheckResult, SuiteResult]:
    if isinstance(json_dict, str):
        json_dict = jsonpickle.loads(json_dict)
    json_type = json_dict['type']
    if 'Check' in json_type:
        return BaseCheckResult.from_json(json_dict)
    if json_type == 'SuiteResult':
        return SuiteResult.from_json(json_dict)
    raise ValueError('Expected json object to be one of '
                     '[CheckFailure, CheckResult, SuiteResult]')

Source: deepchecks/utils/json_utils.py:24-49

HTML Reports

HTML output provides self-contained, interactive visualization of validation results. Use the save_as_html() method to generate standalone HTML files:

result.save_as_html('validation_report.html')

HTML reports include embedded styles, JavaScript for interactivity, and all necessary assets for offline viewing. The output leverages anywidget for interactive visualizations.

Common HTML Output Issues

Users have reported issues with HTML report generation:

Issue	Description	Reference
Blank page	HTML renders empty when opened in browser	Issue #2803
anywidget errors	Failed to load model class from anywidget module	Issue #2794
Widget state	Interactive elements fail to initialize	Issue #2794

When encountering blank HTML pages, ensure:

anywidget package is properly installed (pip install anywidget)
Browser console shows no JavaScript errors
Assets are correctly embedded in the saved file

Weights & Biases (W&B) Integration

Deepchecks supports direct integration with Weights & Biases for experiment tracking. Check results can be logged to W&B using the built-in WandbLogger:

from deepchecks.core.serialization.check_result.wandb import WandbLogger

logger = WandbLogger(project='ml-validation')
logger.log_check_result(result)

Check Result Structure

BaseCheckResult Components

The BaseCheckResult class provides the foundation for all check outputs:

class BaseCheckResult:
    value: Any              # Primary output data
    header: str            # Human-readable title
    display: List[Any]     # Visual elements for output
    conditions_results: List[ConditionResult]
    extra_data: Dict       # Supplementary information

Source: deepchecks/core/check_result.py

Condition Results

Conditions represent automated pass/fail criteria defined for checks. Each condition result includes:

Field	Type	Description
`name`	String	Condition identifier
`category`	String	PASS, FAIL, or WARN
`details`	String	Explanation of the result

Serialization Data Flow

sequenceDiagram
    participant User
    participant CheckResult
    participant Serializer
    participant Output
    
    User->>CheckResult: to_json()
    CheckResult->>Serializer: Serialize value
    Serializer->>Serializer: Handle complex types
    Serializer->>Output: JSON string
    
    User->>CheckResult: save_as_html()
    CheckResult->>Serializer: Generate HTML
    Serializer->>Output: HTML file

Known Limitations and Issues

JSON Serialization Concerns

Several community-reported issues relate to JSON serialization:

DataFrame in Results

When checks like WeakSegmentsPerformance return DataFrames as part of their value, the to_json() method may flatten the output incorrectly. The result's value field containing both DataFrames and scalar values gets passed to the serializer as a dictionary, causing nested structures to be flattened.

Reference: Issue #2804

Precision Control

Currently, the precision of floating-point values in DataFrame serialization is fixed at 2 decimal places and cannot be configured by users. This may not be suitable for values requiring higher precision.

Reference: Issue #2598

HTML Output Issues

Blank Page After Save

Users running save_as_html() have reported blank pages when opening the generated HTML file. This typically occurs when:

The anywidget package version is incompatible
JavaScript execution is blocked in the browser
Embedded assets fail to load

Reference: Issue #2803

Missing Markdown Export

Currently, Deepchecks does not support exporting validation results as Markdown files. Users requesting CML (Continuous Machine Learning) integration have requested this feature to enable markdown-formatted reports in CI/CD pipelines.

Reference: Issue #1290

Common Patterns

Saving Results in CI/CD

import deepchecks
from deepchecks import Dataset

# Run validation
result = deepchecks.check(...).run(dataset)

# Save for CI/CD artifact
result.save_as_html('validation-report.html')
result.to_json('validation-result.json')

Reconstructing Results

from deepchecks.utils.json_utils import from_json

# Load previously saved result
with open('validation-result.json', 'r') as f:
    saved_result = from_json(f.read())

# Access results programmatically
print(saved_result.passed_conditions())

Working with Complex Outputs

For checks that return complex nested data structures:

result = check.run(dataset)

# Access structured data
if hasattr(result, 'value'):
    if isinstance(result.value, dict):
        for key, value in result.value.items():
            # Handle each component
            pass

Best Practices

Version Compatibility: Ensure consistent Deepchecks versions when sharing serialized results between environments.

Large Results: For checks producing large outputs, consider using to_json() which is more compact than HTML for storage.

Error Handling: Wrap deserialization in try-except blocks to handle format changes between versions.

HTML for Review: Use HTML reports for human review; use JSON for programmatic processing and CI/CD integration.

Asset Management: HTML reports are self-contained but may require anywidget for full interactivity in Jupyter environments.

Tabular Data Validation

Related topics: Checks & Suites Framework, Creating Custom Checks

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Capabilities

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Data Flow

Continue reading this section for the full explanation and source context.

Tabular Data Validation

Tabular Data Validation is the core module of Deepchecks that provides comprehensive validation capabilities for tabular datasets and machine learning models. This module enables data scientists and ML engineers to validate data integrity, evaluate model performance, and detect train-test distribution shifts through a suite of automated checks organized into logical categories.

Overview

The tabular validation system in Deepchecks is designed to catch common data quality issues and model problems before they impact production systems. It supports both classification and regression tasks across multiple domain areas, providing consistent validation patterns regardless of the specific use case.

Key Capabilities

Data Integrity Checks: Validate dataset structure, detect anomalies, and identify data quality issues
Model Evaluation: Assess model performance using various metrics and comparison baselines
Train-Test Validation: Detect distribution shifts and validate consistency between training and test datasets
Feature Analysis: Analyze feature properties, correlations, and importance
Label Verification: Validate label distribution and detect labeling issues

The system is built around three core abstractions: Dataset, Context, and Check, which work together to provide a flexible and extensible validation framework. Source: deepchecks/tabular/__init__.py

Architecture

Core Components

graph TD
    A[User Data] --> B[Dataset]
    A --> C[Model]
    B --> D[Context]
    C --> D
    D --> E[Check]
    E --> F[CheckResult]
    F --> G[Display / Export]
    
    B --> H[Data Integrity Checks]
    D --> I[Model Evaluation Checks]
    D --> J[Train-Test Validation Checks]
    
    H -.-> E
    I -.-> E
    J -.-> E

Data Flow

sequenceDiagram
    participant User
    participant Dataset
    participant Context
    participant Check
    participant Result
    
    User->>Dataset: Create Dataset
    User->>Context: Create Context with Dataset(s)
    User->>Context: Optional: Add Model
    Context->>Check: Run Check
    Check->>Dataset: Access data via Context
    Check->>Check: Compute validation logic
    Check->>Result: Generate CheckResult
    Result->>User: Display / Export

Dataset Class

The Dataset class is the fundamental data container for tabular validation. It wraps pandas DataFrames and provides metadata about the dataset structure, including feature types and label column information.

Source: deepchecks/tabular/dataset.py

Creating a Dataset

from deepchecks.tabular import Dataset

# Basic dataset creation
ds = Dataset(df)

# With explicit label column
ds = Dataset(df, label='target_column')

# With feature type hints
ds = Dataset(df, 
             label='target',
             features=['feature1', 'feature2'],
             cat_features=['categorical_column'])

Dataset Properties

Property	Type	Description
`df`	`pd.DataFrame`	The underlying DataFrame
`label_col`	`str`	Name of the label column
`features`	`List[Hashable]`	List of feature column names
`cat_features`	`List[Hashable]`	List of categorical feature names
`index_col`	`Optional[Hashable]`	Optional index column
`datetime_col`	`Optional[Hashable]`	Optional datetime column

Feature Type Inference

Deepchecks automatically infers feature types based on column data. The system uses the following inference rules:

Source: deepchecks/utils/type_inference.py

def infer_numerical_features(df: pd.DataFrame) -> List[Hashable]:
    """Infers which features are numerical."""
    # Columns with numeric dtype are inferred as numerical
    # Object columns may still contain numeric data

def infer_categorical_features(df: pd.DataFrame) -> List[Hashable]:
    """Infers which features are categorical."""
    # String columns with few unique values are typically categorical
    # Boolean columns are treated as categorical

Context Class

The Context class serves as the orchestration layer that manages datasets, models, and configuration for validation runs. It provides a unified interface for checks to access the data they need.

Source: deepchecks/tabular/context.py

Context Creation

from deepchecks.tabular import Context, Dataset

# Training and test datasets
train_ds = Dataset(train_df, label='target')
test_ds = Dataset(test_df, label='target')

# Create context with both datasets
context = Context(
    train_dataset=train_ds,
    test_dataset=test_ds,
    model=trained_model  # Optional
)

Context Parameters

Parameter	Type	Required	Description
`train_dataset`	`Dataset`	Yes	Training dataset
`test_dataset`	`Dataset`	No	Test dataset for comparison
`model`	`BasicModel`	No	Trained model to validate
`model_name`	`str`	No	Name identifier for the model
`scorers`	`Dict`	No	Custom metric scorers
`scorers_required_average`	`Dict`	No	Scorers requiring averaging

Model Base Classes

Deepchecks defines protocol classes that describe the interface models must implement to work with the validation framework. These are defined using Python's Protocol for structural subtyping.

Source: deepchecks/utils/typing.py

BasicModel Protocol

@runtime_checkable
class BasicModel(Protocol):
    """Traits of a model that are necessary for deepchecks."""
    
    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

ClassificationModel Protocol

@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
    """Traits of a classification model that are used by deepchecks."""
    
    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Supported Model Integrations

Model Type	Required Methods	Use Cases
Classification	`predict`, `predict_proba`	Class probability checks, ROC curves
Regression	`predict` only	Residual analysis, performance metrics

Check Categories

Checks in Deepchecks are organized into three main categories that address different validation concerns.

Source: deepchecks/tabular/checks/data_integrity/__init__.py

Data Integrity Checks

Data integrity checks validate the structure and quality of datasets. These checks can run on a single dataset without requiring a model or train-test comparison.

Source: deepchecks/tabular/checks/data_integrity/__init__.py

Available Checks:

StringMismatchComparison - Detect string format inconsistencies
IsSingleValue - Identify columns with only one unique value
DataDuplicates - Find duplicate rows
MixedNulls - Detect mixed null value representations
StringLengthOutOfBounds - Find strings outside expected length ranges

Model Evaluation Checks

Model evaluation checks assess model performance using various metrics and comparison techniques. These checks require a trained model.

Source: deepchecks/tabular/checks/model_evaluation/__init__.py

Available Checks:

TrainTestPerformance - Compare model performance across train/test
PerformanceReport - Generate comprehensive performance metrics
ConfusionMatrixReport - Display confusion matrix for classification
ClassPerformance - Analyze per-class performance
WeakSegmentsPerformance - Identify underperforming data segments

Train-Test Validation Checks

Train-test validation checks detect distribution shifts and inconsistencies between training and test datasets.

Source: deepchecks/tabular/checks/train_test_validation/__init__.py

Available Checks:

FeatureDrift - Detect feature distribution changes
LabelDrift - Detect label distribution changes
TrainTestFeatureDrift - Compare feature distributions
IndexTrainTest Leakage - Detect index-based data leakage
MultivariateDrift - Detect overall distribution shifts

Running Checks

Single Check Execution

from deepchecks.tabular.checks.data_integrity import DataDuplicates

# Create and run a single check
check = DataDuplicates()
result = check.run(dataset=my_dataset)
result.show()  # Display results

Using Suites

Suites group multiple checks together for comprehensive validation:

from deepchecks.tabular.suites import data_integrity_suite

# Run predefined suite
suite = data_integrity_suite()
result = suite.run(dataset=my_dataset)

Custom Scorers

Deepchecks supports custom scikit-learn compatible scorers for model evaluation:

Source: deepchecks/tabular/metric_utils/scorers.py

from sklearn.metrics import make_scorer, log_loss

# Custom scorer for model evaluation
context = Context(
    train_dataset=train_ds,
    test_dataset=test_ds,
    model=model,
    scorers={
        'neg_log_loss': make_scorer(log_loss, greater_is_better=False, needs_proba=True)
    }
)

Note: When using custom scorers with make_scorer, ensure compatibility with your scikit-learn version. Some older scorer configurations may not be compatible with newer scikit-learn versions. See Issue #2806 for known compatibility issues.

Feature Importance

Deepchecks provides utilities for extracting and using feature importance values from models.

Source: deepchecks/tabular/utils/feature_importance.py

Supported Importance Sources

Source	Priority	Description
`feature_importances_` attribute	1	Scikit-learn feature importances
`coef_` attribute	2	Linear model coefficients
`permutation`	3	Computed permutation importance

Usage

from deepchecks.tabular.utils.feature_importance import get_feature_importance

# Get feature importance from model
importance = get_feature_importance(model, dataset)

Task Type Detection

Deepchecks automatically detects the task type (classification or regression) based on the label column characteristics.

Source: deepchecks/utils/type_inference.py

def infer_task_type(label_column: pd.Series) -> Literal['classification', 'regression']:
    """Infer task type based on label characteristics."""
    if is_numeric_dtype(label_column):
        # Check if unique values suggest classification
        n_unique = label_column.nunique()
        if n_unique <= 20:  # Threshold for classification
            return 'classification'
        return 'regression'
    return 'classification'

Built-in Datasets

Deepchecks includes utilities for loading example datasets for testing and demonstration purposes.

Source: deepchecks/tabular/datasets/__init__.py

Loading Example Data

from deepchecks.tabular.datasets import load_iris, load_diabetes

# Load classification dataset
train_ds, test_ds = load_iris()

# Load regression dataset
train_ds, test_ds = load_diabetes()

Output and Export

Displaying Results

# Display in notebook
result.show()

# Display inline
result.display()

Exporting Results

# Save as HTML
result.save_as_html('report.html')

# Export to JSON
json_output = result.to_json()

Known Issue: When saving reports as HTML, ensure compatibility between deepchecks and anywidget versions. Blank pages may appear with certain version combinations. See Issue #2794 and Issue #2803.

Common Issues and Solutions

Pairwise Correlation Display

A known issue affects the accuracy of the conditions summary and heatmap for pairwise correlation displays. See Issue #2802 for details and proposed solutions.

JSON Serialization

When serializing WeakSegmentsPerformance results to JSON, the value field containing both weak_segments (DataFrame) and avg_score may be flattened. See Issue #2804.

Model Serialization Precision

Currently, the precision of values in DataFrame serialization is fixed at 2 decimal places and cannot be configured. See Feature Request #2598.

NLP Validation

Related topics: Checks & Suites Framework

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Constructor Parameters

Continue reading this section for the full explanation and source context.

Section Supported Task Types

Continue reading this section for the full explanation and source context.

Related topics: Checks & Suites Framework

NLP Validation

Overview

Deepchecks' NLP (Natural Language Processing) Validation module provides a comprehensive framework for validating text-based machine learning models and datasets. This module enables data scientists and ML engineers to perform rigorous checks on text data integrity, model evaluation, and train-test validation for NLP tasks.

The NLP validation framework is designed to work seamlessly with the broader Deepchecks ecosystem, providing specialized checks tailored to the unique characteristics of text data including tokenization, embeddings, and linguistic properties.

Source: deepchecks/nlp/__init__.py

Architecture

The NLP validation module follows a modular architecture with distinct components for data representation, context management, task configuration, and validation checks.

graph TD
    A[NLP Validation Module] --> B[TextData]
    A --> C[NLPContext]
    A --> D[TaskType]
    A --> E[NLP Checks]
    
    B --> B1[Raw Text]
    B --> B2[Metadata]
    B --> B3[Embeddings Cache]
    
    C --> C1[Single Dataset Mode]
    C --> C2[Train-Test Mode]
    
    E --> E1[Data Integrity]
    E --> E2[Model Evaluation]
    E --> E3[Train-Test Validation]
    
    D --> D1[TextClassification]
    D --> D2[TokenClassification]
    D --> D3[SequenceClassification]

Core Components

Component	Purpose	Source File
`TextData`	Represents text datasets with metadata and caching	text_data.py
`NLPContext`	Manages validation context for single or train-test scenarios	context.py
`TaskType`	Enum defining supported NLP task types	task_type.py
`input_validations`	Input validation utilities for NLP data	input_validations.py

Source: deepchecks/nlp/__init__.py

TextData Class

The TextData class is the fundamental data structure for representing text datasets in Deepchecks NLP validation. It encapsulates raw text data, optional metadata, and computed properties such as embeddings.

Source: deepchecks/nlp/text_data.py

Constructor Parameters

Parameter	Type	Required	Description
`raw_text`	List[str] or pd.Series	Yes	List of text samples
`task_type`	TaskType	Yes	Type of NLP task
`label`	List or pd.Series	No	Labels for samples
`metadata`	pd.DataFrame	No	Additional metadata columns
`embeddings_provider`	object	No	Provider for computing embeddings
`embeddings_cache_dir`	str	No	Directory for caching embeddings

Supported Task Types

The module supports three primary NLP task types:

Task Type	Description	Use Case
`TEXT_CLASSIFICATION`	Single-label or multi-label classification	Sentiment analysis, topic classification
`TOKEN_CLASSIFICATION`	Per-token labeling	Named Entity Recognition (NER), Part-of-Speech tagging
`SEQUENCE_CLASSIFICATION`	Sequence-to-label tasks	Document classification

Source: deepchecks/nlp/task_type.py

Data Integrity Validation

Input validation ensures that the provided data meets the requirements for NLP processing:

def validate_texts_not_empty(raw_text: t.List) -> None:
    """Validate that texts list is not empty."""
    
def validate_label_format(label, task_type: TaskType) -> None:
    """Validate label format matches task type requirements."""

Source: deepchecks/nlp/input_validations.py

Text Properties

Deepchecks NLP provides utilities for computing various text properties that can be used for drift detection and data integrity checks.

Source: deepchecks/nlp/utils/text_properties.py

Available Text Properties

Property	Description	Category
`TextLength`	Number of characters in text	Basic
`WordCount`	Number of words in text	Basic
`SentenceCount`	Number of sentences	Basic
`AverageWordLength`	Average length of words	Basic
`MaxWordLength`	Length of longest word	Basic
`Language`	Detected language	Linguistic
`Sentiment`	Sentiment score	Linguistic

Property Calculation

Text properties are calculated lazily and cached to optimize performance. The property calculator handles different scenarios:

def calculate_text_properties(
    raw_text: t.List[str],
    properties_list: t.List[str] = None,
    device: str = 'cpu'
) -> pd.DataFrame:
    """Calculate text properties for a list of texts."""

Note: GPU acceleration for property calculation is available for certain properties. Some community issues have reported limitations with GPU runtime optimization for drift checks. See Issue #2789 for details.

Source: deepchecks/nlp/utils/text_properties.py

Text Embeddings

Deepchecks supports text embeddings for semantic similarity and drift detection in NLP validation.

Source: deepchecks/nlp/utils/text_embeddings.py

Embedding Providers

The module supports multiple embedding providers through a unified interface:

Provider	Description	Notes
`transformers`	Hugging Face Transformers models	Requires `transformers` package
`sentence_transformers`	Sentence-BERT models	Recommended for semantic similarity
`sklearn`	TF-IDF or other sklearn embedders	Built-in, no extra dependencies

Embedding Configuration

class EmbeddingsCalculator:
    """Calculate embeddings for text data."""
    
    def __init__(
        self,
        provider: str = 'sentence_transformers',
        model_name: str = 'all-MiniLM-L6-v2',
        device: str = 'cpu'
    ):
        """Initialize embeddings calculator."""

Important: Some users have reported issues downloading property/embedding models with newer versions of transformers and optimum packages. See Issue #2630 for compatibility information.

Source: deepchecks/nlp/utils/text_embeddings.py

Validation Checks

Deepchecks NLP provides three categories of validation checks:

Source: deepchecks/nlp/checks/data_integrity/__init__.py

Data Integrity Checks

These checks validate the quality and consistency of text data before model training or evaluation.

Check	Purpose	Source
`TextPropertyOutliers`	Detect text samples with unusual property values	data_integrity/__init__.py
`SpecialCharacters`	Identify samples with unexpected special characters	data_integrity/__init__.py
`StringMismatch`	Detect string formatting inconsistencies	data_integrity/__init__.py

Model Evaluation Checks

These checks assess model performance and behavior on a labeled dataset.

Source: deepchecks/nlp/checks/model_evaluation/__init__.py

Check	Purpose	Source
`ConfusionMatrixReport`	Display confusion matrix for classification	model_evaluation/__init__.py
`ClassPerformance`	Compare per-class model performance	model_evaluation/__init__.py
`PredictionDrift`	Detect drift in model predictions	model_evaluation/__init__.py

Train-Test Validation Checks

These checks compare training and test datasets to detect distribution shift and potential data leakage.

Source: deepchecks/nlp/checks/train_test_validation/__init__.py

Check	Purpose	Source
`TextPropertyDrift`	Detect drift in text properties	train_test_validation/__init__.py
`PropertyLabelCorrelation`	Check correlation between properties and labels	train_test_validation/__init__.py
`TrainTestFeatureDrift`	Detect feature distribution shift	train_test_validation/__init__.py

NLPContext

The NLPContext class manages the validation workflow, handling both single-dataset and train-test validation scenarios.

Source: deepchecks/nlp/context.py

Context Modes

graph LR
    A[NLPContext] --> B[Single Dataset Mode]
    A --> C[Train-Test Mode]
    
    B --> B1[train Dataset Only]
    B --> B2[Data Integrity Checks]
    B --> B3[Model Evaluation Checks]
    
    C --> C1[train Dataset]
    C --> C2[Test Dataset]
    C --> C3[Train-Test Validation Checks]

Creating and Running a Context

from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks import TextPropertyDrift

# Single dataset mode
context = NLPContext(train_dataset)
context.run()

# Train-test mode
context = NLPContext(train_dataset, test_dataset)
context.add_check(TextPropertyDrift())
context.run()

Source: deepchecks/nlp/context.py

Built-in Datasets

Deepchecks provides built-in NLP datasets for testing and learning purposes.

Source: deepchecks/nlp/datasets/__init__.py

Available Datasets

Dataset	Task Type	Description
`load_builtin_dataset()`	Various	Load pre-configured NLP datasets
`load_dataset_from_list()`	Various	Create TextData from custom lists

Visualization

NLP validation results can be visualized using Plotly-based plots that integrate with Jupyter notebooks and can be exported to HTML.

Source: deepchecks/nlp/utils/nlp_plot.py

Plot Types

Plot Type	Purpose
Distribution plots	Display property or prediction distributions
Drift plots	Visualize train-test drift
Heatmaps	Show confusion matrices and correlations

Note: There have been reported issues with saving reports as HTML in certain environments. Users have reported blank pages when using save_as_html(). See Issue #2794 and Issue #2803 for workarounds.

Usage Patterns

Basic Single-Dataset Validation

from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks.data_integrity import TextPropertyOutliers

# Create text data
train_data = TextData(
    raw_text=['This is positive', 'This is negative', 'Neutral text'],
    task_type='TEXT_CLASSIFICATION',
    label=[1, 0, 1]
)

# Run validation
context = NLPContext(train_data)
context.run()

Train-Test Drift Detection

from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks.train_test_validation import TextPropertyDrift

train_data = TextData(raw_text=train_texts, task_type='TEXT_CLASSIFICATION', label=train_labels)
test_data = TextData(raw_text=test_texts, task_type='TEXT_CLASSIFICATION', label=test_labels)

context = NLPContext(train_data, test_data)
context.add_check(TextPropertyDrift())
result = context.run()
result.save_as_html('drift_report.html')

Common Issues and Troubleshooting

Model Download Failures

Issue: Cannot download models for property calculation or embeddings.

Solution: This may be caused by incompatibilities with newer versions of transformers or optimum. Ensure you have compatible versions installed:

pip install transformers==4.30.0 optimum==1.12.0

See Issue #2630 for details.

HTML Report Display Issues

Issue: Blank HTML page after saving report.

Solution:

Ensure anywidget is properly installed and registered
Try using the requirejs parameter when saving:

result.save_as_html('report.html', requirejs=True)

See Issue #2803 for details.

GPU Runtime for Drift Checks

Issue: GPU not utilized for image/text drift checks despite documentation suggesting optimization.

Note: GPU acceleration support for NLP drift checks is limited. The documentation regarding runtime optimization may not apply uniformly across all check types. See Issue #2789 for status updates.

Computer Vision Validation

Related topics: Checks & Suites Framework

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Checks & Suites Framework

Computer Vision Validation

Overview

The Computer Vision Validation module in DeepChecks provides a comprehensive framework for validating image classification, object detection, and semantic segmentation models. This module enables ML practitioners to detect data integrity issues, evaluate model performance, and identify distribution shifts between training and test datasets.

Purpose: The vision validation system performs automated checks on image datasets and vision models to ensure data quality, model reliability, and consistency across different data splits. It supports multiple vision task types including classification, object detection, and segmentation.

Scope: The module covers data integrity validation (corrupted images, missing labels, class imbalance), model evaluation checks (performance metrics, confusion analysis), and train-test validation (drift detection, weak segment identification).

Source: deepchecks/vision/__init__.py

Source: https://github.com/deepchecks/deepchecks / Human Manual

Creating Custom Checks

Related topics: Core Architecture, Checks & Suites Framework

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Check Hierarchy

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section BasicModel Protocol

Continue reading this section for the full explanation and source context.

Creating Custom Checks

This page documents how to create custom checks in Deepchecks, enabling you to extend the validation framework with domain-specific validation logic for tabular, NLP, and vision use cases.

Overview

Deepchecks provides a flexible check architecture that allows users to create custom validation checks beyond the built-in suite. A Check in Deepchecks is a self-contained unit of validation that analyzes data, models, or their relationships and returns structured results with optional pass/fail conditions.

The check system is designed around several key principles:

Modularity: Each check focuses on a single validation concern
Reusability: Checks can be composed into Suites for batch execution
Extensibility: Domain-specific checks can be added without modifying core code
Conditional Logic: Checks support conditions that define pass/fail thresholds

Source: deepchecks/core/checks.py

Architecture

Check Hierarchy

Deepchecks implements a hierarchical check system with different base classes for each domain:

graph TD
    A["BaseCheck<br/>(core)"] --> B["TrainTestCheck<br/>(tabular)"]
    A --> C["SingleDatasetCheck<br/>(tabular)"]
    A --> D["ModelOnlyCheck<br/>(tabular)"]
    A --> E["TrainTestCheck<br/>(nlp)"]
    A --> F["SingleDatasetCheck<br/>(nlp)"]
    A --> G["TrainTestCheck<br/>(vision)"]
    A --> H["SingleDatasetCheck<br/>(vision)"]
    
    B --> I["Tabular Check<br/>Implementation"]
    C --> I
    D --> I
    E --> I
    F --> I
    G --> I
    H --> I

Core Components

The check system relies on several core components defined in the abstract layer:

Component	File	Purpose
`BaseCheck`	deepchecks/core/checks.py	Abstract base class for all checks
`BasicModel`	deepchecks/utils/typing.py	Protocol defining minimal model interface
`ClassificationModel`	deepchecks/utils/typing.py	Protocol for classification models with `predict_proba`
`TrainTestCheck`	deepchecks/tabular/base_checks.py	Check comparing train and test data

Source: deepchecks/utils/typing.py:17-30

Model Protocols

Before creating checks, understand the model protocols Deepchecks uses:

BasicModel Protocol

@runtime_checkable
class BasicModel(Protocol):
    """Traits of a model that are necessary for deepchecks."""

    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

ClassificationModel Protocol

@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
    """Traits of a classification model that are used by deepchecks."""

    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Source: deepchecks/utils/typing.py:17-35

Creating a Custom Tabular Check

Step 1: Choose the Appropriate Base Class

For tabular data, choose from:

Base Class	Use Case
`TrainTestCheck`	Compare training and testing data distributions
`SingleDatasetCheck`	Validate a single dataset
`ModelOnlyCheck`	Validate model properties without data

Source: deepchecks/tabular/base_checks.py

Step 2: Implement the Check

from deepchecks.tabular import TrainTestCheck
from deepchecks.core import ConditionResult

class MyCustomDriftCheck(TrainTestCheck):
    """Custom check to detect feature drift between train and test sets."""
    
    def __init__(self, threshold: float = 0.1, **kwargs):
        super().__init__(**kwargs)
        self.threshold = threshold
    
    def run_logic(self, context):
        """Implement the check's validation logic."""
        train = context.train
        test = context.test
        
        # Your validation logic here
        drift_scores = self._calculate_drift(train, test)
        
        # Return results
        return self.generate_output(drift_scores)
    
    def _calculate_drift(self, train, test):
        # Custom drift calculation
        return drift_scores

Step 3: Add Conditions

Conditions define pass/fail criteria for checks:

class MyCustomDriftCheck(TrainTestCheck):
    # ... initialization and run_logic ...
    
    def add_condition_drift_not_exceeds_threshold(self, threshold=0.1):
        """Add a condition that drift scores should not exceed threshold."""
        def condition(result, check):
            failed_features = [
                feature for feature, score in result.value.items()
                if score > threshold
            ]
            if failed_features:
                return ConditionResult(
                    False,
                    f'Features with drift > {threshold}: {failed_features}',
                    {'failed_features': failed_features}
                )
            return ConditionResult(True, 'All features within drift threshold')
        
        return self.add_condition(
            'Drift below threshold',
            condition
        )

Source: deepchecks/utils/decorators.py

Validation Utilities

Deepchecks provides utilities for validating inputs in custom checks:

Ensure Hashable or Sequence

from deepchecks.utils.validation import ensure_hashable_or_mutable_sequence

def my_check_logic(self, feature_name, feature_values):
    # Validate that feature_name is hashable or a sequence of hashables
    validated_features = ensure_hashable_or_mutable_sequence(
        feature_name,
        message='Feature name must be hashable or a sequence of hashables'
    )

Check if Sequence (Not String)

from deepchecks.utils.validation import is_sequence_not_str

def my_check_logic(self, data):
    if is_sequence_not_str(data):
        # Handle sequence data
        pass

Source: deepchecks/utils/validation.py

Decorators for Check Development

Deepchecks provides decorators that help with documentation and parameter handling:

Substitution Decorator

The Substitution decorator allows replacing documentation placeholders:

from deepchecks.utils.decorators import Substitution

@Substitution(
    feature_average_greater_than=0.5,
    feature_average_greater_than_info='The minimum ratio between...'
)
def _feature_segment_condition_scorer_parameter(self, feature_avg: float, ...

Appender Decorator

The Appender decorator adds information to docstrings:

from deepchecks.utils.decorators import Appender

@Appender(
    TrainTestCheck.run.__doc__ + """
    Returns

Source: https://github.com/deepchecks/deepchecks / Human Manual

Integrations

Related topics: Serialization & Output Formats

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Weights & Biases (wandb) Integration

Continue reading this section for the full explanation and source context.

Section H2O.ai Integration

Continue reading this section for the full explanation and source context.

Section Apache Airflow Integration

Continue reading this section for the full explanation and source context.

Related topics: Serialization & Output Formats

Integrations

Deepchecks provides a comprehensive integrations framework that enables seamless connectivity with popular ML platforms, experiment tracking tools, and orchestration systems. This page documents the integration architecture, available integrations, and how to extend Deepchecks with custom integrations.

Overview

Deepchecks integrations allow users to:

Log validation results to experiment tracking platforms
Embed checks within ML pipelines using orchestration tools
Leverage external model serving for validation workflows
Use pretrained embeddings from Hugging Face for NLP checks

The integration system is designed to be modular, allowing each integration to be installed and used independently without requiring the full Deepchecks ecosystem.

Source: deepchecks/tabular/integrations/__init__.py

Architecture

The integration architecture follows a plugin-like pattern where each integration module implements a consistent interface. Integrations are primarily located in deepchecks/tabular/integrations/ for tabular-specific integrations, while utility modules for cross-cutting concerns reside in deepchecks/utils/.

graph TD
    User[User Code] --> Deepchecks[Deepchecks Core]
    Deepchecks --> Integrations[Integration Layer]
    
    Integrations --> WNB[Weights & Biases]
    Integrations --> H2O[H2O.ai]
    Integrations --> Airflow[Apache Airflow]
    Integrations --> HF[Hugging Face]
    
    WNB --> WNBCloud[wandb.ai]
    H2O --> H2OCloud[H2O.ai Platform]
    Airflow --> AirflowScheduler[Airflow Scheduler]
    HF --> HFHub[Hugging Face Hub]
    
    NLP[Deepchecks NLP] --> TextEmbeddings[Text Embeddings]
    TextEmbeddings --> HF[HF Transformers]

Available Integrations

Weights & Biases (wandb) Integration

Deepchecks provides native integration with Weights & Biases for logging validation results alongside model training metrics. This integration is implemented in deepchecks/utils/wandb_utils.py.

#### Key Features

Automatic result logging: Validation results are automatically logged to W&B runs
Suite-level logging: Complete suites can be logged as a single W&B Table
Check-level granularity: Individual check results can be logged separately
Config synchronization: Check configurations are logged as W&B hyperparameters

#### Usage Pattern

import deepchecks as dc
from deepchecks.utils.wandb_utils import log_to_wandb

# Run validation
suite = dc.suites.full_suite()
result = suite.run(model=model, train_dataset=train, test_dataset=test)

# Log results to W&B
log_to_wandb(result, project="model-validation")

#### Configuration Options

Parameter	Type	Description	Default
`project`	`str`	W&B project name	`"deepchecks"`
`name`	`str`	Run name for the validation	Auto-generated
`tags`	`List[str]`	Tags for the W&B run	`[]`
`resume`	`bool`	Resume an existing run	`False`

Source: deepchecks/utils/wandb_utils.py

H2O.ai Integration

Deepchecks integrates with H2O.ai's model validation ecosystem, enabling validation of H2O models using the Deepchecks check library. This integration is implemented in deepchecks/tabular/integrations/h2o.py.

#### Supported H2O Models

The integration supports H2O's supervised learning models including:

H2O Generalized Linear Models (GLM)
H2O Gradient Boosting Machines (GBM)
H2O Random Forest
H2O Deep Learning (AutoML models)

#### Usage Pattern

from deepchecks.tabular.integrations.h2o import H2OChecker
import h2o

# Initialize H2O
h2o.init()

# Load H2O model
model = h2o.load_model("path/to/model.zip")

# Run Deepchecks validation
checker = H2OChecker()
result = checker.run(model=model, train=h2o_train, test=h2o_test)

#### Key Functions

Function	Purpose
`H2OChecker`	Main integration class for H2O models
`validate_h2o_model()`	Validates H2O model compatibility
`convert_h2o_dataset()`	Converts H2OFrame to Deepchecks Dataset

Source: deepchecks/tabular/integrations/h2o.py

Apache Airflow Integration

Deepchecks provides an Apache Airflow operator for embedding validation checks within ML pipelines. This integration is documented in examples/integrations/airflow/README.rst.

#### Airflow DAG Integration

from airflow import DAG
from airflow.operators.python import PythonOperator
from deepchecks.airflow.operators import DeepchecksValidationOperator
from datetime import datetime

with DAG('model_validation_dag', start_date=datetime(2024, 1, 1)) as dag:
    
    validate_model = DeepchecksValidationOperator(
        task_id='run_model_validation',
        model_path='/path/to/model',
        test_dataset='test_dataset.parquet',
        suite='model_evaluation',
        check_config={
            'TrainTestPerformance': {
                'params': {'n_samples': 10000}
            }
        }
    )

#### Workflow

graph LR
    A[Train Model] --> B[Deploy Model]
    B --> C[Deepchecks Validation]
    C --> D{Pass?}
    D -->|Yes| E[Production]
    D -->|No| F[Alert & Rollback]

Source: examples/integrations/airflow/README.rst

Hugging Face Integration

Deepchecks integrates with Hugging Face's ecosystem for NLP model validation, leveraging pretrained models and tokenizers for computing text embeddings and detecting drift.

#### Text Embeddings

The Hugging Face integration provides text embedding capabilities for NLP checks, implemented in deepchecks/nlp/utils/text_embeddings.py.

Embedding Model	Use Case	Model Size
`sentence-transformers/all-MiniLM-L6-v2`	Fast, general purpose	22M params
`sentence-transformers/all-mpnet-base-v2`	High quality	110M params
`bert-base-uncased`	Classification tasks	110M params

#### Supported Tasks

Text Classification: Validate classification models with per-class metrics
Token Classification: NER and other token-level predictions
Text Generation: Quality assessment for generative models
Embedding Drift Detection: Detect distribution shift using embedding-based methods

Source: deepchecks/nlp/utils/text_embeddings.py

NLP Embedding-Based Integrations

Deepchecks uses multivariate embedding techniques for advanced NLP validation, particularly for drift detection.

Multivariate Embeddings Drift Detection

The drift detection system uses sentence embeddings from Hugging Face transformers to compute embedding-based drift scores. This is implemented in deepchecks/nlp/utils/multivariate_embeddings_drift_utils.py.

#### Architecture

graph TD
    Text1[Test Text 1] --> Emb1[Embedding Model]
    Text2[Test Text 2] --> Emb2[Embedding Model]
    
    Emb1 --> Vec1[Embedding Vector]
    Emb2 --> Vec2[Embedding Vector]
    
    Vec1 --> Dist[Distance Calculation]
    Vec2 --> Dist
    
    Dist --> Score[Drift Score]
    Score --> Threshold{Threshold}
    Threshold --> Pass[No Drift]
    Threshold --> Fail[Drift Detected]

#### Configuration

Parameter	Type	Description
`embedding_model`	`str`	Hugging Face model identifier
`batch_size`	`int`	Batch size for embedding computation
`device`	`str`	Device for computation (`cpu`, `cuda`)
`drift_threshold`	`float`	Threshold for drift detection

Source: deepchecks/nlp/utils/multivariate_embeddings_drift_utils.py

Model Protocol Interfaces

Deepchecks defines protocol interfaces that integrations must implement to ensure compatibility with the validation framework.

BasicModel Protocol

The BasicModel protocol defines the minimal interface required for all models:

@runtime_checkable
class BasicModel(Protocol):
    """Traits of a model that are necessary for deepchecks."""
    
    def predict(self, X) -> List[Hashable]:
        """Predict on given X."""
        ...

ClassificationModel Protocol

For classification tasks, models must also implement:

@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
    """Traits of a classification model that are used by deepchecks."""
    
    def predict_proba(self, X) -> List[Hashable]:
        """Predict probabilities on given X."""
        ...

Source: deepchecks/utils/typing.py

Common Issues and Troubleshooting

Integration-Specific Issues

Issue	Cause	Solution
W&B results not appearing	API key not configured	Run `wandb login` or set `WANDB_API_KEY` environment variable
H2O model validation timeout	Large dataset	Reduce `n_samples` parameter or use sampling
Hugging Face model download fails	Network issues	Set `HF_HOME` to a directory with cached models
Airflow operator failing	Missing dependencies	Install `apache-airflow-providers-cncf-kubernetes`

GPU Runtime Configuration

When using GPU-accelerated checks (particularly for image and text drift detection), ensure the runtime device is properly configured:

# For Hugging Face embeddings with GPU
from deepchecks.nlp.utils.text_embeddings import TextEmbeddings

embeddings = TextEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    device="cuda"  # or "cpu"
)

Note: Some users have reported issues with GPU runtime configuration for Image Property Drift and Image Dataset Drift checks. See GitHub Issue #2789 for troubleshooting guidance.

Model Validation Errors

If you encounter model validation errors:

Verify the model implements the required protocol (BasicModel or ClassificationModel)
Check that predict method returns compatible types
For classification models, ensure predict_proba returns probability arrays

Extension Points

Creating Custom Integrations

To create a custom integration for Deepchecks:

Implement the Model Protocol: Ensure your model wrapper implements BasicModel or ClassificationModel

Create Dataset Adapter: Wrap your data format in a Deepchecks-compatible Dataset

Register Integration: (Future feature) Create a PR to add your integration to the registry

from deepchecks.utils.typing import BasicModel

class CustomModelWrapper(BasicModel):
    def __init__(self, model):
        self.model = model
    
    def predict(self, X):
        # Transform X to your model's expected format
        return self.model.predict(X)

Integration with CI/CD Systems

For CI/CD integration, Deepchecks supports:

Exit codes: Returns non-zero on validation failure
JSON output: Use result.to_json() for programmatic parsing
HTML reports: Use result.save_as_html() for visual reports

Note: Users have requested Markdown export support for CI/CD integration. See GitHub Issue #1290 for the feature request.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 22 structured pitfall item(s), including 5 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_492bcbfbeaac498b94f2f869074b9edc | https://github.com/deepchecks/deepchecks/issues/2803

2. Installation risk: Installation risk requires verification

Severity: high
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_4bac6c577dee471fa096434516861696 | https://github.com/deepchecks/deepchecks/issues/2794

3. Maintenance risk: Maintenance risk requires verification

Severity: high
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_b9f771df5da2458d9e368765d829e5c7 | https://github.com/deepchecks/deepchecks/issues/2789

4. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_aaa0c0bcbdbf41d6855980523e0d7682 | https://github.com/deepchecks/deepchecks/issues/2813

5. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_edf8cf14dc8f49898cbcab292f3abbeb | https://github.com/deepchecks/deepchecks/issues/2802

6. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_4789fb40d7364096958752494c3054a2 | https://github.com/deepchecks/deepchecks/releases/tag/0.18.0

7. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_53c2c4845d134a54b0989b29725c1c93 | https://github.com/deepchecks/deepchecks/releases/tag/0.18.1

8. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_e24dd97e69674ab2b766fef25a401070 | https://github.com/deepchecks/deepchecks/issues/2812

9. Installation risk: Installation risk requires verification

Severity: medium
Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_6e307650606b4ed380c4adc96caa8c28 | https://github.com/deepchecks/deepchecks/issues/2806

10. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_b64249b0551146afb696a981f404b3e6 | https://github.com/deepchecks/deepchecks/releases/tag/0.17.3

11. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_b3a17f2d1dcd4fcb8c3f19ea7065d12e | https://github.com/deepchecks/deepchecks/issues/2804

12. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | cevd_e55595065e6b4c72b0409a303bb46b11 | https://github.com/deepchecks/deepchecks/releases/tag/0.17.1

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using deepchecks with real data or production workflows.

[[FEAT] LLM Support?](https://github.com/deepchecks/deepchecks/issues/2767) - github / github_issue
[[BUG] GPU not being able to change runtime of Image Property Drift and I](https://github.com/deepchecks/deepchecks/issues/2789) - github / github_issue
Failed to load model class 'AnyModel' from module 'anywidget' Error: No - github / github_issue
[[DEE-170] [FEAT] Add tests for python 3.10 & 3.11](https://github.com/deepchecks/deepchecks/issues/2161) - github / github_issue
Feature Request: EU AI Act compliance mapping for validation checks - github / github_issue
Deepchecks Fix - Additional Checks NLP - github / github_issue
Proposal: Doc/example for RAG failure-mode testing using WFGY 16-problem - github / github_issue
https://github.com/deepchecks/deepchecks/blob/98475d17b08a21fca29d533b94 - github / github_issue
[[BUG] neg_log_loss scorer incompatible with newer scikit-learn version](https://github.com/deepchecks/deepchecks/issues/2806) - github / github_issue
[[BUG] Inaccurate Conditions Summary and Heatmap for Pairwise Correlation](https://github.com/deepchecks/deepchecks/issues/2802) - github / github_issue
Blank html page after saving report using save_as_html - github / github_issue
[[FEAT] NLP property - sudden stop](https://github.com/deepchecks/deepchecks/issues/2722) - github / github_issue

Source: Project Pack community evidence and pitfall evidence

deepchecks

Deepchecks Repository Overview

Related Pages

Deepchecks Repository Overview

Purpose and Scope

Architecture Overview

Core Abstractions

Model Protocol System

Basic Model Protocol

Classification Model Protocol

Task Types

Validation Utilities

Model Validation

Value Validation

Feature Importance System

Supported Methods

Metrics and Scoring

Scorer Utilities

Gain Calculation Logic

Logging System

Verbosity Levels

Serialization and JSON Support

Outlier Detection

Parameters

Simple Model Utilities

Decorator System

Type Inference

Execution Flow

Known Issues and Community Feedback

Feature Requests

Version Information

See Also

Installation & Quickstart

Related Pages

Installation & Quickstart

Overview

System Requirements

Prerequisites

Optional Dependencies by Domain

Installation Methods

PyPI Installation (Recommended)

Conda Installation

Development Installation

Environment Detection

Headless Mode Configuration

Quickstart Guide

Basic Tabular Validation

Running Individual Checks

Model Validation with Custom Models

Core Concepts

Checks and Suites

Dataset Structure

Conditions and Thresholds

Task Types

Saving and Exporting Results

HTML Export

JSON Export

Validation Workflow

Common Installation Issues

GPU/CUDA Configuration for Vision and NLP

Package Compatibility

Verifying Installation

Next Steps

See Also

Core Architecture

Related Pages

Core Architecture

Overview

Checks & Suites Framework

Related Pages

Checks & Suites Framework

Overview

Architecture Overview

Core Components

Check Base Classes

Model Protocol Definitions

Check Structure

Check Lifecycle

Check Result Structure

Suite Framework