Doramagic Project Pack · Human Manual
deepchecks
Related topics: Installation & Quickstart, Core Architecture, Checks & Suites Framework
Deepchecks Repository Overview
Related topics: Installation & Quickstart, Core Architecture, Checks & Suites Framework
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation & Quickstart, Core Architecture, Checks & Suites Framework
Deepchecks Repository Overview
Deepchecks is an open-source Python library designed for validating and testing machine learning models and data throughout the ML lifecycle. It provides a comprehensive suite of checks organized into validation suites, enabling data scientists and ML engineers to systematically evaluate model quality, detect data integrity issues, and ensure robust model performance.
Purpose and Scope
Deepchecks addresses the critical need for systematic ML validation by providing:
- Pre-training validation: Checks for data integrity, distribution analysis, and feature engineering validation
- Post-training validation: Model performance evaluation, error analysis, and robustness testing
- Ongoing monitoring: Drift detection for data and model predictions
- Cross-domain support: Built-in support for tabular, image, and natural language processing (NLP) domains
The library is distributed under the GNU Affero General Public License (version 3 or later) and is designed to integrate seamlessly into existing ML workflows, including CI/CD pipelines.
Architecture Overview
The Deepchecks architecture follows a modular design with clear separation between core abstractions, domain-specific implementations, and utility functions.
graph TB
subgraph "Core Layer"
C[Core Checks]
S[Suite Engine]
R[Check Result]
E[Errors & Validation]
end
subgraph "Domain Layer"
T[Tabular Module]
I[Vision Module]
N[NLP Module]
end
subgraph "Utility Layer"
U[Utils - Typing]
V[Utils - Validation]
M[Utils - Metrics]
L[Utils - Logger]
J[Utils - JSON]
end
T --> C
I --> C
N --> C
S --> R
U --> E
V --> E
M --> C
L --> C
J --> RCore Abstractions
The core layer provides fundamental abstractions that all domain modules inherit from:
| Component | Purpose | Key Classes |
|---|---|---|
| Checks | Individual validation tests | BaseCheck, TrainTestCheck, SingleDatasetCheck |
| Suites | Collection of organized checks | Suite, SuiteResult |
| Results | Output from check execution | CheckResult, CheckFailure |
| Conditions | Pass/fail criteria for checks | ConditionResult, ConditionCategory |
Source: deepchecks/__init__.py
Model Protocol System
Deepchecks defines a protocol-based model interface system that supports various model types while maintaining flexibility.
Basic Model Protocol
The BasicModel protocol defines the minimal interface required for any model to work with Deepchecks checks:
@runtime_checkable
class BasicModel(Protocol):
"""Traits of a model that are necessary for deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
Source: deepchecks/utils/typing.py:1-50
Classification Model Protocol
Classification models require additional probability prediction capabilities:
@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
"""Traits of a classification model that are used by deepchecks."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Source: deepchecks/utils/typing.py:53-61
Task Types
Deepchecks supports three primary machine learning task types:
| Task Type | Value | Description |
|---|---|---|
| REGRESSION | 'regression' | Continuous value prediction |
| BINARY | 'binary' | Binary classification |
| MULTICLASS | 'multiclass' | Multi-class classification |
Source: deepchecks/tabular/utils/task_type.py
Validation Utilities
The validation module provides essential functions for input validation and model verification.
Model Validation
def model_type_validation(model: t.Any):
"""Receive any object and check if it's an instance of a model we support."""
if not isinstance(model, BasicModel):
raise errors.ModelValidationError(
f'Model supplied does not meets the minimal interface requirements.'
)
Source: deepchecks/tabular/utils/validation.py
Value Validation
def ensure_hashable_or_mutable_sequence(
value: t.Union[T, t.MutableSequence[T]],
message: str = (
'Provided value is neither hashable nor mutable '
'sequence of hashable items. Got {type}')
) -> t.List[T]:
Source: deepchecks/utils/validation.py
Feature Importance System
Feature importance calculations are central to many Deepchecks checks, enabling identification of the most impactful features.
def calculate_feature_importance_or_none(
model: t.Any,
dataset: t.Union['tabular.Dataset', pd.DataFrame],
model_classes,
observed_classes,
task_type,
...
Source: deepchecks/tabular/utils/feature_importance.py
Supported Methods
| Method | Description |
|---|---|
| Permutation Importance | Uses scikit-learn's permutation_importance |
| Built-in Importance | Extracts from models with feature_importances_ attribute |
| Order-based | Falls back to feature column order when other methods unavailable |
Metrics and Scoring
Deepchecks provides comprehensive metric utilities for model evaluation.
Scorer Utilities
def get_gain(base_score, score, perfect_score, max_gain):
"""Get gain between base score and score compared to the distance from the perfect score."""
Source: deepchecks/utils/metrics.py
Gain Calculation Logic
The gain calculation provides normalized performance improvement metrics:
| Scenario | Return Value |
|---|---|
| Both base and score are perfect | 0 |
| Base score is better than score | -max_gain |
| Normal improvement | scores_diff / distance_from_perfect |
| Capped improvement | Clamped to [-max_gain, max_gain] |
Logging System
Deepchecks implements a centralized logging system for debugging and progress tracking.
_logger = logging.getLogger('deepchecks')
def get_logger() -> logging.Logger:
"""Return the deepchecks logger."""
return _logger
def set_verbosity(level: int):
"""Set the deepchecks logger verbosity level."""
Source: deepchecks/utils/logger.py
Verbosity Levels
| Level | Effect |
|---|---|
| INFO | Shows progress bars and informational messages |
| WARNING | Suppresses progress bars, shows warnings only |
| ERROR | Shows only error messages |
Serialization and JSON Support
Deepchecks supports serialization of check results for persistence and integration with external systems.
def from_json(json_dict: t.Union[str, t.Dict]) -> t.Union[BaseCheckResult, SuiteResult]:
"""Convert a json object that was returned from one of our classes to_json."""
if isinstance(json_dict, str):
json_dict = jsonpickle.loads(json_dict)
json_type = json_dict['type']
if 'Check' in json_type:
return BaseCheckResult.from_json(json_dict)
if json_type == 'SuiteResult':
return SuiteResult.from_json(json_dict)
Source: deepchecks/utils/json_utils.py
Note: There is a known issue (#2804) with WeakSegmentsPerformance().to_json() where the value field containing both weak_segments (DataFrame) and avg_score gets flattened during serialization.
Outlier Detection
Deepchecks includes IQR-based outlier detection utilities:
def iqr_outliers_range(data: np.ndarray,
iqr_range: Tuple[int, int],
scale: float,
sharp_drop_ratio: float = 0.9) -> Tuple[float, float]:
"""Calculate outliers range on the data given using IQR."""
Source: deepchecks/utils/outliers.py
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| data | np.ndarray | Required | Data to calculate outliers range for |
| iqr_range | Tuple[int, int] | Required | Two percentiles defining IQR range |
| scale | float | Required | Scale multiplier for IQR range |
| sharp_drop_ratio | float | 0.9 | Threshold for sharp drop detection |
Simple Model Utilities
For testing and baseline comparisons, Deepchecks provides reference model implementations:
| Model | Purpose |
|---|---|
PerfectModel | Predicts perfectly from training labels |
RandomModel | Random predictions for baseline testing |
ClassificationUniformModel | Uniform probability distribution |
RegressionUniformModel | Uniform continuous predictions |
Source: deepchecks/utils/simple_models.py
Decorator System
Deepchecks uses decorators for documentation and code modification:
| Decorator | Purpose |
|---|---|
@Substitution | Dynamic docstring substitution |
@Appender | Append content to docstrings |
@deprecate_kwarg | Mark deprecated keyword arguments |
Source: deepchecks/utils/decorators.py
Type Inference
Automated feature type detection supports categorical and numerical feature identification:
def infer_numerical_features(df: pd.DataFrame) -> t.List[Hashable]:
"""Infers which features are numerical."""
def infer_categorical_features(df: pd.DataFrame) -> t.List[Hashable]:
"""Infers which columns are categorical."""
Source: deepchecks/utils/type_inference.py
Execution Flow
graph LR
A[User Code] --> B[Create Suite/Check]
B --> C[Run with Dataset/Model]
C --> D{Check Logic}
D -->|Validation Pass| E[Generate CheckResult]
D -->|Validation Fail| F[Raise DeepchecksValueError]
E --> G[Apply Conditions]
G --> H{Result Category}
H -->|Pass| I[Display Green]
H -->|Fail| J[Display Red/Warning]
H -->|Error| K[Display Error]Known Issues and Community Feedback
The following issues from the community are relevant to users working with the repository:
| Issue | Description | Status |
|---|---|---|
| #2789 | GPU runtime optimization not working for Image Property/Dataset Drift | Bug - needs triage |
| #2794 | anywidget module not registered in visualization | Bug |
| #2806 | neg_log_loss scorer incompatible with newer scikit-learn | Bug |
| #2802 | Inaccurate conditions summary for Pairwise Correlation | Bug with proposed solution |
| #2803 | Blank HTML page after save_as_html() | Bug |
| #2804 | WeakSegmentsPerformance JSON serialization flattening | Bug |
Feature Requests
Notable feature requests from the community include:
- #1290: Add option to save reports as Markdown files for CI/CD integration (23 comments - most engaged issue)
- #2767: LLM Support for evaluating language model-based applications
- #2813: EU AI Act compliance mapping for validation checks (aligned with August 2026 enforcement deadline)
- #2812: RAG failure-mode testing documentation using WFGY ProblemMap
Version Information
The current stable release is 0.19.1, which includes:
- scikit-learn compatibility updates
- Pandas version upgrade support
Recent releases:
| Version | Key Changes |
|---|---|
| 0.19.1 | updated_sci, upgrade-pandas |
| 0.19.0 | Contributor additions, 0.18.x release merge |
| 0.18.1 | Build fixes |
| 0.18.0 | Documentation improvements, contributor additions |
| 0.17.4 | Hotfix version bump |
See Also
Source: https://github.com/deepchecks/deepchecks / Human Manual
Installation & Quickstart
Related topics: Deepchecks Repository Overview, Tabular Data Validation, NLP Validation, Computer Vision Validation
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Deepchecks Repository Overview, Tabular Data Validation, NLP Validation, Computer Vision Validation
Installation & Quickstart
This page provides comprehensive guidance for installing Deepchecks and getting started with model validation. Deepchecks is an open-source library for validating machine learning models and data throughout the ML pipeline—from data integrity checks to model performance evaluation and drift detection.
Overview
Deepchecks supports multiple ML domains through specialized modules:
- Tabular: Validation for tabular data and traditional ML models
- Vision: Validation for image datasets and computer vision models
- NLP: Validation for text data and natural language processing models
The installation process automatically handles core dependencies, with optional extras for domain-specific functionality.
System Requirements
Prerequisites
| Requirement | Minimum Version | Notes |
|---|---|---|
| Python | 3.8+ | Tested up to Python 3.11 |
| pip | 21.0+ | Recommended for installation |
| conda | 4.10+ | Alternative installation option |
Optional Dependencies by Domain
| Domain | Optional Packages | Installation Flag |
|---|---|---|
| Vision | albumentations, torch, torchvision | pip install deepchecks[vision] |
| NLP | transformers, nltk, spacy | pip install deepchecks[nlp] |
| All Extras | All optional packages | pip install deepchecks[all] |
Source: requirements/vision-requirements.txt, requirements/nlp-requirements.txt
Installation Methods
PyPI Installation (Recommended)
The standard installation uses pip from PyPI:
# Core installation (tabular only)
pip install deepchecks
# With vision support
pip install deepchecks[vision]
# With NLP support
pip install deepchecks[nlp]
# All optional dependencies
pip install deepchecks[all]
Source: setup.py
Conda Installation
For conda users, Deepchecks is available through conda-forge:
conda install -c conda-forge deepchecks
Source: conda-recipe/meta.yaml
Development Installation
To install from source for development:
git clone https://github.com/deepchecks/deepchecks.git
cd deepchecks
pip install -e .
Environment Detection
Deepchecks automatically detects the execution environment to optimize display behavior. The library checks for:
- Jupyter Notebook: Full interactive display with rich output
- Google Colab: Optimized display for Colab notebooks
- Kaggle: Environment-specific handling for Kaggle notebooks
- Databricks/SageMaker: Cloud notebook environment detection
- Terminal/Headless: Text-only output when GUI is unavailable
Source: deepchecks/utils/ipython.py:37-54
# Environment detection functions available in deepchecks.utils
from deepchecks.utils.ipython import (
is_notebook,
is_colab_env,
is_kaggle_env,
is_databricks_env,
is_sagemaker_env,
is_headless
)
Headless Mode Configuration
When running in CI/CD environments or servers without display capabilities, Deepchecks operates in headless mode. The library automatically detects headless environments but can be explicitly configured:
import deepchecks
# Progress bars are disabled at WARNING level
deepchecks.set_verbosity(logging.WARNING)
Source: deepchecks/utils/logger.py:38-45
Quickstart Guide
Basic Tabular Validation
The fastest way to validate a tabular model using built-in datasets:
from deepchecks.tabular.datasets.classification import load_iris
from deepchecks.tabular.suites import full_suite
# Load sample data
train, test = load_iris()
# Run a full validation suite
suite = full_suite()
result = suite.run(train, test)
# Display results (works in notebooks)
result.show()
Source: deepchecks/utils/builtin_datasets_utils.py
Running Individual Checks
For more granular control, run individual checks:
from deepchecks.tabular.checks.integrity import IsNullsReport
from deepchecks.tabular.datasets.classification import load_iris
# Load data
train, _ = load_iris()
# Run single check
check = IsNullsReport()
result = check.run(dataset=train)
result.show()
Model Validation with Custom Models
Deepchecks supports any model implementing the basic model interface:
from deepchecks.tabular import Dataset
from sklearn.ensemble import RandomForestClassifier
# Create dataset from pandas DataFrame
train_dataset = Dataset(train_df, label='target')
test_dataset = Dataset(test_df, label='target')
# Validate model
from deepchecks.tabular.checks.performance import ModelInfoCheck
check = ModelInfoCheck()
result = check.run(model=model, train_dataset=train_dataset,
test_dataset=test_dataset)
Source: deepchecks/utils/typing.py:22-27
# Required model interface
class BasicModel(Protocol):
"""Minimal interface required by Deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
class ClassificationModel(BasicModel, Protocol):
"""Classification models require probability predictions."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Core Concepts
Checks and Suites
Deepchecks organizes validation into two conceptual levels:
graph TD
A[Suite] --> B[Check 1]
A --> C[Check 2]
A --> D[Check N]
B --> E[Result with Conditions]
C --> F[Result with Conditions]
D --> G[Result with Conditions]
E --> H[SuiteResult]
F --> H
G --> HChecks are individual validation tests that return structured results. Suites are collections of checks that run together and aggregate results.
Source: deepchecks/core/check_result.py
Dataset Structure
The Dataset class wraps pandas DataFrames with additional metadata:
from deepchecks.tabular import Dataset
# Required: DataFrame and label column
dataset = Dataset(df, label='target_column')
# Optional: Specify feature types
dataset = Dataset(
df,
label='target_column',
features=['feature1', 'feature2'],
cat_features=['categorical_feature'],
index='id_column',
datetime='timestamp_column'
)
Source: deepchecks/utils/validation.py:26-43
Conditions and Thresholds
Checks produce results that can be evaluated against conditions:
from deepchecks.tabular.checks.integrity import MixedNullsCheck
# Create check with condition
check = MixedNullsCheck().add_condition_not_more_than_nulls(0.05)
result = check.run(dataset=train)
# Check condition status
for condition, status in result.conditions_results:
print(f"{condition.name}: {status}")
Task Types
Deepchecks automatically infers the task type or can be explicitly specified:
from deepchecks.tabular.utils.task_type import TaskType
# Task types available
TaskType.REGRESSION # For regression models
TaskType.BINARY # For binary classification
TaskType.MULTICLASS # For multi-class classification
Source: deepchecks/tabular/utils/task_type.py:14-19
Saving and Exporting Results
HTML Export
Save validation reports as standalone HTML files:
result = suite.run(train, test)
result.save_as_html('validation_report.html')
Note: Users have reported issues with blank HTML pages when using certain versions of anywidget. If you encounter this, ensure you have a compatible version installed.
Source: deepchecks/issues/2794, deepchecks/issues/2803
JSON Export
Serialize results for programmatic processing:
json_output = result.to_json()
Note: Some checks like WeakSegmentsPerformance may require special handling when converting to JSON due to nested DataFrame structures.
Source: deepchecks/issues/2804, deepchecks/utils/json_utils.py
Validation Workflow
graph LR
A[Prepare Data] --> B[Create Datasets]
B --> C[Load/Define Model]
C --> D[Select Suite or Checks]
D --> E[Configure Conditions]
E --> F[Run Validation]
F --> G[Review Results]
G --> H{Issues Found?}
H -->|Yes| I[Address Issues]
I --> A
H -->|No| J[Deploy Model]Common Installation Issues
GPU/CUDA Configuration for Vision and NLP
Some checks can leverage GPU acceleration for faster computation:
# For image drift checks, GPU can be enabled
# Note: Currently limited runtime optimization support
Known Issue: GPU acceleration for Image Property Drift and Image Dataset Drift has limited runtime optimization support in version 0.19.x.
Source: deepchecks/issues/2789
Package Compatibility
| Package | Known Issues | Recommended Action |
|---|---|---|
| scikit-learn | neg_log_loss scorer incompatible in newer versions | Use make_scorer with explicit parameters |
| transformers/optimum | Model download issues with latest versions | Pin compatible versions |
| anywidget | Version conflicts affecting HTML display | Install specific compatible versions |
Source: deepchecks/issues/2806, deepchecks/issues/2630
Verifying Installation
Run this simple verification:
import deepchecks
# Verify core installation
print(f"Deepchecks version: {deepchecks.__version__}")
# Check available modules
from deepchecks import tabular, vision, nlp
# Run a simple check
from deepchecks.tabular.datasets.classification import load_iris
train, test = load_iris()
print(f"Loaded Iris dataset: {len(train)} train, {len(test)} test samples")
Next Steps
After installation, explore:
| Topic | Description |
|---|---|
| Tabular Checks | Individual validation checks for tabular data |
| Suites | Pre-built validation suites |
| Vision | Image and computer vision validation |
| NLP | Text and NLP validation |
| Integrations | CI/CD and MLOps integrations |
See Also
- Tabular Module Documentation - Comprehensive tabular validation guide
- Built-in Datasets - Available sample datasets for testing
- Supported Models - Model compatibility information
- Contributing Guide - How to contribute to Deepchecks
Source: https://github.com/deepchecks/deepchecks / Human Manual
Core Architecture
Related topics: Checks & Suites Framework, Serialization & Output Formats, Creating Custom Checks
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Checks & Suites Framework, Serialization & Output Formats, Creating Custom Checks
Core Architecture
Overview
The Deepchecks Core Architecture provides the foundational building blocks for model validation, data integrity checks, and testing workflows across all supported data modalities (tabular, vision, NLP). The architecture is designed around a Check abstraction that encapsulates validation logic, a Condition system for defining pass/fail thresholds, and a Suite orchestration mechanism for running multiple checks together.
The core module establishes the protocol for model interfaces, defines the check lifecycle, manages execution context, and provides utilities for serialization, logging, and environment detection. This architecture enables Deepchecks to support diverse ML frameworks while maintaining a consistent API for users.
Source: deepchecks/utils/typing.py:1-30
Source: https://github.com/deepchecks/deepchecks / Human Manual
Checks & Suites Framework
Related topics: Core Architecture, Tabular Data Validation, NLP Validation, Computer Vision Validation, Creating Custom Checks
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Architecture, Tabular Data Validation, NLP Validation, Computer Vision Validation, Creating Custom Checks
Checks & Suites Framework
Overview
The Checks & Suites Framework is the foundational architecture of Deepchecks, providing a standardized mechanism for validating machine learning models, data, and pipelines. This framework enables domain-specific validation through a pluggable check system organized into suites that can be executed together or individually.
Checks are atomic validation units that evaluate specific aspects of ML systems, such as data integrity, model performance, or drift detection. Suites aggregate multiple checks into cohesive validation pipelines that can be run as a whole or configured with custom conditions.
Source: deepchecks/core/checks.py:1-50
Architecture Overview
graph TD
A[User Code] --> B[Suite or Check]
B --> C[Check.run]
C --> D[Check Result]
D --> E[Conditions Evaluation]
E --> F[Condition Results]
F --> G[Suite Result]
G --> H[Output: Display/JSON/HTML]
I[Domain: Tabular] --> J[BaseCheck]
K[Domain: NLP] --> J
L[Domain: Vision] --> J
J --> M[BaseCheckResult]Core Components
Check Base Classes
The framework defines a hierarchical class structure with domain-specific base classes that inherit from a common core.
| Component | File | Purpose |
|---|---|---|
BaseCheck | deepchecks/core/checks.py | Core abstract check implementation |
BaseCheckResult | deepchecks/core/checks.py | Base result container |
TabularBaseCheck | deepchecks/tabular/base_checks.py | Tabular domain checks |
NLPCBBaseCheck | deepchecks/nlp/base_checks.py | NLP domain checks |
VisionBaseCheck | deepchecks/vision/base_checks.py | Vision domain checks |
Source: deepchecks/core/checks.py:100-150
Model Protocol Definitions
Checks interact with models through standardized protocols defined in typing.py:
@runtime_checkable
class BasicModel(Protocol):
"""Traits of a model that are necessary for deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
"""Traits of a classification model that are used by deepchecks."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Source: deepchecks/utils/typing.py:50-70
Check Structure
Check Lifecycle
stateDiagram-v2
[*] --> Initialization
Initialization --> Configuration: Set conditions
Configuration --> Execution: run() called
Execution --> Computation: compute()
Computation --> ResultCreation: Create CheckResult
ResultCreation --> ConditionEvaluation: Evaluate conditions
ConditionEvaluation --> [*]Check Result Structure
Each check produces a CheckResult containing:
| Field | Type | Description |
|---|---|---|
value | Any | Primary computed value |
display | List | Visualization elements |
conditions_results | List[ConditionResult] | Evaluated conditions |
header | str | Check name/identifier |
reduce_output | Any | Aggregated value for suites |
Source: deepchecks/core/checks.py:200-280
Suite Framework
Suites organize checks into logical groupings for comprehensive validation. Each domain (tabular, NLP, vision) provides pre-built suites.
Suite Execution Flow
graph LR
A[Suite Instance] --> B[Initialize All Checks]
B --> C{For Each Check}
C -->|Success| D[Add CheckResult]
C -->|Failure| E[Add CheckFailure]
D --> F{More Checks?}
E --> F
F -->|Yes| C
F -->|No| G[Return SuiteResult]Default Suite Composition
#### Tabular Default Suites
Source: deepchecks/tabular/suites/default_suites.py
| Suite Name | Purpose | Typical Checks |
|---|---|---|
single_dataset_integrity | Data quality validation | Missing values, special characters, data duplications |
train_test_validation | Train/test split validation | Feature drift, label drift, train-test leakage |
model_evaluation | Model performance | Performance metrics, confusion matrix, class balance |
full_suite | Comprehensive validation | All tabular checks |
#### NLP Default Suites
Source: deepchecks/nlp/suites/default_suites.py
| Suite Name | Purpose |
|---|---|
train_test_validation | NLP-specific train/test validation |
model_evaluation | NLP model performance checks |
full_suite | Complete NLP validation pipeline |
#### Vision Default Suites
Source: deepchecks/vision/suites/default_suites.py
| Suite Name | Purpose |
|---|---|
single_dataset_integrity | Image/data integrity validation |
train_test_validation | Vision-specific drift detection |
model_evaluation | Classification/detection performance |
full_suite | Complete vision validation |
Conditions System
Conditions define pass/fail thresholds for check results and are central to automated validation.
Condition Structure
condition = {
'name': 'string',
'comparison_type': 'operator_type',
'operator': 'comparator',
'value': threshold
}
Supported Condition Operators
| Operator | Description | Example |
|---|---|---|
greater_than | Value > threshold | Score > 0.8 |
less_than | Value < threshold | Drift < 0.2 |
greater_than_or_equal | Value >= threshold | Accuracy >= 0.9 |
between | Threshold1 <= Value <= Threshold2 | 0.1 <= Drift <= 0.3 |
Source: deepchecks/core/checks.py:300-400
Domain-Specific Implementations
Tabular Checks
Tabular checks inherit from TabularBaseCheck and work with pandas DataFrames:
class TabularBaseCheck(BaseCheck, RunMonitor):
"""Base class for Tabular checks."""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
Common tabular check categories:
- Data Integrity: Missing values, outliers, duplicates
- Train-Test Drift: Feature drift, label drift, concept drift
- Model Performance: Metrics evaluation, confusion matrix analysis
- Feature Validation: Feature importance, correlation analysis
Source: deepchecks/tabular/base_checks.py:1-80
NLP Checks
NLP checks inherit from NLPCBBaseCheck and work with text data:
class NLPCBBaseCheck(BaseCheck, RunMonitor):
"""Base class for Natural Language Processing checks."""
Common NLP check categories:
- Text Statistics: Token length, vocabulary size
- Text Drift: Property drift, embedding drift
- Label Validation: Label distribution, annotation quality
Source: deepchecks/nlp/base_checks.py:1-80
Vision Checks
Vision checks inherit from VisionBaseCheck and work with image data:
class VisionBaseCheck(BaseCheck, RunMonitor):
"""Base class for Vision checks."""
Common vision check categories:
- Image Statistics: Brightness, contrast, aspect ratio
- Image Drift: Property drift, dataset drift
- Model Performance: Classification metrics, detection accuracy
Source: deepchecks/vision/base_checks.py:1-80
Usage Patterns
Running a Single Check
from deepchecks.tabular.checks import FeatureDrift
# Initialize check
check = FeatureDrift()
# Run with data
result = check.run(dataset=train_ds, test_dataset=test_ds)
# Display results
result.show()
Running a Suite
from deepchecks.tabular.suites import full_suite
# Get default suite
suite = full_suite()
# Run suite
result = suite.run(
train_dataset=train_ds,
test_dataset=test_ds,
model=trained_model
)
# Display results
result.show()
Customizing with Conditions
from deepchecks.tabular.checks import ModelPerformance
# Add custom condition
check = ModelPerformance()
check.add_condition_drift_score_less_than(0.1)
# Run with condition
result = check.run(train_dataset=train_ds, test_dataset=test_ds, model=model)
Saving Results
# Save as HTML
result.save_as_html('report.html')
# Save as JSON
json_str = result.to_json()
# Load from JSON
from deepchecks.utils.json_utils import from_json
loaded_result = from_json(json_str)
Note: Users have reported issues with blank HTML pages when saving reports with save_as_html() using certain versions of anywidget. This has been tracked in issue #2794.
Configuration Options
Check Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
n_top_columns | Number of top columns to display | 10 |
n_top_samples | Number of samples for display | 5 |
aggregation_method | Method for aggregating multi-class results | 'mean' |
Suite Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
conditions | List of condition configurations | [] |
include_random_samples | Include random samples in output | True |
random_samples | Number of random samples | 3 |
Source: deepchecks/core/checks.py:400-500
Result Aggregation
When checks are run as part of a suite, individual results can be aggregated using the reduce mechanism.
graph TD
A[Multiple CheckResults] --> B[Reduce Classes]
B --> C[column_importance_sorter_dict]
B --> D[column_importance_sorter_df]
B --> E[CategoryReducerAgg]
C --> F[Aggregated Output]
D --> F
E --> FSource: deepchecks/core/reduce_classes.py
Common Issues and Troubleshooting
Model Validation Errors
If you encounter ModelValidationError, ensure your model implements the required interface:
from deepchecks.utils.typing import BasicModel
# Check if model is valid
assert isinstance(your_model, BasicModel)
assert hasattr(your_model, 'predict')
Condition Evaluation Failures
When conditions fail to evaluate:
- Check that the check produces the expected value type
- Verify condition thresholds are appropriate for your data
- Review the condition's expected value format
Serialization Issues
When serializing results to JSON:
Known Issue: The WeakSegmentsPerformance check produces nested structures that may not serialize correctly. See issue #2804 for details.
Scorer Compatibility
Some custom scorers may be incompatible with newer scikit-learn versions. The neg_log_loss scorer has known issues with make_scorer parameters as documented in issue #2806.
Integration with CI/CD
Programmatic Suite Execution
import deepchecks
from deepchecks.tabular.suites import full_suite
def run_validation():
suite = full_suite()
result = suite.run(
train_dataset=train_ds,
test_dataset=test_ds,
model=model
)
# Fail if any condition fails
if not result.passed():
raise ValueError("Validation suite failed")
return result
Output Formats
| Format | Method | Use Case |
|---|---|---|
| Interactive Display | result.show() | Jupyter notebooks, GUI |
| HTML Report | result.save_as_html(path) | Static reports, sharing |
| JSON | result.to_json() | CI/CD integration, automation |
See Also
Source: https://github.com/deepchecks/deepchecks / Human Manual
Serialization & Output Formats
Related topics: Core Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Architecture
Serialization & Output Formats
Overview
Deepchecks provides comprehensive serialization capabilities to persist, share, and integrate validation results across different environments and workflows. The serialization system supports multiple output formats including JSON for programmatic access and HTML for human-readable reports.
The primary goal of the serialization subsystem is to capture complete validation results—including check outputs, conditions, metrics, and metadata—in a format that can be reliably stored, transmitted, and reconstructed.
graph TD
A[CheckResult / SuiteResult] --> B{Serialization Request}
B --> C[JSON Format]
B --> D[HTML Format]
B --> E[W&B Integration]
C --> F[from_json]
D --> G[save_as_html]
E --> H[WandbLogger]
F --> I[Reconstructed Objects]Output Format Types
JSON Serialization
JSON is the primary machine-readable format for Deepchecks outputs. Both individual CheckResult objects and SuiteResult objects support JSON serialization through their to_json() method.
The JSON output preserves the complete structure of validation results including:
| Component | Description | Data Type |
|---|---|---|
type | Object type identifier | String |
check_name | Name of the validation check | String |
value | Check-specific output data | Various |
conditions_results | Condition evaluation outcomes | List |
have_passed | Overall pass/fail status | Boolean |
metadata | Additional context and parameters | Dict |
Deserialization
Use the from_json() utility function to reconstruct objects from JSON:
from deepchecks.utils.json_utils import from_json
# Load from JSON string or dictionary
result = from_json(json_data)
Source: deepchecks/utils/json_utils.py:24-49
The from_json function handles type dispatch automatically, returning either a BaseCheckResult or SuiteResult based on the JSON's type field:
def from_json(json_dict: t.Union[str, t.Dict]) -> t.Union[BaseCheckResult, SuiteResult]:
if isinstance(json_dict, str):
json_dict = jsonpickle.loads(json_dict)
json_type = json_dict['type']
if 'Check' in json_type:
return BaseCheckResult.from_json(json_dict)
if json_type == 'SuiteResult':
return SuiteResult.from_json(json_dict)
raise ValueError('Expected json object to be one of '
'[CheckFailure, CheckResult, SuiteResult]')
Source: deepchecks/utils/json_utils.py:24-49
HTML Reports
HTML output provides self-contained, interactive visualization of validation results. Use the save_as_html() method to generate standalone HTML files:
result.save_as_html('validation_report.html')
HTML reports include embedded styles, JavaScript for interactivity, and all necessary assets for offline viewing. The output leverages anywidget for interactive visualizations.
Common HTML Output Issues
Users have reported issues with HTML report generation:
| Issue | Description | Reference |
|---|---|---|
| Blank page | HTML renders empty when opened in browser | Issue #2803 |
| anywidget errors | Failed to load model class from anywidget module | Issue #2794 |
| Widget state | Interactive elements fail to initialize | Issue #2794 |
When encountering blank HTML pages, ensure:
anywidgetpackage is properly installed (pip install anywidget)- Browser console shows no JavaScript errors
- Assets are correctly embedded in the saved file
Weights & Biases (W&B) Integration
Deepchecks supports direct integration with Weights & Biases for experiment tracking. Check results can be logged to W&B using the built-in WandbLogger:
from deepchecks.core.serialization.check_result.wandb import WandbLogger
logger = WandbLogger(project='ml-validation')
logger.log_check_result(result)
Check Result Structure
BaseCheckResult Components
The BaseCheckResult class provides the foundation for all check outputs:
class BaseCheckResult:
value: Any # Primary output data
header: str # Human-readable title
display: List[Any] # Visual elements for output
conditions_results: List[ConditionResult]
extra_data: Dict # Supplementary information
Source: deepchecks/core/check_result.py
Condition Results
Conditions represent automated pass/fail criteria defined for checks. Each condition result includes:
| Field | Type | Description |
|---|---|---|
name | String | Condition identifier |
category | String | PASS, FAIL, or WARN |
details | String | Explanation of the result |
Serialization Data Flow
sequenceDiagram
participant User
participant CheckResult
participant Serializer
participant Output
User->>CheckResult: to_json()
CheckResult->>Serializer: Serialize value
Serializer->>Serializer: Handle complex types
Serializer->>Output: JSON string
User->>CheckResult: save_as_html()
CheckResult->>Serializer: Generate HTML
Serializer->>Output: HTML fileKnown Limitations and Issues
JSON Serialization Concerns
Several community-reported issues relate to JSON serialization:
DataFrame in Results
When checks like WeakSegmentsPerformance return DataFrames as part of their value, the to_json() method may flatten the output incorrectly. The result's value field containing both DataFrames and scalar values gets passed to the serializer as a dictionary, causing nested structures to be flattened.
Reference: Issue #2804
Precision Control
Currently, the precision of floating-point values in DataFrame serialization is fixed at 2 decimal places and cannot be configured by users. This may not be suitable for values requiring higher precision.
Reference: Issue #2598
HTML Output Issues
Blank Page After Save
Users running save_as_html() have reported blank pages when opening the generated HTML file. This typically occurs when:
- The
anywidgetpackage version is incompatible - JavaScript execution is blocked in the browser
- Embedded assets fail to load
Reference: Issue #2803
Missing Markdown Export
Currently, Deepchecks does not support exporting validation results as Markdown files. Users requesting CML (Continuous Machine Learning) integration have requested this feature to enable markdown-formatted reports in CI/CD pipelines.
Reference: Issue #1290
Common Patterns
Saving Results in CI/CD
import deepchecks
from deepchecks import Dataset
# Run validation
result = deepchecks.check(...).run(dataset)
# Save for CI/CD artifact
result.save_as_html('validation-report.html')
result.to_json('validation-result.json')
Reconstructing Results
from deepchecks.utils.json_utils import from_json
# Load previously saved result
with open('validation-result.json', 'r') as f:
saved_result = from_json(f.read())
# Access results programmatically
print(saved_result.passed_conditions())
Working with Complex Outputs
For checks that return complex nested data structures:
result = check.run(dataset)
# Access structured data
if hasattr(result, 'value'):
if isinstance(result.value, dict):
for key, value in result.value.items():
# Handle each component
pass
Best Practices
- Version Compatibility: Ensure consistent Deepchecks versions when sharing serialized results between environments.
- Large Results: For checks producing large outputs, consider using
to_json()which is more compact than HTML for storage.
- Error Handling: Wrap deserialization in try-except blocks to handle format changes between versions.
- HTML for Review: Use HTML reports for human review; use JSON for programmatic processing and CI/CD integration.
- Asset Management: HTML reports are self-contained but may require
anywidgetfor full interactivity in Jupyter environments.
See Also
- Checks Documentation - Individual check documentation
- Suite Configuration - Organizing multiple checks
- Integration Guides - CI/CD and experiment tracking integrations
- Troubleshooting Guide - Common serialization problems
Source: https://github.com/deepchecks/deepchecks / Human Manual
Tabular Data Validation
Related topics: Checks & Suites Framework, Creating Custom Checks
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Checks & Suites Framework, Creating Custom Checks
Tabular Data Validation
Tabular Data Validation is the core module of Deepchecks that provides comprehensive validation capabilities for tabular datasets and machine learning models. This module enables data scientists and ML engineers to validate data integrity, evaluate model performance, and detect train-test distribution shifts through a suite of automated checks organized into logical categories.
Overview
The tabular validation system in Deepchecks is designed to catch common data quality issues and model problems before they impact production systems. It supports both classification and regression tasks across multiple domain areas, providing consistent validation patterns regardless of the specific use case.
Key Capabilities
- Data Integrity Checks: Validate dataset structure, detect anomalies, and identify data quality issues
- Model Evaluation: Assess model performance using various metrics and comparison baselines
- Train-Test Validation: Detect distribution shifts and validate consistency between training and test datasets
- Feature Analysis: Analyze feature properties, correlations, and importance
- Label Verification: Validate label distribution and detect labeling issues
The system is built around three core abstractions: Dataset, Context, and Check, which work together to provide a flexible and extensible validation framework. Source: deepchecks/tabular/__init__.py
Architecture
Core Components
graph TD
A[User Data] --> B[Dataset]
A --> C[Model]
B --> D[Context]
C --> D
D --> E[Check]
E --> F[CheckResult]
F --> G[Display / Export]
B --> H[Data Integrity Checks]
D --> I[Model Evaluation Checks]
D --> J[Train-Test Validation Checks]
H -.-> E
I -.-> E
J -.-> EData Flow
sequenceDiagram
participant User
participant Dataset
participant Context
participant Check
participant Result
User->>Dataset: Create Dataset
User->>Context: Create Context with Dataset(s)
User->>Context: Optional: Add Model
Context->>Check: Run Check
Check->>Dataset: Access data via Context
Check->>Check: Compute validation logic
Check->>Result: Generate CheckResult
Result->>User: Display / ExportDataset Class
The Dataset class is the fundamental data container for tabular validation. It wraps pandas DataFrames and provides metadata about the dataset structure, including feature types and label column information.
Source: deepchecks/tabular/dataset.py
Creating a Dataset
from deepchecks.tabular import Dataset
# Basic dataset creation
ds = Dataset(df)
# With explicit label column
ds = Dataset(df, label='target_column')
# With feature type hints
ds = Dataset(df,
label='target',
features=['feature1', 'feature2'],
cat_features=['categorical_column'])
Dataset Properties
| Property | Type | Description |
|---|---|---|
df | pd.DataFrame | The underlying DataFrame |
label_col | str | Name of the label column |
features | List[Hashable] | List of feature column names |
cat_features | List[Hashable] | List of categorical feature names |
index_col | Optional[Hashable] | Optional index column |
datetime_col | Optional[Hashable] | Optional datetime column |
Feature Type Inference
Deepchecks automatically infers feature types based on column data. The system uses the following inference rules:
Source: deepchecks/utils/type_inference.py
def infer_numerical_features(df: pd.DataFrame) -> List[Hashable]:
"""Infers which features are numerical."""
# Columns with numeric dtype are inferred as numerical
# Object columns may still contain numeric data
def infer_categorical_features(df: pd.DataFrame) -> List[Hashable]:
"""Infers which features are categorical."""
# String columns with few unique values are typically categorical
# Boolean columns are treated as categorical
Context Class
The Context class serves as the orchestration layer that manages datasets, models, and configuration for validation runs. It provides a unified interface for checks to access the data they need.
Source: deepchecks/tabular/context.py
Context Creation
from deepchecks.tabular import Context, Dataset
# Training and test datasets
train_ds = Dataset(train_df, label='target')
test_ds = Dataset(test_df, label='target')
# Create context with both datasets
context = Context(
train_dataset=train_ds,
test_dataset=test_ds,
model=trained_model # Optional
)
Context Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
train_dataset | Dataset | Yes | Training dataset |
test_dataset | Dataset | No | Test dataset for comparison |
model | BasicModel | No | Trained model to validate |
model_name | str | No | Name identifier for the model |
scorers | Dict | No | Custom metric scorers |
scorers_required_average | Dict | No | Scorers requiring averaging |
Model Base Classes
Deepchecks defines protocol classes that describe the interface models must implement to work with the validation framework. These are defined using Python's Protocol for structural subtyping.
Source: deepchecks/utils/typing.py
BasicModel Protocol
@runtime_checkable
class BasicModel(Protocol):
"""Traits of a model that are necessary for deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
ClassificationModel Protocol
@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
"""Traits of a classification model that are used by deepchecks."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Supported Model Integrations
| Model Type | Required Methods | Use Cases |
|---|---|---|
| Classification | predict, predict_proba | Class probability checks, ROC curves |
| Regression | predict only | Residual analysis, performance metrics |
Check Categories
Checks in Deepchecks are organized into three main categories that address different validation concerns.
Source: deepchecks/tabular/checks/data_integrity/__init__.py
Data Integrity Checks
Data integrity checks validate the structure and quality of datasets. These checks can run on a single dataset without requiring a model or train-test comparison.
Source: deepchecks/tabular/checks/data_integrity/__init__.py
Available Checks:
StringMismatchComparison- Detect string format inconsistenciesIsSingleValue- Identify columns with only one unique valueDataDuplicates- Find duplicate rowsMixedNulls- Detect mixed null value representationsStringLengthOutOfBounds- Find strings outside expected length ranges
Model Evaluation Checks
Model evaluation checks assess model performance using various metrics and comparison techniques. These checks require a trained model.
Source: deepchecks/tabular/checks/model_evaluation/__init__.py
Available Checks:
TrainTestPerformance- Compare model performance across train/testPerformanceReport- Generate comprehensive performance metricsConfusionMatrixReport- Display confusion matrix for classificationClassPerformance- Analyze per-class performanceWeakSegmentsPerformance- Identify underperforming data segments
Train-Test Validation Checks
Train-test validation checks detect distribution shifts and inconsistencies between training and test datasets.
Source: deepchecks/tabular/checks/train_test_validation/__init__.py
Available Checks:
FeatureDrift- Detect feature distribution changesLabelDrift- Detect label distribution changesTrainTestFeatureDrift- Compare feature distributionsIndexTrainTest Leakage- Detect index-based data leakageMultivariateDrift- Detect overall distribution shifts
Running Checks
Single Check Execution
from deepchecks.tabular.checks.data_integrity import DataDuplicates
# Create and run a single check
check = DataDuplicates()
result = check.run(dataset=my_dataset)
result.show() # Display results
Using Suites
Suites group multiple checks together for comprehensive validation:
from deepchecks.tabular.suites import data_integrity_suite
# Run predefined suite
suite = data_integrity_suite()
result = suite.run(dataset=my_dataset)
Custom Scorers
Deepchecks supports custom scikit-learn compatible scorers for model evaluation:
Source: deepchecks/tabular/metric_utils/scorers.py
from sklearn.metrics import make_scorer, log_loss
# Custom scorer for model evaluation
context = Context(
train_dataset=train_ds,
test_dataset=test_ds,
model=model,
scorers={
'neg_log_loss': make_scorer(log_loss, greater_is_better=False, needs_proba=True)
}
)
Note: When using custom scorers with make_scorer, ensure compatibility with your scikit-learn version. Some older scorer configurations may not be compatible with newer scikit-learn versions. See Issue #2806 for known compatibility issues.
Feature Importance
Deepchecks provides utilities for extracting and using feature importance values from models.
Source: deepchecks/tabular/utils/feature_importance.py
Supported Importance Sources
| Source | Priority | Description |
|---|---|---|
feature_importances_ attribute | 1 | Scikit-learn feature importances |
coef_ attribute | 2 | Linear model coefficients |
permutation | 3 | Computed permutation importance |
Usage
from deepchecks.tabular.utils.feature_importance import get_feature_importance
# Get feature importance from model
importance = get_feature_importance(model, dataset)
Task Type Detection
Deepchecks automatically detects the task type (classification or regression) based on the label column characteristics.
Source: deepchecks/utils/type_inference.py
def infer_task_type(label_column: pd.Series) -> Literal['classification', 'regression']:
"""Infer task type based on label characteristics."""
if is_numeric_dtype(label_column):
# Check if unique values suggest classification
n_unique = label_column.nunique()
if n_unique <= 20: # Threshold for classification
return 'classification'
return 'regression'
return 'classification'
Built-in Datasets
Deepchecks includes utilities for loading example datasets for testing and demonstration purposes.
Source: deepchecks/tabular/datasets/__init__.py
Loading Example Data
from deepchecks.tabular.datasets import load_iris, load_diabetes
# Load classification dataset
train_ds, test_ds = load_iris()
# Load regression dataset
train_ds, test_ds = load_diabetes()
Output and Export
Displaying Results
# Display in notebook
result.show()
# Display inline
result.display()
Exporting Results
# Save as HTML
result.save_as_html('report.html')
# Export to JSON
json_output = result.to_json()
Known Issue: When saving reports as HTML, ensure compatibility between deepchecks and anywidget versions. Blank pages may appear with certain version combinations. See Issue #2794 and Issue #2803.
Common Issues and Solutions
Pairwise Correlation Display
A known issue affects the accuracy of the conditions summary and heatmap for pairwise correlation displays. See Issue #2802 for details and proposed solutions.
JSON Serialization
When serializing WeakSegmentsPerformance results to JSON, the value field containing both weak_segments (DataFrame) and avg_score may be flattened. See Issue #2804.
Model Serialization Precision
Currently, the precision of values in DataFrame serialization is fixed at 2 decimal places and cannot be configured. See Feature Request #2598.
See Also
- Vision Data Validation - Validating image and computer vision data
- NLP Data Validation - Validating text and natural language processing data
- Custom Checks Guide - Creating custom validation checks
- Suite Configuration - Building and configuring validation suites
- API Reference - Complete API documentation
Source: https://github.com/deepchecks/deepchecks / Human Manual
NLP Validation
Related topics: Checks & Suites Framework
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Checks & Suites Framework
NLP Validation
Overview
Deepchecks' NLP (Natural Language Processing) Validation module provides a comprehensive framework for validating text-based machine learning models and datasets. This module enables data scientists and ML engineers to perform rigorous checks on text data integrity, model evaluation, and train-test validation for NLP tasks.
The NLP validation framework is designed to work seamlessly with the broader Deepchecks ecosystem, providing specialized checks tailored to the unique characteristics of text data including tokenization, embeddings, and linguistic properties.
Source: deepchecks/nlp/__init__.py
Architecture
The NLP validation module follows a modular architecture with distinct components for data representation, context management, task configuration, and validation checks.
graph TD
A[NLP Validation Module] --> B[TextData]
A --> C[NLPContext]
A --> D[TaskType]
A --> E[NLP Checks]
B --> B1[Raw Text]
B --> B2[Metadata]
B --> B3[Embeddings Cache]
C --> C1[Single Dataset Mode]
C --> C2[Train-Test Mode]
E --> E1[Data Integrity]
E --> E2[Model Evaluation]
E --> E3[Train-Test Validation]
D --> D1[TextClassification]
D --> D2[TokenClassification]
D --> D3[SequenceClassification]Core Components
| Component | Purpose | Source File |
|---|---|---|
TextData | Represents text datasets with metadata and caching | text_data.py |
NLPContext | Manages validation context for single or train-test scenarios | context.py |
TaskType | Enum defining supported NLP task types | task_type.py |
input_validations | Input validation utilities for NLP data | input_validations.py |
Source: deepchecks/nlp/__init__.py
TextData Class
The TextData class is the fundamental data structure for representing text datasets in Deepchecks NLP validation. It encapsulates raw text data, optional metadata, and computed properties such as embeddings.
Source: deepchecks/nlp/text_data.py
Constructor Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
raw_text | List[str] or pd.Series | Yes | List of text samples |
task_type | TaskType | Yes | Type of NLP task |
label | List or pd.Series | No | Labels for samples |
metadata | pd.DataFrame | No | Additional metadata columns |
embeddings_provider | object | No | Provider for computing embeddings |
embeddings_cache_dir | str | No | Directory for caching embeddings |
Supported Task Types
The module supports three primary NLP task types:
| Task Type | Description | Use Case |
|---|---|---|
TEXT_CLASSIFICATION | Single-label or multi-label classification | Sentiment analysis, topic classification |
TOKEN_CLASSIFICATION | Per-token labeling | Named Entity Recognition (NER), Part-of-Speech tagging |
SEQUENCE_CLASSIFICATION | Sequence-to-label tasks | Document classification |
Source: deepchecks/nlp/task_type.py
Data Integrity Validation
Input validation ensures that the provided data meets the requirements for NLP processing:
def validate_texts_not_empty(raw_text: t.List) -> None:
"""Validate that texts list is not empty."""
def validate_label_format(label, task_type: TaskType) -> None:
"""Validate label format matches task type requirements."""
Source: deepchecks/nlp/input_validations.py
Text Properties
Deepchecks NLP provides utilities for computing various text properties that can be used for drift detection and data integrity checks.
Source: deepchecks/nlp/utils/text_properties.py
Available Text Properties
| Property | Description | Category |
|---|---|---|
TextLength | Number of characters in text | Basic |
WordCount | Number of words in text | Basic |
SentenceCount | Number of sentences | Basic |
AverageWordLength | Average length of words | Basic |
MaxWordLength | Length of longest word | Basic |
Language | Detected language | Linguistic |
Sentiment | Sentiment score | Linguistic |
Property Calculation
Text properties are calculated lazily and cached to optimize performance. The property calculator handles different scenarios:
def calculate_text_properties(
raw_text: t.List[str],
properties_list: t.List[str] = None,
device: str = 'cpu'
) -> pd.DataFrame:
"""Calculate text properties for a list of texts."""
Note: GPU acceleration for property calculation is available for certain properties. Some community issues have reported limitations with GPU runtime optimization for drift checks. See Issue #2789 for details.
Source: deepchecks/nlp/utils/text_properties.py
Text Embeddings
Deepchecks supports text embeddings for semantic similarity and drift detection in NLP validation.
Source: deepchecks/nlp/utils/text_embeddings.py
Embedding Providers
The module supports multiple embedding providers through a unified interface:
| Provider | Description | Notes |
|---|---|---|
transformers | Hugging Face Transformers models | Requires transformers package |
sentence_transformers | Sentence-BERT models | Recommended for semantic similarity |
sklearn | TF-IDF or other sklearn embedders | Built-in, no extra dependencies |
Embedding Configuration
class EmbeddingsCalculator:
"""Calculate embeddings for text data."""
def __init__(
self,
provider: str = 'sentence_transformers',
model_name: str = 'all-MiniLM-L6-v2',
device: str = 'cpu'
):
"""Initialize embeddings calculator."""
Important: Some users have reported issues downloading property/embedding models with newer versions of transformers and optimum packages. See Issue #2630 for compatibility information.
Source: deepchecks/nlp/utils/text_embeddings.py
Validation Checks
Deepchecks NLP provides three categories of validation checks:
Source: deepchecks/nlp/checks/data_integrity/__init__.py
Data Integrity Checks
These checks validate the quality and consistency of text data before model training or evaluation.
| Check | Purpose | Source |
|---|---|---|
TextPropertyOutliers | Detect text samples with unusual property values | data_integrity/__init__.py |
SpecialCharacters | Identify samples with unexpected special characters | data_integrity/__init__.py |
StringMismatch | Detect string formatting inconsistencies | data_integrity/__init__.py |
Model Evaluation Checks
These checks assess model performance and behavior on a labeled dataset.
Source: deepchecks/nlp/checks/model_evaluation/__init__.py
| Check | Purpose | Source |
|---|---|---|
ConfusionMatrixReport | Display confusion matrix for classification | model_evaluation/__init__.py |
ClassPerformance | Compare per-class model performance | model_evaluation/__init__.py |
PredictionDrift | Detect drift in model predictions | model_evaluation/__init__.py |
Train-Test Validation Checks
These checks compare training and test datasets to detect distribution shift and potential data leakage.
Source: deepchecks/nlp/checks/train_test_validation/__init__.py
| Check | Purpose | Source |
|---|---|---|
TextPropertyDrift | Detect drift in text properties | train_test_validation/__init__.py |
PropertyLabelCorrelation | Check correlation between properties and labels | train_test_validation/__init__.py |
TrainTestFeatureDrift | Detect feature distribution shift | train_test_validation/__init__.py |
NLPContext
The NLPContext class manages the validation workflow, handling both single-dataset and train-test validation scenarios.
Source: deepchecks/nlp/context.py
Context Modes
graph LR
A[NLPContext] --> B[Single Dataset Mode]
A --> C[Train-Test Mode]
B --> B1[train Dataset Only]
B --> B2[Data Integrity Checks]
B --> B3[Model Evaluation Checks]
C --> C1[train Dataset]
C --> C2[Test Dataset]
C --> C3[Train-Test Validation Checks]Creating and Running a Context
from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks import TextPropertyDrift
# Single dataset mode
context = NLPContext(train_dataset)
context.run()
# Train-test mode
context = NLPContext(train_dataset, test_dataset)
context.add_check(TextPropertyDrift())
context.run()
Source: deepchecks/nlp/context.py
Built-in Datasets
Deepchecks provides built-in NLP datasets for testing and learning purposes.
Source: deepchecks/nlp/datasets/__init__.py
Available Datasets
| Dataset | Task Type | Description |
|---|---|---|
load_builtin_dataset() | Various | Load pre-configured NLP datasets |
load_dataset_from_list() | Various | Create TextData from custom lists |
Visualization
NLP validation results can be visualized using Plotly-based plots that integrate with Jupyter notebooks and can be exported to HTML.
Source: deepchecks/nlp/utils/nlp_plot.py
Plot Types
| Plot Type | Purpose |
|---|---|
| Distribution plots | Display property or prediction distributions |
| Drift plots | Visualize train-test drift |
| Heatmaps | Show confusion matrices and correlations |
Note: There have been reported issues with saving reports as HTML in certain environments. Users have reported blank pages when using save_as_html(). See Issue #2794 and Issue #2803 for workarounds.
Usage Patterns
Basic Single-Dataset Validation
from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks.data_integrity import TextPropertyOutliers
# Create text data
train_data = TextData(
raw_text=['This is positive', 'This is negative', 'Neutral text'],
task_type='TEXT_CLASSIFICATION',
label=[1, 0, 1]
)
# Run validation
context = NLPContext(train_data)
context.run()
Train-Test Drift Detection
from deepchecks.nlp import TextData, NLPContext
from deepchecks.nlp.checks.train_test_validation import TextPropertyDrift
train_data = TextData(raw_text=train_texts, task_type='TEXT_CLASSIFICATION', label=train_labels)
test_data = TextData(raw_text=test_texts, task_type='TEXT_CLASSIFICATION', label=test_labels)
context = NLPContext(train_data, test_data)
context.add_check(TextPropertyDrift())
result = context.run()
result.save_as_html('drift_report.html')
Common Issues and Troubleshooting
Model Download Failures
Issue: Cannot download models for property calculation or embeddings.
Solution: This may be caused by incompatibilities with newer versions of transformers or optimum. Ensure you have compatible versions installed:
pip install transformers==4.30.0 optimum==1.12.0
See Issue #2630 for details.
HTML Report Display Issues
Issue: Blank HTML page after saving report.
Solution:
- Ensure
anywidgetis properly installed and registered - Try using the
requirejsparameter when saving:
result.save_as_html('report.html', requirejs=True)
See Issue #2803 for details.
GPU Runtime for Drift Checks
Issue: GPU not utilized for image/text drift checks despite documentation suggesting optimization.
Note: GPU acceleration support for NLP drift checks is limited. The documentation regarding runtime optimization may not apply uniformly across all check types. See Issue #2789 for status updates.
See Also
- Tabular Validation - For tabular data validation
- Vision Validation - For image-based model validation
- Deepchecks Documentation - Official documentation portal
- NLP Checks Reference - Complete list of NLP checks
Source: https://github.com/deepchecks/deepchecks / Human Manual
Computer Vision Validation
Related topics: Checks & Suites Framework
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Checks & Suites Framework
Computer Vision Validation
Overview
The Computer Vision Validation module in DeepChecks provides a comprehensive framework for validating image classification, object detection, and semantic segmentation models. This module enables ML practitioners to detect data integrity issues, evaluate model performance, and identify distribution shifts between training and test datasets.
Purpose: The vision validation system performs automated checks on image datasets and vision models to ensure data quality, model reliability, and consistency across different data splits. It supports multiple vision task types including classification, object detection, and segmentation.
Scope: The module covers data integrity validation (corrupted images, missing labels, class imbalance), model evaluation checks (performance metrics, confusion analysis), and train-test validation (drift detection, weak segment identification).
Source: deepchecks/vision/__init__.py
Source: https://github.com/deepchecks/deepchecks / Human Manual
Creating Custom Checks
Related topics: Core Architecture, Checks & Suites Framework
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Core Architecture, Checks & Suites Framework
Creating Custom Checks
This page documents how to create custom checks in Deepchecks, enabling you to extend the validation framework with domain-specific validation logic for tabular, NLP, and vision use cases.
Overview
Deepchecks provides a flexible check architecture that allows users to create custom validation checks beyond the built-in suite. A Check in Deepchecks is a self-contained unit of validation that analyzes data, models, or their relationships and returns structured results with optional pass/fail conditions.
The check system is designed around several key principles:
- Modularity: Each check focuses on a single validation concern
- Reusability: Checks can be composed into Suites for batch execution
- Extensibility: Domain-specific checks can be added without modifying core code
- Conditional Logic: Checks support conditions that define pass/fail thresholds
Source: deepchecks/core/checks.py
Architecture
Check Hierarchy
Deepchecks implements a hierarchical check system with different base classes for each domain:
graph TD
A["BaseCheck<br/>(core)"] --> B["TrainTestCheck<br/>(tabular)"]
A --> C["SingleDatasetCheck<br/>(tabular)"]
A --> D["ModelOnlyCheck<br/>(tabular)"]
A --> E["TrainTestCheck<br/>(nlp)"]
A --> F["SingleDatasetCheck<br/>(nlp)"]
A --> G["TrainTestCheck<br/>(vision)"]
A --> H["SingleDatasetCheck<br/>(vision)"]
B --> I["Tabular Check<br/>Implementation"]
C --> I
D --> I
E --> I
F --> I
G --> I
H --> ICore Components
The check system relies on several core components defined in the abstract layer:
| Component | File | Purpose |
|---|---|---|
BaseCheck | deepchecks/core/checks.py | Abstract base class for all checks |
BasicModel | deepchecks/utils/typing.py | Protocol defining minimal model interface |
ClassificationModel | deepchecks/utils/typing.py | Protocol for classification models with predict_proba |
TrainTestCheck | deepchecks/tabular/base_checks.py | Check comparing train and test data |
Source: deepchecks/utils/typing.py:17-30
Model Protocols
Before creating checks, understand the model protocols Deepchecks uses:
BasicModel Protocol
@runtime_checkable
class BasicModel(Protocol):
"""Traits of a model that are necessary for deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
ClassificationModel Protocol
@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
"""Traits of a classification model that are used by deepchecks."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Source: deepchecks/utils/typing.py:17-35
Creating a Custom Tabular Check
Step 1: Choose the Appropriate Base Class
For tabular data, choose from:
| Base Class | Use Case |
|---|---|
TrainTestCheck | Compare training and testing data distributions |
SingleDatasetCheck | Validate a single dataset |
ModelOnlyCheck | Validate model properties without data |
Source: deepchecks/tabular/base_checks.py
Step 2: Implement the Check
from deepchecks.tabular import TrainTestCheck
from deepchecks.core import ConditionResult
class MyCustomDriftCheck(TrainTestCheck):
"""Custom check to detect feature drift between train and test sets."""
def __init__(self, threshold: float = 0.1, **kwargs):
super().__init__(**kwargs)
self.threshold = threshold
def run_logic(self, context):
"""Implement the check's validation logic."""
train = context.train
test = context.test
# Your validation logic here
drift_scores = self._calculate_drift(train, test)
# Return results
return self.generate_output(drift_scores)
def _calculate_drift(self, train, test):
# Custom drift calculation
return drift_scores
Step 3: Add Conditions
Conditions define pass/fail criteria for checks:
class MyCustomDriftCheck(TrainTestCheck):
# ... initialization and run_logic ...
def add_condition_drift_not_exceeds_threshold(self, threshold=0.1):
"""Add a condition that drift scores should not exceed threshold."""
def condition(result, check):
failed_features = [
feature for feature, score in result.value.items()
if score > threshold
]
if failed_features:
return ConditionResult(
False,
f'Features with drift > {threshold}: {failed_features}',
{'failed_features': failed_features}
)
return ConditionResult(True, 'All features within drift threshold')
return self.add_condition(
'Drift below threshold',
condition
)
Source: deepchecks/utils/decorators.py
Validation Utilities
Deepchecks provides utilities for validating inputs in custom checks:
Ensure Hashable or Sequence
from deepchecks.utils.validation import ensure_hashable_or_mutable_sequence
def my_check_logic(self, feature_name, feature_values):
# Validate that feature_name is hashable or a sequence of hashables
validated_features = ensure_hashable_or_mutable_sequence(
feature_name,
message='Feature name must be hashable or a sequence of hashables'
)
Check if Sequence (Not String)
from deepchecks.utils.validation import is_sequence_not_str
def my_check_logic(self, data):
if is_sequence_not_str(data):
# Handle sequence data
pass
Source: deepchecks/utils/validation.py
Decorators for Check Development
Deepchecks provides decorators that help with documentation and parameter handling:
Substitution Decorator
The Substitution decorator allows replacing documentation placeholders:
from deepchecks.utils.decorators import Substitution
@Substitution(
feature_average_greater_than=0.5,
feature_average_greater_than_info='The minimum ratio between...'
)
def _feature_segment_condition_scorer_parameter(self, feature_avg: float, ...
Appender Decorator
The Appender decorator adds information to docstrings:
from deepchecks.utils.decorators import Appender
@Appender(
TrainTestCheck.run.__doc__ + """
ReturnsSource: https://github.com/deepchecks/deepchecks / Human Manual
Integrations
Related topics: Serialization & Output Formats
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Serialization & Output Formats
Integrations
Deepchecks provides a comprehensive integrations framework that enables seamless connectivity with popular ML platforms, experiment tracking tools, and orchestration systems. This page documents the integration architecture, available integrations, and how to extend Deepchecks with custom integrations.
Overview
Deepchecks integrations allow users to:
- Log validation results to experiment tracking platforms
- Embed checks within ML pipelines using orchestration tools
- Leverage external model serving for validation workflows
- Use pretrained embeddings from Hugging Face for NLP checks
The integration system is designed to be modular, allowing each integration to be installed and used independently without requiring the full Deepchecks ecosystem.
Source: deepchecks/tabular/integrations/__init__.py
Architecture
The integration architecture follows a plugin-like pattern where each integration module implements a consistent interface. Integrations are primarily located in deepchecks/tabular/integrations/ for tabular-specific integrations, while utility modules for cross-cutting concerns reside in deepchecks/utils/.
graph TD
User[User Code] --> Deepchecks[Deepchecks Core]
Deepchecks --> Integrations[Integration Layer]
Integrations --> WNB[Weights & Biases]
Integrations --> H2O[H2O.ai]
Integrations --> Airflow[Apache Airflow]
Integrations --> HF[Hugging Face]
WNB --> WNBCloud[wandb.ai]
H2O --> H2OCloud[H2O.ai Platform]
Airflow --> AirflowScheduler[Airflow Scheduler]
HF --> HFHub[Hugging Face Hub]
NLP[Deepchecks NLP] --> TextEmbeddings[Text Embeddings]
TextEmbeddings --> HF[HF Transformers]Available Integrations
Weights & Biases (wandb) Integration
Deepchecks provides native integration with Weights & Biases for logging validation results alongside model training metrics. This integration is implemented in deepchecks/utils/wandb_utils.py.
#### Key Features
- Automatic result logging: Validation results are automatically logged to W&B runs
- Suite-level logging: Complete suites can be logged as a single W&B Table
- Check-level granularity: Individual check results can be logged separately
- Config synchronization: Check configurations are logged as W&B hyperparameters
#### Usage Pattern
import deepchecks as dc
from deepchecks.utils.wandb_utils import log_to_wandb
# Run validation
suite = dc.suites.full_suite()
result = suite.run(model=model, train_dataset=train, test_dataset=test)
# Log results to W&B
log_to_wandb(result, project="model-validation")
#### Configuration Options
| Parameter | Type | Description | Default |
|---|---|---|---|
project | str | W&B project name | "deepchecks" |
name | str | Run name for the validation | Auto-generated |
tags | List[str] | Tags for the W&B run | [] |
resume | bool | Resume an existing run | False |
Source: deepchecks/utils/wandb_utils.py
H2O.ai Integration
Deepchecks integrates with H2O.ai's model validation ecosystem, enabling validation of H2O models using the Deepchecks check library. This integration is implemented in deepchecks/tabular/integrations/h2o.py.
#### Supported H2O Models
The integration supports H2O's supervised learning models including:
- H2O Generalized Linear Models (GLM)
- H2O Gradient Boosting Machines (GBM)
- H2O Random Forest
- H2O Deep Learning (AutoML models)
#### Usage Pattern
from deepchecks.tabular.integrations.h2o import H2OChecker
import h2o
# Initialize H2O
h2o.init()
# Load H2O model
model = h2o.load_model("path/to/model.zip")
# Run Deepchecks validation
checker = H2OChecker()
result = checker.run(model=model, train=h2o_train, test=h2o_test)
#### Key Functions
| Function | Purpose |
|---|---|
H2OChecker | Main integration class for H2O models |
validate_h2o_model() | Validates H2O model compatibility |
convert_h2o_dataset() | Converts H2OFrame to Deepchecks Dataset |
Source: deepchecks/tabular/integrations/h2o.py
Apache Airflow Integration
Deepchecks provides an Apache Airflow operator for embedding validation checks within ML pipelines. This integration is documented in examples/integrations/airflow/README.rst.
#### Airflow DAG Integration
from airflow import DAG
from airflow.operators.python import PythonOperator
from deepchecks.airflow.operators import DeepchecksValidationOperator
from datetime import datetime
with DAG('model_validation_dag', start_date=datetime(2024, 1, 1)) as dag:
validate_model = DeepchecksValidationOperator(
task_id='run_model_validation',
model_path='/path/to/model',
test_dataset='test_dataset.parquet',
suite='model_evaluation',
check_config={
'TrainTestPerformance': {
'params': {'n_samples': 10000}
}
}
)
#### Workflow
graph LR
A[Train Model] --> B[Deploy Model]
B --> C[Deepchecks Validation]
C --> D{Pass?}
D -->|Yes| E[Production]
D -->|No| F[Alert & Rollback]Source: examples/integrations/airflow/README.rst
Hugging Face Integration
Deepchecks integrates with Hugging Face's ecosystem for NLP model validation, leveraging pretrained models and tokenizers for computing text embeddings and detecting drift.
#### Text Embeddings
The Hugging Face integration provides text embedding capabilities for NLP checks, implemented in deepchecks/nlp/utils/text_embeddings.py.
| Embedding Model | Use Case | Model Size |
|---|---|---|
sentence-transformers/all-MiniLM-L6-v2 | Fast, general purpose | 22M params |
sentence-transformers/all-mpnet-base-v2 | High quality | 110M params |
bert-base-uncased | Classification tasks | 110M params |
#### Supported Tasks
- Text Classification: Validate classification models with per-class metrics
- Token Classification: NER and other token-level predictions
- Text Generation: Quality assessment for generative models
- Embedding Drift Detection: Detect distribution shift using embedding-based methods
Source: deepchecks/nlp/utils/text_embeddings.py
NLP Embedding-Based Integrations
Deepchecks uses multivariate embedding techniques for advanced NLP validation, particularly for drift detection.
Multivariate Embeddings Drift Detection
The drift detection system uses sentence embeddings from Hugging Face transformers to compute embedding-based drift scores. This is implemented in deepchecks/nlp/utils/multivariate_embeddings_drift_utils.py.
#### Architecture
graph TD
Text1[Test Text 1] --> Emb1[Embedding Model]
Text2[Test Text 2] --> Emb2[Embedding Model]
Emb1 --> Vec1[Embedding Vector]
Emb2 --> Vec2[Embedding Vector]
Vec1 --> Dist[Distance Calculation]
Vec2 --> Dist
Dist --> Score[Drift Score]
Score --> Threshold{Threshold}
Threshold --> Pass[No Drift]
Threshold --> Fail[Drift Detected]#### Configuration
| Parameter | Type | Description |
|---|---|---|
embedding_model | str | Hugging Face model identifier |
batch_size | int | Batch size for embedding computation |
device | str | Device for computation (cpu, cuda) |
drift_threshold | float | Threshold for drift detection |
Source: deepchecks/nlp/utils/multivariate_embeddings_drift_utils.py
Model Protocol Interfaces
Deepchecks defines protocol interfaces that integrations must implement to ensure compatibility with the validation framework.
BasicModel Protocol
The BasicModel protocol defines the minimal interface required for all models:
@runtime_checkable
class BasicModel(Protocol):
"""Traits of a model that are necessary for deepchecks."""
def predict(self, X) -> List[Hashable]:
"""Predict on given X."""
...
ClassificationModel Protocol
For classification tasks, models must also implement:
@runtime_checkable
class ClassificationModel(BasicModel, Protocol):
"""Traits of a classification model that are used by deepchecks."""
def predict_proba(self, X) -> List[Hashable]:
"""Predict probabilities on given X."""
...
Source: deepchecks/utils/typing.py
Common Issues and Troubleshooting
Integration-Specific Issues
| Issue | Cause | Solution |
|---|---|---|
| W&B results not appearing | API key not configured | Run wandb login or set WANDB_API_KEY environment variable |
| H2O model validation timeout | Large dataset | Reduce n_samples parameter or use sampling |
| Hugging Face model download fails | Network issues | Set HF_HOME to a directory with cached models |
| Airflow operator failing | Missing dependencies | Install apache-airflow-providers-cncf-kubernetes |
GPU Runtime Configuration
When using GPU-accelerated checks (particularly for image and text drift detection), ensure the runtime device is properly configured:
# For Hugging Face embeddings with GPU
from deepchecks.nlp.utils.text_embeddings import TextEmbeddings
embeddings = TextEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2",
device="cuda" # or "cpu"
)
Note: Some users have reported issues with GPU runtime configuration for Image Property Drift and Image Dataset Drift checks. See GitHub Issue #2789 for troubleshooting guidance.
Model Validation Errors
If you encounter model validation errors:
- Verify the model implements the required protocol (
BasicModelorClassificationModel) - Check that
predictmethod returns compatible types - For classification models, ensure
predict_probareturns probability arrays
Extension Points
Creating Custom Integrations
To create a custom integration for Deepchecks:
- Implement the Model Protocol: Ensure your model wrapper implements
BasicModelorClassificationModel
- Create Dataset Adapter: Wrap your data format in a Deepchecks-compatible
Dataset
- Register Integration: (Future feature) Create a PR to add your integration to the registry
from deepchecks.utils.typing import BasicModel
class CustomModelWrapper(BasicModel):
def __init__(self, model):
self.model = model
def predict(self, X):
# Transform X to your model's expected format
return self.model.predict(X)
Integration with CI/CD Systems
For CI/CD integration, Deepchecks supports:
- Exit codes: Returns non-zero on validation failure
- JSON output: Use
result.to_json()for programmatic parsing - HTML reports: Use
result.save_as_html()for visual reports
Note: Users have requested Markdown export support for CI/CD integration. See GitHub Issue #1290 for the feature request.
See Also
Source: https://github.com/deepchecks/deepchecks / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 22 structured pitfall item(s), including 5 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_492bcbfbeaac498b94f2f869074b9edc | https://github.com/deepchecks/deepchecks/issues/2803
2. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_4bac6c577dee471fa096434516861696 | https://github.com/deepchecks/deepchecks/issues/2794
3. Maintenance risk: Maintenance risk requires verification
- Severity: high
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_b9f771df5da2458d9e368765d829e5c7 | https://github.com/deepchecks/deepchecks/issues/2789
4. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_aaa0c0bcbdbf41d6855980523e0d7682 | https://github.com/deepchecks/deepchecks/issues/2813
5. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_edf8cf14dc8f49898cbcab292f3abbeb | https://github.com/deepchecks/deepchecks/issues/2802
6. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_4789fb40d7364096958752494c3054a2 | https://github.com/deepchecks/deepchecks/releases/tag/0.18.0
7. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_53c2c4845d134a54b0989b29725c1c93 | https://github.com/deepchecks/deepchecks/releases/tag/0.18.1
8. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_e24dd97e69674ab2b766fef25a401070 | https://github.com/deepchecks/deepchecks/issues/2812
9. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_6e307650606b4ed380c4adc96caa8c28 | https://github.com/deepchecks/deepchecks/issues/2806
10. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_b64249b0551146afb696a981f404b3e6 | https://github.com/deepchecks/deepchecks/releases/tag/0.17.3
11. Configuration risk: Configuration risk requires verification
- Severity: medium
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_b3a17f2d1dcd4fcb8c3f19ea7065d12e | https://github.com/deepchecks/deepchecks/issues/2804
12. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: Project evidence flags a capability evidence risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | cevd_e55595065e6b4c72b0409a303bb46b11 | https://github.com/deepchecks/deepchecks/releases/tag/0.17.1
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using deepchecks with real data or production workflows.
- [[FEAT] LLM Support?](https://github.com/deepchecks/deepchecks/issues/2767) - github / github_issue
- [[BUG] GPU not being able to change runtime of Image Property Drift and I](https://github.com/deepchecks/deepchecks/issues/2789) - github / github_issue
- Failed to load model class 'AnyModel' from module 'anywidget' Error: No - github / github_issue
- [[DEE-170] [FEAT] Add tests for python 3.10 & 3.11](https://github.com/deepchecks/deepchecks/issues/2161) - github / github_issue
- Feature Request: EU AI Act compliance mapping for validation checks - github / github_issue
- Deepchecks Fix - Additional Checks NLP - github / github_issue
- Proposal: Doc/example for RAG failure-mode testing using WFGY 16-problem - github / github_issue
- https://github.com/deepchecks/deepchecks/blob/98475d17b08a21fca29d533b94 - github / github_issue
- [[BUG] neg_log_loss scorer incompatible with newer scikit-learn version](https://github.com/deepchecks/deepchecks/issues/2806) - github / github_issue
- [[BUG] Inaccurate Conditions Summary and Heatmap for Pairwise Correlation](https://github.com/deepchecks/deepchecks/issues/2802) - github / github_issue
- Blank html page after saving report using
save_as_html- github / github_issue - [[FEAT] NLP property - sudden stop](https://github.com/deepchecks/deepchecks/issues/2722) - github / github_issue
Source: Project Pack community evidence and pitfall evidence