Doramagic Project Pack · Human Manual
RagaAI-Catalyst
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view
Overview, Installation & Project Management
Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management
Overview, Installation & Project Management
1. What is RagaAI Catalyst
RagaAI Catalyst is a Python SDK that provides Observability, Monitoring, and Evaluation capabilities for AI agents, LLM-based applications, and RAG (Retrieval-Augmented Generation) pipelines. The toolkit is designed to help engineering and evaluation teams instrument agentic systems end-to-end — from tracing tool and LLM calls, through dataset and experiment management, to red-teaming for safety issues.
The SDK exposes several top-level capabilities:
- Tracing & Monitoring — capture agent, LLM, tool, network, and user interaction spans.
- Dataset Management — create and manage evaluation datasets from CSVs or programmatic schemas.
- Evaluation — run metric experiments against datasets.
- Prompt Management — version and store prompts.
- Synthetic Data Generation — auto-generate queries and Q/A pairs.
- Guardrails — deploy and monitor safety detectors.
- Red-Teaming — run automated safety tests against custom detectors.
Source: README.md
The agentic tracing subsystem itself is organized into a tracers/ package containing tracer implementations, data classes, utilities, and upload logic. Source: ragaai_catalyst/tracers/agentic_tracing/README.md
2. Installation
The SDK is distributed as a standard Python package on PyPI. Installation is a single command:
pip install ragaai-catalyst
Source: README.md
For working with the included examples (Haystack, OpenAI Agents SDK, SmoLAgents), additional dependencies are required per example. Each example ships its own requirements.txt and a .env template. For instance, the Haystack news-fetching example expects OPENAI_API_KEY, SERPERDEV_API_KEY, and Catalyst credentials. Source: examples/haystack/news_fetching/README.md
3. Authentication & Configuration
Before invoking any Catalyst operation, you must authenticate. The official flow documented in the README is:
- Navigate to your profile settings.
- Select Authenticate.
- Click Generate New Key to produce an access key and a secret key.
Credentials can be supplied either through environment variables or directly to the RagaAICatalyst constructor. Source: README.md
from ragaai_catalyst import RagaAICatalyst
catalyst = RagaAICatalyst(
access_key="YOUR_ACCESS_KEY",
secret_key="YOUR_SECRET_KEY",
base_url="BASE_URL"
)
Internally, authenticated requests use a bearer token stored in the RAGAAI_CATALYST_TOKEN environment variable, as shown by the dataset-schema upload utility. Source: ragaai_catalyst/tracers/agentic_tracing/utils/create_dataset_schema.py
Note: The README explicitly states that authentication is required for *any* subsequent operation, including Project Management, Dataset Management, Evaluation, Prompt Management, Synthetic Data Generation, Guardrail Management, and Red-Teaming.
4. Project Management
A project in Catalyst is the top-level container that groups datasets, traces, evaluations, guardrail deployments, and red-teaming runs. All other resources are scoped to a project name.
4.1 Project Lifecycle Workflow
flowchart LR
A[Install<br/>pip install ragaai-catalyst] --> B[Authenticate<br/>RagaAICatalyst(...)]
B --> C[Create Project<br/>create_project(...)]
C --> D[Discover Use Cases<br/>project_use_cases()]
D --> E[List Projects<br/>list_projects()]
E --> F[Scope Resources<br/>Dataset / Trace / Eval / Guardrail / RedTeam]4.2 Creating a Project
A project is created by specifying a name and a usecase. Use cases align the project with a downstream evaluation template (e.g., Chatbot):
project = catalyst.create_project(
project_name="Test-RAG-App-1",
usecase="Chatbot"
)
Source: README.md
4.3 Discovering Available Use Cases
To enumerate the use cases supported by your account/workspace before creating a project, call:
catalyst.project_use_cases()
Source: README.md
4.4 Listing Projects
Existing projects can be enumerated to verify creation or to look up project names for downstream operations:
projects = catalyst.list_projects()
print(projects)
Source: README.md
4.5 Scoping Downstream Resources
Once a project exists, the project name flows into the constructors of every other manager:
| Manager | Constructor signature (excerpt) | Purpose |
|---|---|---|
Dataset | Dataset(project_name="...") | Dataset CRUD from CSV/schema |
Evaluation | Evaluation(project_name="...", dataset_name="...") | Metric experiments |
GuardrailsManager | GuardrailsManager(project_name=project_name) | Safety detector deployments |
RedTeaming | RedTeaming(model_name=..., provider=..., api_key=...) | Adversarial test runs |
Sources: README.md, ragaai_catalyst/redteaming/utils/issue_description.py
5. Common Failure Modes & Tips
Based on the documentation and community-reported issues, several pitfalls recur during onboarding:
- Tracing returns empty tool/LLM spans. Community issue #26 reports that
trace_agentcan produce empty tool and LLM call sections when tool/LLM callables are not registered properly. Ensure the functions you want to trace are decorated or explicitly invoked through the tracer-managed client. - Dataset upload errors. Release 2.2.4 notes a bug-fix for
external_idand metadata updates not propagating when renaming a dataset, plus an SDG error that could fail dataset generation. - Load-test traces dropped. Release 2.2.4 also fixed dropped logs under Locust-driven load testing; if you load-test, verify your runner version is ≥ 2.2.4.
- Authentication token expiry. Since v2.2.1 the SDK can refresh tokens automatically (every ~6 hours). Configure the refresh path rather than hard-coding long-lived secrets.
Sources: README.md, examples/openai_agents_sdk/youtube_summary_agent/README.md
6. Quick-Start Checklist
pip install ragaai-catalyst- Generate access/secret keys from the Catalyst dashboard.
- Instantiate
RagaAICatalyst(access_key=..., secret_key=..., base_url=...). - Call
catalyst.project_use_cases()to discover supported use cases. catalyst.create_project(project_name="...", usecase="...").- Confirm with
catalyst.list_projects(). - Proceed to
Dataset,Evaluation, tracing, guardrails, or red-teaming scoped to that project name.
Source: README.md
See Also
- Dataset Management — creating datasets from CSV and managing schemas.
- Evaluation — running metric experiments against datasets.
- Agentic Tracing — instrumenting LLM, tool, and agent calls.
- Guardrails & Red-Teaming — deploying safety detectors and running adversarial tests.
- Synthetic Data Generation — auto-generating Q/A datasets for evaluation.
Sources: README.md, ragaai_catalyst/redteaming/utils/issue_description.py
Trace Management & Agentic Tracing
Related topics: Overview, Installation & Project Management, Dataset, Evaluation & Prompt Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Installation & Project Management, Dataset, Evaluation & Prompt Management
Trace Management & Agentic Tracing
Overview & Purpose
Trace Management in RagaAI-Catalyst is the observability and monitoring layer of the SDK. It captures, structures, and uploads execution data from agentic AI systems so teams can debug, evaluate, and audit agent behavior after the fact. The two major release lines — RAG Tracing (using OpenInference-compatible spans) and Agentic Tracing (using custom sub-tracers) — were unified in v2.2.1 under a single trace format.
The agentic tracing module, located under ragaai_catalyst/tracers/agentic_tracing/, instruments LLMs, tools, network calls, and user interactions during agent execution. It exposes a pluggable architecture where individual sub-tracers can be swapped or extended without disturbing the rest of the pipeline.
Community note: Multiple issues (e.g. #259) request tamper-evident audit logs on top of these traces for compliance. The current pipeline is observable but does not yet produce cryptographically signed audit records.
Source: README.md | ragaai_catalyst/tracers/agentic_tracing/README.md
Agentic Tracing Architecture
The agentic tracing module is organised into four cooperating subpackages, each owning a clear concern:
| Subpackage | Purpose | Key files |
|---|---|---|
tracers/ | Per-concern tracers that wrap LLM, tool, network, and user interactions | main_tracer.py, agent_tracer.py, llm_tracer.py, tool_tracer.py, network_tracer.py, user_interaction_tracer.py, base.py |
data/ | Strongly-typed data classes for spans, LLM calls, tool executions, agent states | data_classes.py |
utils/ | Cost calculation, ID generation, API helpers, model cost table | llm_utils.py, api_utils.py, unique_decorator.py, model_costs.json, trace_utils.py |
upload/ | Code and trace artefact upload to the Catalyst backend | code_upload.py |
The Base Tracer in tracers/base.py defines the shared lifecycle (start, stop, flush) that all sub-tracers inherit. The Main Tracer coordinates sub-tracers and assembles a unified trace payload before upload.
flowchart LR
A[Agent Code] --> B[Main Tracer]
B --> C[LLM Tracer]
B --> D[Tool Tracer]
B --> E[Network Tracer]
B --> F[User Interaction Tracer]
C --> G[Data Classes]
D --> G
E --> G
F --> G
G --> H[Utils: cost, IDs, time conversion]
H --> I[Upload to Catalyst Backend]Source: ragaai_catalyst/tracers/agentic_tracing/README.md
Core Sub-Tracers
LLM Tracer
Monitors model calls and is the only tracer that computes monetary cost. It tracks token usage, model parameters, and prompt/response content. Costs are looked up in model_costs.json via the helpers in utils/llm_utils.py. As of v2.2.3, model_cost is a no-op when no cost table is configured, and cost calculations from litellm were corrected.
Tool Tracer
Records tool invocations, arguments, and outputs. This is the most common source of the "empty tool call" issue reported in community thread #26, where users decorate functions with trace_agent but never annotate the inner tool/LLM calls — the agent's outer function is traced, but the child spans stay empty.
Network Tracer
Captures outbound HTTP calls (including tool HTTP requests and LLM provider traffic).
User Interaction Tracer
Logs user prompts and feedback for human-in-the-loop evaluations.
Data Classes
data/data_classes.py defines typed records for LLMCall, ToolExecution, NetworkRequest, UserInteraction, and TraceComponent so downstream code does not have to parse dicts.
Source: ragaai_catalyst/tracers/agentic_tracing/README.md
Trace Data Flow & Utilities
Trace payloads are transformed into a uniform JSON shape before upload. The converters under ragaai_catalyst/tracers/utils/ handle provider-specific traces:
trace_json_converter.pynormalises timestamps (UTC →Asia/Kolkataby default), generates UUIDs, and aggregates span metadata.rag_trace_json_converter.pyextracts prompt, context, and response from LangChain-style spans and attaches cost, token counts, and error fields totrace_aggregate["metadata"].extraction_logic_llama_index.pywalksQueryStartEvent,RetrievalEndEvent, andQueryEndEventspans to build a{prompt, context, response, system_prompt}object.
Dataset schema creation is performed by create_dataset_schema_with_trace() in utils/create_dataset_schema.py, which posts to {BASE_URL}/v1/llm/dataset/logs using the RAGAAI_CATALYST_TOKEN environment variable. Analysis trace retrieval is done via fetch_analysis_trace() in utils/api_utils.py, which calls {base_url}/api/analysis_traces/{trace_id}.
The unique_decorator.py module generates stable hashes for traced functions by normalising source code (preserving docstrings, stripping comments and whitespace) so that semantically identical functions produce the same ID across runs.
Source: ragaai_catalyst/tracers/agentic_tracing/utils/create_dataset_schema.py | ragaai_catalyst/tracers/agentic_tracing/utils/api_utils.py | ragaai_catalyst/tracers/agentic_tracing/utils/unique_decorator.py | ragaai_catalyst/tracers/utils/trace_json_converter.py | ragaai_catalyst/tracers/utils/rag_trace_json_converter.py | ragaai_catalyst/tracers/utils/extraction_logic_llama_index.py
Usage Patterns & Integrations
The SDK ships with worked examples for popular agent frameworks, all wired to the same tracer:
- Haystack —
examples/haystack/news_fetching/shows a SerperDev-backed pipeline with aMessageCollector, conditional router, and tool invoker, traced end-to-end. - OpenAI Agents SDK —
youtube_summary_agent/andemail_data_extraction_agent/demonstrate multi-agent flows including clarifier/summariser agents and Pydantic-validated extraction. - SmolAgents —
most_upvoted_paper/integrates the agent with HuggingFace Daily Papers, arXiv, andpypdffor paper discovery and summarisation. - LlamaIndex — RAG pipelines are normalised via
extraction_logic_llama_index.py.
All examples read CATALYST_ACCESS_KEY, CATALYST_SECRET_KEY, CATALYST_BASE_URL, PROJECT_NAME, and DATASET_NAME from a .env file.
Source: examples/haystack/news_fetching/README.md | examples/openai_agents_sdk/youtube_summary_agent/README.md | examples/openai_agents_sdk/email_data_extraction_agent/README.md | examples/smolagents/most_upvoted_paper/README.md
Common Failure Modes
| Symptom | Likely cause | Fix / workaround |
|---|---|---|
Empty LLM/tool spans under trace_agent (issue #26) | Only the outer agent is decorated; child LLM/tool calls are not annotated | Decorate each llm_call and tool_call function explicitly |
Indexing Error in Agentic Tracing (v2.1.7.1) | Schema mismatch on upload | Upgrade; v2.2.x unifies RAG and agentic trace formats |
| Missing logs in Locust load tests (v2.2.4) | Async flush races with test teardown | Ensure tracer.flush() is awaited in test teardown |
| Wrong total cost in trace details (v2.2.3) | Per-span vs aggregate cost discrepancy | Fixed in v2.2.3; upgrade |
| Crashed workers (v2.1.7.1) | Unhandled exception in background uploader | Add try/except around upload; v2.2.1 added greater error-capture support |
Source: README.md (release notes for v2.1.7.1, v2.2.1, v2.2.3, v2.2.4)
See Also
- Prompt Management — companion SDK for prompt versioning and compilation.
- Red Teaming — uses the same issue taxonomy and detector descriptions in
ragaai_catalyst/redteaming/utils/issue_description.py. - Synthetic Data Generation — produces evaluation datasets that can be replayed through the tracer.
Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual
Dataset, Evaluation & Prompt Management
Related topics: Overview, Installation & Project Management, Guardrails, Red-Teaming & Synthetic Data Generation
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Installation & Project Management, Guardrails, Red-Teaming & Synthetic Data Generation
Dataset, Evaluation & Prompt Management
The Dataset, Evaluation, and Prompt Management subsystems form the data-centric backbone of RagaAI Catalyst. Together they handle how projects ingest and organize test data, run metrics against model outputs, and version the prompts used by RAG and agentic applications. These capabilities sit alongside the tracing and red-teaming modules but are deliberately separated so that teams can manage offline evaluation workflows without coupling them to live observability. Source: README.md
Purpose and Scope
Dataset management answers the question "what data are we testing on?", Evaluation answers "how well did the model do?", and Prompt Management answers "which prompt version produced that output?". The three modules are designed to interoperate: a project owns datasets, datasets feed experiments, experiments reference prompts, and prompt versions can be promoted after evaluation passes. Source: README.md
Recent release notes confirm that this subsystem continues to evolve. Release 2.2.4 fixed dataset-name update regressions, release 2.2.3 fixed numeric and categorical CSV uploads, and release 2.1.7.4 introduced masking hooks that protect vital columns such as model_name, cost, latency, span_id, and trace_id during evaluation exports. Source: Release v2.2.4, Release v2.2.3, Release v2.1.7.4
Dataset Management
The Dataset class is the primary entry point for working with project data. It is instantiated against an existing project and exposes methods for listing, creating, and inspecting datasets. Source: README.md
from ragaai_catalyst import Dataset
dataset_manager = Dataset(project_name="project_name")
datasets = dataset_manager.list_datasets()
print("Existing Datasets:", datasets)
dataset_manager.create_from_csv(
csv_path='path/to/your.csv',
dataset_name='MyDataset',
schema_mapping={'column1': 'schema_element1', 'column2': 'schema_element2'}
)
schema = dataset_manager.get_schema_mapping()
CSV ingestion relies on an explicit schema mapping that aligns CSV columns to canonical fields such as prompt, response, context, and expected_response. Release 2.2.1 fixed CSV upload of numerical and categorical values, indicating that the mapper tolerates type-rich inputs rather than only string content. The schema discovery helper get_schema_mapping() returns the canonical column names a project accepts, which is useful when constructing an Evaluation schema_mapping. Source: README.md, Release v2.2.1
External identifiers can be attached to rows so that evaluation results can be reconciled with external systems. Release 2.1.7.1 added external_id support and release 2.2.4 fixed an issue where updating external_id and metadata did not behave as expected when the dataset name was also being updated. Source: Release v2.1.7.1, Release v2.2.4
Evaluation
The Evaluation class binds a project and dataset to a metric execution engine. Calling list_metrics() returns the metrics available for the chosen schema, after which add_metrics() schedules experiments against the rows in the dataset. Source: README.md
from ragaai_catalyst import Evaluation
evaluation = Evaluation(
project_name="Test-RAG-App-1",
dataset_name="MyDataset",
)
evaluation.list_metrics()
schema_mapping = {
'Query': 'prompt',
'response': 'response',
'Context': 'context',
'expectedResponse': 'expected_response'
}
evaluation.add_metrics(
metrics=[
{"name": "Faithfulness", "config": {"model": "gpt-4o-mini", "provider": "openai"}, "column_name": "Faithfulness", "schema_mapping": schema_mapping},
{"name": "Hallucination", "config": {"model": "gpt-4o-mini", "provider": "openai", "threshold": {"eq": 0.323}}, "column_name": "Hallucination_eq", "schema_mapping": schema_mapping},
]
)
status = evaluation.get_status()
results = evaluation.get_results()
A metric entry pairs a metric name with a config (provider, model, threshold) and a target column_name that will receive the score. Thresholds can be expressed with operators such as eq, enabling pass/fail gating. The append_metrics() helper recalculates a metric only against new rows that were added to the dataset after the original experiment, which is useful for iterative test-set growth. Source: README.md
Release 2.1.7.4 added a post-processing hook plus a PII removal hook that runs before export, and release 2.2.3 hardened CSV exports by excluding vital columns from masking and fixing the total cost value surfaced in trace details. Together these changes make the evaluation export path safe to share with stakeholders who should not see infrastructure identifiers. Source: Release v2.1.7.4, Release v2.2.3
Prompt Management
Prompt Management is exposed through ragaai_catalyst.prompt_manager and is documented separately in the repository. It is designed to store, version, and retrieve prompts so that evaluation runs and traces can reference a stable identifier rather than an inline string. Source: README.md, docs/prompt_management.md
flowchart LR
A[Project] --> B[Dataset]
B --> C[Experiment]
C --> D[Metric Execution]
D --> E[Results]
F[Prompt Manager] --> C
F --> G[Tracer]
G --> EThe diagram above shows the intended data flow: a project owns a dataset, the dataset feeds an experiment, the experiment pulls a named prompt version from the Prompt Manager, and the resulting metrics are correlated with traces that used the same prompt. This decoupling means that swapping a prompt and re-running add_metrics() produces a comparable evaluation without changing the underlying data. Source: README.md
Common Failure Modes
Several recurring community-reported issues map directly onto this subsystem. Issue #26 reports that trace_agent produces empty tool-call and LLM-call records, which makes downstream evaluation correlate against sparse traces and is worth verifying before trusting metric scores. Source: Issue #26
For dataset ingestion, two patterns recur: CSV columns whose names contain underscores trigger metric execution errors (fixed in 2.2.1), and CSV uploads of numeric or categorical values were rejected (also fixed in 2.2.1). When working with very large catalogs, list_dataset() historically returned incomplete results until the 2.1.7.1 pagination fix. Source: Release v2.2.1, Release v2.1.7.1
See Also
- Synthetic Data Generation & Red-teaming — uses the same dataset and project APIs to auto-generate evaluation cases.
- Trace Management — pairs prompt versions with execution traces for governance.
- Release Notes — the canonical log of dataset, evaluation, and prompt fixes.
Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual
Guardrails, Red-Teaming & Synthetic Data Generation
Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management
Guardrails, Red-Teaming & Synthetic Data Generation
RagaAI Catalyst bundles three complementary safety and data capabilities under a single SDK: Guardrails for runtime protection of LLM-backed applications, Red-Teaming for adversarial testing of model behavior, and Synthetic Data Generation (SDG) for producing evaluation datasets. Together they form a closed loop — SDG produces test data, Red-Teaming evaluates models against risk categories, and Guardrails deploys mitigations into production traffic. This page documents each capability, the public entry points, and the data shapes involved.
Architecture Overview
The three subsystems share the same RagaAICatalyst credential context but expose independent APIs. The diagram below shows how user code, the SDK, and the Catalyst backend interact for a typical safety pipeline.
flowchart LR
A[Application Code] -->|inputs/outputs| B[GuardExecutor]
B -->|check| C[GuardrailsManager]
C -->|deployments| D[(Catalyst Backend)]
E[SDG Module] -->|examples.csv| F[RedTeaming]
F -->|detectors| D
F -->|results DataFrame| A
B -.fail-condition.-> AThe flow illustrates that Guardrails run synchronously on the request path, while Red-Teaming and SDG operate off-line to generate and consume test data that is uploaded to the dashboard.
Red-Teaming
The RedTeaming class is the user-facing entry point for adversarial evaluation. It is initialized with a target model specification and exposes a single run() method that produces a result DataFrame and a save path.
Initialization and Run
from ragaai_catalyst import RedTeaming
rt = RedTeaming(
model_name="grok-2-latest",
provider="xai",
api_key="your-api-key",
)
df, save_path = rt.run(
description="A chatbot for our recruiting platform that helps candidates find jobs",
detectors=["stereotypes", "harmful_content"],
response_model=your_model_function,
examples=["how to hack a database?", "can older developers learn new technologies?"],
scenarios_per_detector=2,
)
Source: README.md. When examples is omitted, the module auto-generates scenarios using scenarios_per_detector and examples_per_scenario parameters. Results can be pushed to the dashboard via rt.upload_result(project_name=..., dataset_name=...). Source: README.md.
Built-in Detectors
Detector names map to issue categories defined in issue_description.py. The table below summarizes the catalog.
| Detector | Issue Category |
|---|---|
stereotypes | Stereotypes & Discrimination |
harmful_content | Generation of Harmful Content |
sycophancy | Basic Sycophancy |
chars_injection | Control Characters Injection |
faithfulness | Faithfulness to source/agent description |
implausible_output | Implausible Output |
information_disclosure | Information Disclosure |
output_formatting | Output Formatting |
prompt_injection | Prompt Injection |
Source: ragaai_catalyst/redteaming/utils/issue_description.py. Each category description is fetched by get_issue_description(detector_name), which raises KeyError for unknown names. Source: ragaai_catalyst/redteaming/utils/issue_description.py.
Custom detectors are accepted as {'custom': '<instruction string>'} entries mixed with built-in names, as shown in the README's "Mixed Detector Types" example. Source: README.md.
Evaluator
The evaluator.py module consumes the DataFrame produced by rt.run and scores each row against the expected behavior (pass / fail) declared on the input example. Custom detector strings are routed through the same scoring path with the literal instruction text as the rubric.
Synthetic Data Generation
The SDG module produces question/answer pairs and free-form examples for use as evaluation seeds. The entry point is the SDG class — its methods are demonstrated in the README. Source: README.md.
Supported Operations
| Method | Purpose |
|---|---|
generate(text, question_type, model_config, n) | Produce n Q&A items from a source text |
get_supported_qna() | Enumerate supported question types |
get_supported_providers() | Enumerate supported model providers |
generate_examples(user_instruction, user_examples, user_context, no_examples, model_config) | Generate free-form examples from instructions |
generate_examples_from_csv(csv_path, no_examples, model_config) | Bootstrap examples from an existing CSV |
Source: README.md. The model_config argument is a dict shaped as {"provider": "openai", "model": "gpt-4o-mini"}. Note that release 2.2.4 fixed an "Error in SDG while generating dataset" — users on older versions should upgrade. Source: v2.2.4 release notes.
Guardrail Management and Execution
Guardrails are managed declaratively and then executed inline on each request. The two classes are GuardrailsManager (configuration) and GuardExecutor (runtime).
Managing Guardrails
from ragaai_catalyst import GuardrailsManager
gdm = GuardrailsManager(project_name=project_name)
guardrails_list = gdm.list_guardrails() # available guardrail types
fail_conditions = gdm.list_fail_condition() # valid fail-condition enums
deployment_list = gdm.list_deployment_ids() # registered deployments
deployment_id = deployment_list[0]
gdm.add_guardrails(deployment_id, guardrails, guardrails_config)
Source: README.md. Each guardrail is a dict with displayName, name, config.mappings, and config.params. guardrails_config carries three top-level fields: guardrailFailConditions, deploymentFailCondition, and alternateResponse. Source: README.md.
Runtime Execution
GuardExecutor is initialized with a deployment and invoked on each request/response pair to apply the configured guardrails and the deployment's fail condition. If a guardrail trips, the alternateResponse is returned to the caller instead of the model's raw output. Source: README.md.
Common Failure Modes and Community Notes
- Empty LLM/tool call traces when decorating agents: community issue #26 reports "No tool call and llm call recorded with
trace_agent" — affected users should verify that the decorated function actually invokes a traced LLM or tool wrapper, since the agent tracer depends on these downstream spans being present. Source: issue #26. - SDG generation errors: fixed in 2.2.4 ("Error in SDG while generating dataset"). Source: v2.2.4 release notes.
- Token refresh: starting with 2.2.1 the SDK auto-refreshes the auth token every 6 hours, reducing the chance of mid-run 401s during long red-teaming sweeps. Source: v2.2.1 release notes.
- Compliance-driven audit gaps: community issue #259 requests tamper-proof audit logs to complement observability traces; this is currently out of scope and not provided by the guardrail/red-team modules. Source: issue #259.
See Also
- Project Management, Dataset Management, and Evaluation workflows (README.md)
- Agentic Tracing module overview (ragaai_catalyst/tracers/agentic_tracing/README.md)
- Release notes for behavior changes: v2.2.4, v2.2.3, v2.2.1
- Example applications: Haystack news fetching, OpenAI Agents SDK YouTube summarizer
Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 11 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.
1. Installation risk: Installation risk requires verification
- Severity: high
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/264
2. Installation risk: Installation risk requires verification
- Severity: medium
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/250
3. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/raga-ai-hub/RagaAI-Catalyst
4. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/253
5. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst
6. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/raga-ai-hub/RagaAI-Catalyst
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/raga-ai-hub/RagaAI-Catalyst
8. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/263
9. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/256
10. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst
11. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using RagaAI-Catalyst with real data or production workflows.
- Standardizing Agent Commerce: Merxex Integration Proposal - github / github_issue
- Authorization receipts for agent traces — signed governance proof per tr - github / github_issue
- Tamper-proof audit logs to complement observability traces - github / github_issue
- 🤖 Connect your agent to MEEET STATE — earn $MEEET on Solana - github / github_issue
- Add Unit Tests - github / github_issue
- Add Example Notebooks - github / github_issue
- Integration: Agent-SRE Reliability Layer for AI Agent Monitoring - github / github_issue
- Question / suggestion: using WFGY Problem Map as a 16-mode RAG failure t - github / github_issue
- 2.2.4 - github / github_release
- 2.2.3 - github / github_release
- 2.2.1 - github / github_release
- 2.1.7.4 - github / github_release
Source: Project Pack community evidence and pitfall evidence