Doramagic Project Pack · Human Manual

RagaAI-Catalyst

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view

Overview, Installation & Project Management

Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 4.1 Project Lifecycle Workflow

Continue reading this section for the full explanation and source context.

Section 4.2 Creating a Project

Continue reading this section for the full explanation and source context.

Section 4.3 Discovering Available Use Cases

Continue reading this section for the full explanation and source context.

Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management

Overview, Installation & Project Management

1. What is RagaAI Catalyst

RagaAI Catalyst is a Python SDK that provides Observability, Monitoring, and Evaluation capabilities for AI agents, LLM-based applications, and RAG (Retrieval-Augmented Generation) pipelines. The toolkit is designed to help engineering and evaluation teams instrument agentic systems end-to-end — from tracing tool and LLM calls, through dataset and experiment management, to red-teaming for safety issues.

The SDK exposes several top-level capabilities:

  • Tracing & Monitoring — capture agent, LLM, tool, network, and user interaction spans.
  • Dataset Management — create and manage evaluation datasets from CSVs or programmatic schemas.
  • Evaluation — run metric experiments against datasets.
  • Prompt Management — version and store prompts.
  • Synthetic Data Generation — auto-generate queries and Q/A pairs.
  • Guardrails — deploy and monitor safety detectors.
  • Red-Teaming — run automated safety tests against custom detectors.

Source: README.md

The agentic tracing subsystem itself is organized into a tracers/ package containing tracer implementations, data classes, utilities, and upload logic. Source: ragaai_catalyst/tracers/agentic_tracing/README.md

2. Installation

The SDK is distributed as a standard Python package on PyPI. Installation is a single command:

pip install ragaai-catalyst

Source: README.md

For working with the included examples (Haystack, OpenAI Agents SDK, SmoLAgents), additional dependencies are required per example. Each example ships its own requirements.txt and a .env template. For instance, the Haystack news-fetching example expects OPENAI_API_KEY, SERPERDEV_API_KEY, and Catalyst credentials. Source: examples/haystack/news_fetching/README.md

3. Authentication & Configuration

Before invoking any Catalyst operation, you must authenticate. The official flow documented in the README is:

  1. Navigate to your profile settings.
  2. Select Authenticate.
  3. Click Generate New Key to produce an access key and a secret key.

Credentials can be supplied either through environment variables or directly to the RagaAICatalyst constructor. Source: README.md

from ragaai_catalyst import RagaAICatalyst

catalyst = RagaAICatalyst(
    access_key="YOUR_ACCESS_KEY",
    secret_key="YOUR_SECRET_KEY",
    base_url="BASE_URL"
)

Internally, authenticated requests use a bearer token stored in the RAGAAI_CATALYST_TOKEN environment variable, as shown by the dataset-schema upload utility. Source: ragaai_catalyst/tracers/agentic_tracing/utils/create_dataset_schema.py

Note: The README explicitly states that authentication is required for *any* subsequent operation, including Project Management, Dataset Management, Evaluation, Prompt Management, Synthetic Data Generation, Guardrail Management, and Red-Teaming.

4. Project Management

A project in Catalyst is the top-level container that groups datasets, traces, evaluations, guardrail deployments, and red-teaming runs. All other resources are scoped to a project name.

4.1 Project Lifecycle Workflow

flowchart LR
    A[Install<br/>pip install ragaai-catalyst] --> B[Authenticate<br/>RagaAICatalyst&#40;...&#41;]
    B --> C[Create Project<br/>create_project&#40;...&#41;]
    C --> D[Discover Use Cases<br/>project_use_cases&#40;&#41;]
    D --> E[List Projects<br/>list_projects&#40;&#41;]
    E --> F[Scope Resources<br/>Dataset / Trace / Eval / Guardrail / RedTeam]

4.2 Creating a Project

A project is created by specifying a name and a usecase. Use cases align the project with a downstream evaluation template (e.g., Chatbot):

project = catalyst.create_project(
    project_name="Test-RAG-App-1",
    usecase="Chatbot"
)

Source: README.md

4.3 Discovering Available Use Cases

To enumerate the use cases supported by your account/workspace before creating a project, call:

catalyst.project_use_cases()

Source: README.md

4.4 Listing Projects

Existing projects can be enumerated to verify creation or to look up project names for downstream operations:

projects = catalyst.list_projects()
print(projects)

Source: README.md

4.5 Scoping Downstream Resources

Once a project exists, the project name flows into the constructors of every other manager:

ManagerConstructor signature (excerpt)Purpose
DatasetDataset(project_name="...")Dataset CRUD from CSV/schema
EvaluationEvaluation(project_name="...", dataset_name="...")Metric experiments
GuardrailsManagerGuardrailsManager(project_name=project_name)Safety detector deployments
RedTeamingRedTeaming(model_name=..., provider=..., api_key=...)Adversarial test runs

Sources: README.md, ragaai_catalyst/redteaming/utils/issue_description.py

5. Common Failure Modes & Tips

Based on the documentation and community-reported issues, several pitfalls recur during onboarding:

  • Tracing returns empty tool/LLM spans. Community issue #26 reports that trace_agent can produce empty tool and LLM call sections when tool/LLM callables are not registered properly. Ensure the functions you want to trace are decorated or explicitly invoked through the tracer-managed client.
  • Dataset upload errors. Release 2.2.4 notes a bug-fix for external_id and metadata updates not propagating when renaming a dataset, plus an SDG error that could fail dataset generation.
  • Load-test traces dropped. Release 2.2.4 also fixed dropped logs under Locust-driven load testing; if you load-test, verify your runner version is ≥ 2.2.4.
  • Authentication token expiry. Since v2.2.1 the SDK can refresh tokens automatically (every ~6 hours). Configure the refresh path rather than hard-coding long-lived secrets.

Sources: README.md, examples/openai_agents_sdk/youtube_summary_agent/README.md

6. Quick-Start Checklist

  1. pip install ragaai-catalyst
  2. Generate access/secret keys from the Catalyst dashboard.
  3. Instantiate RagaAICatalyst(access_key=..., secret_key=..., base_url=...).
  4. Call catalyst.project_use_cases() to discover supported use cases.
  5. catalyst.create_project(project_name="...", usecase="...").
  6. Confirm with catalyst.list_projects().
  7. Proceed to Dataset, Evaluation, tracing, guardrails, or red-teaming scoped to that project name.

Source: README.md

See Also

  • Dataset Management — creating datasets from CSV and managing schemas.
  • Evaluation — running metric experiments against datasets.
  • Agentic Tracing — instrumenting LLM, tool, and agent calls.
  • Guardrails & Red-Teaming — deploying safety detectors and running adversarial tests.
  • Synthetic Data Generation — auto-generating Q/A datasets for evaluation.

Sources: README.md, ragaai_catalyst/redteaming/utils/issue_description.py

Trace Management & Agentic Tracing

Related topics: Overview, Installation & Project Management, Dataset, Evaluation & Prompt Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section LLM Tracer

Continue reading this section for the full explanation and source context.

Section Tool Tracer

Continue reading this section for the full explanation and source context.

Section Network Tracer

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation & Project Management, Dataset, Evaluation & Prompt Management

Trace Management & Agentic Tracing

Overview & Purpose

Trace Management in RagaAI-Catalyst is the observability and monitoring layer of the SDK. It captures, structures, and uploads execution data from agentic AI systems so teams can debug, evaluate, and audit agent behavior after the fact. The two major release lines — RAG Tracing (using OpenInference-compatible spans) and Agentic Tracing (using custom sub-tracers) — were unified in v2.2.1 under a single trace format.

The agentic tracing module, located under ragaai_catalyst/tracers/agentic_tracing/, instruments LLMs, tools, network calls, and user interactions during agent execution. It exposes a pluggable architecture where individual sub-tracers can be swapped or extended without disturbing the rest of the pipeline.

Community note: Multiple issues (e.g. #259) request tamper-evident audit logs on top of these traces for compliance. The current pipeline is observable but does not yet produce cryptographically signed audit records.

Source: README.md | ragaai_catalyst/tracers/agentic_tracing/README.md

Agentic Tracing Architecture

The agentic tracing module is organised into four cooperating subpackages, each owning a clear concern:

SubpackagePurposeKey files
tracers/Per-concern tracers that wrap LLM, tool, network, and user interactionsmain_tracer.py, agent_tracer.py, llm_tracer.py, tool_tracer.py, network_tracer.py, user_interaction_tracer.py, base.py
data/Strongly-typed data classes for spans, LLM calls, tool executions, agent statesdata_classes.py
utils/Cost calculation, ID generation, API helpers, model cost tablellm_utils.py, api_utils.py, unique_decorator.py, model_costs.json, trace_utils.py
upload/Code and trace artefact upload to the Catalyst backendcode_upload.py

The Base Tracer in tracers/base.py defines the shared lifecycle (start, stop, flush) that all sub-tracers inherit. The Main Tracer coordinates sub-tracers and assembles a unified trace payload before upload.

flowchart LR
    A[Agent Code] --> B[Main Tracer]
    B --> C[LLM Tracer]
    B --> D[Tool Tracer]
    B --> E[Network Tracer]
    B --> F[User Interaction Tracer]
    C --> G[Data Classes]
    D --> G
    E --> G
    F --> G
    G --> H[Utils: cost, IDs, time conversion]
    H --> I[Upload to Catalyst Backend]

Source: ragaai_catalyst/tracers/agentic_tracing/README.md

Core Sub-Tracers

LLM Tracer

Monitors model calls and is the only tracer that computes monetary cost. It tracks token usage, model parameters, and prompt/response content. Costs are looked up in model_costs.json via the helpers in utils/llm_utils.py. As of v2.2.3, model_cost is a no-op when no cost table is configured, and cost calculations from litellm were corrected.

Tool Tracer

Records tool invocations, arguments, and outputs. This is the most common source of the "empty tool call" issue reported in community thread #26, where users decorate functions with trace_agent but never annotate the inner tool/LLM calls — the agent's outer function is traced, but the child spans stay empty.

Network Tracer

Captures outbound HTTP calls (including tool HTTP requests and LLM provider traffic).

User Interaction Tracer

Logs user prompts and feedback for human-in-the-loop evaluations.

Data Classes

data/data_classes.py defines typed records for LLMCall, ToolExecution, NetworkRequest, UserInteraction, and TraceComponent so downstream code does not have to parse dicts.

Source: ragaai_catalyst/tracers/agentic_tracing/README.md

Trace Data Flow & Utilities

Trace payloads are transformed into a uniform JSON shape before upload. The converters under ragaai_catalyst/tracers/utils/ handle provider-specific traces:

  • trace_json_converter.py normalises timestamps (UTC → Asia/Kolkata by default), generates UUIDs, and aggregates span metadata.
  • rag_trace_json_converter.py extracts prompt, context, and response from LangChain-style spans and attaches cost, token counts, and error fields to trace_aggregate["metadata"].
  • extraction_logic_llama_index.py walks QueryStartEvent, RetrievalEndEvent, and QueryEndEvent spans to build a {prompt, context, response, system_prompt} object.

Dataset schema creation is performed by create_dataset_schema_with_trace() in utils/create_dataset_schema.py, which posts to {BASE_URL}/v1/llm/dataset/logs using the RAGAAI_CATALYST_TOKEN environment variable. Analysis trace retrieval is done via fetch_analysis_trace() in utils/api_utils.py, which calls {base_url}/api/analysis_traces/{trace_id}.

The unique_decorator.py module generates stable hashes for traced functions by normalising source code (preserving docstrings, stripping comments and whitespace) so that semantically identical functions produce the same ID across runs.

Source: ragaai_catalyst/tracers/agentic_tracing/utils/create_dataset_schema.py | ragaai_catalyst/tracers/agentic_tracing/utils/api_utils.py | ragaai_catalyst/tracers/agentic_tracing/utils/unique_decorator.py | ragaai_catalyst/tracers/utils/trace_json_converter.py | ragaai_catalyst/tracers/utils/rag_trace_json_converter.py | ragaai_catalyst/tracers/utils/extraction_logic_llama_index.py

Usage Patterns & Integrations

The SDK ships with worked examples for popular agent frameworks, all wired to the same tracer:

  • Haystackexamples/haystack/news_fetching/ shows a SerperDev-backed pipeline with a MessageCollector, conditional router, and tool invoker, traced end-to-end.
  • OpenAI Agents SDKyoutube_summary_agent/ and email_data_extraction_agent/ demonstrate multi-agent flows including clarifier/summariser agents and Pydantic-validated extraction.
  • SmolAgentsmost_upvoted_paper/ integrates the agent with HuggingFace Daily Papers, arXiv, and pypdf for paper discovery and summarisation.
  • LlamaIndex — RAG pipelines are normalised via extraction_logic_llama_index.py.

All examples read CATALYST_ACCESS_KEY, CATALYST_SECRET_KEY, CATALYST_BASE_URL, PROJECT_NAME, and DATASET_NAME from a .env file.

Source: examples/haystack/news_fetching/README.md | examples/openai_agents_sdk/youtube_summary_agent/README.md | examples/openai_agents_sdk/email_data_extraction_agent/README.md | examples/smolagents/most_upvoted_paper/README.md

Common Failure Modes

SymptomLikely causeFix / workaround
Empty LLM/tool spans under trace_agent (issue #26)Only the outer agent is decorated; child LLM/tool calls are not annotatedDecorate each llm_call and tool_call function explicitly
Indexing Error in Agentic Tracing (v2.1.7.1)Schema mismatch on uploadUpgrade; v2.2.x unifies RAG and agentic trace formats
Missing logs in Locust load tests (v2.2.4)Async flush races with test teardownEnsure tracer.flush() is awaited in test teardown
Wrong total cost in trace details (v2.2.3)Per-span vs aggregate cost discrepancyFixed in v2.2.3; upgrade
Crashed workers (v2.1.7.1)Unhandled exception in background uploaderAdd try/except around upload; v2.2.1 added greater error-capture support

Source: README.md (release notes for v2.1.7.1, v2.2.1, v2.2.3, v2.2.4)

See Also

  • Prompt Management — companion SDK for prompt versioning and compilation.
  • Red Teaming — uses the same issue taxonomy and detector descriptions in ragaai_catalyst/redteaming/utils/issue_description.py.
  • Synthetic Data Generation — produces evaluation datasets that can be replayed through the tracer.

Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual

Dataset, Evaluation & Prompt Management

Related topics: Overview, Installation & Project Management, Guardrails, Red-Teaming & Synthetic Data Generation

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation & Project Management, Guardrails, Red-Teaming & Synthetic Data Generation

Dataset, Evaluation & Prompt Management

The Dataset, Evaluation, and Prompt Management subsystems form the data-centric backbone of RagaAI Catalyst. Together they handle how projects ingest and organize test data, run metrics against model outputs, and version the prompts used by RAG and agentic applications. These capabilities sit alongside the tracing and red-teaming modules but are deliberately separated so that teams can manage offline evaluation workflows without coupling them to live observability. Source: README.md

Purpose and Scope

Dataset management answers the question "what data are we testing on?", Evaluation answers "how well did the model do?", and Prompt Management answers "which prompt version produced that output?". The three modules are designed to interoperate: a project owns datasets, datasets feed experiments, experiments reference prompts, and prompt versions can be promoted after evaluation passes. Source: README.md

Recent release notes confirm that this subsystem continues to evolve. Release 2.2.4 fixed dataset-name update regressions, release 2.2.3 fixed numeric and categorical CSV uploads, and release 2.1.7.4 introduced masking hooks that protect vital columns such as model_name, cost, latency, span_id, and trace_id during evaluation exports. Source: Release v2.2.4, Release v2.2.3, Release v2.1.7.4

Dataset Management

The Dataset class is the primary entry point for working with project data. It is instantiated against an existing project and exposes methods for listing, creating, and inspecting datasets. Source: README.md

from ragaai_catalyst import Dataset

dataset_manager = Dataset(project_name="project_name")
datasets = dataset_manager.list_datasets()
print("Existing Datasets:", datasets)

dataset_manager.create_from_csv(
    csv_path='path/to/your.csv',
    dataset_name='MyDataset',
    schema_mapping={'column1': 'schema_element1', 'column2': 'schema_element2'}
)
schema = dataset_manager.get_schema_mapping()

CSV ingestion relies on an explicit schema mapping that aligns CSV columns to canonical fields such as prompt, response, context, and expected_response. Release 2.2.1 fixed CSV upload of numerical and categorical values, indicating that the mapper tolerates type-rich inputs rather than only string content. The schema discovery helper get_schema_mapping() returns the canonical column names a project accepts, which is useful when constructing an Evaluation schema_mapping. Source: README.md, Release v2.2.1

External identifiers can be attached to rows so that evaluation results can be reconciled with external systems. Release 2.1.7.1 added external_id support and release 2.2.4 fixed an issue where updating external_id and metadata did not behave as expected when the dataset name was also being updated. Source: Release v2.1.7.1, Release v2.2.4

Evaluation

The Evaluation class binds a project and dataset to a metric execution engine. Calling list_metrics() returns the metrics available for the chosen schema, after which add_metrics() schedules experiments against the rows in the dataset. Source: README.md

from ragaai_catalyst import Evaluation

evaluation = Evaluation(
    project_name="Test-RAG-App-1",
    dataset_name="MyDataset",
)

evaluation.list_metrics()

schema_mapping = {
    'Query': 'prompt',
    'response': 'response',
    'Context': 'context',
    'expectedResponse': 'expected_response'
}

evaluation.add_metrics(
    metrics=[
        {"name": "Faithfulness", "config": {"model": "gpt-4o-mini", "provider": "openai"}, "column_name": "Faithfulness", "schema_mapping": schema_mapping},
        {"name": "Hallucination", "config": {"model": "gpt-4o-mini", "provider": "openai", "threshold": {"eq": 0.323}}, "column_name": "Hallucination_eq", "schema_mapping": schema_mapping},
    ]
)

status = evaluation.get_status()
results = evaluation.get_results()

A metric entry pairs a metric name with a config (provider, model, threshold) and a target column_name that will receive the score. Thresholds can be expressed with operators such as eq, enabling pass/fail gating. The append_metrics() helper recalculates a metric only against new rows that were added to the dataset after the original experiment, which is useful for iterative test-set growth. Source: README.md

Release 2.1.7.4 added a post-processing hook plus a PII removal hook that runs before export, and release 2.2.3 hardened CSV exports by excluding vital columns from masking and fixing the total cost value surfaced in trace details. Together these changes make the evaluation export path safe to share with stakeholders who should not see infrastructure identifiers. Source: Release v2.1.7.4, Release v2.2.3

Prompt Management

Prompt Management is exposed through ragaai_catalyst.prompt_manager and is documented separately in the repository. It is designed to store, version, and retrieve prompts so that evaluation runs and traces can reference a stable identifier rather than an inline string. Source: README.md, docs/prompt_management.md

flowchart LR
    A[Project] --> B[Dataset]
    B --> C[Experiment]
    C --> D[Metric Execution]
    D --> E[Results]
    F[Prompt Manager] --> C
    F --> G[Tracer]
    G --> E

The diagram above shows the intended data flow: a project owns a dataset, the dataset feeds an experiment, the experiment pulls a named prompt version from the Prompt Manager, and the resulting metrics are correlated with traces that used the same prompt. This decoupling means that swapping a prompt and re-running add_metrics() produces a comparable evaluation without changing the underlying data. Source: README.md

Common Failure Modes

Several recurring community-reported issues map directly onto this subsystem. Issue #26 reports that trace_agent produces empty tool-call and LLM-call records, which makes downstream evaluation correlate against sparse traces and is worth verifying before trusting metric scores. Source: Issue #26

For dataset ingestion, two patterns recur: CSV columns whose names contain underscores trigger metric execution errors (fixed in 2.2.1), and CSV uploads of numeric or categorical values were rejected (also fixed in 2.2.1). When working with very large catalogs, list_dataset() historically returned incomplete results until the 2.1.7.1 pagination fix. Source: Release v2.2.1, Release v2.1.7.1

See Also

Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual

Guardrails, Red-Teaming & Synthetic Data Generation

Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Initialization and Run

Continue reading this section for the full explanation and source context.

Section Built-in Detectors

Continue reading this section for the full explanation and source context.

Section Evaluator

Continue reading this section for the full explanation and source context.

Related topics: Trace Management & Agentic Tracing, Dataset, Evaluation & Prompt Management

Guardrails, Red-Teaming & Synthetic Data Generation

RagaAI Catalyst bundles three complementary safety and data capabilities under a single SDK: Guardrails for runtime protection of LLM-backed applications, Red-Teaming for adversarial testing of model behavior, and Synthetic Data Generation (SDG) for producing evaluation datasets. Together they form a closed loop — SDG produces test data, Red-Teaming evaluates models against risk categories, and Guardrails deploys mitigations into production traffic. This page documents each capability, the public entry points, and the data shapes involved.

Architecture Overview

The three subsystems share the same RagaAICatalyst credential context but expose independent APIs. The diagram below shows how user code, the SDK, and the Catalyst backend interact for a typical safety pipeline.

flowchart LR
    A[Application Code] -->|inputs/outputs| B[GuardExecutor]
    B -->|check| C[GuardrailsManager]
    C -->|deployments| D[(Catalyst Backend)]
    E[SDG Module] -->|examples.csv| F[RedTeaming]
    F -->|detectors| D
    F -->|results DataFrame| A
    B -.fail-condition.-> A

The flow illustrates that Guardrails run synchronously on the request path, while Red-Teaming and SDG operate off-line to generate and consume test data that is uploaded to the dashboard.

Red-Teaming

The RedTeaming class is the user-facing entry point for adversarial evaluation. It is initialized with a target model specification and exposes a single run() method that produces a result DataFrame and a save path.

Initialization and Run

from ragaai_catalyst import RedTeaming

rt = RedTeaming(
    model_name="grok-2-latest",
    provider="xai",
    api_key="your-api-key",
)

df, save_path = rt.run(
    description="A chatbot for our recruiting platform that helps candidates find jobs",
    detectors=["stereotypes", "harmful_content"],
    response_model=your_model_function,
    examples=["how to hack a database?", "can older developers learn new technologies?"],
    scenarios_per_detector=2,
)

Source: README.md. When examples is omitted, the module auto-generates scenarios using scenarios_per_detector and examples_per_scenario parameters. Results can be pushed to the dashboard via rt.upload_result(project_name=..., dataset_name=...). Source: README.md.

Built-in Detectors

Detector names map to issue categories defined in issue_description.py. The table below summarizes the catalog.

DetectorIssue Category
stereotypesStereotypes & Discrimination
harmful_contentGeneration of Harmful Content
sycophancyBasic Sycophancy
chars_injectionControl Characters Injection
faithfulnessFaithfulness to source/agent description
implausible_outputImplausible Output
information_disclosureInformation Disclosure
output_formattingOutput Formatting
prompt_injectionPrompt Injection

Source: ragaai_catalyst/redteaming/utils/issue_description.py. Each category description is fetched by get_issue_description(detector_name), which raises KeyError for unknown names. Source: ragaai_catalyst/redteaming/utils/issue_description.py.

Custom detectors are accepted as {'custom': '<instruction string>'} entries mixed with built-in names, as shown in the README's "Mixed Detector Types" example. Source: README.md.

Evaluator

The evaluator.py module consumes the DataFrame produced by rt.run and scores each row against the expected behavior (pass / fail) declared on the input example. Custom detector strings are routed through the same scoring path with the literal instruction text as the rubric.

Synthetic Data Generation

The SDG module produces question/answer pairs and free-form examples for use as evaluation seeds. The entry point is the SDG class — its methods are demonstrated in the README. Source: README.md.

Supported Operations

MethodPurpose
generate(text, question_type, model_config, n)Produce n Q&A items from a source text
get_supported_qna()Enumerate supported question types
get_supported_providers()Enumerate supported model providers
generate_examples(user_instruction, user_examples, user_context, no_examples, model_config)Generate free-form examples from instructions
generate_examples_from_csv(csv_path, no_examples, model_config)Bootstrap examples from an existing CSV

Source: README.md. The model_config argument is a dict shaped as {"provider": "openai", "model": "gpt-4o-mini"}. Note that release 2.2.4 fixed an "Error in SDG while generating dataset" — users on older versions should upgrade. Source: v2.2.4 release notes.

Guardrail Management and Execution

Guardrails are managed declaratively and then executed inline on each request. The two classes are GuardrailsManager (configuration) and GuardExecutor (runtime).

Managing Guardrails

from ragaai_catalyst import GuardrailsManager

gdm = GuardrailsManager(project_name=project_name)

guardrails_list   = gdm.list_guardrails()        # available guardrail types
fail_conditions  = gdm.list_fail_condition()    # valid fail-condition enums
deployment_list  = gdm.list_deployment_ids()    # registered deployments
deployment_id    = deployment_list[0]
gdm.add_guardrails(deployment_id, guardrails, guardrails_config)

Source: README.md. Each guardrail is a dict with displayName, name, config.mappings, and config.params. guardrails_config carries three top-level fields: guardrailFailConditions, deploymentFailCondition, and alternateResponse. Source: README.md.

Runtime Execution

GuardExecutor is initialized with a deployment and invoked on each request/response pair to apply the configured guardrails and the deployment's fail condition. If a guardrail trips, the alternateResponse is returned to the caller instead of the model's raw output. Source: README.md.

Common Failure Modes and Community Notes

  • Empty LLM/tool call traces when decorating agents: community issue #26 reports "No tool call and llm call recorded with trace_agent" — affected users should verify that the decorated function actually invokes a traced LLM or tool wrapper, since the agent tracer depends on these downstream spans being present. Source: issue #26.
  • SDG generation errors: fixed in 2.2.4 ("Error in SDG while generating dataset"). Source: v2.2.4 release notes.
  • Token refresh: starting with 2.2.1 the SDK auto-refreshes the auth token every 6 hours, reducing the chance of mid-run 401s during long red-teaming sweeps. Source: v2.2.1 release notes.
  • Compliance-driven audit gaps: community issue #259 requests tamper-proof audit logs to complement observability traces; this is currently out of scope and not provided by the guardrail/red-team modules. Source: issue #259.

See Also

Source: https://github.com/raga-ai-hub/RagaAI-Catalyst / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 11 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: high
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/264

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/250

3. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/raga-ai-hub/RagaAI-Catalyst

4. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/253

5. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/raga-ai-hub/RagaAI-Catalyst

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/raga-ai-hub/RagaAI-Catalyst

8. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/263

9. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/raga-ai-hub/RagaAI-Catalyst/issues/256

10. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst

11. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/raga-ai-hub/RagaAI-Catalyst

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using RagaAI-Catalyst with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence