Doramagic Project Pack · Human Manual

pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

Overview, Installation, and Quickstart

Related topics: Code Execution, Sandbox, and Security Model, LLM Backends, Local Models, and Extension Ecosystem

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Development install

Continue reading this section for the full explanation and source context.

Section Optional extensions

Continue reading this section for the full explanation and source context.

Related topics: Code Execution, Sandbox, and Security Model, LLM Backends, Local Models, and Extension Ecosystem

Overview, Installation, and Quickstart

What is PandasAI

PandasAI is a Python library that lets users ask questions about their data in natural language. It targets two audiences: non-technical users who want to query datasets conversationally, and technical users who want to accelerate exploratory data analysis. The library is distributed as the core pandasai package on PyPI, with separate extension packages for additional LLM providers and vector stores. Source: README.md.

At a high level, the library works by sending the user's question together with a serialized representation of the dataframe to a Large Language Model (LLM), receiving generated Python code in response, and executing that code to produce a result. This flow is reflected in the prompt architecture: a BasePrompt renders Jinja2 templates, which are passed to an LLM that extends pandasai.llm.base.LLM. Source: pandasai/core/prompts/base.py:1-45, pandasai/llm/base.py:1-40.

The system message prompt class, for example, loads its template from disk through BasePrompt.template_path, which is how instructions are injected into the LLM context. Source: pandasai/core/prompts/generate_system_message.py:1-5.

Installation

PandasAI requires Python 3.8+ up to 3.11 at the time of v3.0.0. This constraint is enforced through a dependency on scipy==1.10.1, which itself caps Python at <3.12. Community requests to support Python 3.12 are tracked in issues #1850, #1787, and #1872. Source: README.md.

Install the core library and an LLM provider extension with either pip or poetry:

# pip
pip install pandasai
pip install pandasai-litellm

# poetry
poetry add pandasai
poetry add pandasai-litellm

The pandasai-litellm extension is a common choice because it routes requests through LiteLLM, supporting many providers with a single interface. Source: README.md.

Development install

Contributors are instructed to use Poetry (not pip or conda) and to install all extras plus dev dependencies:

poetry install --all-extras --with dev
pre-commit install

The project uses ruff for linting and pytest for tests. Source: CONTRIBUTING.md.

Optional extensions

ExtensionPurposeSource
pandasai-litellmMulti-provider LLM routingREADME.md
pandasai-openaiNative OpenAI / Azure OpenAIextensions/llms/openai/pandasai_openai/openai.py
pandasai-chromadbChromaDB vector store (EE)extensions/ee/vectorstores/chromadb/
pandasai-pineconePinecone vector store (EE)extensions/ee/vectorstores/pinecone/
pandasai-milvusMilvus vector store (EE)extensions/ee/vectorstores/milvus/
pandasai-qdrantQdrant vector store (EE)extensions/ee/vectorstores/qdrant/

Vector-store extensions fall under the Sinaptik GmbH Enterprise License and are intended for commercial use under that license. Source: extensions/ee/vectorstores/pinecone/README.md, extensions/ee/vectorstores/qdrant/README.md.

Quickstart

The minimal end-to-end example uses pandasai together with the LiteLLM extension. Source: README.md.

import pandasai as pai
from pandasai_litellm.litellm import LiteLLM

# 1. Configure the LLM
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")
pai.config.set("llm", llm)

# 2. Load a dataframe
df = pai.read_csv("employees.csv")

# 3. Ask a question
print(df.chat("Which employee has the highest salary?"))

Each .chat() call serializes the dataframe, builds prompts, calls the LLM, and executes the returned code. The default OpenAI extension supports a wide range of chat and completion model IDs, including gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, and gpt-4o-mini. Source: extensions/llms/openai/pandasai_openai/openai.py:1-40.

The BaseOpenAI class exposes standard inference parameters such as temperature, max_tokens, top_p, frequency_penalty, presence_penalty, and seed, all of which are forwarded to the underlying OpenAI client. Source: extensions/llms/openai/pandasai_openai/base.py:1-40.

Architecture and Data Flow

The following diagram summarizes the request path for a single chat() call:

flowchart LR
    A[User question] --> B[SmartDataframe / Agent]
    B --> C[Prompt Builder<br/>BasePrompt + Jinja2]
    C --> D[LLM<br/>LLM subclass]
    D --> E[Generated Python code]
    E --> F[Code Executor]
    F --> G[Result]
    B -.optional.-> H[Vector Store<br/>SemanticLayer / Memory]
    D <-. context .-> H
  • The prompt builder uses BasePrompt, which supports both inline template strings and template_path files loaded via Jinja2's FileSystemLoader. Source: pandasai/core/prompts/base.py:1-45.
  • The LLM is any subclass of pandasai.llm.base.LLM. The base class exposes is_pandasai_llm, type, and a _polish_code helper that strips leading python markers and stray backticks from generated snippets. Source: pandasai/llm/base.py:1-40.
  • The vector store is an optional component used to retrieve relevant documents or past question/answer pairs. VectorStore defines abstract methods such as add_docs, update_docs, delete_docs, get_relevant_docs, and get_relevant_qa_documents. Source: pandasai/vectorstores/vectorstore.py:1-90.

Common Failure Modes and Limitations

A few caveats from the v3.0.0 release and community discussions are worth noting during setup:

  • Python version pin. Running on Python 3.12 will fail to install scipy==1.10.1. Use 3.8–3.11 or wait for upstream support. Source: README.md, community issue #1872.
  • Code execution is not sandboxed by default. The default code executor runs LLM-generated code with full builtins available, so untrusted model output can lead to arbitrary code execution. Community issues #1893 and #1895 describe this risk; production deployments should add a sandbox or restrict input sources.
  • Pillow CVE. Older transitive dependencies on pillow ^10.1.0 carry an out-of-bounds CVE; community issue #1871 requests upgrading to 12.1.1.
  • Reasoning models. GPT-5 and other reasoning models are not yet supported out of the box. See community issue #1867.
  • Local models. Local LLM support (Ollama, LM Studio, Open WebUI) is requested frequently in community issues #187, #799, #1181, and #1888.

See Also

Source: https://github.com/sinaptik-ai/pandas-ai / Human Manual

Code Execution, Sandbox, and Security Model

Related topics: Overview, Installation, and Quickstart, LLM Backends, Local Models, and Extension Ecosystem

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation, and Quickstart, LLM Backends, Local Models, and Extension Ecosystem

Code Execution, Sandbox, and Security Model

Overview

PandasAI converts natural-language questions into Python (and sometimes SQL) code via a language model, then executes that code against the user's data. The path that LLM-generated code travels — from prompt construction, to LLM call, to code polishing, to exec — is therefore a security boundary. The repository exposes a Sandbox abstraction as the designated extension point for isolating execution, but the default code path runs generated code in-process with no sandbox attached. The community has flagged this surface as a significant risk vector (see issues #1893 and #1895), and understanding where isolation is — and is not — applied is essential for any production deployment.

The Sandbox Abstraction

The sandbox contract is defined in pandasai/sandbox/sandbox.py and re-exported from pandasai/sandbox/__init__.py. The base class declares four abstract methods that concrete implementations must provide:

MethodPurpose
start()Boot the isolated runtime (e.g. container, microVM)
stop()Tear the runtime down
execute(code, environment)Run generated code inside the sandbox with the supplied environment namespace
transfer_file(csv_data, filename)Move a CSV payload into the sandbox
_exec_code(code, environment)Internal worker that performs the actual execution

execute() lazily calls start() on first use, then delegates to _exec_code(). The base class also defines _extract_sql_queries_from_code(), a small ast.NodeVisitor that walks generated Python source looking for SELECT/WITH query string assignments and call arguments — useful for routing SQL fragments to a query engine rather than executing them via Python eval. Because every method other than execute() raises NotImplementedError, any subclass must implement the full lifecycle, and a missing sandbox means the agent's executor falls back to a non-isolated path.

The Code Generation and Prompt Pipeline

Generated code is shaped by the prompt templates before it ever reaches an executor. pandasai/core/prompts/base.py defines BasePrompt, which renders either an inline template string or a file-loaded Jinja2 template from a templates/ sibling directory, collapses runs of three or more newlines, and caches the resolved string in _resolved_prompt. Subclasses specialize the rendering surface:

Once the LLM responds, pandasai/llm/base.py runs the raw response through _polish_code(), which strips leading python/py markers, removes surrounding backtick fences, and trims non-code preamble. The polished string is what the executor ultimately receives.

flowchart LR
    A[BasePrompt render] --> B[LLM call]
    B --> C[Raw response]
    C --> D["_polish_code()"]
    D --> E{Sandbox configured?}
    E -- yes --> F["Sandbox.execute()"]
    E -- no --> G["In-process exec"]
    F --> H[Result]
    G --> H[Result]

LLM Integration and the Security Boundary

The LLM transport layer is itself relevant to the threat model because the prompt — and therefore any data, schema, or instruction the model sees — is fully under the caller's control until it leaves the boundary. extensions/llms/openai/pandasai_openai/base.py shows the two transport paths: completion() prepends a system prompt to a raw string and hits the legacy completions endpoint, while chat_completion() builds an OpenAI-style message list from Memory.to_openai_messages() and calls the chat endpoint. extensions/llms/openai/pandasai_openai/openai.py fixes the default model to gpt-4.1-mini, supports a broad set of gpt-4.1* chat models, and reads OPENAI_API_KEY, OPENAI_API_BASE, and OPENAI_PROXY from the environment. Critically, none of these layers sanitize the LLM's output before it is executed — _polish_code() only normalizes formatting.

Security Implications and Community Concerns

Because the default executor is a plain exec with no sandbox and no __builtins__ restriction, the system trusts the LLM's output completely. Two community issues document the resulting exposure:

  • Issue #1895 — "Default code executor runs LLM-generated code with full builtins (no sandbox by default) → RCE via indirect prompt injection." The reporter notes that the default namespace exposes pd, plt, and np and leaves __builtins__ unrestricted, so any prompt-injection payload that reaches the model can return arbitrary Python that runs in the host process.
  • Issue #1893 — "Code Injection in CodeExecutor.execute Allows Arbitrary Code Execution via LLM-Generated Code" in pandasai 3.0.0. The same pattern is reported against the v3.0.0 release line, indicating the exposure is not historical.

The architectural mitigation present in the codebase is the Sandbox class itself: a deployment can substitute a hardened Sandbox subclass (e.g. a container-based runner) and wire it into the agent so that execute() is invoked with the generated code and a deliberately minimal environment dictionary. Until such a sandbox is configured, the practical guidance from the source is that PandasAI should be treated as running LLM-generated code with full local privileges, and untrusted data sources should not be allowed to flow into the prompt without external filtering.

See Also

Source: https://github.com/sinaptik-ai/pandas-ai / Human Manual

LLM Backends, Local Models, and Extension Ecosystem

Related topics: Overview, Installation, and Quickstart, Code Execution, Sandbox, and Security Model, Agent Lifecycle, Prompts, and Semantic Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section LLM Backend Extensions

Continue reading this section for the full explanation and source context.

Section Vector Store Extensions

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation, and Quickstart, Code Execution, Sandbox, and Security Model, Agent Lifecycle, Prompts, and Semantic Layer

LLM Backends, Local Models, and Extension Ecosystem

Overview

PandasAI ships with a pluggable LLM abstraction so that the same conversational dataframe interface can be driven by hosted providers, local inference servers, or enterprise vector stores. The LLM base class defines the contract every backend must implement, while individual backends live in optional extensions/ packages that can be installed independently. This design lets users switch providers without modifying their analytics code.

The base interface is intentionally minimal: an LLM must expose a type property, a call(instruction, context) method, and code-polishing helpers. The full set of pandasai/llm/__init__.py re-exports only the LLM symbol, indicating that the package is a framework for subclasses rather than a list of preconfigured clients. Source: pandasai/llm/__init__.py:1-4. Source: pandasai/llm/base.py:1-15.

Built-in LLM Base Class

The LLM class in pandasai/llm/base.py provides the contract that every backend must satisfy. The constructor stores an optional api_key and additional keyword arguments, while the is_pandasai_llm() method returns True so the agent loop can recognize first-party backends. The type property raises APIKeyNotFoundError if a subclass does not override it, enforcing that each backend declares an identifier. Source: pandasai/llm/base.py:25-65.

The _polish_code helper strips Markdown code fences, leading language tags, and stray backticks so the LLM-generated snippet can be fed directly to the code executor. The call() method is declared abstractmethod, requiring every concrete backend to translate a BasePrompt instruction into a string response. Source: pandasai/llm/base.py:67-110. Prompts themselves are Jinja2 templates rendered through the BasePrompt class in pandasai/core/prompts/base.py, which supports both inline template strings and external template_path files such as generate_system_message.tmpl. Source: pandasai/core/prompts/base.py:1-65. Source: pandasai/core/prompts/generate_system_message.py:1-7.

Extension Ecosystem

PandasAI organizes optional backends under a top-level extensions/ directory, split into two tiers:

TierLocationLicenseExamples
LLM backendsextensions/llms/<provider>/pandasai_<provider>/Open sourceopenai, litellm
Enterprise extensionsextensions/ee/<category>/pandasai_<vendor>/Sinaptik GmbH Enterprisepinecone, milvus, chromadb

LLM Backend Extensions

The OpenAI extension in extensions/llms/openai/pandasai_openai/openai.py declares a model default of gpt-4.1-mini and lists supported chat models including gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, and their dated snapshots. The constructor resolves the API token from the OPENAI_API_KEY environment variable, raises APIKeyNotFoundError when missing, and supports a custom api_base for OpenAI-compatible proxies. Source: extensions/llms/openai/pandasai_openai/openai.py:1-60.

The Azure OpenAI backend in extensions/llms/openai/pandasai_openai/azure_openai.py extends the OpenAI client with azure_endpoint, api_version, and deployment_name parameters, and validates each one with explicit APIKeyNotFoundError and MissingModelError exceptions. The shared base.py supplies completion() and chat_completion() helpers, default sampling parameters (temperature=0, max_tokens=1000, presence_penalty=0.6), and an http_client hook for custom transport configuration. Source: extensions/llms/openai/pandasai_openai/base.py:1-80). Source: extensions/llms/openai/pandasai_openai/azure_openai.py:1-70.

The LiteLLM wrapper in extensions/llms/litellm/pandasai_litellm/litellm.py is the community-favoured universal adapter. It accepts a model string plus arbitrary **kwargs that LiteLLM forwards to the underlying provider, and overrides call() to call litellm.completion directly. The README example shows the recommended usage:

from pandasai_litellm.litellm import LiteLLM
llm = LiteLLM(model="gpt-4.1-mini", api_key="YOUR_OPENAI_API_KEY")
pai.config.set({"llm": llm})

Source: extensions/llms/litellm/pandasai_litellm/litellm.py:1-50. Source: README.md:1-30.

Vector Store Extensions

PandasAI extends its semantic-layer caching with vector-store adapters under extensions/ee/vectorstores/. The abstract base in pandasai/vectorstores/vectorstore.py defines the contract: add_docs, update_docs, delete_question_and_answers, get_relevant_docs, and get_relevant_qa_documents. Each concrete backend must implement these methods or inherit the NotImplementedError defaults. Source: pandasai/vectorstores/vectorstore.py:1-80.

VendorPackageNotable Method
Pineconepandasai-pinecone_filter_docs_based_on_distance cosine threshold
Milvuspandasai-milvus_initiate_docs_collection with COSINE index params
ChromaDBpandasai-chromadb_filter_docs_based_on_distance over QueryResult

All three Enterprise extensions are licensed under the Sinaptik GmbH Enterprise License, as stated in the Pinecone README. Source: extensions/ee/vectorstores/pinecone/README.md:1-20. Source: extensions/ee/vectorstores/milvus/pandasai_milvus/milvus.py:1-60. Source: extensions/ee/vectorstores/pinecone/pandasai_pinecone/pinecone.py:1-30. Source: extensions/ee/vectorstores/chromadb/pandasai_chromadb/chroma.py:1-20.

Local Model Support and Community Demand

A large share of community engagement is driven by requests for self-hosted inference. Issue #187 (38 comments) calls for StarCoder/MPT support, #799 (15 comments) requests LM Studio, and #1181 requests Open WebUI compatibility. The historical LocalLLM import path (from pandasai.llm.local_llm import LocalLLM) raised ModuleNotFoundError in v3.0.0 as reported in issue #1888.

The recommended pattern for local models is therefore to point an OpenAI-compatible backend (LiteLLM or the base OpenAI extension) at a local server, or to use LiteLLM's broad provider coverage. The pai.config.set({"llm": llm}) call in pandasai/config.py is the single integration point regardless of which backend is chosen. Source: pandasai/config.py:1-30.

Configuration and Operational Notes

Two recurring operational themes appear in community discussions. First, Python 3.12 compatibility is blocked by an upper-bound dependency on scipy==1.10.1 and is tracked in issues #1850 and #1787. Second, the default code executor in v3.0.0 invokes LLM-generated code via exec with __builtins__ exposed, which has been flagged as a remote-code-execution risk in issues #1893 and #1895; users handling untrusted data should sandbox the executor or restrict __builtins__ explicitly.

flowchart LR
  User[User Prompt] --> Agent[Agent / SmartDataframe]
  Agent --> Config[pai.config]
  Config --> LLM[LLM Backend]
  LLM -->|completion| Provider[(OpenAI / Azure / LiteLLM / Local)]
  Provider --> Code[Generated Code]
  Code --> Executor[CodeExecutor]
  Executor -->|result| Agent
  Agent --> Vector[(Vector Store\nPinecone / Milvus / ChromaDB)]
  Agent --> Response[Answer + Chart]

See Also

  • SmartDataframe and Agent architecture
  • Prompt templates and code generation pipeline
  • Code sandboxing and security best practices
  • Contributing guide and pre-commit setup (CONTRIBUTING.md)

Source: https://github.com/sinaptik-ai/pandas-ai / Human Manual

Agent Lifecycle, Prompts, and Semantic Layer

Related topics: Overview, Installation, and Quickstart, Code Execution, Sandbox, and Security Model, LLM Backends, Local Models, and Extension Ecosystem

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation, and Quickstart, Code Execution, Sandbox, and Security Model, LLM Backends, Local Models, and Extension Ecosystem

Agent Lifecycle, Prompts, and Semantic Layer

PandasAI exposes a thin conversational layer on top of pandas through an Agent class that orchestrates prompt construction, LLM invocation, code execution, and (optionally) a semantic layer for retrieval-augmented few-shot prompting. This page describes how an Agent is instantiated, how prompts are rendered and sent to the LLM, and how the semantic-layer vector stores plug in to supply prior question/answer context.

1. Agent Entry Point and Lifecycle

The public Agent surface is intentionally narrow. pandasai/agent/__init__.py re-exports a single symbol:

from .base import Agent
__all__ = ["Agent"]

Source: pandasai/agent/__init__.py:1-3

Internally, the Agent keeps an AgentState that holds datasets, memory of past turns, the most recent generated code, and the configured LLM. Each call to chat() is expected to be a "clean start" turn, but community reports (#1855) document that agent.chat() sometimes fails to fully reset last_code_generated, so residual state can leak into the next prompt. When this happens, the LLM sees stale code mixed with the new user question and may produce an answer that depends on identifiers from the previous turn.

The LLM contract is defined abstractly in pandasai/llm/base.py. Every concrete LLM (OpenAI, Azure, LiteLLM, etc.) must implement:

  • call(instruction, context) – execute the prompt against the model.
  • type – a string identifier (e.g. "openai", "litellm", "azure-openai").
  • generate_code(instruction, context) – wraps call and extracts a runnable Python code block via _extract_code / _polish_code.

Source: pandasai/llm/base.py:96-115

The base class also exposes helpers that the Agent uses to assemble a turn: prepend_system_prompt(prompt, memory) and get_messages(memory), which read the conversation history from the Memory object before the request is dispatched.

2. Prompt Construction Pipeline

All prompts in pandas-ai inherit from BasePrompt defined in pandasai/core/prompts/base.py. A prompt is either an inline Jinja2 string (template) or a file loaded from the templates/ directory next to the module (template_path). The class resolves the template at construction time and caches the rendered output in _resolved_prompt, exposed via to_string() / __str__. A to_json() hook lets structured prompts (e.g. the SQL prompt or the error-correction prompt) serialise themselves for chat-style APIs.

Source: pandasai/core/prompts/base.py:13-58

The system message used to instruct the model that it must answer with Python code is built by GenerateSystemMessagePrompt, which simply loads generate_system_message.tmpl. This template is rendered with the agent description, the conversation memory, and any custom instructions.

Source: pandasai/core/prompts/generate_system_message.py:1-6

A specialised structured prompt, CorrectOutputTypeErrorPrompt, is rendered into JSON and serialised the conversation, datasets, system prompt, the failing code, the exception trace, and the expected output_type whenever the executor returns the wrong type. The LLM is then asked to produce a corrected snippet.

Source: pandasai/core/prompts/correct_output_type_error_prompt.py:1-28

The diagram below summarises the lifecycle from user input to executed code:

flowchart LR
    A[User query] --> B[Agent.chat]
    B --> C[Build system prompt<br/>GenerateSystemMessagePrompt]
    C --> D[Render instruction prompt<br/>BasePrompt.to_string]
    B --> M[Query semantic layer<br/>VectorStore.get_relevant_qa_documents]
    M --> D
    D --> E[LLM.call]
    E --> F[extract_code / polish_code]
    F --> G[CodeExecutor.execute]
    G -->|type error| H[CorrectOutputTypeErrorPrompt]
    H --> E
    G --> I[Result]

A known bug in this pipeline (#1853) is that agent.description is extracted on the Python side but never reaches the LLM because the corresponding Jinja template omits the System Prompt placeholder. A second bug (#1856) causes the SQL variant (generate_python_code_with_sql.tmpl) to skip the conversation-history block, so multi-turn SQL agents lose context.

3. LLM Backends and the Semantic Layer

PandasAI ships several concrete LLM implementations. The OpenAI family (extensions/llms/openai/pandasai_openai/) shares BaseOpenAI, which sets defaults for temperature, max_tokens, top_p, frequency_penalty, presence_penalty, and supports an injectable http_client and proxy. OpenAI validates the API key and the model name against a whitelist that includes gpt-4.1-mini, gpt-4.1-mini-2025-04-14, and gpt-3.5-turbo-instruct.

Source: extensions/llms/openai/pandasai_openai/base.py:18-43, extensions/llms/openai/pandasai_openai/openai.py:1-40

The semantic layer is built on top of an abstract VectorStore (pandasai/vectorstores/vectorstore.py) that defines a contract with two collections – documents and question/answer pairs – and a uniform set of methods:

MethodPurpose
add_docs / update_docsInsert or update free-form documents
add_question_answer / update_question_answerInsert or update few-shot Q/A examples
get_relevant_docs(question, k)Retrieve similar documents for a query
get_relevant_question_answers(question, k)Retrieve similar prior Q/A pairs
delete_docs / delete_question_and_answersRemove entries by ID

Source: pandasai/vectorstores/vectorstore.py:1-90

Concrete backends implement this contract. ChromaDBVectorStore queries two collections and post-filters by a similarity threshold; MilvusVectorStore creates explicit schemas with VARCHAR IDs and FLOAT_VECTOR embeddings indexed by COSINE distance; LanceDB and Pinecone follow the same dual-collection pattern.

Source: extensions/ee/vectorstores/chromadb/pandasai_chromadb/chroma.py:1-40, extensions/ee/vectorstores/milvus/pandasai_milvus/milvus.py:1-40

Community issue #1874 reports that the limit attribute on SemanticLayerSchema appears to have no effect on how many rows are included in the prompt; this is consistent with the observation that schema-level controls are not always wired through the rendering pipeline.

4. Known Failure Modes

Several recurring failure modes surface from the community and are reflected in the code:

  • Unsafe code execution (#1893, #1895) – the default executor runs LLM-generated code with full __builtins__, exposing the host to RCE through indirect prompt injection. Mitigations must be applied at the executor level, not via prompts.
  • Dead system-prompt placeholder (#1853)agent.description is dropped before the LLM call.
  • Missing conversation context in SQL prompt (#1856) – the SQL template does not render prior turns.
  • Stale state on chat() (#1855)last_code_generated is not cleared, contaminating the next prompt.
  • Python version constraints (#1850, #1872) – the package pins <3.12 because of scipy==1.10.1.
  • Local-model ergonomics (#187, #799, #1181, #1888) – repeated requests for first-class Ollama, LM Studio, and Open WebUI support; these depend on the LocalLLM shim and a working BaseOpenAI-style client.

See Also

  • SmartDataframe and SmartDatalake
  • Vector store extensions (ChromaDB, Milvus, LanceDB, Pinecone)
  • LLM backends (OpenAI, Azure OpenAI, LiteLLM, Bedrock)

Source: https://github.com/sinaptik-ai/pandas-ai / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 19 structured pitfall item(s), including 2 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: high
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1872

2. Runtime risk: Runtime risk requires verification

  • Severity: high
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1896

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1868

4. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1853

5. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1856

6. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/sinaptik-ai/pandas-ai

7. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1888

8. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: packet_text.keyword_scan | https://github.com/sinaptik-ai/pandas-ai

9. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1874

10. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/sinaptik-ai/pandas-ai/issues/1855

11. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/sinaptik-ai/pandas-ai

12. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/sinaptik-ai/pandas-ai

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using pandas-ai with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence