# presidio - Doramagic AI Context Pack

> Purpose: pre-work context for the user's host AI. This pack does not prove that the project has been installed, run, or validated.

## Project

- canonical_name: `microsoft/presidio`
- capability: An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
- expected_user_outcome: An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

## Operating Boundaries

- Do not claim that the project has been installed, run, called through an API, or used on local files unless separate evidence proves it.
- Project facts must come from repo evidence, Claim Graph, or explicit source references.
- When a capability is not verified, mark it as unverified instead of completing it as fact.
- publish_status: `publishable`
- blocking_gaps: none

---

## Doramagic Context Augmentation

The following sections strengthen the repository context for a host AI. Human Manual data is a reading route, and pitfall notes become operating constraints.

## Human Manual Outline

Usage rule: this is only a reading route and salience signal, not factual authority. Concrete claims must still return to repo evidence or Claim Graph.

Host AI hard rules:
- Do not treat page titles, section order, summaries, or importance values as factual project evidence.
- When explaining the Human Manual outline, state that it is only a reading route or salience signal.
- Capability, installation, compatibility, runtime state, and risk claims must cite repo evidence, source paths, or Claim Graph.

- **Presidio Overview & System Architecture**: importance `high`
  - source_paths: README.MD, docs/index.md, docs/design.md, docs/text_anonymization.md, docs/learn_presidio/concepts.md
- **Analyzer: PII Detection, NLP Engines & Recognizers**: importance `high`
  - source_paths: presidio-analyzer/presidio_analyzer/analyzer_engine.py, presidio-analyzer/presidio_analyzer/analyzer_engine_provider.py, presidio-analyzer/presidio_analyzer/recognizer_registry/recognizer_registry.py, presidio-analyzer/presidio_analyzer/recognizer_registry/recognizer_registry_provider.py, presidio-analyzer/presidio_analyzer/recognizer_registry/recognizers_loader_utils.py
- **Anonymization, Image Redaction & DICOM Processing**: importance `high`
  - source_paths: presidio-anonymizer/presidio_anonymizer/anonymizer_engine.py, presidio-anonymizer/presidio_anonymizer/deanonymize_engine.py, presidio-anonymizer/presidio_anonymizer/batch_anonymizer_engine.py, presidio-anonymizer/presidio_anonymizer/entities/conflict_resolution_strategy.py, presidio-anonymizer/presidio_anonymizer/operators/__init__.py
- **Structured Data, CLI, Deployment & Extensibility**: importance `medium`
  - source_paths: presidio-structured/presidio_structured/structured_engine.py, presidio-structured/presidio_structured/analysis_builder.py, presidio-structured/presidio_structured/config/structured_analysis.py, presidio-structured/presidio_structured/data/data_processors.py, presidio-structured/presidio_structured/data/data_reader.py

## Repo Inspection Evidence

- repo_clone_verified: true
- repo_inspection_verified: true
- repo_commit: `2901c7fcb316d8f8b78d0632e415d8eb49145f74`
- inspected_files: `README.MD`, `docker-compose.yml`, `pyproject.toml`, `docs/ahds_integration.md`, `docs/analyzer/adding_recognizers.md`, `docs/analyzer/analyzer_engine_provider.md`, `docs/analyzer/customizing_nlp_models.md`, `docs/analyzer/decision_process.md`, `docs/analyzer/developing_recognizers.md`, `docs/analyzer/filtering_by_country.md`, `docs/analyzer/index.md`, `docs/analyzer/languages-config.yml`, `docs/analyzer/languages.md`, `docs/analyzer/nlp_engines/gpu_usage.md`, `docs/analyzer/nlp_engines/spacy_stanza.md`, `docs/analyzer/nlp_engines/transformers.md`, `docs/analyzer/recognizer_registry_provider.md`, `docs/anonymizer/adding_operators.md`, `docs/anonymizer/index.md`, `docs/api/analyzer_python.md`

Host AI hard rules:
- Without repo_clone_verified=true, do not claim that the source code has been read.
- Without repo_inspection_verified=true, do not write README, docs, or package-file conclusions as facts.
- Without quick_start_verified=true, do not claim that the Quick Start path has run successfully.

## Doramagic Pitfall Constraints

These rules come from Doramagic discovery, validation, or compilation findings. The host AI must treat them as operating constraints, not background notes.

### Constraint 1: Runtime risk requires verification

- Trigger: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/microsoft/presidio/issues/1251
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 2: Security or permission risk requires verification

- Trigger: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/microsoft/presidio/issues/2080
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 3: Capability evidence risk requires verification

- Trigger: README/documentation is current enough for a first validation pass.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 4: Runtime risk requires verification

- Trigger: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/microsoft/presidio/issues/2083
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 5: Maintenance risk requires verification

- Trigger: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 6: Security or permission risk requires verification

- Trigger: no_demo
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 7: Security or permission risk requires verification

- Trigger: no_demo
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 8: Security or permission risk requires verification

- Trigger: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/microsoft/presidio/issues/1882
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 9: Maintenance risk requires verification

- Trigger: issue_or_pr_quality=unknown。
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.

### Constraint 10: Maintenance risk requires verification

- Trigger: release_recency=unknown。
- Host AI rule: Reproduce the official install and quickstart path in an isolated environment.
- Why it matters: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/microsoft/presidio
- Hard boundary: Do not present this pitfall as solved, verified, or ignorable unless later evidence explicitly closes it.
