What it does
Classifies the files in a Kiteworks folder by sensitivity (public, internal, confidential, restricted) using a local LLM combined with deterministic entity detection — metadata-only results, file content never leaves the platform.
See it in action
In My Folder/Test, find which documents contain sensitive information and which kind
Pinning My Folder/Test and classifying each supported file locally. Deterministic detection runs over the full text first; the local model adds context. You get labels, scores and counts only — the documents never leave the platform.
| File | Label | Score | Why |
|---|---|---|---|
| customer-list.csv | restricted | 92 | 14 emails, 2 credit cards |
| forecast-q3.md | confidential | 78 | financial context |
| press-release.txt | public | 8 | approved marketing copy |
Relevant regulations and standards
Frameworks and mandates this agent helps you address. Not a certification — your own controls and assessment still apply.
Tags
What's new
latest 0.3.0Published version history. The latest version is what new installs receive; your administrator chooses when to upgrade.
-
0.3.0stable latest 2026-06-14Multi-format Phase B: PDF and image classification via OCR. Minor — a user-visible capability expansion with no input-schema break.
- Now classifies PDFs and images (
.pdf,.png,.jpg,.jpeg,.tiff,.bmp,.webp) via on-prem OCR, when the deployment opts in (KW_ENTITIES_OCR=1plus a local Tesseract). The OCR text is read only by the in-deployment classifier; it never leaves the platform and never appears in agent output, audit, or logs. Without OCR enabled, PDFs/images are reportedunscanned(ocr_unavailable) — never clean. - PDFs are classified by rendering each page and OCR'ing it (so both the text layer and any scanned/raster content are covered); there is no separate PDF text-layer parser in this version.
- The deterministic entity floors are recomputed inside the platform over the OCR text, exactly as for text/Office files — injected text in a scanned document cannot talk the classifier down past a hard identifier.
- Honest OCR coverage: a page below the OCR confidence floor leaves the file
classifiedwith anocr_low_confidencecoverage gap (partially scanned,complete:false); a file that yields no usable OCR text is reportedunscanned(ocr_low_confidence), never clean. Encrypted, oversized, too-many-page, or unsupported-codec files are reportedunscannedwith their specific reason.
- Now classifies PDFs and images (
Install in Claude Code
claude plugin marketplace add \
kiteworks/agent-marketplace
claude plugin install \
kiteworks-classification-sentinel@kiteworks
Prerequisites
-
Kiteworks Compliance Runtime — install via
pip install kw-mcp-gateway(host>=1.0.0,<2.0.0). This agent calls into the runtime for deterministic, audited execution. -
Official Kiteworks MCP
>=9.3.0(used by the runtime) — install and sign in from github.com/kiteworks/mcp. - Python
>=3.11.
Connect from Claude
Add this marketplace as a remote MCP connector in Claude Desktop or Claude Code — point it at <your-host>/mcp. One process per deployment; no per-machine install. Requires the official Kiteworks MCP to be configured.