Classification Sentinel

What it does

Classifies the files in a Kiteworks folder by sensitivity (public, internal, confidential, restricted) using a local LLM combined with deterministic entity detection — metadata-only results, file content never leaves the platform.

local-classification

See it in action

You

In My Folder/Test, find which documents contain sensitive information and which kind

Classification Sentinel

Pinning My Folder/Test and classifying each supported file locally. Deterministic detection runs over the full text first; the local model adds context. You get labels, scores and counts only — the documents never leave the platform.

9 Files classified

2 Restricted

1 Needs review

File	Label	Score	Why
customer-list.csv	restricted	92	14 emails, 2 credit cards
forecast-q3.md	confidential	78	financial context
press-release.txt	public	8	approved marketing copy

Read-only. Files the platform could not read are listed as unscanned — never counted as clean — and hard identifiers always floor the label upward, whatever the document text claims.

Illustrative example · not live tenant data

Relevant regulations and standards

Frameworks and mandates this agent helps you address. Not a certification — your own controls and assessment still apply.

GDPR PCI-DSS

What's new

latest 0.3.0

Published version history. The latest version is what new installs receive; your administrator chooses when to upgrade.

0.3.0 stable latest 2026-06-14
Multi-format Phase B: PDF and image classification via OCR. Minor — a user-visible capability expansion with no input-schema break.
- Now classifies PDFs and images (.pdf, .png, .jpg, .jpeg, .tiff, .bmp, .webp) via on-prem OCR, when the deployment opts in (KW_ENTITIES_OCR=1 plus a local Tesseract). The OCR text is read only by the in-deployment classifier; it never leaves the platform and never appears in agent output, audit, or logs. Without OCR enabled, PDFs/images are reported unscanned (ocr_unavailable) — never clean.
- PDFs are classified by rendering each page and OCR'ing it (so both the text layer and any scanned/raster content are covered); there is no separate PDF text-layer parser in this version.
- The deterministic entity floors are recomputed inside the platform over the OCR text, exactly as for text/Office files — injected text in a scanned document cannot talk the classifier down past a hard identifier.
- Honest OCR coverage: a page below the OCR confidence floor leaves the file classified with an ocr_low_confidence coverage gap (partially scanned, complete:false); a file that yields no usable OCR text is reported unscanned (ocr_low_confidence), never clean. Encrypted, oversized, too-many-page, or unsupported-codec files are reported unscanned with their specific reason.

Install in Claude Code

claude plugin marketplace add \
  kiteworks/agent-marketplace
claude plugin install \
  kiteworks-classification-sentinel@kiteworks

Prerequisites

Kiteworks Compliance Runtime — install via pip install kw-mcp-gateway (host >=1.0.0,<2.0.0). This agent calls into the runtime for deterministic, audited execution.
Official Kiteworks MCP >=9.3.0 (used by the runtime) — install and sign in from github.com/kiteworks/mcp.
Python >=3.11.

Connect from Claude

Add this marketplace as a remote MCP connector in Claude Desktop or Claude Code — point it at <your-host>/mcp. One process per deployment; no per-machine install. Requires the official Kiteworks MCP to be configured.