OpenRay.ai
Announcement · May 18, 2026

Ekō: A Context-Aware Security Proxy for AI Applications in African Regulated Domains

A context-aware security proxy that strips and tokenizes credentials, PII, and regulated financial data before prompts leave your perimeter — combining deterministic patterns with an optional finetuned SLM, and built with first-class support for African regulated domains.

Narcisse Egonu

Most AI integrations leak before they fail. A support agent pastes a customer ticket into ChatGPT to summarize it. An internal copilot threads KYC or compliance notes through a third-party model. Once that text lands in a provider's logs, the only honest answer to "where is it now?" is "we don't know."

Ekō is an open-source security proxy that sits between your applications and AI providers. It catches credentials, PII, and regulated identifiers in every prompt, then either redacts the values or replaces them with reversible session tokens — with first-class detection for African banking, fintech, and health workflows.

It runs as a single Go binary, ships as a Docker image, and is available today on GitHub.

Why this matters in practice

A single, ordinary support-ticket sentence exercises both layers of Ekō at once:

{
  "original_prompt": "Amina Yusuf can be reached at +234 802 111 3344. Her BVN is 22334455667 and she lives at No. 30 Ikeja Road, Lagos. Amina Obi is the head of IT and in charge of database administration",
  "sanitized_prompt": "Xxxxx Xxxxc can be reached at +234 000 000 0001. Her BVN is 00000000001 and she lives at Xx. 02 Xxxxx Xxxx, Xxxxx. Xxxxx Xxa is the head of IT and in charge of database administration"
}

A few things to notice.

  • The phone number and BVN are caught by the deterministic pattern layer. These are the easy wins for a regex pipeline that knows about Nigerian formats — and the kind of value a generic DLP often passes through unchanged.
  • The two names and the address are caught by the optional SLM sidecar. These are the entities a regex layer cannot reach reliably. The model also treats the two Aminas as distinct entities rather than collapsing them into a single token.

What is not redacted matters as much as what is. The clause "head of IT and in charge of database administration" survives untouched. The downstream model still gets a coherent ticket it can summarize or route; the regulated identifiers and the human identities are gone before they ever reach a provider's logs.

The gap the standard answers don't cover

Banks and fintech are shipping customer-facing AI and connecting employees to internal models. Every prompt is a potential leak vector:

  • Private hosting ≠ sanitization. Azure OpenAI and AWS Bedrock keep the network private, but they will faithfully forward a BVN, a JWT, or a customer's address to the model exactly as written.
  • Manual review doesn't scale to per-prompt traffic. The volume that makes AI useful is the same volume that makes review impossible.
  • Training doesn't survive deadlines. A tired engineer pasting a stack trace at 6 pm is operating on muscle memory, not policy.
  • Homemade regex catches the easy 60% and silently misses the half-formatted account numbers, regional ID schemes, and names that actually matter.

Ekō is opinionated about this layer being infrastructure, not a feature inside each application: change one base URL, get a sanitized provider; or call the core API directly from any language.

Where Ekō fits

Ekō exposes two interfaces — a drop-in OpenAI proxy and a core /v1/sanitize API. What changes from one deployment to the next is who is calling it and where the prompt was generated.

An internal sanitization playground. A small web app fronted by /v1/sanitize that staff paste into before they paste into ChatGPT, Claude, or Gemini. Not a structural control, but it turns "don't paste customer data into AI" from a policy line into a habit with a button.

Inside the AI products you ship. Banking assistants, KYC reviewers, fraud copilots, loan-eligibility chats. The team points its OpenAI SDK at Ekō and stops thinking about it. Sanitization becomes structural — a developer can't forget, because the egress path is Ekō.

from openai import OpenAI

client = OpenAI(
  base_url="http://eko.internal:8080/v1",
  api_key=OPENAI_API_KEY,
)

In front of self-hosted chat UIs. assistant-ui, LibreChat, Open WebUI, AnythingLLM, Chatbox — anywhere an internal "ChatGPT but inside our walls" has been stood up. One config line covers everyone behind it.

Slack, Teams, and browser middleware. A thin middleware intercepts @AI bot messages, calls /v1/sanitize, then forwards to the model. A browser extension hooking submit on chatgpt.com, claude.ai, and gemini.google.com offers the same defense without asking anyone to change behaviour.

Batch and pipeline jobs. Historical tickets going into a fine-tuning corpus. Airflow or Dagster jobs feeding PDFs to an LLM. Warehouse outbound syncs to analytics providers that run models on the data. Sanitize the corpus once on the way out.

Infrastructure-level egress control. Deploy Ekō as an Istio or Envoy sidecar, or as the only route to api.openai.com and friends, enforced by NetworkPolicy. No application can reach a public LLM without going through it.

Surfaces that matter for African deployments

Two surfaces are worth calling out because they generate the kind of unstructured, identifier-dense text the SLM sidecar was built for:

  • WhatsApp Business API bots. WhatsApp is a primary customer channel across Nigeria, Kenya, Ghana, and South Africa. Inbound messages are full of BVNs, NUBANs, M-Pesa codes, and ID numbers, often mid-sentence with no formatting cues. Sanitize the message body before it reaches the LLM that drafts the reply.
  • Call-centre and IVR transcripts. Customer service teams are increasingly piping call transcripts to LLMs for summarization, sentiment, or routing. Transcripts contain identifiers read aloud and rendered by ASR — exactly the high-volume, partly-mangled input where the regex layer drops recall and the contextual model earns its keep.

What's in the release

  • Single Go binary, built on Gin, deployed as openray/eko:main-latest on Docker Hub
  • Drop-in OpenAI proxy (/v1/chat/completions and /v1/responses) with streaming pass-through, tool-field preservation, and text sanitization for Responses API content blocks
  • Core sanitization API (POST /v1/sanitize) with redact-or-tokenize modes
  • Session-scoped reversible tokenization — same value gets the same token within a conversation
  • 24+ built-in detection patterns plus arbitrary user-defined YAML patterns
  • Optional SLM sidecar (Python/FastAPI) loading openray-ai/privacy-filter-nigeria for contextual PII
  • Encrypted Redis token vault with optional HashiCorp Vault Transit for key rotation
  • Production ops: Prometheus metrics, health and readiness endpoints, Docker Compose, Kubernetes deployment example

Detection coverage

Credentials and secrets: OpenAI, Anthropic, Google, and AWS API keys; JWTs and OAuth tokens; SSH private keys; database connection strings (Postgres, MongoDB, MySQL); environment variables and generic secrets.

Financial: Credit cards (Luhn-validated), IBAN, SWIFT/BIC, CVV, generic bank-account numbers.

African identity and finance: Nigerian BVN, NIN, NUBAN, mobile (+234). Kenyan M-Pesa transaction codes, mobile (+254). South African 13-digit national ID, mobile (+27). Ghanaian mobile (+233).

Generic PII: Email addresses, phone numbers (international formats).

Custom: Any regex declared in YAML, with per-pattern severity (BLOCK / WARN / LOG) and per-pattern action (redact or tokenize).

Tokenize, don't just redact

Many flows lose value the moment a string becomes [REDACTED]. A support agent summarizing a case still needs to refer to "that customer" consistently across turns; a downstream system may need to round-trip a value back to the original record.

Ekō's tokenizer assigns a deterministic, opaque token per detected value within a session, stored in an encrypted Redis vault with optional Vault Transit envelope encryption. The same BVN seen twice in the same conversation gets the same token; a downstream service holding the session key can resolve it back.

Performance

Ekō is built to sit on the request path. Internal benchmarks on 4 vCPU / 8 GB show p95 latency under 5 ms for the core sanitization API and under 50 ms end-to-end through the proxy at 1,000 req/s, with a memory footprint around 50 MB. The repo includes the benchmark harness, baseline comparison, and memory-ceiling test so you can reproduce these numbers on your own hardware.

Research preview, not a compliance product

This is a v0.1 release. Treat it accordingly:

  • It is a data-minimization layer — not anonymization, not a compliance certification, and not the only control you should rely on for legal, regulatory, or irreversible decisions.
  • The OpenAI proxy is the only proxy implementation today. Anthropic and Google providers are on the roadmap; the core /v1/sanitize API works with any model in the meantime.
  • The SLM sidecar is recall-oriented and will over-redact in some cases. It is opt-in per request and falls back via a circuit breaker if unavailable.
  • Production deployment should pair Ekō with representative local evaluation, access controls, audit logging, and human review where a missed detection could cause harm.

Feedback welcome

We are particularly interested in:

  • Real African-context prompts where Ekō missed a sensitive entity or flagged something benign
  • Local data formats not yet covered — additional jurisdictions, ID schemes, financial identifiers
  • Integration patterns from production deployments — what worked, what required workarounds
  • OCR-derived text from Nigerian identity documents and bank statements
  • Hard-negative cases for the contextual detector

Open an issue on the repository, or contribute a pattern via the workflow in CONTRIBUTING.md. Patterns are the most direct way to make Ekō more accurate for your jurisdiction.

Get started:

docker run -p 8080:8080 openray/eko:main-latest

Or read the full quickstart in the README.