No cloud · No API key · No leash

Reads everything.
Beats the frontier.
Runs on your hardware.

No more context-window limits. No RAG. No lossy retrieval guessing. Awarity reads every token you throw at it, matches GPT-5.4 on the benchmarks — then runs the whole pipeline offline on open models and the hardware you already own. No API key. No cloud. Nothing phoned home.

Download Free Mac & Windows · Free · Runs offline
See the benchmarks →
$0
To run, fully local
Runs on free, open models (Qwen) on your own Mac. No API fees, no per-query cost — ever.
0
Keys, sign-ups, telemetry NEW
It works the second you install it and phones home to no one. Add a cloud key only if you actually want one.
100%
On your machine
Fully offline and air-gap capable. Your contracts, financials, and patient data never leave your computer.
400M+
Tokens read, zero sampled
No context ceiling, no RAG roulette. Reads every token, every time — while retrieval-based tools guess and miss.
100% on your machine

Send nothing.
Runs entirely on your Mac.

Awarity now runs end-to-end on open AI models on your own computer. Install the app and it reads, maps, and answers across your entire document set — without ever touching the cloud. No API key. No uploads. No per-query fees. The most private way to put AI on the documents you can't send anywhere.

Nothing leaves your computer

Every document and every query stays local — by default, not as an enterprise add-on. Awarity has no server to send your data to.

No API key, no bill

Works the moment you install it, on free open models. Add an OpenAI, Anthropic, or Gemini key only if you specifically want a cloud model.

Air-gap ready

Runs with no internet connection at all. Built for legal, healthcare, finance, and government data that can never leave the building.

Download free — runs offline on Mac →
The problem

Three things conventional AI struggles with

01

Context window limits

Native LLMs like Claude or ChatGPT cap out at 500K–1M tokens. Drop in a large dataset and you're already over the limit — the model starts hallucinating or refusing. Awarity has tested beyond 400 million tokens.

02

RAG is lossy by design

Retrieval-Augmented Generation picks what to retrieve — and guesses wrong. Embeddings compress meaning, vector search misses edges, and the answer you get is only as good as what was fetched. Awarity reads everything, every time.

03

Data leaves your hands

Most AI tools require uploading documents to a third-party cloud. Awarity runs on-prem or fully offline. Your contracts, financials, and patient data never leave your environment.

Benchmark results

Equal or better quality. Radically lower cost.

90.1
GPT-4.1 + ECW synthesis score
Nearly matches GPT-5.4 alone (90.9) — at 92% lower cost. ECW lifts a cheaper model to flagship performance.
$0.31
Per synthesis case with ECW
GPT-5.4 alone costs $3.67 for the same task. Same quality. 92% cheaper. Every single time.
New
~$0.04
With an on-prem map model
Run the map phase on Qwen or Llama locally — cost is zero. Only the final infer call charges API fees. Trade-off: slower than cloud.
Full benchmark results — including on-prem model testing
Our solution

How the Elastic Context Window Works

Step 01 — Ingest

Build your catalog

Drop in any documents — PDFs, Word files, spreadsheets, CSVs. Awarity chunks and indexes them into a local catalog. No cloud upload required.

Step 02 — Map

Every chunk, in parallel

ECW sends every chunk through a fast model — running locally on your Mac, or in the cloud — simultaneously. Each call is small, under the context limit, so no document is truncated. Notes are collected from all chunks.

Step 03 — Infer

One coherent answer

The notes — not the raw documents — are synthesized by your chosen model, local or cloud. The infer context stays small regardless of catalog size — which is why ECW runs free on your own machine, or 68–92% cheaper in the cloud.

You know who else thinks context window size is important? Claude does.

Claude on context window size
Runs on Local · Qwen Local · Llama OpenAI Anthropic Gemini Any LLM
Use it your way

Desktop app, CLI, or embedded in your stack

UI

Desktop app

Download Awarity for Mac and start querying your documents in minutes. Ingest files, build catalogs, run queries — no command line required. Free, self-contained, and runs entirely offline on your machine.

CLI

Integrate into pipelines

The Awarity CLI drops into any existing workflow. Run awarity ask, awarity extract, or awarity for-each inside scripts, CI/CD jobs, or scheduled tasks. If you have a pipeline, Awarity plugs in.

API

Embed in your apps

Deploy Awarity as an Azure Function, AWS Lambda, or Docker container. First-class TypeScript and Node.js support. Call document reasoning from any application or service.

Windows Alpha

Windows Alpha now available

An early unsigned build is available for Windows x64 and ARM64. Windows will show a SmartScreen warning — click "More info" then "Run anyway" to install.

Download for Windows →
Linux build coming soon — join the Linux waitlist.

Download free.
Run in minutes.

Awarity runs entirely on your machine. Bring your own API key, ingest your documents, and start getting answers — no SaaS subscription, no data upload, no vendor lock-in.

Download for Mac Apple Silicon & Intel · Free
macOS 12+ required  ·  Windows & Linux coming soon