How Ostler is built

For developers and the technically curious. Everything runs on a Mac Mini. Nothing leaves your network.

Three-store architecture

Ostler uses three specialised databases, each optimised for a different type of query:

StoreTechnologyPurpose
Vector store Qdrant Semantic search. "Find people similar to this description." Stores 148K+ vectors with nomic-embed-text embeddings.
Knowledge graph Oxigraph Structured relationships. SPARQL queries over 2M+ RDF triples. "Who knows whom? What happened when?"
Cache + message bus Redis Fast lookups, real-time message routing between services, session state.

All three run as Docker containers. On a Mac Mini M4, the database containers use under 2GB RAM, leaving the rest for Ollama and the AI models (which need 6–12GB depending on model size).

Local LLM inference

All AI inference runs locally via Ollama. No cloud API calls. No usage billing. No data exfiltration.

ModelUsePerformance
Qwen 3.5 9B AI assistant (Marvin), conversation processing, fact extraction ~30 tok/s on M4
nomic-embed-text Vector embeddings for semantic search ~200 embeddings/s

The system is hardware-adaptive. Settings profiles configure model selection and batch sizes based on available hardware. A Mac Mini M1 runs smaller models; a Mac Studio M2 Ultra runs larger ones.

Instant onboarding (macOS data)

The moment you install, Ostler reads data directly from your Mac's built-in apps. No exports needed. No waiting.

SourceWhat we readPermission
SafariBrowsing history, bookmarks, reading listFull Disk Access
iMessageConversations, participants, timestampsFull Disk Access
Apple NotesNote titles, text content, foldersFull Disk Access
CalendarEvents, attendees, locationsFull Disk Access
PhotosFace labels, GPS locations, dates (not image content)Full Disk Access
RemindersTasks, due dates, listsFull Disk Access
Apple MailSubjects, senders, dates (not email body)Full Disk Access

All databases are opened read-only to prevent corruption. Each extractor handles schema differences across macOS versions (Ventura, Sonoma, Sequoia). Full Disk Access is optional – you can skip it and still use GDPR imports.

GDPR import pipeline

For deeper historical data, 20 parsers read from GDPR data exports:

PlatformData importedFormat
LinkedInConnections, career, endorsements, messages (metadata)CSV
FacebookFriends, events, timelineJSON
InstagramFollowers, following, close friendsJSON
WhatsAppPhone cross-referencesJSON
Twitter / XSynced contacts (phone cross-ref)JS
Google CalendarEvents, attendees, locationsICS
iCloudContacts (via CardDAV)vCard
EmailSignature mining, header analysisMBOX
BrowserHistory URLs, page titlesSafari / Chrome

Identity resolution

The same person appears differently across platforms. "John Smith" on LinkedIn, "johnnyboy" on Instagram, "+44 7XXX XXXXXX" on WhatsApp. The identity resolver matches these using:

Exact matching: LinkedIn URL, email address, phone number (last 8 digits).

Fuzzy matching: Jaro-Winkler string distance on names, corroborated by shared organisation, email domain, or platform overlap.

Manual review queue: Uncertain matches go to a review queue. The user approves or rejects. No automatic merges without confidence.

The resolver has 38 automated tests covering exact, fuzzy, phone, email, and name-subset matching strategies.

Conversation processing

When a conversation is recorded (via the macOS companion app or manual import), it passes through a multi-step pipeline:

1. Classification – setting (work/social/family), shape (meeting/1:1/group), stakes (high/medium/low)
2. Fact extraction – 12.6 facts per conversation on average, with quality gates
3. Relationship signals – warmth, reciprocity, energy, power dynamics
4. Coaching observations – longitudinal patterns in how the user communicates
5. Cross-conversation linking – semantic similarity between conversation summaries

Each step is idempotent (re-runnable without duplicates), has exponential backoff on failure, and records the prompt version that generated it.

Stack summary

Capture Layer macOS databases (instant) · GDPR imports · macOS app · iOS app · Browser extension Processing Layer Conversation pipeline · Identity resolver · Fact extraction · Relationship signals Intelligence Layer Ollama (Qwen 3.5 9B) · nomic-embed-text · SPARQL queries · Vector search Storage Layer Qdrant (vectors) · Oxigraph (RDF graph) · Redis (cache + bus) · SQLite (coaching) Interface Layer Marvin (iMessage · WhatsApp · Email) · Personal Wiki · iOS app

Total dependencies: Python 3.11+, Docker, Ollama. No cloud accounts required. No API keys. No subscriptions.

Get started Security architecture See the demo