Onyx AI

checking server

onyx-ai

Onyx AI

llama3.1:8b ▾

Private. Local. Yours.

Create

Build a playable tic-tac-toe game

Explore

What's happening in the world today?

Design

Design a futuristic dashboard UI

Research

Who won the most recent Super Bowl?

/templates files voice images research knowledge

Ask anything...

open-source self-hosted ollama + local llms multi-provider api node.js tailscale mesh cloudflare edge persistent memory knowledge base web search deep research yours to fork open-source self-hosted ollama + local llms multi-provider api node.js tailscale mesh cloudflare edge persistent memory knowledge base web search deep research yours to fork

Why I built this.

I was using ChatGPT and Claude subscriptions, and neither felt like mine — context reset between sessions, data left the machine, and I couldn't tune the behavior for my actual workflow. So I built something that didn't have those problems.

The interesting part wasn't the model side — Ollama makes that easy. It was the infrastructure: threading a Tailscale mesh between two machines, writing auth middleware that enforces per-user session isolation, adding a Cloudflare tunnel so it's reliably reachable from anywhere. The kind of plumbing that doesn't show up in tutorials but is the whole reason it works.

Inference Ollama (local GPU) + multi-provider API fallback — Anthropic, OpenAI, Gemini

Network Tailscale mesh across two servers + Cloudflare tunnel for public access

Memory SQLite — persistent per-user chat history, knowledge base, searchable context

Auth Two-layer — Cloudflare Access at the edge + custom session tokens in the backend

Onyx AI — private, local, yours.

Onyx AI

Why I built this.

Take Onyx for a spin.