I’ve been running self-hosted AI agents for a while. Tools like OpenClaw and Hermes do this well and were a big inspiration, but they’re CLI/dev-first and headless. I wanted that kind of power with a real, mobile-friendly UI my non-technical wife could actually use from her phone. I couldn’t find it, so I built it for my own household and open-sourced it. Not claiming to reinvent anything (there’s a new “AI agents platform” every other week right now), I just took the UI-first angle.

Self-hosting fundamentals:

  • Single Docker container. Bun + SQLite, no Postgres, no Redis, no external cloud. All state in one volume.
  • Light enough to run on a small home server.
  • Secrets are stored in an AES-256-GCM encrypted vault and never sent to the LLM provider.
  • Reachable over Telegram, WhatsApp, Slack, Discord, Signal and Matrix.
  • Bring your own keys: Anthropic, OpenAI, Gemini, OpenRouter (an OpenAI-compatible endpoint for llama.cpp / LM Studio / vLLM is in progress).
  • MIT, actively maintained.

The parts I focused on (where having a UI actually pays off):

  • A proper web UI that works on mobile, not a terminal.
  • Full transparency into what the model sees: you can inspect the exact context sent to the LLM and the token cost of every message. No black box.
  • Tool calls rendered visually in the chat with custom renderers (a weather call shows a weather card, not raw JSON).
  • Mini-apps embedded in the UI: small interactive apps, dashboards, even live background services.
  • Create your own tools from inside the platform, instantly reusable by any agent.
  • A Kanban board to manage projects and tickets the agents work on.
  • A plugin system (NPM) to add providers, channels, tools and more.
  • Connected accounts with triggers (e.g. an incoming email can wake an agent).
  • A workspace file browser and terminal, in the UI.
  • Conversational setup: an onboarding agent walks you through configuring everything.
  • Image generation and TTS/STT built in.

Install:

docker run -d -p 3000:3000 -v hivekeep:/app/data ghcr.io/marlburrow/hivekeep:latest

Open the web UI and the setup agent takes it from there.

GitHub: https://github.com/MarlBurroW/hivekeep Site + demo: https://hivekeep.app/

It’s young and I’m after honest feedback. Disclosure: I’m the author, happy to answer anything.

  • Mike Wooskey@lemmy.thewooskeys.com
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    edit-2
    1 hour ago

    This looks interesting, especially the persistent memory. I want to try it out but it seems likely to me that multiple simultaneous agents would require significant hardware. Even if they were serially activated, reloading contexts with each switch would take time. I have a pretty beefy GPU and experience significant (almost ridiculous) slowdown when opencode runs 2 subagents simultaneously.

    But perhaps the memory storage/lookup keeps contexts very small?

    Anyway, I can’t find any mention in the repo or docs what the suggested minimum hardware is.

    • MarlburroW@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 minutes ago

      Good question, and you’re right that it’s missing from the docs (just added a Hardware requirements section to the README to fix that).

      The key thing: Hivekeep doesn’t run the models itself. It calls your provider (Anthropic/OpenAI/etc.) or a local OpenAI-compatible endpoint, so the heavy compute lives there, not in the app. The platform itself is a single Bun process over SQLite, no GPU, no extra services. It runs in well under 1 GB of RAM on a small home server.

      On the multi-agent worry: agents are activated serially per message, not all firing at once, and the persistent memory is exactly what keeps each context small (hybrid vector + keyword recall, re-ranked, instead of replaying the whole history). So adding agents mostly means more calls routed to your provider, not multiplied local load.

      The opencode slowdown you saw is on the inference side: if you point Hivekeep at local models (llama.cpp / LM Studio / Ollama / vLLM), the hardware question moves to your inference server, same as any other client. If you use a hosted provider, your machine barely feels it.

    • irmadlad@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      4 hours ago

      Anyway, I can’t find any mention in the repo or docs what the suggested minimum hardware is.

      Same.