Do you host your own AI?

SuspiciousCarrot78@aussie.zone · 8 hours ago

Do you host your own AI?

brucethemoose@lemmy.world · edit-2 4 hours ago

Yep.

I have a RTX 3090 + 128GB CPU RAM.

Currently I run my own custom IQ3_KT quantization of MiMo 2.5 300B, and it’s crazy good. It’s better than API models from not that long ago, and it’s served at about reading speed.

Never thought I’d ever run such a thing on my lowly desktop.

For quick scripts or code assistant, sometimes I use Qwen 27B (another custom quant, currently experimenting with exllama). Or Gemini 12B for messing with image/audio input. But TBH MiMo 2.5 with thinking disabled is smarter than 27B with it.

…And honestly, I use GLM 5.2 API a good bit.

I was lucky enough to get a yearly subscription for like $30, 6 months ago. I do self host the UIs or whatever takes the prompts, though.