Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    3 hours ago

    Oh, and I just saw you have a 3090.

    To get more specific, you can actually run way better models than Qwen 3.5 and Deepseek coder (both of which are very obsolete now). The best that’s practical depends on how much CPU RAM you have, but at the minimum you can do Qwen 3.6 27B, with a more optimal quant like ones here: https://huggingface.co/ubergarm/Qwen3.6-27B-GGUF/tree/main

    Or Gemma 31B QAT: https://huggingface.co/unsloth/gemma-4-31B-it-qat-GGUF

    If you have 128GB CPU RAM, I can upload my custom MiMo 2.5 quant. That should “beat” the cheapest Claude, give or take.

    If you have 64GB, I’d suggest a quantization of Step 3.7.

    If you have 32GB or 48, I’m not sure. I’d need to look if any “small” MoE is actually better than Qwen 27B now.