Hey guys,

What’s currently the best LLM for low-VRAM machines with only 6 GB VRAM? I’ve got 32GB RAM as well.

I’m experimenting a little with SillyTavern and I’m curious which model gets the most out of my setup. Should be multilingual and suitable for “casual chatting”.

I know I will probably not get very far with this, but I’m still interested in how far we’ve already come.

(Using KoboldCPP if that matters).

~sp3ctre

  • lime!@feddit.nu
    link
    fedilink
    English
    arrow-up
    1
    ·
    17 days ago

    with a 20b model on weak hardware you’ll be waiting more like 10 minutes. unless the os clobbers your process for using too much memory.