I realize, I need to upgrade my little NUC to something bigger for higher inference of bigger llama models. I want something that you still can have on your living room’s tv bench, so no monster rack please, but that has also the necessary muscle when needed for llama. Budget doesn’t matter right now, want to understand what’s good and what’s out there. Thanks

EDIT: Wow, thanks for the inspiration, guess I need to look at bit for “how to stuff a huge graphics card into a mini box”. To clarify a bit more what I want with it: I want to build a responsive personal assistant. I am dreaming of models bigger than 8B, good tool calling for things like memory, websearch etc., no coding, no image generation, no video generation required. Image recognition would be good but not a must. Regarding footprint, the no monster ;) Something that you can have in your livingroom, and could be wife approved - so no big gaming rig with exhaust pipes and stuff, needs to be good looking ;)

  • anamethatisnt@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 hours ago

    Con: Fewer guides, more complicated setup and having to solve the translation from CUDA with IPEX-LLM and so on. Not everything will run.

    Pro: Looking at Intel Arc Pro B70 with 32GB for less than half the price of an RTX 5090 sure makes one curious to try it.