There are already other providers like Deepinfra offering DeepSeek. So while the the average person (like me) couldn’t run it themselves, they do have alternative options.
There are already other providers like Deepinfra offering DeepSeek. So while the the average person (like me) couldn’t run it themselves, they do have alternative options.
A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost “only” ~$10k rather than 100k+…
To be fair, most people can’t actually self-host Deepseek, but there already are other providers offering API access to it.
You got that backwards. They’re other models - qwen or llama - fine-tuned on synthetic data generated by Deepseek-R1. Specifically, reasoning data, so that they can learn some of its reasoning ability.
But the base model - and so the base capability there - is that of the corresponding qwen or llama model. Calling them “Deepseek-R1-something” doesn’t change what they fundamentally are, it’s just marketing.