Self hosting seems far less something worth while for normal people, I mean he couldn't run the 'smart' model even with his setup ( let alone someone without a threadripper setup ).
edit: not to mention I wasn't even aware DS has a 'reasoning model' instead of 'predicting the next most used word'.
That is a step above, interesting. I will stop using Bing
Yes. Minimum cost of entry to a decent LLM will set you back about $4k USD ( Used Nvidia workstation card with 48GB ).
CPU is extremely slow, forget it, this software was developed for and on Nvidia GPUs and performs best on them.
I currently run Qwen2.5-coder 14B ( Consumes ~8gb in GPU ) via ollama on a Nvidia 4070 and it's extremely snappy, but doesn't have broad knowledge. I have to hand-feed it small bits of context, which is tedious, but sometimes works out in my favor.
To get to deepseek's level of knowledge broadness and reasoning, you need around 480GB of VRAM, so it's about a 5 figure investment to get running. And the thing probably draws 1kW for a few seconds when answering a prompt.
So.. yeah...
The problem is that as the computer gets smarter and has more knowledge, it takes exponentially more processing power and memory. The human brain runs on ~100w of power. The overgrown calculator can run faster than the human brain, but needs thousands of watts of instantaneous draw to perform the task.
In theory, chips/hardware/software optimization might be able to get a human equivalent computer brain to operate on only 2x-5x the electricity; at this point it's feasible to convert electricity into thought.
Of course we have to solve the problem of where to get 2-5x the electricity we currently have w/o cooking the planet
The advantage to a LLM is that you can use it as a side-arm brain which can be working out hard problems for you in the background. For me, i leverage it when reasonable, and it expands the amount of things i can do a bit, which is awesome.
It would be a second generation bicycle for the mind, making every human so much more effective, and that's pretty cool from a technological perspective.
I would say the technology has a way to go but the current state of the art is starting to get impressive.