Friendica Social Network

DeepSeek V4—almost on the frontier, a fraction of the price

$DeepSeek V4—almost on the frontier, a fraction of the price$

DeepSeek V4—almost on the frontier, a fraction of the price

Chinese AI lab DeepSeek’s last model release was V3.2 (and V3.2 Speciale) last December. They just dropped the first of their hotly anticipated V4 series in the shape of two …

^{Simon Willison’s Weblog}

#technology

in reply to ☆ Yσɠƚԋσʂ ☆

zikzak025

in reply to ☆ Yσɠƚԋσʂ ☆ • 3 weeks ago • •

Slop is slop

in reply to zikzak025

Dr_Vindaloo

in reply to zikzak025 • 3 weeks ago • •

I get the AI hate when it comes to a lot of things, but it is genuinely a useful tool for software development.

in reply to Dr_Vindaloo

slacktoid

in reply to Dr_Vindaloo • 3 weeks ago • •

Usually it's people who don't code or understand the complexities involved that go that way.

in reply to ☆ Yσɠƚԋσʂ ☆

HiddenLayer555

in reply to ☆ Yσɠƚԋσʂ ☆ • 3 weeks ago • •

Really seems like Deepseek is one of the only vendors actually focusing on performance per unit compute power and not just throwing infinite compute power at the problem. Calling it now, when the bubble bursts they'll be one of the few to make it out with a usable product.

This entry was edited (3 weeks ago)

in reply to HiddenLayer555

☆ Yσɠƚԋσʂ ☆

in reply to HiddenLayer555 • 3 weeks ago • •

For sure, they've probably dropped more significant papers in the past year than any other groups. It does seem like the mindset in China is very different overall though. In the states, it's basically a cult at this point where they're trying to build a god with AGI. In China, it's just treated like another tool for automation and companies see it as common infrastructure, akin to Linux, that people will build interesting things on. Hence why pretty much all the models in China re developed on open basis. Everybody there seems to realize that there's no real path towards monetizing the models themselves.

in reply to ☆ Yσɠƚԋσʂ ☆

audaxdreik

in reply to ☆ Yσɠƚԋσʂ ☆ • 3 weeks ago • •

Gary Marcus has put forward articles theorizing that's why the LLM/neural network models are so appealing to American capitalists. They at least have the appearance of something that can be infinitely scaled with investment (screw diminishing returns, right?)