Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

deevus

I've been using DeepSeek V4 a lot in the last week and I am very happy with it. If you have a really gnarly bug, you might need a SOTA model like Opus. For most things it is very very good, and costs significantly less (even without the discount).

I've been using it as part of a complex DOS game decompilation project[0]. I'm working on refactoring the software rendering pipeline so that we can add GPU rendering. The hardest part of this so far is converting the 90's polygon rendering from screen to world space.

It spun its wheels a few times doing a large mostly mechanical change. After resetting and improving my prompts it was able to get through it. I'm using Matt Pocock's skills[1] for this work, which has been quite nice.

[0]: https://github.com/FatalDecomp/ROLLER

[1]: https://github.com/mattpocock/skills

doctoboggan

What agentic harness do you use deepseek with?

EEnsw3r

I find it hard to understand why nobody in this thread considers that the current pricing might still be below cost. The discount was supposed to end on May 5, and then shortly after that they extended it to May 31. They clearly made a judgment call there, rather than treating it as a desperate loss-leader.

If you have actually used DeepSeek, you would notice that the cache-hit rate is extremely high, and the cache invalidation window is much longer than every other provider's. That suggests DeepSeek is simply much better at utilizing its infrastructure than other vendors.

I am also highly skeptical that the average user's input is worth more than the API cost of processing it. Do people really think DeepSeek researchers enjoy panning for gold in a river of boilerplate and half-baked code?

zozbot234

DeepSeek's KV cache is tiny compared to other open weight models. This actually makes very large inference batches viable even on consumer hardware, even when resorting to SSD offload for weights. Once support is added to the main inference frameworks, it should be an absolute game changer for SOTA local inference.

ern

A few days ago we were hearing about how the "free lunch is over", now we're seeing discounts and increased usage limits.

niobe

This is clearly a well-timed loss-leading strategic market share grab! Anthropic have blown a lot of user trust in the last couple of months..

But, overall, the current AI pricing is completely unsustainable, across all AI companies, except via the exponential growth they are relying on. Dylan Patel did the most insightful analysis of this I've come across.. https://youtu.be/mDG_Hx3BSUE?si=nyJu4adwYCH1igbJ

sidrag22

Really feel like the current versions are for sure "good enough". Thats not how market capture is gonna function though and they are gonna keep pushing because the only moat is to stay ahead, so the problems gonna stay strange. at some point more compute isn't a reasonable answer, and optimization is, and my feeling is we are well past that point from a product perspective, but ipos etc etc

2ndorderthought

The only moat is the us trying to buy all the compute hardware in the world for the next two years. Then China, amd, etc are just making their own chips.

niobe

So I think the current generation of models are arguably all about the same in terms of capability. However, the requirement for exponential growth I mentioned is all about the economics.

AI companies are trying to ride a growth wave where the income curve lags the expense curve by 1-2 years, and at the same time investing 10x their historical income on next year's projected demand.

Everyone is selling their API calls at a loss, because to capture the investment required to scale the business up and the costs down, you need to grow your market now (in relative and absolute terms). And history shows, that in big tech you often have winner-takes-all situations, or, at least a couple of big firms will dominate, and the others will die. That's where market share becomes a key strategic goal.

But to secure that, they also need to be building next year's compute now. And if their anticipated compute needs are 10x this year, they've got a serious funding problem, one that can only be filled by capital with an appropriate risk appetite. You can only get this high-risk capital when the potential payoff is even more enormous, or, when it's a smaller bite of a much bigger pie. Hence, MS putting into OpenAI and so on. But the investment needs are getting so big we are starting to see some pullback from more conservative sources, but also record deals from others.

Now say an AI company does get the capital they need to grow. Well, they've still got a very serious supply problem. RAM, GPUs, water, electricity etc. Hence why there's a lot of deals and cross-investment going on - everyone is trying to secure resources and lower their overall risk exposure while keeping a foot in every possible door, so they can switch alliances whenever it's expedient, and because collaboration also helps the overall market to grow.

This all explains to me why the industry _needs_ the hype. These companies can't exist without it, because the money they need to sink in, in order to even be around in 18 months, far outstrips all reasonable financial practices. So it's capitalism on steroids or nothing. If you believe the AI story, then to that extent, it's rational.

But note that nowhere in this scenario does it suggest the actual consumers will be getting a consistent product at a consistent price!!!

flakiness

We're subsidized by the Chinese government!

https://www.reuters.com/world/asia-pacific/deepseek-nears-45...

2ndorderthought

Cool go download qwen 3.6 and run it on a single GPU and you can avoid paying into a subsidized model

serf

why are we pretending these are equivalents?

yes, single gpu open models exist. Now show me the one that can keep up with a SOTA api model on more than short code block evals.

2ndorderthought

People don't understand that deep seek is running a plausibly sustainable business. Like how qwen/Alibaba is.

jarym

Every AI vendor is trying to steal marketshare. For now the competition is good!

HWR_14

I'm guessing there was a pullback in usage as the free lunch started ending. So we get some more subsidized usage.

ttul

* from Chinese labs

splatzone

What advantage do you think they have?

ralph84

Operating in a jurisdiction where US companies can't sue them.

serf

a lack of existential threat in the form of pay-seeking and remediation from the people you stole training materials from that allows for an intrinsically different pace of operation than the Western competition

cogman10

A sane government policy that invests heavily on innovative businesses.

peyton

I’m not happy with their privacy policy [1]. I’m unfamiliar with the phrase “Parties with Other Legal Rights”. Given the well-documented struggles of Anthropic and others to provide enough compute, I wonder if “Parties with Other Legal Rights” constitutes part of the advantage here.

[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...

dyauspitr

They need to build data centers and lots of them everywhere, preferably powered with renewable energy. Let the tokens flow like water. The models are finally getting to the point where the LLM just knows what you’re asking for and gives it to you.

mannanj

Free lunch? More like "free data". The fools who give their life data and most intimate Intellectual property over to the AI companies for free, yes that's a free lunch that won't be subsidized for much longer when the cost on them which has been unsustainable (their data being harvested for non-training purposes) come stop catch up with them.

Sincerely, - I see you AI companies harvesting our data giving us discounted subscriptions so we can not realize we are paying you to take our own data!

wxw

Per 1M tokens (input cache hit / input cache miss / output)

v4-pro (75% off): $0.003625 / $0.435 / $0.87

v4-pro (regular): $0.0145 / $1.74 / $3.48

v4-flash: $0.0028 / $0.14 / $0.28

that is damn cheap.

yehosef

You are the product. The book is called "So long, and thanks for all the secrets"

niobe

What if I told you.. this was no different to every US company

croon

There is not a single LLM provider I trust enough to send secrets to. If you firewall accordingly the provider (or local) can be interchangable, barring capability differences of course.

I also struggle to find a provider that can credibly convince me I wouldn't be a product for when using. Have you found one?

wanderlust123

You are the product whenever you are sending your data to an LLM not controlled by you.

Nothing specific to Deepseek.

jack_pp

Generous of you to think I'm doing top secret coding and not just another cat website

samdhar

Cached input at $0.003625/M, output at $0.435/M. Aggressive pricing.

For anyone doing the "should I self-host on rented GPUs?" math: at this rate you'd need to push roughly 1B output tokens/day to break even against an 8xH100 fleet on Vast/Lambda (assuming 3-5k tokens/sec aggregate throughput). The vast majority of "I should run my own LLM" use cases don't come close to that volume.

Every API price drop kills another tranche of "self-host the open model" use cases. The implied bet: even if regular pricing ($1.74/M output) is also subsidized, exponential demand growth eventually makes the unit economics work. We'll see.

yehosef

Is anyone concerned about these services and China’s National Intelligence Law?

2ndorderthought

No because China can only do so much to me as someone who doesn't live there and never will.

It's the same reason why I prefer vpns that are owned by countries outside my own.

yehosef

Unless you're very careful, it's trivial to have my secrets to be sent to the LLM. If it reads your .env just to see the variable names, the secrets have been sent to the servers. Now - they probably don't care about you and your secrets - but it makes me more uncomfortable that they have them.

This is true of anthropic or openai - but for some reason I think the us govt or anyone else will have a harder time getting to my data from them than the CCP will any chinese company.

0xbadcafebee

> for some reason I think the us govt or anyone else will have a harder time getting to my data

US companies are required by law to hand over your data if given a warrant by USG. They don't need a warrant if they have a subpoena for less invasive data, or a FISA request. They can also ask without any justification, and see if the company will cough it up anyway (they often do). Any AI company with government contracts will want to give up data quicker so as not to threaten deals worth hundreds of millions.

bean469

> but for some reason I think the us govt or anyone else will have a harder time getting to my data from them than the CCP will any chinese company

With what we know about the US government's mass surveillance in cooperation with tech giants, I would highly doubt that either country is "better" in this regard

ndiddy

> but for some reason I think the us govt or anyone else will have a harder time getting to my data from them than the CCP will any chinese company.

US tech companies voluntarily give their data to the US government. Don't you remember PRISM? You think they stopped doing that?

> Internal NSA presentation slides included in the various media disclosures show that the NSA could unilaterally access data and perform "extensive, in-depth surveillance on live communications and stored information" with examples including email, video and voice chat, videos, photos, voice-over-IP chats (such as Skype), file transfers, and social networking details.[2] Snowden summarized that "in general, the reality is this: if an NSA, FBI, CIA, DIA, etc. analyst has access to query raw SIGINT [signals intelligence] databases, they can enter and get results for anything they want."[13]

protocolture

>I think the us govt or anyone else will have a harder time getting to my data than the CCP will any chinese company.

Why? You dont think that 5 eyes cyber peeps use every advantage they can get? And on the way out leave a dusting of evidence pointing at the russkies or chinese?

2ndorderthought

Why would two companies burning 100s of billions of dollars and are not profitable be safe keepers of your data when there is a huge market for all of that in the us and the us has really weak protections for those things so the companies can sell it to defense agencies?

Thing is, either way your data is getting hoovered up. If not today then eventually. It's just a matter of where. If you work in an industry where nation states might want to do you irreparable harm then yea don't let your data leave the country.

tasuki

> If it reads your .env just to see the variable names, the secrets have been sent to the servers.

But these are development secrets. They're of no use to anyone.

Production secrets probably live in another repo, or at least are stored encrypted. In my case both. And I'm managing the least sensitive stuff you can imagine: a pre-school website, a non-profit event website, personal blog...

striking

It's unlikely that you're special enough that someone will genuinely look through the massive amount of data produced by this system in order to target You Specifically. If you are that special you can just use another provider.

From this line of reasoning, my guess is that the huge discount is not so much intended to sell the data collection system as much as it is intended to sell the model. If you had to wring a geopolitical consequence from this, it would be that the US labs producing models would be impacted by a vastly less expensive competitor.

missedthecue

Not for my purposes tbh. Enjoy my shitty javascript, Xi.

brcmthrowaway

Should be TypeScript

martin_henk

yes. imagine getting denied at the border or something because of data you shared with deep seek,WeChat or any other china centric service

2ndorderthought

Are you actually planning on travelling out of the country right now? It's probably not a good idea even if you don't use Chinese products, which by the way you definitely do.

dylan604

The people that travel out of the country are typically not the same ones aligned with the current administration. The vast majority of the MAGA base are more likely to not have a passport, while a large portion have probably never left their state.

peyton

Definitely would select the frowny face if that happened.

dylan604

Might as well answer yes to the "are you a subversive" question

mannanj

the US does that to you too, for not liking your opinions about particular parties or intelligence aparati.

inerte

I think martin_henk is fully aware of that and it's why of all the examples of how a government can use your data, he picked this one...

mdni007

No I'm more concerned with OpenAI and Anthropic AI models being used as a tool to murder brown people in the middle east for our "greatest ally".

protocolture

More worried about the Epstein regime

dyauspitr

Eh I’m using it for stuff where there is nothing proprietary or identifiable.

dancemethis

Eh, I'd be more concerned about the Three-Letters and the One country that dropped an A-bomb.

serf

>the One country that dropped an A-bomb.

i'd like to point out that the soviet RDS-3 was an airdropped A-bomb.

I get that you mean 'in anger', but I don't feel that bad being a pedant against a propagandist statement that's also pedantically wrong.

dylan604

I don't think anyone will ever be confused by the "only country to use the bomb" in this context. Your pedantry is not something to not feel bad about as it does nothing constructive to the conversation

undefined

[deleted]

WatchDog

What coding agent(ideally CLI) have people found works well with this?

Occasionally I go and try different agents with openrouter models, but nothing seems to really get close to the proprietary ones like claude-code.

croon

As sibling said, Pi is great, and you can absolutely run it directly (there's even a plugin to use itself as a sub agent), but I mainly run it as a sub agent from other harnesses, for example running a more capable model in copilot, and then delegating simpler chunks to pi (using a cheaper model) as the sub agent. I've tried gas town and some others but never got into that way of working. I'm going to try opencode though as a less vendor specific harness than copilot/claude/gemini.

speu

I've been using OpenCode for a few days now, I like it. It doesn't feel any "less heavy" than Claude Code (they're both massive piles of vibe-coded typescript) but for me it's essentially a 1:1 replacement for Claude Code.

Sidenote, I've been trying deepseek-v4-flash and I'm blown away. It's no Opus, but it's as cheap as tap water and punches far above its weight as a Flash model. I keep throwing tasks at it out of curiosity and it keeps solving them.

flakiness

Pi (pi.dev) is fine. I'm using it with DS v4 right now. It's not close to Claude code but I think that's the point.

By the way OpenRouter version is very slow for some reason. DeepSeek platform is faster (and cheaper with the discount) if you don't mind passing the credit card number / email to this company.

undefined

[deleted]

pupppet

Have any regular Opus users taken V4 for a spin? What’s your take?

binary132

Anecdatally, out of all the popular LLMs I’ve only found Gemini to be any use for entry-level Ford Power Stroke Diesel mechanics and diagnostics. :)

grovel4brown

lmao i can pay them to steal my ideas and code

undefined

[deleted]
Daily Digest email

Get the top HN stories in your inbox every day.