Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

manofmanysmiles

I love the following section of their copy:

> Even More Value for Upgraders

> The new 14- and 16-inch MacBook Pro with M5 Pro and M5 Max mark a major leap for pro users. There’s never been a better time for customers to upgrade from a previous generation of MacBook Pro with Apple silicon or an Intel-based Mac.

I read as "Whoops we made the M1 Macbook Pro too good, please upgrade!"

I think I will get another 2-5 years out my mine.

Apple: If you document the hardware enough for the Asahi team to deliver a polished Linux experiene, I'll buy one this year!

dawnerd

My 32gb m1 max was probably the best purchase I've made. Still plenty of headroom in performance left in this beast. Wonder what reason they'll use to end software support in the future. Bet it'll be some security hardware they make up for the sake of forcing upgrades.

kobalsky

my tinfoil hat theory is that they make small features depend on new hardware.

for example, let's say the new os depends on m5's exclusive thumbnail generator accelerator, and let's say it improves speed by a 20%.

now, your M1 notebook than on previous OSes uses standard gpu acceleration for thumbnails will not have this specialized hardware acceleration, it will have software fallback that will be 90% slower.

you won't notice it a first thought because it's stuff, fast, but it eats a bit of the processor.

multiply this by 1000 features and you have a slow machine.

I don't know how else to explain how an ipad pro cannot even scroll a menu without stuttering, it's insane how fast these things were on release

coldtea

>my tinfoil hat theory is that they make small features depend on new hardware.

The general case is hardly a "tinfoil hat theory". They openly do that, and the major reason is to tie to new hardware adoption.

That said, it doesn't usually work like you call it. It's not adding new features depending on HW optimization to slow older machines down (after all one could just not use those features in an older machine, or toggle them off).

It's rather: you want to get these shiny new features, which is all we advertise for iOS/macOS N+1, and the main new changes? The big ones will only work if you have a newer machine, even though we could trivially enable them on older machines (and some don't even need special hardware, as there are third-party hacks that unlock them and they work fine).

compounding_it

yes pretty much this. make useless features use up resources and make basic scrolling slow.

the Liquid Glass for example probably is not so great when it comes to resources. Probably works better with latest metal and hardware blocks on the GPU in M5 as opposed to using GPU cores and unified memory on 8gb M1 making latest macOS work not so great. I have the M1 8gb air and it is really slow on Tahoe. It was snappy just a couple of years ago on a fresh install.

jtokoph

This makes Asahi Linux so valuable to me. I'll just move to linux on my M2 Max when MacOS drops support.

theodric

A brand-new iMac G4 couldn't scroll smoothly in 2002. Apple has a long history of great-looking terrible performance.

That G4 was a dog in Mac OS X 10.1. I installed Yellow Dog, and it lit a rocket under its ass.

danielxt

It's not tinfoil, that's just how publicly traded companies work - increasing the share value

subpixel

I have a perfectly good 2015 Macbook that can't use Apple's own Password app, presumably hobbled intentionally to make me upgrade.

hx8

You're too far down the rabbit hole. Anytime they can make M1 incompatible with the latest version of macOS which would most people to upgrade.

GeorgeOldfield

my 64GB M1 pro max is still fast AF. faster than my regular M3 in practice

i wish the new mbp had 256GB of ram :(

bluescrn

Shame the keycaps wear so poorly. Just a cosmetic issue, but on a £3k machine that’s otherwise amazing, it’s annoying to have keycaps that look rather dirty/greasy as they wear and develop shiny patches.

(Can at least replace them via the self-service repair store. Fiddly job but worth it)

debian3

I was surprised to learn that they still replace the keyboard on m1 max when they service the battery. Probably you are due at this point. I just had mine done

andy_ppp

I use my machine daily for 5 years and the keyboard looks new, what are you doing to it? ;-)

lordnacho

Mine still runs like the first day I had it. There's basically nothing that is limiting me with the machine as it is, everything is just me being slow to code.

I don't see why I need a new computer at the moment. In the past, I always got to a stage where the machine felt sluggish.

chairmanwow1

Yeah my M1 is still insanely snappy. Would be nice to have some extra legroom for things like compilation, but I'm far from feeling this device isn't sufficient for me.

vr46

Agreed - I was just picking mine up from a repair at the Apple Store - they replaced the top case as the keyboard was borked, found a logic issue and replaced the board. It's as good as new, and its already lasted longer than any Mac I've ever owned. I want for nothing, although I wouldn't mind double the RAM and SSD. It's the perfect laptop.

karolist

Ditto, I don't see myself upgrading in the near future, the 64GB M1 Max I paid 2499 at the end of 2023 still feels like a new machine, nothing I do can slow it down. Apple kept OS updated for around 6 years in Intel times, I don't see how they can drop support for this one tbh. I'm still paying for apple care since I depend on it so much

manmal

Some of my M1 MBP Max keys are losing their coating, and the battery is at 74% capacity. At some point soon I'll need a service. But other than that, I have no real complaints. Even the case edge where my arms constantly rest doesn't look too bad.

My next MBP will have 128GB memory, but these prices just wanna make me wait longer.

abustamam

If you don't mind a bit of DIY, apple runs self service repair.

https://support.apple.com/self-service-repair

LTL_FTC

Those keys are easily replaced, my friend.

duskdozer

Do they need a reason? I see plenty that amounts to nothing more than "that's old"

ramijames

I've been on a Macbook M1 Pro since 2022 (bought refurbished on Amazon for cheap) and it's still such a powerhouse. It doesn't struggle at all with anything that I throw at it. Kind of amazing.

Nothing has broken and I consistently get 4-6 hours of heavy work time while on battery. An amazing machine for the price I paid.

undefined

[deleted]

Nevermark

> I read as "Whoops we made the M1 Macbook Pro too good, please upgrade!"

As there target for that marketing, I can report it hits home!

But objectively, there is nothing wrong with my current experience at all.

I have never had that experience over many generations and types of machines. The M1 keeps looking better and better in hindsight.

—-

Looking forward, either the M5 is the next M1, a bump of good that will last. Or Apple will be really firing on all cylinders if it can “obsolete” the M5 anytime soon.

neya

I upgraded to an M3 Pro from an M1 Pro. I sold my M1 Pro at 90% of the original cost (not even exaggerating) on Facebook marketplace AFTER 2 YEARS.

I thought the buyer was insane to buy it at that price. But, of course mine had a decent spec and still had the Apple care warranty with very low battery cycle count. After the sale, the buyer told me the truth: The M1 is the best chip Apple ever made and I wouldn't see much of a difference in real world between the M1 Pro and an M3 Pro unless it was the Max version of the chip.

I didn't believe him then. But, after a year of being on M3 Pro, I gotta say he was spot on. Don't get me wrong, the M3 Pro is definitely faster in a lot of things. But not 3x or 2x faster like Apple always like to market. I can open a few extra tabs without slowing down, compile times (Elixir) did get somewhat faster. But definitely not faster to the point where there were two generations worth of performance improvements like Apple claimed.

The M1 chip series is vastly underrated.

freehorse

M3 was a weird generation, as they contained fewer transistors than the previous ones. It is slightly faster in single core tasks, and has a few more cores, but they are very close. But in terms of gpu, m3s are quite nerfed esp because they lowered the memory bandwidth, so on llm performance they are on par. I have both an m3 and an M1 Max, one of them from work, so I have tested them extensively (though the m3 is binned and 14”, the m1 full and 16”). M3 had better TTFT but the M1 had a bit higher tokens/s.

neya

Wow! Thanks for sharing. I didn't know this. Time to upgrade to M5? What do you think about the M5? I know it's too early for tests. But I would love to hear your opinion.

retired

Impressive. Four years in and my once €2100 M1 Pro is worth maybe €600.

neya

You can try selling it in Asia (Singapore / Malaysia). You can get a good deal for it there usually if your battery cycle count is low. One thing I really learned is - it's super important to keep the battery cycle count low if you want a good resale value on your machine. I was extremely fortunate enough with the M1 Pro to always use it plugged in because I was constantly worried about not having enough battery when I actually needed it.

seanalltogether

Same, in fact the only reason right now that I would upgrade my m1 pro is if they threaten to change the design by getting rid of the hdmi or sd card slot, or doing something stupid like when they added the touch bar. I was locked into my old intel pro for so long because of all the bad hardware choices they were making.

virgildotcodes

You may get your wish with all the rumors of a touch screen on the M6 MBPs.

throwforfeds

Love that they didn't learn anything from the touchbar.

xp84

A touch screen could be useful! I love having one on my HP. It’s just another option that doesn’t hurt you if/when you aren’t using it. Unlike the Touch Bar that deleted 13 keys and replaced them with garbage.

fragmede

how about a cell modem in one

renewiltord

I have an M1 Max with 64 GB and an M4 Max with 128 GB and the latter feels noticeably snappier than the former. The latest MacOS release fucked up the M1’s performance. Wish I could downgrade easily. I want off that ride.

brailsafe

I have the M3 Pro (32gb) and an M4 Pro 16" (48gb), and the latter is sufficiently snappier to make me happy I waited to upgrade from my horrible Intel 13" i5 with 16gb. The M1 Pro I used for work a few years ago was great too. I'm not on Tahoe on either computer, thank god.

simonvc

im running asahi fedora with niri on my M1 Air and apart from display port over usbc not working (it's coming) it's perfect.

not too annoying to setup if the first thing you install is claude-cli

mkurz

> display port over usbc not working (it's coming) it's perfect.

I am on a Macbook Pro M1 Pro running Asahi and a 28 inch external display via USB-C / dp alt mode as of typing this comment. They have a `fairydust` branch in their kernel repo which is meant for devs to test and hack on dp alt mode support, but it just works for me without problems.

See https://www.reddit.com/r/AsahiLinux/comments/1pzht74/dpaltmo...

riffraff

Yeah reading these announcements I realized my M1 Pro is supposed to be obsolete but I still see no reason to upgrade.

Also, my wife's still using the older touch bar MBP, and we'll, it works fine for her too.

I'm not sure who needs the newer pros.

phil21

Mostly folks who bought base model with small amounts of RAM I imagine.

While it’s workable, anything less than 24GB to me feels rather constrained. I definitely am not efficient though - leaving way too many browser tabs open I never actually get back to, running a few chrome profiles for work/side hustle/personal, etc.

I don’t think I’ve ever been CPU constrained for many years now. The few times I need to something that maxes out CPU just isn’t worth the upgrade vs taking a break to grab a cup of coffee.

gbalduzzi

I'm still using the touch bar MBP. For doing web work using vs code, it works very well.

You start having problems when an heavy compilation is required, e.g. Android / iOS builds

lelandfe

My 2020 M1 MBP just had its touch bar die a couple weeks ago :( Now I don't have function keys

Jolter

Everyone who’s still running Intel hardware, especially on Windows.

I recently swapped out my work PC (a beefy workstation laptop) for an M4 Pro and it’s an amazing upgrade.

jeanlucas

Well, I just upgraded from Intel late last year. There are lots of users still on Intel :)

bsimpson

There was a magical window at Google where you could be issued an iMac Pro 5k. (To this day, the standard issue monitor is still 1440p.)

~9 years later, there are a lot of people still using it as their main machine, waiting until we get kicked off the corp network for lack of software support.

bombcar

Was that one of the ones that could do "target display mode" and become a monitor for another machine?

apple4ever

I am coming from a 2020 iMac 27", and waiting for the M5 Mac Studio. I thought about upgrading last year, but I didn't really have the money. But now I do!

jbellis

I chased down what the "4x faster at AI tasks" was measuring:

> Testing conducted by Apple in January 2026 using preproduction 13-inch and 15-inch MacBook Air systems with Apple M5, 10-core CPU, 10-core GPU, 32GB of unified memory, and 4TB SSD, and production 13-inch and 15-inch MacBook Air systems with Apple M4, 10-core CPU, 10-core GPU, 32GB of unified memory, and 2TB SSD. Time to first token measured with an 8K-token prompt using a 14-billion parameter model with 4-bit quantization, and LM Studio 0.4.1 (Build 1). Performance tests are conducted using specific computer systems and reflect the approximate performance of MacBook Air.

butILoveLife

>Time to first token measured with an 8K-token prompt using a 14-billion parameter model with 4-bit quantization

Oh dear 14B and 4-bit quant? There are going to be a lot of embarrassed programmers who need to explain to their engineering managers why their Macbook can't reasonably run LLMs like they said it could. (This already happened at my fortune 20 company lol)

VladVladikoff

I don’t really get why people are smack talking this, are there other laptops available that can do better?

diffeomorphism

Wrong question. If you sell a 6k€ machine "for AI", then you are judged on your own merits.

Replies like "but, but other laptops" are very weak attempts at deflection.

butILoveLife

My 2023 Nvidia 3060 laptop I spent $700 on?

piokoch

Nope, but other producers does not claim that their hardware "can run AI".

khana

[dead]

knicholes

I wonder if Apple has foresight into locally running LLMs becoming sufficiently useful.

DiscourseFan

It won’t handle serious tasks but I have Gemma 3 installed on my M2 Mac and it is good for most of my needs—-esp data I don’t want a corporation getting its hands on.

b112

They do! "You're holding it wrong*

velcrovan

This wasn’t a statement about capability. It’s just a detail about what model they used to compare the speed of two chips for this purpose. You want a bigger model, run a bigger model.

bbshfishe

Yeah no it didn’t. If you have a fully speced out M3/4 MacBook with enough memory you’re running pretty decent models locally already. But no one is using local models anyway.

razster

I run a local model on the daily. I have it making tickets when certain emails come in and made a small that I can click to approve ticket creation. It follows my instructions and has a nice chain of thought process trained. Local LLMs are starting to become very useful. Not OpenClaw crap.

weird-eye-issue

> Yeah no it didn’t

What is "it" and what didn't it do?

me551ah

If your company can afford fully speced out M3/4 MacBook, then it can also afford cloud AI costs.

jordhy

With OpenClaw and powerful local models like Kimi 2.5, these specs make a lot of sense.

whynotmaybe

Quite interesting that it's now a selling point just like fps in Crysis was a long time ago.

re-thc

Next is the fps of an AI playing Crysis.

bartvk

If AI actually becomes somewhat sentient, it may be bored out of its skull in between our queries, and may want to do some "light gaming".

dana321

Or tasks per minute of the AI doing your job for you

patates

Now that you mentioned it, these macs could theoretically also run crysis if it supported arm and such! They should add that to the marketing material :)

gslepak

That is talking about battery life, not AI tasks. Footnote 53, where it says, "Up to 18 hours battery life":

https://www.apple.com/macbook-pro/

fulafel

So it's not measuring output tokens/s, just how long it takes to start generating tokens. Seems we'll have to wait for independent benchmarks to get useful numbers.

dotancohen

For many workflows involving real time human interaction, such as voice assistant, this is the most important metric. Very few tasks are as sensitive to quality, once a certain response quality threshold has been achieved, as is the software planning and writing tasks that most HN readers are likely familiar.

raw_anon_1111

The way that voice assistants work even in the age of LLMs are:

Voice —> Speech to Text -> LLM to determine intent -> JSON -> API call -> response -> LLM -> text to speech.

TTFT is irrelevant, you have to process everything through the pipeline before you can generate a response. A fast model is more important than a good model

Source: I do this kind of stuff for call centers. Yes I know modern LLMs don’t go through the voice -> text -> LLM -> text -> voice anymore. But that only works when you don’t have to call external sources

Art9681

It's going to be faster no matter what. My M3 MAX prints tokens faster than I can read for the new MoE models. It's the prompt processing that kills it when the context grows beyond a threshold which is easy to do in the modern agentic loops.

fulafel

If your computer was faster at it, you could run more capable models at the same token rate.

oofbey

Token/s is entirely determined by memory bandwidth. TTFT is compute bound.

fulafel

This is broadly correct for currently favoured software, but in computer science optimization problems you can usually trade off compute for memory and vice versa.

For example just now from the front page: https://news.ycombinator.com/item?id=47242637 "Speculative Speculative Decoding"

Or this: https://openreview.net/forum?id=960Ny6IjEr "Low-Rank Compression of Language Models Via Differentiable Rank Selection"

easygenes

Topical. My hobby project this week (0) has been hyper-optimizing microgpt for M5's CPU cores (and comparing to MLX performance). Wonder if anything changes under the regime I've been chasing with these new chips.

0: https://entrpi.github.io/eemicrogpt/

gok

consider using fp16 or bf16 for the matrix math (in SME you can use svmopa_za16_f16_m or svmopa_za16_bf16_m)

lastdong

14-billion parameter model with 4-bit quantization seems rather small

derefr

I think these aren't meant to be representative of arbitrary userland-workload LLM inferences, but rather the kinds of tasks macOS might spin up a background LLM inference for. Like the Apple Intelligence stuff, or Photos auto-tagging, etc. You wouldn't want the OS to ever be spinning up a model that uses 98% of RAM, so Apple probably considers themselves to have at most 50% of RAM as working headroom for any such workloads.

duskwuff

Also: they're advertising the degree of improvement ("4x faster"), not an absolute level of performance.

giancarlostoro

On my 24GB RAM M4 Pro MBP some models run very quickly through LM Studio to Zed, I was able to ask it to write some code. Course my fan starts spinning off like the worlds ending, but its still impressive what I can do 100% locally. I can't imagine on a more serious setup like the Mac Studio.

jbellis

Your limitation after prefill is memory bandwidth. A maxed out Studio has less than a single 3090 (really).

efxhoy

How is the output quality of the smaller models?

kraig911

what model were you using?

simlevesque

It's not much for a frontier AI but it can be a very useful specialized LLM.

bilbo0s

It is.

That's how they make loot on their 128GB MacBook Pros. By kneecapping the cheap stuff. Don't think for a second that the specs weren't chosen so that professional developers would have to shell out the 8 grand for the legit machine. They're only gonna let us do the bare minimum on a MacBook Air.

butILoveLife

For anyone who has been watching Apple since the iPod commercials, Apple really really has grey area in the honesty of their marketing.

And not even diehard Apple fanboys deny this.

I genuinely feel bad for people who fall for their marketing thinking they will run LLMs. Oh well, I got scammed on runescape as a child when someone said they could trim my armor... Everyone needs to learn.

zitterbewegung

Yesterday I ran qwen3.5:27b with an M1 Max and 64 GB of ram. I have even run Llama 70B when llama.cpp came out. These run sufficiently well but somewhat slow but compared to what the improvements with the M5 Max it will make it a much faster experience.

giwook

I don't know that there would be a huge overlap between the people who would fall for this type of marketing and the people who want to run LLMs locally.

There definitely are some who fit into this category, but if they're buying the latest and greatest on a whim then they've likely got money to burn and you probably don't need to feel bad for them.

Reminds me of the saying: "A fool and his money are soon parted".

mptest

In retrospect, was there a better place to learn about the cruelty of the world than runescape? Must've got scammed thrice before I lost the youthful light in my eye

jki275

I run local models on my M1 Max. there are a number of them that are quite useful.

jtbaker

my mac mini m4 is getting to be a good substitute for claude for a lot of use cases. LM Studio + qwen3.5, tailscale, and an opencode CLI harness. It doesn't do well with super long context or complexity but it has gotten production quality code out for me this week (with some fairly detailed instructions/background).

nine_k

There used to be a polite way to call this out, the "Steve Jobs's reality distortion field".

azinman2

Seems very reasonable to me

tux3

A bit strange to use time to first token instead of throughput.

Latency to the first token is not like a web page where first paint already has useful things to show. The first token is "The ", and you'll be very happy it's there in 50ms instead of 200ms... but then what you really want to know is how quickly you'll get the rest of the sentence (throughput)

jbellis

As far as benchmarketing goes they clearly went with prefill because it's much easier for apple to improve prefill numbers (flops-dominated) than decode (bandwidth-dominated, at least for local inference); M5 unified memory bandwidth is only about 10% better than the M4.

GeekyBear

In previous generations, throughout was excellent for an integrated GPU, but the time to first token was lacking.

hedgehog

Not strange, for the kind of applications models at that size are often used for the prefill is the main factor in responsiveness. Large prompt, small completion.

case540

I assume it’s time to first output token so it’s basically throughput. How fast can it output 8001 tokens

fragmede

No you don't. Not as a sticky mushy human with emotions watching tokens drip in. There's a lot of feeling and emotion not backed by hard facts and data going around, and most people would rather see something happening even if it takes longer overall. Hence spinner.gif, that doesn't actually remotely do a damned thing, but it gives users reassurance that they're waiting for something good. So human psychology makes time to first token an important metric to look at, although it's not the only one.

nabakin

I would consider it reasonable if this was 4x TTFT and Throughput, but it seems like it's only for TTFT.

hydroreadsstuff

The 4x comes from the neural accelerators (tensor core in NVIDIA jargon). It's 4x fp16 over the vector path (And 8x compared to M1 because at some point they 2x'd the fp16 vector path). Therefore LLM prefill(context processing/TTFT), diffusion models (image gen), and e.g. video and photo effects that make use of them can be up to 4x faster. At fp16 that's the same speed at the same clock as NVIDIA. But NVIDIA still has 2xfp8 and 4xnvfp4.

Batch-1 token generation, that is often quoted, does not benefit from this. It's purely RAM bandwidth-limited.

Tangokat

"Scaling up performance from M5 and offering the same breakthrough GPU architecture with a Neural Accelerator in each core, M5 Pro and M5 Max deliver up to 4x faster LLM prompt processing than M4 Pro and M4 Max, and up to 8x AI image generation than M1 Pro and M1 Max."

Are they doubling down on local LLMs then?

I still think Apple has a huge opportunity in privacy first LLMs but so far I'm not seeing much execution. Wondering if that will change with the overhaul of Siri this spring.

butILoveLife

I think its just marketing, and the marketing is working. Look how many people bought Minis and ended up just paying for API calls anyway. (Saw it IRL 2x, see it on reddit openclaw daily)

I don't mind it, I open Apple stock. But I'm def not buying into their rebranding of integrated GPU under the guise of Unified Memory.

jsheard

> Look how many people bought Minis and ended up just paying for API calls anyway. (Saw it IRL 2x, see it on reddit openclaw daily)

Aren't the OpenClaw enjoyers buying Mac Minis because it's the cheapest thing which runs macOS, the only platform which can programmatically interface with iMessage and other Apple ecosystem stuff? It has nothing to do with the hardware really.

Still, buying a brand new Mac Mini for that purpose seems kind of pointless when a used M1 model would achieve the same thing.

ErneX

It’s exactly that. They are buying the base model just for that. You are not going to do much local AI with those 16GB of ram anyway, it could be useful for small things but the main purpose of the Mini is being able to interact with the apple apps and services.

re-thc

> Aren't the OpenClaw enjoyers buying Mac Minis because it's the cheapest thing which runs macOS

That's likely only part of the reason. Mac Mini is now "cheap" because everyone exploded in price. RAM and SSD etc have all gone up massively. Not the mention Mac mini is easy out of the box experience.

philistine

There are so few used Mac Mini around, those are all gone and what is left is to buy new.

renewiltord

Bro. The used M1 mini and studio are all gone. I was thinking of buying one for local AI before openclaw came out and went back to look and the order book is near empty. Swappa is cleared out. eBay is to the point that the m1 studio is selling for at least a thousand more.

This arb you’re talking about doesn’t exist. An m1 studio with 64 gb was $1300 prior to openclaw. You’re not getting that today.

I would have preferred that too since I could Asahi it later. It’s just not cheap any more. The m4 is flat $500 at microcenter.

BeetleB

Can't they simply run MacOS on a VM on existing Mac hardware?

llmslave

yes, and its funny that all these critical people dont know this

rafram

Why not? The integrated GPUs are quite powerful, and having access to 32+ GB of GPU memory is amazing. There's a reason people buy Macs for local LLM work. Nothing else on the market really beats it right now.

mleo

My M4 MacBook Pro for work just came a few weeks ago with 128 GB of RAM. Some simple voice customization started using 90GB. The unified memory value is there.

lizknope

Jeff Geerling had a video of using 4 Mac Studios each with 512GB RAM connected by Thunderbolt. Each machine is around $10K so this isn't cheap but the performance is impressive.

https://www.youtube.com/watch?v=x4_RsUxRjKU

Greed

If 40k is the barrier to entry for impressive, that doesn't really sell the usecase of local LLMs very well.

For the same price in API calls, you could fund AI driven development across a small team for quite a long while.

Whether that remains the case once those models are no longer subsidized, TBD. But as of today the comparison isn't even close.

Hamuko

I've tried to use a local LLM on an M4 Pro machine and it's quite painful. Not surprised that people into LLMs would pay for tokens instead of trying to force their poor MacBooks to do it.

atwrk

Local LLM inference is all about memory bandwidth, and an M4 pro only has about the same as a Strix Halo or DGX Spark. That's why the older ultras are popular with the local LLM crowd.

usagisushi

Qwen 3.5 35B-A3B and 27B have changed the game for me. I expect we'll see something comparable to Sonnet 4.6 running locally sometime this year.

freeone3000

I’m super happy with it for embedding, image recog, and semantic video segmentation tasks.

giancarlostoro

What are the other specs and how's your setup look? You need a minimum of 24GB of RAM for it to run 16GB or less models.

andoando

Local LLMs are useful for stuff like tool calling

tcmart14

I'm not really into AI and LLMs. I personally don't like anything they output. But the people I know who are into it and into running their own local setups are buying Studios and Minis for their at home local LLM set ups. Really, everyone I personally know who is doing their build your own with local LLMs are doing this. I don't know anyone anymore buying other computers and NVIDIA graphics cards for it.

threatofrain

The biggest problem with personal ML workflows on Mac right now is the software.

cmdrmac

I'm curious to know what software you're referring to.

0x457

I think people buying those don't realize requirements to run something as big as Opus, they think those gigabytes of memory on Mac studio/mini is a lot only to find out that its "meh" on context of LLMs. Plus most buy it as a gateway into Apple ecosystem for their Claws, iMessage for example.

> But I'm def not buying into their rebranding of integrated GPU under the guise of Unified Memory.

But it is Unified Memory? Thanks to Intel iGPU term is tainted for a long time.

whizzter

We had a workshop 6 months ago and while I've always been sceptical of OpenAI,etc's silly AGI/ASI claims, the investments have shown the way to a lot of new technology and has opened up a genie that won't be put back into the bottle.

Now extrapolating in line with how Sun servers around year 2000 cost a fortune and can be emulated by a 5$ VPS today, Apple is seeing that they can maybe grab the local LLM workloads if they act now with their integrated chip development.

But to grab that, they need developers to rely less on CUDA via Python or have other proper hardware support for those environments, and that won't happen without the hardware being there first and the machines being able to be built with enough memory (refreshing to see Apple support 128gb even if it'll probably bleed you dry).

fny

I feel like the push by devs towards Metal compatibility has been 10x than AMD. I assume that's because the majority of us run MacBooks.

whizzter

I think that might be partly because on regular PC's you can just go and buy an NVidia card insteaf of fuzzing around with software issues, and for those on laptops they probably hope that something like Zluda will solve it via software shims or MS backed ML api's.

Basically, too many choices to "focus on" makes non a winner except the incumbent.

davidmurdoch

Who is "us" in this case? Majority of devs that took the stack overflow survey use Windows:

https://survey.stackoverflow.co/2025/technology/#1-computer-...

pjmlp

Which majority?

I certainly only use Macs when being project assigned, then there are plenty of developers out there whose job has nothing to do with what Apple offers.

Also while Metal is a very cool API, I rather play with Vulkan, CUDA and DirectX, as do the large majority of game developers.

well_ackshually

The only "push" towards Metal compatibility there's been has been complaints on github issues. Not only has none of the work been done, absolutely nobody in their right mind wants to work on Metal compatibility. Replacing proprietary with proprietary is absolutely nobody's weekend project. or paid project.

freeone3000

Torch mlp support on my local macbook outperforms CUDA T4 on Colab.

pjmlp

Except CUDA feels really cozy, because like Microsoft, NVidia understands the Developers, Developers, Developers mantra.

People always overlook that CUDA is a polyglot ecosystem, the IDE and graphical debugging experience where one can even single step on GPU code, the libraries ecosystem.

And as of last year, NVidia has started to take Python seriously and now with cuTile based JIT, it is possible to write CUDA kernels in pure Python, not having Python generate C++ code that other tools than ingest.

They are getting ahead of Modular, with Python.

woadwarrior01

> Are they doubling down on local LLMs then?

Neural Accelerators (aka NAX) accelerates matmults with tile sizes >= 32. From a very high level perspective, LLM inference has two phases: (chunked) prefill and decode. The former is matmults (GEMM) and the latter is matrix vector mults (GEMV). Neural Accelerators make the former (prefill) faster and have no impact on the latter.

Lalabadie

There already are a bunch of task-specific models running on their devices, it makes sense to maintain and build capacity in that area.

I assume they have a moderate bet on on-device SLMs in addition to other ML models, but not much planned for LLMs, which at that scale, might be good as generalists but very poor at guaranteeing success for each specific minute tasks you want done.

In short: 8gb to store tens of very small and fast purpose-specific models is much better than a single 8gb LLM trying to do everything.

Munachi1869

Probably possible for pure coding models. I see on-device models becoming viable and usable in like 2-3 years on device

undefined

[deleted]

tiffanyh

> Are they doubling down on local LLMs then?

Apple is in the hardware business.

They want you to buy their hardware.

People using Cloud for compute is essentially competitive to their core business.

causal

"Doubling down on already being the best hardware for local inference"

Sharlin

"Apple Intelligence is even more capable while protecting users’ privacy at every step."

Remains to be seen how capable it actually is. But they're certainly trying to sell the privacy aspect.

re-thc

> Remains to be seen how capable it actually is.

It's the best. We all turned it off. 100% privacy.

jmyeet

Apple absolutely has a massive opportunity here because they used a shared memory architecture.

So as most people in or adjacent to the AI space know, NVidia gatekeeps their best GPUs with the most memory by making them eye-wateringly expensive. It's a form of market segmentation. So consumer GPUs top out at 16GB (5090 currently) while the best AI GPUs (H200?) is 141GB (I just had to search)? I think the previou sgen was 80GB.

But these GPUs are north of $30k.

Now the Mac Studio tops out currently at 512GB os SHARED memory. That means you can potentially run a much larger model locally without distributing it across machines. Currently that retails at $9500 but that's relatively cheap, in comparison.

But, as it stands now, the best Apple chips have significantly lower memory bandwidth than NVidia GPUs and that really impacts tokens/second.

So I've been waiting to see if Apple will realize this and address it in the next generation of Mac Studios (and, to a lesser extend, Macbook Pros). The H200 seems to be 4.8TB/s. IIRC the 5090 is ~1.8TB/s. The best Apple is (IIRC) 819GB/s on the M3 Ultra.

Apple could really make a dent in NVidia's monopoly here if they address some of these technical limitations.

So I just checked the memory bandwidth of these new chips and it seems like the M5 is 153GB/s, M5 Pro is ~300 and M5 Max is ~600. I was hoping for higher. This isn't a big jump from the M4 generation. I suspect the new Studios will probably barely break 1TB/s. I had been hoping for higher.

SirMaster

>So consumer GPUs top out at 16GB (5090 currently)

5090 has 32GB, and the 4090 and 3090 both have 24GB.

undefined

[deleted]

fridder

It will be interesting to see the specs on an m5 ultra. Probably have to wait until WWDC at the earliest to see it though

ericd

Hard to get 6000+ bit memory bus HBM bandwidth out of a 512 or 1024 bit memory bus tied to DDR... I think it's also just tough to physically tie in 512 gigs close enough to the GPU to run at those speeds. But yeah, I wish there was a very competitive local option, too, short of spending $50k+.

saagarjha

There is a reason those data center GPUs are so expensive: it’s not trivial to “just” 5x the memory bandwidth.

Nevermark

• Having NPU cores since the M1, would seem to verify that running models has been a game plan for a while. LLMs coming along can only have increased that focus.

• Studios with Ultra Mx, now 4-way RDMA over Thunderbolt 5, and enormous RAM and SSD options, suggest a strong focus. I don't know what else that RAM would be intended for. Four Studio Ultras (total of 360 GPU cores with M5 Ultras?) with 2TB of unified RAM is a local model beast.

• They refashioned their GPU cores to better support both graphic and neural processing, despite already having focused NPU cores.

I would say they have been leaning into local models for several years.

I expect we will see more models being optimized for smaller sizes, as demand for them increases. With hardware performance and neural focus trending up, and model requirements/quality trending down, the next few years will be interesting times.

What would make me happy: Ultra x 2 (i.e. 2xUltra, 4xMax, 8xPro, 16xM5) packaging in the Studio. With 8-way RDMA. Mac Kong. Perhaps Apple will start making server cards again.

wincy

I typed “RAM” to search for it and boy they hammer home how lucky I am to be getting 1TB SSD standard, but no mention of RAM anywhere on this page. Anyway, the MacBook Pro starts with 16GB of RAM. It’s $400 to go from 16GB to 32GB.

Interestingly, 36-128GB models are showing as “currently unavailable” on the store page, and you can’t even place an order for them right now? But for anyone curious, it’s quoting $5099 for the 128GB RAM 14” MacBook Pro model.

jsheard

> It’s $400 to go from 16GB to 32GB.

No change from the previous models then, 16GB->32GB was already $400. They're cutting into their previously enormous margins to keep the prices stable, rather than hiking the prices to maintain their margins.

philistine

They bought the fab time for that RAM 2-3 years ago. Apple is renowned for their foresight and preparation. We'll eventually see price increases from Apple's RAM upgrade, but we're not there yet.

scottyah

Commodity futures made sense to me at FedEx- they would pay money with a supplier for the option to buy gas/oil at X price at Y date in the future. It costs more than just agreeing to pay for it at that price in the future, but if deliveries went way down (or prices) it'd be less costly to "back out".

I wonder if there's a fab time secondary market where Wall Street types are making millions off speculating fab time.

stingraycharles

I thought they bought RAM externally, before soldering it on their chips?

Regardless, point still stands, they probably ordered this several years ago.

daveidol

Their margins may not have changed actually. https://youtu.be/IGCzo6s768o

niwtsol

This is not exactly correct. If you have an M5 Pro chip instead of m5 Chip - I just built a 16inch, M5 Pro chip, it is $400 to go from 24 -> 48gb. An additional $200 ($600 over base) to go to 64gb. So the memory prices change based on chip. M5 Max Chip starts with 48gb of memory.

abhikul0

M5 Max starts at 36GB memory at $3599. M4 Max started at the same memory at $3199. They have doubled the default storage from 1TB to 2TB, that's a $400 increase I'm paying even if I don't want the extra 1TB.

aroman

They raised the base price by $200.

carefree-bob

Apple's previous policy of price gouging for RAM means no need to raise prices yet, they still have a buffer.

__loam

They also have long term contracts with the suppliers in all likelihood

sgt

In practice, you can really go a long way on 16GB on a Mac with unified memory. I like to say it's comparable to 32GB during the old Intel days.

cardanome

They advertise local LLMs which will be servery limited with 16GD of RAM. Plus the GPU could in theory provide decent gaming performance but again might suffer from the RAM limit.

Most people can totally live with 16gigs but it is kind of a waste for the horsepower. They know what they are doing. Apple is a master in upselling.

Though personally I don't mid the aggressive upsellign as long as the quality is there. Problem is, the hardware quality is great but the software side is severely lacking and getting worse.

cthalupa

If anything, it's less, because you're giving up more RAM to the GPU.

Which, I mean, I love unified memory, as one of those weirdos that does do local LLM stuff and am contemplating if it's time to upgrade my m2 max.

But if you needed 32gb then you still need at least 32gb now. Unless swap on nvme disks is enough for you - and it isn't for me.

jsheard

RAM is still RAM, the switch from crusty HDDs to fast NVMe SSDs may have helped to smooth things over when you spill into swap but it's not going to do miracles.

jeroenhd

I know RAM is scarce and everything, but doubling down on LLM local acceleration with all of that dedicated silicon while at the same time sticking with Apple's traditional lack of RAM availability makes for a very weird product proposition to me.

raincole

> M5 Pro supports up to 64GB of unified memory with up to 307GB/s of memory bandwidth, while M5 Max supports up to 128GB of unified memory with up to 614GB/s of memory bandwidth

Isn't this it?

wincy

Ah yeah you’re right, thanks. I tried to at least make my post useful and pull up prices for the different tiers. Overall, those prices are surprisingly competitive now compared to the rest of the laptop market!

stetrain

On the M5 Pro tier (not the base M5 tier that was released last November), the base memory is 24GB.

My M3 Pro from a few years ago for the same price had 18GB.

kylec

Apple doesn't tend to use "RAM" in their marketing materials, they usually use "memory", which appears 9 times in the press release.

tonyedgecombe

>Anyway, it starts with 16GB of RAM. $400 to go from 16GB to 32GB

Interesting that this hasn't budged since the memory shortages appeared.

mschuster91

> Interesting that this hasn't budged since the memory shortages appeared.

Apple has had enough war chests with the ability of buying the entirety of TSMC's new capacity years in advance in the past.

If I were to guess, Apple locked in their entire BOM and production capacity two years ago. That's something even the large players cannot replicate because they run cash-lean and have too many different SKUs, and the small players (Framework, System76, even Steam) are entirely left to the forces of the markets.

lm28469

They sell you 1gb LPDDR5X for $25 while buying it at $5, don't worry for their margins...

WarmWash

Fair chance that Apple has price/purchase agreements already in place. Consumers are left to fight over the excess capacity after megabuyers get their orders filled.

armsaw

Preorders open tomorrow according to the store page. You can’t order the base RAM model today, either.

aurareturn

It starts at 16GB for the base M5 and 24GB for the Pro/Max. It's been like this.

bob1029

I feel like Apple pulled an Instant Pot with the M1 MacBook Pro. I still haven't had a single situation where I felt like spending more money would improve my experience. The battery is wearing out a bit, but it started out life with so much runtime that losing a few hours doesn't seem to matter.

swyx

> The battery is wearing out a bit, but it started out life with so much runtime that losing a few hours doesn't seem to matter.

this is my exact opposite experience. my M3 Max from 2 years ago now has <2hrs battery life at best. wondering if any experts here can help me figure out what is going on? what should i be expecting?

varenc

As others have said, keep the battery in the 80%-30% range. Use the `batt` CLI tool to hard limit your max charge to 80%. Sadly, if you're already down to <2hrs, this might not make sense for you. Also prevent it being exposed to very hot or cold temps (even when not in use)

I type this from an M3 Max 2023 MBP that still has 98% battery health. But admittedly it's only gone through 102 charge cycles in ~2 years.

(use `pmset -g rawbatt` to get cycle count or `system_profiler SPPowerDataType | grep -A3 'Health'` to get health and cycles)

rootusrootus

> I type this from an M3 Max 2023 MBP that still has 98% battery health.

That's amazing. I have an early 2023 M2 Max MBP that mostly charges in desktop mode, which limits to 80%. I just looked in battery health and it says 82%. Damn! :(

For giggles, earlier today I asked Apple how much they'd give me for this machine if I traded it in on a brand new $5K M5 Max equivalent. $825. Ouch. I think I will keep it for a few more years. 96GB is enough memory to do anything I want, and it's been such a great performer that it's easily my favorite MacBook ever. I do wish the battery weren't so degraded though.

For anec-science, here goes:

  % pmset -g rawbatt
  03/03/2026 18:29:51
   AC; Not Charging; 76%; Cap=76: FCC=100; Design=6075;   Time=1092:15; 0mA; Cycles=63/1000; Location=0;
   Polled boot=02/09/2026 07:24:50; Full=03/03/2026   18:24:52; User visible=03/03/2026 18:28:52

  % system_profiler SPPowerDataType | grep -A3 'Health'
        Health Information:
            Cycle Count: 63
            Condition: Normal
            Maximum Capacity: 82%

intelkishan

The option to have a 80% cap is being added in the beta versions of MacOS. I think within a few months it should be available to general users without using extra tools.

Analemma_

I set Claude loose on my computer and said “why is my battery life so bad?” and it found an always-running audio subsystem kernel extension (Parrot) which didn’t need to be there and was preventing the CPU from going into low-power states. My battery life got noticeably better when I deleted it.

I’m not even sure how it got installed, possibly when I installed Zoom for an interview once but I don’t know. Point is, at least in one case, AI can help track down battery hogs.

1123581321

What is your maximum capacity in Settings > Battery Health? What processes are running with significant CPU? What's the typical temperature of the laptop according to a stats app? (Temperature is a good proxy for general energy use.)

I'm typing this on an M3 Max; its max battery capacity is 88%. I've got some things running (laptop average temp is 50-55C, fans off), screen is half brightness, and it's projected to go from 90% to 0% in five hours. I don't usually baby it enough to test this, but 8-10 hours should be achievable.

undefined

[deleted]

windowsrookie

Either your battery was defective or something is using all your battery. Even my 2018 Intel MacBook still lasts 3+ hours on a charge.

Apple will replace the battery for $249 if you choose to. https://support.apple.com/mac-laptops/repair?services=servic...

stouset

People here are suggesting limiting your battery charge as a proactive measure to prevent degradation but an M3 is far too new for you to be getting so poor battery life from use, even if you spent all day every day charging and discharging it.

The only plausible answers are either: something you’re running is eating CPU/GPU cycles like crazy (browser tabs gone amok, background processes) or you have a defective battery. Use Activity Monitor to look for energy usage and that will give you a pretty good idea.

saagarjha

This. The issue is not your battery but something running in the background.

sgt

I've got a M2 Pro from 3 years ago and battery is still so good I can go to a whole day of meetings and not even need to bring my charger. Then I can probably work all night as well without plugging it in. Battery time is insane.

Unless of course you're doing something that truly sucks down your battery! If I spin up a few Docker instances doing 100% CPU then obviously battery will go down much quicker.

0_____0

Charge habits with batteries make a huge difference. If your use pattern is that once per day, you take the device from 100% to 10%, you put a lot more wear on the battery than if it kind of hovers in the 30%-80% range for example, or if it just hangs out nearish top-of-charge all day when you're at your desk.

Hot take: people should get used to, and expect to, replace device batteries 1 or 2 times during the device lifetime. They're the main limiting factor on portable device longevity, and engineers make all kinds of design tradeoffs just to make that 1 battery that the device ships with last long enough to not annoy users. If we could get people used to taking their device in for a battery once every couple of years, we could dramatically reduce device waste, and also unlock functionality that's hidden behind battery-preserving mechanisms.

Analemma_

BatFi is a macOS application which will prevent your battery from charging to over 80% by default. macOS does have a version of this built-in but it’s “intelligent charging” I don’t really trust, and I’d rather just have a hard 80% limit except when I override that.

BugsJustFindMe

> Charge habits with batteries make a huge difference.

> Hot take: people should get used to, and expect to, replace device batteries 1 or 2 times during the device lifetime.

I agree that people should get used to replacing device batteries, but if you accept that then you should just stop worrying about charge habits. An MBP that doesn't have a defective or extreme-heat-damaged battery should stay above 80% battery capacity for at least 600 charge cycles without any special care at all. That's many years of regular charging, and 80% capacity is still good for all day usage.

hmottestad

My M3 Max can burn through battery much faster than my M1 Max ever could.

And some apps are really inefficient. New Codex app drains my battery. If you are using Codex I recommend minimizing it, since it’s the UI that uses most power.

linsomniac

A couple weeks ago I was working remote and didn't bring a power adapter, and I realized a couple hours in that my battery was getting kind of low. I clicked on the battery icon and got a list of what was using a lot of power: 1 was an hour long video chat using Google Meet, the other was Claude desktop (which I hadn't used at all that morning).

What in the world is an idle Claude Desktop doing that uses so much power?

willis936

I just bought this model in the past year for $600 and it still feels like a great bargain.

rfwhyte

You can very easily replace the battery yourself for less than $100 USD too if it ever becomes enough of an issue that you feel you actually need to do something about it. My M1 Max is at about 88% battery health, but it still gets 4X-6X longer on battery (At full performance too boot) compared to my old PoS Razer laptop, so I likely won't be replacing my battery any time soon.

cromka

I bought almost brand new top case with battery twice by now for 50 USD on ebay. For M1 Air, but can't imagine Pro would be much more expensive, especially because keyboard is replaceable in Pro. Takes an hour to replace everything.

rajma

M1 pro MacBook pro here as well. Just today I was thinking I have no need to upgrade until M7 and by then maybe even MacBook Air would do. Especially since I will have my home server (dgx spark) available for anything serious anyway. So excited for the Mac studio configs though. M5 ultra 1TB would be a huge leap for serious home server builders.

maxverse

I use an M1 for personal development an an M4 for work. I'm a typical dev. I don't feel any difference.

ireflect

Same. It looks like battery replacement from ifixit is not too difficult, so I plan to do that when the time comes.

Incidentally, I just switched to Asahi Linux, but that was for software quality and openness reasons, rather than anything to do with performance.

fridder

How's Asahi treating you? If I upgrade from my m1max, I was going to try it out

darknavi

I wish this sort of thing was encouraged in the modern capitalist technology space.

Unfortunately, number always must go up (and the rate at which the number goes up, also must go up).

nsbk

The hardware looks amazing! Too bad they will ship with Tahoe installed. I’m not upgrading until I see in which direction the next Mac OS release goes

satoqz

This. I have been a big (and loud) fan of M-series hardware from the beginning, but if Apple is going to keep making their software worse, I will find myself lingering on older generations that run Asahi Linux or going back to a traditional x86_64 laptop instead of buying into new generations.

midtake

I don't trust Asahi after the whole Asahi Lina thing. Lina being an alt in denial of her other identity is a big red flag. If Hector was honest about it I would feel differently. The deception behind the Lina identity is very weird to me.

LeonM

I'm not sure what Hector's personal choices have to do with not "trusting" a piece of software? It's open source, so if you don't trust the quality of the software, then just inspect it yourself?

Also, FWIW: Hector/Lina is no longer associated with Asahi anymore.

ladyanita22

Oh, who cares about that?

solarkraft

You making a big deal out of it is very weird to me.

carlmr

I've upgraded to Tahoe at 26.2, zero complaints from my side. Haven't had any runaway memory leaks or similar that were reported.

jillesvangurp

Same here. I know some people are unhappy with some of the UX tweaks but honestly I don't notice much of it. The whole liquid glass thing is a bit gimmicky. Other than that, I don't see much difference. The rounded corners on windows are a bit silly. But I don't spend a lot of time fiddling with windows. Most of my windows are maximized (not full screen). I'm sure there are other issues people dislike that I just haven't noticed.

I use my laptop for development. I don't actually use most of the built in applications. My browser is Firefox, I use codex, vs code, intellij, iterm2, etc. Most of that works just fine just as it did on previous versions of the OS. I actually on purpose keep my tool chains portable as I like to have the option to switch back to Linux when I want to. I've done that a few times. I come back for the hardware, not the OS.

In my experience, if you don't like Apple's OS changes that is unfortunate but they don't seem to generally respond to a lot of the criticism. Your choices are to get further and further out of date, switch to something else, or just swallow your pride. Been there done that. Windows is a "Hell No" for me at this point. I'll take the UX, with all the pastel colors that came and went and all the other crap that got unleashed on macs over the last ten years. Definitely a case of the grass not being greener on Windows. Even with the tele tubby default desktop in XP back in the day.

I can deal with Linux (and use that on and off on one of my laptops). However, that just doesn't run that well on mac hardware. And any other hardware seems like a big downgrade to me. Both Windows and Linux are arguably a lot worse in terms of UX (or lack thereof). Linux you can tweak. And you kind of have to. But it just never adds up to consistent and delightful. Windows, well, at this point liking that is probably a form of Stockholm Syndrome. If that doesn't bother you, good for you.

So, Mac OS it is for me as everything else is worse. I've in the past deferred updates to new versions of Mac OS as well. Generally you can do that for a while but eventually it becomes annoying when things like homebrew and other development toys start assuming you run something more recent. And of course for security reasons you might just not drag your feet too long. Just my personal, pragmatic take.

iknowstuff

Is your Spotlight usable? Mine literally will not find an app

Searching for Chat yields "Ask ChatGPT", "ChatGPT Atlas", "ChatGPT Atlas" the website, and chatgpt.com. Does not yield the actual ChatGPT.app which I have currently open lol.

arianvanp

Closing Tabs in Safari till takes more than a second though. And if you hold Cmd-W to close all of them it just completely locks up and crashes. Still not fixed since the release of Safari 26.

Literally unusable

Analemma_

I'm on an M4 Pro MacBook-- basically the fastest computer you could buy from Apple before today-- and opening/closing the tab sidebar in Safari on Tahoe takes multiple seconds, even if I have only 4-6 tabs open, and seems to drop to 5 FPS. It's comically bad.

It's so bad I switched back to Chrome. I had thought Chrome had a major battery life penalty compared to Safari on Macs, but I checked more up-to-date info and apparently that's outdated.

nozzlegear

Never had this problem, been on Tahoe since it released. My safari tabs are buttery, silken smooth.

michelb

I have this issue as well on multiple Tahoe Macs. Opening a new Safari window is 500ms to 1000ms. Adding a tab is faster most of the times. But Safari frequently loses tabs turning them into a blank page without a URL. Searching in the passwords app talkes multiple seconds. This is on multiple macs with different icloud accounts even.

AdamN

Works fine for me. I wonder if you have some extension or script on one of the sites you use slowing down the tab closure.

alwillis

I’ve been running the macOS 26.4 beta and have none of these issues.

achierius

Do you have more info on those crashes (e.g. crashlogs)? I work on Safari and might be able to get that forwarded to other people.

clint

That's objectively false. I use safari all day everyday and have never experience any of that stuff.

herpderperator

This sounds like swap needing to be swapped in and then released. Check your memory usage.

gas9S9zw3P9c

I moved away from mac because of the OS and couldn't be happier. The hardware may be great but non-Apple hardware is fine too, and Linux is significantly better experience than MacOS these days.

satvikpendem

The next macOS will be touch screen centric with elements getting bigger when you're close to touching them, rumors say. That being said, I run Tahoe and it works perfectly fine to me, I am not sure what issues people have with it. Sure, some corner radii aren't exactly the same but I honestly couldn't give less of a shit as long as it runs the programs I need.

nsbk

Safari routinely using 20+ Gb of memory with a handful of tabs open. Safari tabs refusing to close. Unresponsive System Settings window. Random application freezes and crashes, Apple Music not playing music. This is on a 32Gb M1 Max. My M1 Air on Sequoia doesn't experience any of these issues, even if it has half the unified memory.

nuker

Just do a fresh OS install erasing full disk first. Then be careful what apps you add. Done.

clint

Never seen any of this even once.

satvikpendem

I never had any of those issues, but then again I don't use Safari or other Apple apps like music.

ErneX

I read a rumor about it being “touch friendly instead of touch 1st”.

benoau

Presumably touch will be fully interchangeable and equivalent with mouse clicks and trackpad gestures.

pier25

Yeah this is a real issue with these new Macs. I would wait until macOS 27 to see the direction Apple takes.

silverwind

Hopefully less `border-radius`.

sdevonoes

Same. Im waiting for the next macOS release. Tahoe is ugly as hell

john_alan

same, but will it change?

jonplackett

Unfortunately it won’t be long til we’re all forced up to Tahoe anyway. Well, ee iOS developers will be anyway once they make the latest Xcode only work with it…

brikym

Exactly. My org forces me to use Tahoe. The left hand slows you down while the right the right giveth performance and taketh money.

dirk94018

On M4 Max 128GB we're seeing ~100 tok/s generation on a 30B parameter model in our from scratch inference engine. Very curious what the "4x faster LLM prompt processing" translates to in practice. Smallish, local 30B-70B inference is genuinely usable territory for real dev workflows, not just demos. Will require staying plugged in though.

fotcorn

The memory bandwith on M4 Max is 546 GB/s, M5 Max is 614GB/s, so not a huge jump.

The new tensor cores, sorry, "Neural Accelerator" only really help with prompt preprocessing aka prefill, and not with token generation. Token generation is memory bound.

Hopefully the Ultra version (if it exists) has a bigger jump in memory bandwidth and maximum RAM.

anentropic

Do any frameworks manage to use the neural engine cores for that?

Most stuff ends up running Metal -> GPU I thought

abhikul0

It's referring to the neural cores(for matrix mul) in the GPU itself, not the NPU.

https://creativestrategies.com/research/m5-apple-silicon-its...

irusensei

I noticed that even on my M3 MLX tends to do prefill it a lot faster than llama.cpp and GGML models. Anyone knows how they do it?

storus

4x faster is about token prefill, i.e. the time to first token. It should be on par with DGX Spark there while being slightly faster than M4 for token generation. I.e. when you have long context, you don't need to wait 15 minutes, only 4 minutes.

hu3

What about real workloads? Because as context gets larger, these local LLMs aproxiate the useless end of the spectrum with regards to t/s.

Someone1234

I strongly agree. People see local "GPT-4 level" responses, and get excited, which I totally get. But how quickly is the fall-off as the context size grows? Because if it cannot hold and reference a single source-code file in its context, the efficiency will absolutely crater.

That's actually the biggest growth area in LLMs, it is no longer about smart, it is about context windows (usable ones, note spec-sheet hypotheticals). Smart enough is mostly solved, combating larger problems is slowly improving with every major release (but there is no ceiling).

zozbot234

The thing about context/KV cache is that you can swap it out efficiently, which you can't with the activations because they're rewritten for every token. It will slow down as context grows (decode is often compute-limited when context is large) but it will run.

satvikpendem

That should be covered by the harness rather than the LLM itself, no? Compaction and summarization should be able to allow the LLM to still run smoothly even on large contexts.

hu3

Sometimes it really needs a lot of data to work.

barumrho

100 tok/s sounds pretty good. What do you get with 70B? With 128GB, you need quantization to fit 70B model, right?

Wondering if local LLM (for coding) is a realistic option, otherwise I wouldn't have to max out the RAM.

super_mario

I run gpt-oss 120b model on ollama (the model is about 65 GB on disk) with 128k context size (the model is super optimized and only uses 4.8 GB of additional RAM for KV cache at this context size) on M4 Max 128 GB RAM Mac Studio and I get 65 tokens/s.

abhikul0

Have you tried the dense(27B,9B) Qwen3.5 models? Or any diffusion models (Flux Klein, Zimage)? I'm trying to gauge how much of a perf boost I'd get upgrading from an m3 pro.

For reference:

  | model                          |       size |     params | backend    | threads |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
  | qwen35 ?B Q5_K - Medium        |   6.12 GiB |     8.95 B | MTL,BLAS   |       6 |           pp512 |        288.90 ± 0.67 |
  | qwen35 ?B Q5_K - Medium        |   6.12 GiB |     8.95 B | MTL,BLAS   |       6 |           tg128 |         16.58 ± 0.05 |

  | model                          |       size |     params | backend    | threads |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | MTL,BLAS   |       6 |           pp512 |        615.94 ± 2.23 |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | MTL,BLAS   |       6 |           tg128 |         42.85 ± 0.61 |

  Klein 4B completes a 1024px generation in 72seconds.

eknkc

I find time to first token more important then tok/s generally as these models wait an ungodly amount of time before streaming results. It looks like the claims are true based on M5: https://www.macstories.net/stories/ipad-pro-m5-neural-benchm... so this might work great.

fulafel

The marketing subterfugue might be about this exactly, technically prompt processing means the prefill phase of inference. So prompt goes in 4x as fast but generates tokens slower.

This seems even likely as the memory bandwidth hasn't increased enough for those kinds of speedups, and I guess prefill is more likely to be compute-bound (vs mem bw bound).

petercooper

So prompt goes in 4x as fast but generates tokens slower.

I'd take that tradeoff. On my M3 Ultra, the inference is surprisingly fast, but the prompt processing speed makes it painful except as a fallback or experimentation, especially with agentic coding tools.

bytesandbits

4x faster PREFILL not decode. Decode is bandwidth-bounded. Prefill is flops-constrained.

nbardy

How much of your RAM does that use including kv cache. Is there enough left to run real dev workloads AND the llm?

Also can you run batchwise effectively like vllm on cuda?

Enough to run multiple agents at the same time with throughput?

aurareturn

Whoah, both the Pro and Max CPUs feature 18 cores. This hasn't happened since M1 Pro/Max. This is a surprise.

Also, the mix of cores have changed drastically.

- 6 "Super cores"

- 12 "Performance cores"

I'm guessing these are just renamed performance and efficiency cores from previous generations.

This is a massive change from the M4 Max:

- 12 performance cores

- 4 efficiency cores

This seems like a downgrade (in core config but may not be in actual MT) assuming super = performance and performance = efficiency cores.

klausa

I don't think the "new" Performance cores are just "renamed" "E" / "Efficiency" cores; Apple has retroactively renamed the baseline M5 nomenclature to say it has "10-core CPU with 4 super cores and 6 efficiency cores"; so they're clearly keeping the "efficiency cores" nomenclature around.

I think this is a new design, with Apple having three tiers of cores now, similar to what Qualcomm has been doing for a while.

I think how it breaks down is:

- "Super" are the old "P" cores, and the top tier cores now

- "Performance" cores are a new tier and seen for the first time here, slotting between "old" P and E in performance

- "Efficiency" / "E" are still going to be around; but maybe not in desktop/Pro/Max anymore.

aurareturn

Interesting. This is clearly a big CPU change if so. I wonder why no E cores. I’m sure E cores would be more efficient at OS tasks than the new performance cores.

For example, 6 super, 8 performance, and 4 efficiency.

NetMageSCW

Another commenter stated the P cores can be scaled down to be E cores dynamically, so why not?

netruk44

I think super cores are a new type/tier of core, not a rename of performance.

The base M5 has super/efficiency cores.

The Pro and Max have super/performance cores.

aurareturn

  Whoah, both the Pro and Max CPUs feature 18 cores. This hasn't happened since M1 Pro/Max. This is a surprise.
Replying to my own post. In hindsight, this shouldn't be any surprise because these chips are now chiplets. Apple is connecting a CPU die with a GPU die. This means they're designing just one CPU die rather than two. An Ultra would just be two of these CPU dies.

jacobp100

I was looking into this. The M5 performance cores can be scaled down to match efficiency cores in performance and power usage.

I believe they lower the clock speed, limit how much work is done in parallel on each core, and limit how aggressive the speculative execution is so less work is wasted.

aurareturn

  The M5 performance cores can be scaled down to match efficiency cores in performance and power usage.
Source for this?

pier25

Not sure if this was available in previous gens but my M4 Pro can run in low power mode. It's amazing. I can work for hours and only use 10-15% of the battery.

cced

So they renamed performance to mean efficiency and are now using super in place of performance?

petu

Super is old "performance" core:

> The industry-leading super core was first introduced as performance cores in M5, which also adopts the super core name for all M5-based products

But new "performance" is claimed to be new design (= not just overclocked efficiency core from M5?):

> M5 Pro and M5 Max also introduce an all-new performance core that is optimized to deliver greater power-efficient, multithreaded performance for pro workloads.

quotes from https://www.apple.com/newsroom/2026/03/apple-debuts-m5-pro-a...

undefined

[deleted]

Havoc

Intel is totally gonna steal that. They're catching so much flak for their "efficiency cores" I'm surprised they haven't done a rebrand yet

undefined

[deleted]

gkanai

Apple, if you are reading this, I'm not investing into new hardware until you fix the mess that is Tahoe. My M1 MAX is doing fine atm.

zer0zzz

Don't forget, Asahi runs real nice on the M1 and M2 series! Can't run Sequoia or Asahi Linux on an M5.

semiinfinitely

does it run on M3?

zer0zzz

Not yet. They got Wayland to boot in software rendering mode.

wslh

Full support now?

zer0zzz

Depends on your definition. Most things apart from touchid and usb-display work really well on M1-M2.

eviks

They are listening, MAXing out is a loud signal that they're doing great, how would you like your liquid glass served?

tristor

I am very excited by this, but I am a bit dampened that the maximum memory available is 128GB. I was really hoping for 256GB, which would allow me to run frontier models locally. I think with 128GB it's still feasible to use this with something like Qwen3-Coder-Next and MiniMax-M2.5, but things like Kimi-K2.5 will require significant quantization to fit and model performance will really suffer.

I'm really wanting to build proper local-first AI workflows at home, and I think Apple has an opportunity to make that possible in a way other companies aren't really focused on, but we need significantly larger memory capabilities to do it, which I know is tough in the current memory market but should be available for a cost.

vardump

Tell me about it. I checked the page thinking whether I should go for 256 GB or 512 GB RAM model.

128 GB maximum.

Sigh.

bombcar

I suspect that they're going to go to a "Ultra every third gen" so we will see a M6 Ultra.

tristor

I spent the last day deep diving on what I can do with MLX with local models. I still feel limited, because you have to use quantized models, but I think it's enough to do /something/, so I went ahead and bit the bullet and pre-ordered just now. I am driven a little bit by concern about ongoing memory market pressures over the next 1-3 years, and thinking it's a bit now or never.

vardump

Sigh. Maybe you are right.

heurs

Honest question. Is it possible to install an earlier version of macOS on these machines? Liquid glass looks so.. unprofessional to my eyes. And I hear it's also unstable.

dmix

You barely see any liquid glass on Tahoe. I keep my dock hidden and it's just the icons mostly which aren't that different than before.

myHNAccount123

Same here. Not really understanding the complaints for macOS. I think the addition of icons in the context and menus is worse than glass.

hakube

The border radius are terrible

philistine

I have a base M5 since last year. You cannot, no. It is literally impossible. Do with that what you will.

adamtaylor_13

That's a big part of what's keeping me from upgrading. Every time I look at my wife's iPhone I'm dumbfounded by just how bad the liquid glass looks.

It's the first time I've ever been so repulsed by a design that I actively avoid it just... out of sheer preference.

sysworld

I wasn't a fan either. But you get used to it.

Hasz

accessibility settings can turn off some (but not all) of the garish animations, transparencies, etc.

icambron

It does look terrible, but I haven't found it to be unstable, personally

zffr

Yes. This page has several ways to get older macOS versions: https://support.apple.com/en-us/102662, but the earliest macOS version you can use on Apple Silicon is macOS 11.

If you move your home directory to a different disk partition, you can even share it between two different macOS versions!

asimovDev

these Macs can't go below Tahoe. People on Mac Rumours were complaining about M5 MacBooks unable to install Sequoia, so it's safe to assume Pro/Max chips will be the same.

angulardragon03

This. You can’t downgrade below the version the device ships with (a forked build of the current version at time of mass production)

testfrequency

I have a fairly maxed out M2 Ultra (24 cores, 192GB RAM), and still cannot get this machine to choke on anything.

I have not once felt the need to upgrade in years, and that’s with doing pretty demanding 3D and LLM work.

prodigycorp

If there’s anything this past three years has taught me, it’s that modern cpus can performantly do every task except for streaming text over the internet.

pmdr

I had to upgrade the CPU in a 10-year old machine (from i5 to i7) to have decently -working javascript on websites. Every other piece of software worked fine, though.

hobofan

I'm pretty sure that's just LLMs tendency to replicate bad React patterns.

_jab

I've found current-generation Macs so capable that I've switched to using a Macbook Air. Would strongly recommend - it's still a powerful machine and it's significantly lighter and cheaper.

asimovDev

I have a M3 Max and considering that when I upgrade in 5 years or so , I will go base Mac Studio and base MBA. If I need to compile something or run a local LLM , I would just run that on the Studio and SSH from the Air. Wouldn't be running these heavy workflows while on the go anyways

testing22321

Yep. I got a used M1 air with 16GB and 2TB. It’s still the fastest computer I’ve ever used.

xbryanx

Would love to do that if it could support two additional displays with the lid open.

Aurornis

I have a powerful older Mac that doesn’t really “choke” on anything, but I could always use more speed.

The high memory Macs have been great for being able to run LLMs, but the prompt processing has always been on the slow side. The new AI acceleration in these should help with that.

There are also workloads like compiling code where I’ll take all the extra speed I can get. Every little bit of reduced cycle time helps me finish earlier in the day.

And then there’s gaming. I don’t game much, but the M1 and M2 era Apple Silicon feels sluggish relative to what I have on the nVidia side.

aurareturn

   and that’s with doing pretty demanding 3D and LLM work.
It definitely chokes with larger models that can fit the 192GB of RAM. Prompt processing is a big bottleneck before M5.

magicalist

> It definitely chokes with larger models that can fit the 192GB of RAM

M5 Max maxes out at 128GB, so that will have to wait for the eventual M5 Ultra anyways.

Sharlin

AI video generation can fairly easily choke anything that's not NVIDIA's flagship model. Even the latest local image gen models are so large that they can be frustratingly slow with non-optimal hardware even if they fit in the VRAM. IIRC when I had an M2, it was about 4x slower at running the venerable Stable Diffusion (and SDXL) than my meager RTX 3060.

testfrequency

I do not do anything with AI Video, but I imagine running this locally would be a hog on a Mac - especially if not optimised for Metal.

steve_adams_86

Mine is an M2 Max with only 32GB of RAM and while I'm sure you're doing things that would choke it, and there are a few things I'd like to be able to do but can't, it's insane how rarely I ever notice load on it. It feels like it'll be sufficient for a long time.

replwoacause

Sounds pretty beefy. What kind of local LLM is that thing capable of running? Does it open up real alternatives to cloud providers like OpenAI and Claude, or are the local models this hardware is capable of running still pretty far behind?

mikert89

Yeah I have an M1 Max, and I really want to upgrade, but there’s no reason to.

Daily Digest email

Get the top HN stories in your inbox every day.