The Codex App

Daily Digest email

Get the top HN stories in your inbox every day.

OlympicMarmoto

It is baffling how these AI companies, with billions of dollars, cannot build native applications, even with the help of AI. From a UI perspective, these are mostly just chat apps, which are not particularly difficult to code from scratch. Before the usual excuses come about how it is impossible to build a custom UI, consider software that is orders of magnitude more complex, such as raddbg, 10x, Superluminal, Blender, Godot, Unity, and UE5, or any video game with a UI. On top of that, programs like Claude Cowork or Codex should, by design, integrate as deeply with the OS as possible. This requires calling native APIs (e.g., Win32), which is not feasible from Electron.

gloosx

>This requires calling native APIs (e.g., Win32), which is not feasible from Electron.

Who told you that? You can write entire C libraries and call them from Electron just fine. Browser is a native application after all. All this "native applications" debate boils down to the UI implementation strategy. Maintaining three separate UI stacks (WinUI, SwiftUI, GTK/Qt) is dramatically more expensive and slower to iterate on than a single web-based UI with shared logic

We already have three major OSes, all doing things differently. The browsers, on the other hand, use the same language, same rendering model, same layout system, and same accessibility layer everywhere, which is a massive abstraction win.

You don't casually give up massive abstraction wins just to say "it's native". If "just build it natively" were actually easier, faster, or cheaper at scale, everyone would do just that.

gbalduzzi

It baffles me how much the discourse over native apps rarely takes this into consideration.

You reduce development effort by a third, it is ok to debate whether a company so big should invest into a better product anyway but it is pretty clear why they are doing this

SPICLK2

That might be true (although you do add in the mess of web frameworks), but I strongly believe that resource usage must factor into these calculations too. It's a net negative to end users if you can develop an app a bit quicker but require the end users to have multiple more times RAM, CPU, etc.

pydry

>You reduce development effort by a third

Done by the company which sells software which is supposed to reduce it tenfold?

vlovich123

There are cross platform GUI toolkits out there so while I am in team web for lots of reasons, generally it’s because web apps are faster and cheaper to iterate.

falcor84

> You reduce development effort by a third

Sorry to nitpick, but this should be "by three" or "by two thirds", right?

eloisant

The real question is how much better are native apps compared to Electron apps.

Yes that would take much disk space, but it takes 50Mb or 500Mb isn't noticeable for most users. Same goes for memory, there is a gain for sure but unless you open your system monitor you wouldn't know.

So even if it's something the company could afford, is it even worth it?

Also it's not just about cost but opportunity cost. If a feature takes longer to implement natively compared to Electron, that can cause costly delays.

vintagedave

> If "just build it natively" were actually easier, faster, or cheaper at scale, everyone would do just that

Value prop of product quality aside, isn't the AI claim that it helps you be more productive? I would expect that OpenAI would run multiple frontends and that they'd use Codex to do it.

Ie are they using their own AI (I would assume it's semi-vibe-coded) to just get out a new product or using AI to create a new product using the productivity gains to let them produce higher quality?

vintagedave

On a side note, the company I work for (RemObjects, not speaking on their behalf) has a value ethos specifically about using the native UI layers, and encouraging our customers to do the same. (We make dev tools, a compiler supporting six languages (C#, Java, Go, etc) plus IDEs.)

Our IDE does this: common code / logic, then a native macOS layer and a WPF layer. Yes, it takes a little more work (less than you'd think!) but we think it is the right way to do it.

And what I hope is that AI will let people do the same -- lower the cost and effort to do things like this. If Electron was used because it was a cheap way to get cross-platform apps out, AI should now be the same layer, the same intermediate 'get stuff done' layer, but done better. And I don't think this prevents doing things faster because AI can work in parallel. Instead of one agent to update the frontend, you have two to update both frontends, you know?

We're building an AI agent, btw. Initially targeting Delphi, which is a third party's product we try to support and provide modern solutions for. We'll be adding support for our own toolchains too.

What I fear is that people will apply AI at the wrong level. That they'll produce the same things, but faster: not the same things, but better (and faster.)

Closi

It's about consistency - you want to build an app that looks and functions the same on all platforms as much as possible. Regardless of if you are hand-coding or vibe-coding 3 entirely separate software stacks, getting everything consistent is going to be a challenge and subtle inconsistencies will sneak in.

It comes back to fundamental programming guidelines like DRY (Don't Repeat Yourself) - if you have three separate implementations in different languages for everything, changes will be come harder and you will move slower. These golden guidelines still stand in a vibe-code world.

AdamN

The gap here is that the company has the money and native apps are so clearly better. With an interactive app a company like OpenAI could really tweak the experience for Android and iOS which have different UX philosophies and featuresets in order to give the best experience possible. It's really a no brainer imho.

pimterry

> the company has the money

It's not about money. It's not a tradeoff in cost vs quality - it's a tradeoff in development speed. Shipping N separate native versions requires more development time for any given change: you must implement everything (at least every UI) N times, which drastically increases the design & planning & coordination required vs just building and shipping one implementation.

Do you want to move slower to get "native feel", or do you want to ship fast and get N times as much feature dev done? In a competitive race while the new features are flowing, development speed always wins.

Once feature development settles down, polish starts to matter more and the slowdown becomes less important, and then you can refocus.

GorbachevyChase

Wouldn’t maintaining the different UI stacks be something a language model could handle? Creating a new front end where the core logic is already defined or making a new one from an existing example has gone pretty fast for me. The “maintenance“ cost might not be as high as you think.

outime

>If "just build it natively" were actually easier, faster, or cheaper at scale, everyone would do just that.

Exactly. Years go by and HN keeps crying about this despite it being extremely easy to understand for anyone. For such a smart community, it's baffling how some debates are so dumb.

The only metric really worth reviewing is resource usage (and perhaps appearance). These factors aren't relevant to the general population as otherwise, most people wouldn't use these apps (which clearly isn't the case).

theknarf

React Native is able to build abstractions on top of both Android and iOS that uses native UI. Microsoft even have a package for doing a "React Native" for Windows: https://github.com/microsoft/react-native-windows

It's weird that we don't have a unified "React Native Desktop" that would build upon the react-native-windows package and add similar backends for MacOS and Linux. That way we could be building native apps while keeping the stuff developers like from React.

reverius42

There are such implementations for React Native: https://reactnative.dev/docs/out-of-tree-platforms

realusername

React Native desktop on Linux isn't a thing, the GTK backend is abandonned.

So if you want a multiplatform desktop app also supporting Linux, React Native isn't going to cut it.

knoopx

the three OSes is BS, none of them cares about linux

nineteen999

This is such a toy webdev take. It's like you guys forget that the web-browser wouldn't work at all if not for the server half, all compiled to native code.

The browser is compiled to native code. It wasn't that long ago that we had three seperate web browsers who couldn't agree on the same set of standards either.

Try porting your browser to Java or C# and see how much faster it is then. The OS the browser and the server run on are compiled to native code. Sun gave up on HotJava web browser in the 1990's, because it couldn't do %10 or %20 of what Netscape or IE could do, and was 10 x slower.

Not everybody is running a website selling internet widgets. Some of us actually have more on the line if our systems fail or are not performant than "oooh our shareholders are gonna be pissed".

People running critical emergency response systems day in, day out.

The very system you typed this bullshit on is running native code. But oh no, thats "too hard" for the webdev crowd. Everyone should bend to accomodate them. The OS should be ported to run inside the browser, because the browser is "so good".

Good one. It's hilarious to see this Silicon Valley/Bay Area, chia-seed eating bullshit in cheap spandex riding their bikes while the trucks shipping shit from coast to coast passing them by.

namelosw

The situation for Desktop development is nasty. Microsoft had so many halfassed frameworks and nobody knows which one to use. It’s probably the de facto platform on Windows IS Electron, and Microsoft use them often, too.

On MacOS is much better. But most of the team either ended up with locked in Mac-only or go cross platform with Electron.

tombert

I guess it shows how geriatric I am with desktop app development these days, but does no one use Qt anymore? Wasn't the dream for that to be a portable and native platform to write GUI apps? Presumably that could abstract away which bullshit Microsoft framework they came out with this week.

I haven't touched desktop application programming in a very long time and I have no desire to ever do so again after trying to learn raw GTK a million years ago, so I'm admittedly kind of speaking out of my ass here.

ogoffart

Qt is still used, but I think part of the reason it is less used is that C++ isn't always the right language anymore for building GUI application.

That’s actually why we're working on Slint (https://slint.dev): It's a cross-platform native UI toolkit where the UI layer is decoupled from the application language, so you can use Rust, JavaScript, Python, etc. for the logic depending on what fits the project better.

nine_k

Qt means C++. I'll take Typescript over C++ for a GUI task any day.

Qt is also pretty memory-hungry; maybe rich declarative (QML) skinnable adaptable UIs with full a11y support, etc just require some RAM no matter what. And it also looks a wee bit "non-native" to purists, except on Windows, where the art of uniform native look is lost.

Also, if you ever plan extensions / plugin support, you already basically have it built-in.

Yes, a Qt-based program may be wonderfully responsive. But an Electron-based app can be wonderfully responsive, too. And both can feel sluggish, even on great hardware. It all depends on a right architecture, on not doing any (not even "guaranteed fast") I/O in the GUI thread, mostly. This takes a bit of skill and, most importantly, consideration; both are in short supply, as usual.

The biggest problem with Electron apps is their size. Tauri, which relies on the system-provided web view component, is the reasonable way.

LtdJorge

And GTK4 is even very usable from Rust too. It’s not a bad development experience, but these companies probably find 100 webdevs for every system programmer.

rubymamis

I built my Block Editor (Notion-style) in Qt C++ and QML[1].

[1] https://get-notes.com

Anonyneko

One reason why I personally never bothered is the licensing of some of its important parts, which is a choice of either GPL or commercial. Which is fair, but too bothersome for some use-cases (e.g. mobile apps which are inherently GPL-unfriendly). Electron and the likes are typically MIT/BSD/etc licensed.

rjh29

Qt is still pretty good, but it's dated in comparison to newer frameworks like Flutter and React Native. No hot reloading of changes, manual widget management vs. React where you just re-define the whole UI every frame and it handles changes magically, no single source of truth for state, etc.

OlympicMarmoto

This is another common excuse.

You don't need to use microsoft's or apple's or google's shit UI frameworks. E.g. see https://filepilot.tech/

You can just write all the rendering yourself using metal/gl/dx. if you didn't want to write the rendering yourself there are plenty of libraries like skia, flutter's renderer, nanovg, etc

jarek-foksa

Customers simply don't care. I don't recall a single complain about RAM or disk usage of my Electron-based app to be reported in the past 10 years.

You will be outcompeted if you waste your time reinventing the wheel and optimizing for stuff that doesn't matter. There is some market for highly optimized apps like e.g. Sublime Text, but you can clearly see that the companies behind them are struggling.

incr_me

How is File Pilot for accessibility and for all of the little niceties like native scrolling, clipboard interaction, drag and drop, and so on? My impression is that the creator is has expertly focused on most/all of these details, but I don't have Windows to test.

I insist on good UI as well, and, as a web developer, have spent many hours hand rolling web components that use <canvas>. The most complicated one is a spreadsheet/data grid component that can handle millions of rows, basically a reproduction of Google Sheets tailored to my app's needs. I insist on not bloating the front-end package with a whole graph of dependencies. I enjoy my NIH syndrome. So I know quality when I see it (File Pilot). But I also know how tedious reinventing the wheel is, and there are certain corners that I regularly cut. For example there's no way a blind user could use my spreadsheet-based web app (https://github.com/glideapps/glide-data-grid is better than me in this aspect, but there's no way I'm bringing in a million dependencies just to use someone else's attempt to reinvent the wheel and get stuck with all of their compromises).

The answer to your original question about why these billion dollar companies don't create artisanal software is pretty straightforward and bleak, I imagine. But there are a few actually good reasons not to take the artisanal path.

josephg

I'd love to see some opensource projects actually do a good job of this. Its a lot of work, especially if you want:

- Good cross platform support (missing in filepilot)

- Want applications to feel native everywhere. For example, all the obscure keyboard shortcuts for moving around a text input box on mac and windows should work. iOS and Android should use their native keyboards. IME needs to work. Etc

- Accessibility support for people who are blind and low vision. (Screen readers, font scaling, etc)

- Ergonomic language bindings

Hitting these features is more or less a requirement if you want to unseat electron.

I think this would be a wonderful project for a person or a small, dedicated team to take on. Its easier than it ever used to be thanks to improvements in font rendering, cross platform graphics libraries (like webgpu, vulcan, etc) and improvements in layout engines (Clay). And how much users have dropped their standards for UI consistency ever since electron got popular and microsoft gave up having a consistent UI toolkit in windows.

There are a few examples of teams doing this in house (eg Zed). But we need a good opensource project.

embedding-shape

> You don't need to use microsoft's or apple's or google's shit UI frameworks. E.g. see https://filepilot.tech/

That's only for Windows though, it seems? Maybe the whole "just write all the rendering yourself using metal/gl/dx" is slightly harder than you think.

anaisbetts

That'll work great until your first customer from a CJK or RTL language writes in, "Hey, how come I can't type in your app?", or the blind user writes in "Hey how come your app is completely blank?" then you'll be right in the middle of the "Find Out" phase

These strategies are fine for toy apps but you cannot ship a production app to millions or even thousands of people without these basics.

Macha

“Render yourself with GPU APIs” has all the same problems with a11y, compatibility, inconsistent behaviour that electron has - the only one it might fix is performance and plenty of apps have messed that one up too

browningstreet

They’re all iterating products really fast. This Codex is already different than the last Codex app. This is all disposable software until the landscape settles.

namelosw

It's essentially asking application developers to wipe ass for OS developers like Microsoft. It's applaudible when you do it, understandable when you don't.

Even though OpenAI has a lot of cash to burn, they're not in a good position now and getting butchered by Anthropic and possibly Gemini later.

If any major player in this AI field has the power to do it's probably Google. But again, they've done the Flutter part, and the result is somewhat mixed.

At the end of the day, it's only HN people and a fraction of Redditors who care. Electron is tolerated by the silent majority. Nice native or local-first alternatives are often separate, niche value propositions when developers can squeeze themselves in over-saturated markets. There's a long way before the AI stuff loses novelty and becomes saturated.

zem

"native" is used for different things, from "use the platform's default gui toolkit" to "compile to a machine code binary". the former is a bit of a mess, but the latter is strictly better than wrapping a web view and shipping an entire chrome fork to display and interpret it. just write something in qt and forget about native look and feel, and the performance gain will be enough to greatly improve the user experience.

belfthrow

Should just use javafx or swing. Take a leaf out of intellij which while it as it's own performance problems (although not from the fact of the ui framework) has a fantastic ui across Mac / windows / nix

wizzledonker

Qt with QML works fine. The real reason is that companies can't hire enough native developers because the skill is comparitively rare.

harikb

As I outlined in a sibling comment. You can still use React and your JS developers. Just don't ship a whole browser with your app.

May be an app that is as complex as Outlook needs the pixel-perfect tweaking of every little button that they need to ship their own browser for exact version match. But everything else can use *system native browser*. Use Tauri or Wails or many other solutions like these

That said, I do agree on the other comments about TUIs etc. Yes, nobody cares about the right abstractions, not even the companies that literally depend on automating these applications

frumplestlatz

Given how much money they have, and the reach they're attempting to achieve, is it really asking too much that they hire native development teams? It's not like an application of this scale requires an army of engineers.

weaksauce

microsoft also uses react native for the start menu and also bricked that during a recent upgrade apparently... along with breaking other stuff.

r0fl

These companies have BILLIONS of dollars and some of the smartest people in the world and access to bleeding edge AI

There should be no excuses! Figure it out!

itemize123

it'll be the least important thing to do

deterministic

Win32 is the platform to use on Microsoft Windows. Everything else is built on top of it. So it will (a) work (b) be there forever.

herval

It’s just irrelevant for most users. These companies are getting more adoption than they can handle, no matter how clunky their desktop apps are. They’re optimizing for experimentation. Not performance.

preston-kwei

While this may be true for casual users, for dev native products like Codex, the desktop experience actually matters a lot. When you are living in the tool for hours, latency, keyboard handling, file system access, and OS-level integration stop being “nice to have” and start affecting real productivity. web or Electron apps are fine for experimentation, but they hit a ceiling fast for serious workflows -- especially if the icp is mostly technical users

Xenoamorphous

VSCode is arguably one of the most if not the most popular code editor these days…

tokioyoyo

Still good enough for the majority of the users.

supriyo-biswas

[flagged]

csomar

It's not irrelevant for developers neither for users. Tiktok has shown that users deeply care about the experience and they'll flock en-masse to something that has a good experience.

ramraj07

The experience in the claude app is fine.

IhateAI

More adoption? I don't think so... It feels to me that these models && tools are getting more verbose/consuming more tokens to compensate for a decrease in usage. I know my usage of these tools has fallen off a cliff as it become glaringly obvious they're useful in very limited scopes.

I think most people start off overusing these tools, then they find the few small things that genuinely improve their workflows which tend to be isolated and small tasks.

Moltbot et al, to me, seems like a psyop by these companies to get token consumption back to levels that justify the investments they need. The clock is ticking, they need more money.

I'd put my money on token prices doubling to tripling over the next 12-24 months.

deaux

> I'd put my money on token prices doubling to tripling over the next 12-24 months.

Chinese open weights models make this completely infeasible.

pilgrim0

I suspect making the models more verbose is also a source of inflation. You’d expect an advanced model to nail down the problem succinctly, rather than spawning a swarm of agents that brute force something resembling an answer. Biggest scam ever.

rafram

- Video games often use HTML/JS-based UI these days.

- UE5 has its own custom UI framework, which definitely does not feel "native" on any platform. Not really any better than Electron.

- You can easily call native APIs from Electron.

I agree that Electron apps that feel "web-y" or hog resources unnecessarily are distasteful, but most people don't know or care whether the apps they're running use native UI frameworks, and being able to reassign web developers to work on desktop apps is a significant selling point that will keep companies coming back to Electron instead of native.

harikb

I have been building Desktop apps with Go + Wails[1]. I happen to know Go, but if you are ai-coding even that is not necessary.

A full fledged app, that does everything I want, is ~ 10MB. I know Tauri+Rust can get it to probably 1 MB. But it is a far cry from these Electron based apps shipping 140MB+ . My app at 10MB does a lot more, has tons of screens.

Yes, it can be vibe coded and it is especially not an excuse these days.

[1] https://wails.io/

Microsoft Teams, Outlook, Slack, Spotify? Cursor? VsCode? I have like 10 copies of Chrome in my machine!

rafram

I've looked into Tauri and Wails, but they don't seem realistic for a cross-platform app with wide distribution across multiple platforms and platform versions.

One of Electron's main selling points is that you control the browser version. Anything that relies on the system web view (like Tauri and Wails) will either force you to aggressively drop support for out-of-date OS versions, or constantly check caniuse.com and ship polyfills like you're writing a normal web app. It also forces you to test CSS that touches form controls or window chrome on every supported major version of every browser, which is just a huge pain. And you'll inevitably run into bugs with the native -> web glue that you wouldn't hit with Electron.

It is absolutely wasteful to ship a copy of Chrome with every desktop app, but Tauri/Wails don't seem like viable alternatives at the moment. As far as I can tell, there aren't really any popular commercial apps using them, so I imagine others have come to the same conclusion.

hodanli

I am in love with wails. having python and JS background with no go experience. I pm'ed Ai agents to create a fairly complex desktop app for my own use. it is day and night in terms of performance compared to lightest electron app.

keyle

Wails is pretty good. I wrote a couple of apps but since I'm on macOS I ended up rewriting them in SwiftUI and that's much lighter of course since it uses all native APIs.

alabhyajindal

Wow Wails looks interesting! Hadn't heard of it before.

ukuina

Interesting. Does the Wails renderer support the full set of what Webkit/Chromium supports?

anonymous908213

Given that OpenAI managed to single-handedly triple the price of RAM globally, people will very much care about their chat application consuming what little scraps are left over for them, even if they don't know enough about anything to know why their system is running poorly.

undefined

[deleted]

pama

My main take is exactly the opposite. Why not build everything with a simple text interface (shell command) so the models learn to use these tools natively in pretraining. Even TUI like codex-cli or claude code are needless abstractions for such use cases and make full automation hard. You could add as many observability or input layers for humans as you want but the core should be simple calls that are saved in historical and documentation logs. [the headless/noninteractive modes come close, as do the session logs]

markbao

TUI is easy to train on, but hard to use for users. Part of the reason it’s easier to have LLMs use a bunch of Unix tools for us is that their text interface is tedious and hard to remember. If you’re a top 5% expert in those tools it doesn’t matter as much I guess but most people aren’t.

Even a full-featured TUI like Claude Code is highly limited compared to a visual UI. Conversation branching, selectively applying edits, flipping between files, all are things visual UI does fine that are extremely tedious in TUI.

Overall it comes down to the fact that people have to use TUI and that’s more important than it being easy to train, and there’s a reason we use websites and not terminals for rich applications these days.

pama

I use headless mode (-p) and keep my named shell histories as journals (so no curses/TUI or GUI). But session management or branching could improve at the tool level and allow seamless integration with completion tools, which could be a simple fast AI looking at recent sessions or even be visual, say for navigating and extending a particular complex branch of a past session. It is not too hard to work with such shell-based flows within Emacs for me, but it would be nice if there was some standardization effort on the tool fronts and some additional care for future automation. I dont want my AI clicking buttons if it can be precise instead. And I certainly want multithreading. I think of AI more as an OS; it needs a shell more than it needs windows at this point in time.

rdslw

all the examples for visual UI, are tasks which already are (or soon be) done by the agent, not human. hence not needed.

I suspect that final(*) UI is much more similar to TUI: being kind of conversational (human <> AI). Current GUIs provided by your bank/etc are much less effective/useful for us, comparing to conversation way: 'show/do me sth which I just need'. Not to mention (lack of) walled garden effect, and attention grabbing not in the user interest (popups, self-promo, nagging). Also if taking into account age factor. Also that we do not have to learn, yet another GUI (teach a new bank to your mom ;). So at least 4 distinct and important advantages for TUI.

My bet: TUI/conversation win (*).

*) there will be some UI where graphical information density is important (air controller?) especially in time critical environments. yet even there I suspect it's more like conversation with dynamic image/report/graph generated on the go. Not the UI per se.

ryandrake

It would be cool if I didn't have to worry about whether I was "in" or "out" of the AI TUI. Right now, I need at least two terminals open: One running my shell, that I use to issue shell commands, and one running Claude, where I can't. It would be nice if it could just be my shell, and when I wanted to invoke claude, I'd just type:

   c Do this programming task for me.

Right in the shell.

niobe

isn't that what Simon Willison's `llm` does?

edit: [link](https://github.com/simonw/llm)

stevejb

Most AI agents have a 'bash mode' and, you can use Warp terminal which is terminal first, but easy to activate the AI from the terminal. For example, if you mangle a jq command, it will use AI to suggest the right way to do it.

scottmf

-p

vardalab

I agree. I like using Claude or Codex in VM on top of the tmux. Much more flexibility that way. I open a new tmux window for each issue/task big enough to warrant it, issue a prompt to create a worktree and agents and let them go to town. I actually use claude and codex a the same time. I still get observability because of tmux and I can close my laptop and let them cook for a while in yolo mode since the VM is frequently backed up in proxmox pbs. I am a retired hobbyist but this has been a nice force multiplier without devolving a complete viby mess. I hope these new orchestration tool support this like vs code remote development does. Same for cloud. I want them to support my personal "cloud" instead of laggy github mess.

bopbopbop7

> even with the help of AI.

This is what you get when you build with AI, an electron app with an input field.

hdjrudni

Doesn't have to be. I just revived one of my C++ GLFW app from 2009. Codex was able to help me get it running again and added some nice new features.

I guess you get an Electron app if you don't prompt it otherwise. Probably because it's learned from what all the humans are putting out there these days.

That said.. unless you know better, it's going to keep happening. Even moreso when folks aren't learning the fundamentals anymore.

airstrike

I've done way, way more than that, as I'm sure others have too.

This is just bad product management.

bopbopbop7

So where is all this amazing software that you and others built with AI?

All I see is hype blog posts and pre-IPO marketing by AI companies, not much being shipped though.

measurablefunc

Their goal is to ship as fast as possible b/c they don't care about what you care about. Their objective is to gather as much data as possible & electron is good enough for that.

OlympicMarmoto

Claude code is perfectly capable of writing low level rendering and input code and it can be equally as mindless as vibe coding web apps.

E.g. just say "write a c++ gui widget library using dx11 and win32 and copy flutters layout philosophy, use harfbuzz for shaping, etc etc"

babypuncher

At the end of the day LLMs just reproduce the consensus of the internet, so it makes sense that a coding agent would spit out software that looks like most of what's on the internet.

LLM output is called slop for a reason.

AstroBen

What features are they missing that a native app would allow for?

No-one outside of a small sliver of the tech community cares if an app is built with web tech

Electron also opens up easier porting to Linux which almost certainly wouldn't happen if companies insist on native only

FridgeSeal

Users care about performance and jank, it’s just that they’ve been successfully forced to shut-up-and-deal-with-it. They’re not involved in purchasing or feedback, and the people that are don’t use it enough to care, or just don’t care. Users who complain about it may as well shout into the void for how much companies take note, but hey, at least we got an ai button now!

Atlassian products are a great example of this. Everyone knows Atlassian has garbage performance. Everyone complains about it. Never gets fixed though. Everyone I know could write customer complaints about its performance in every feedback box for a year, and the only thing that would happen is that we’d have wasted our own time.

Users _care_ about this stuff. They just aren’t empowered to feedback about it, or are trained to just sigh and put up with it.

AstroBen

I find outside of specific use cases the performance and jank are down to the developers and not whether it's native or not

Obsidian is an Electron app which is pretty much universally loved. We can both give single examples

itemize123

i think you've to be more nuanced here - perf becomes important only on the extreme. i think there are compromises to be made between perf and go-to-market.

tokioyoyo

“They just aren’t empowered to feedback about it, or are trained to just sigh and put up with it” is a roundabout way of saying users don’t care about it enough.

827a

It's baffling to me that people still throw around the word "native" like it means anything. Go use VSCode or Obsidian, then go use Apple Music. Electron can be so much better than anything native. The problem isn't that macos ChatGPT, Codex, or Claude isn't native. Their apps just really suck. They're poorly engineered and bad.

barumrho

Apple Music isn't native either.

827a

Oh right, I forgot, "native" just means "good". So if an app is bad, it can't be native, and if an electron app is actually good its because they're doing crazy optimizations that aren't feasible for mortal souls so don't even think about it. This is the "Hackers News Law of Application Nativeness".

jtrn

People's mileage may vary, but in my instance, this was so bad that I actually got angry while trying to use it.

It's slow and stupid. It does not do proper research. It does not follow instructions. It randomly decides to stop being agentic, and instead just dumps the code for me to paste. It has the extremely annoying habit of just doing stuff without understanding what I meant, making a mess, then claiming everything is fine. The outdated training data is extremely annoying when working with Nuxt 4+. It is not creative at solving problems. It dosent show the thinking. The Undo code does not give proper feedback on the diff and if it actually did "undo." And I hate the personality. It HAS to be better than it comes off for me because I am actually in a bad mood after having worked with it. I would rather YOLO code with Gemini 3 flash, since it's actually smarter in my assessment, and at least I can iterate faster, and it feels like it has better common sense.

Just as an example, I found an old, terrible app I made years ago for our firm that handles room reservations. I told it to update from Bootstrap to Flowbite UI. Codex just took forever to make a mess, installed version 2.7 when 4.0.1 is the latest, even when I explicitly stated that it should use the absolute latest version. Then it tried to install it and failed, so it reverted to the outdated CDN.

I gave the same task to Claude Code. Same prompt. It one-shotted it quickly. Then I asked it to swap out ALL the fetch logic to have SPA-like functionality with the new beta 4 version of HTMX, and it one-shot that too in the time Codex spent just trying to read a few files in the project.

This reminds me of the feeling I had when I got the Nokia N800. It was so promising on paper, but the product was so bad and terrible to use that I knew Nokia was done for. If this was their take on what an acceptable smartphone could be, it proves that the whole foundation is doomed. If this is OpenAI's take on what an agentic coding assistant should be—something that can run by itself and iterate until it completes its task in an intelligent and creative way.... OpenAI is doomed.

prodigycorp

If you're using 5.2 high, with all due respect, this has to be a skill issue. If you're using 5.2 Codex high — use 5.2 high. gpt-5.2 is slow, yes (ok, keeping it real, it's excruciatingly slow). But it's not the moronic caricature you're saying it is.

If you need it to be up to date with your version of a framework, then ask it to use the context7 mcp server. Expecting training data to be up to date is unreasonable for any LLM and we now have useful solutions to the training data issue.

If you need it to specify the latest version, don't say "latest". That word would be interpreted differently by humans as well.

Claude is well known at its one-shotting skills. But that's at the expense of strict instruction following adherence and thinner context (it doesn't spend as much time to gather context in larger codebases).

tomashubelbauer

I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

Sammi

Non codex gpt 5.2 is much better than codex gpt 5.2 for me. It does everything better.

theshrike79

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D

prodigycorp

i refuse to defend the 5.2-codex models. They are awful.

stitched2gethr

Perhaps if he was able to get Claude Code to do what he wanted in less time, and with a better experience, then maybe that's not a skill he (or the rest of us) want to develop.

blitzar

Talking LLMs off a ledge is a skill we will all need going forward.

keeganpoppen

still a skill issue, not a codex issue. sure, this line of critique is also one levied by tech bros who want to transfer your company's balance sheet from salaries to ai-SaaS(-ery), but in what world does that automatically make the tech fraudulent or even deficient? and since when is not wanting to develop a skill a reasonable substitute for anything? if my doctor decided they didn't want to keep up on medical advances, i would find a different doctor. but yet somehow finding fault with an ai because it can't read your mind and, in response to that adversity, refusing to introspect at all about why that might be and blaming it on the technology is a reasonable critique? somehow we have magically discovered a technology to manufacture cognition from nothing more than the intricate weaving of silicon, dopants, et al., and the takeaway is that it sucks because it is too slow, doesn't get everything exactly right, etc.? and the craziest part is that the more time you spend with it, the better intuition you get for getting whatever it is you want out of it. but, yeah... let's lend even more of an ear to the head-in-sand crowd-- that's where the real thought leaders are. you don't have to be an ai techno-utopian maximalist to see the profound worthiness and promise of the technology; these things are manifestly self-evident.

prodigycorp

Sure, that's fine. I wrote my comment for the people who don't get angry at an AI agents after using them for the first time within five hours of their release. For those who aren't interested in portending doom for OpenAI. (I have elaborate setups for Codex/Claude btw, there's no fanboying in this space.)

Some things aren't common sense yet so I'm trying my part to make them so.

miki123211

TBH, "use a package manager, don't specify versions manually unless necessary, don't edit package files manually" is an instructions that most agents still need to be given explicitly. They love manually editing package.json / cargo.toml / pyproject.toml / what have you, and using whatever version is given in their training data. They still don't have an intuition for which files should be manually written and which files should be generated by a command.

prodigycorp

Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.

jtrn

I try to make it a point to acknowledge when I am wrong. And after some time, and maybe after the release of Codex 5.3, the Codex App is a VERY good agentic coder.

I stand by my initial impression, so far as the claim that it did piss me off, but I completely retract my prediction that Codex/OpenAI is going down the Nokia death spiral based on the performance of the current Codex App paired with the Codex 5.3 model.

I think a combination of the new model version, and me having maximum bad luck in my first test (it truly didn’t do that great), made me come to the wrong conclusion. Codex App is actually pretty good. Thanks for pushing back.

hyeomans

Ty for the tip on context7 mcp btw

adammarples

How would a person interpret the latest version of flowbite?

undefined

[deleted]

jtrn

Ok. You do you. I'll stick with the models that understand what latest version of a framework means.

chandureddyvari

Agreed, had the same experience. Codex feels lazy - I have to explicitly tell it to research existing code before it stops giving hand-wavy answers. Doc lookup is particularly bad; I even gave it access to a Context7 MCP server for documentation and it barely made a difference. The personality also feels off-putting, even after tweaking the experimental flag settings to make it friendlier.

For people suggesting it’s a skill issue: I’ve been using Claude Code for the past 6 months and I genuinely want to make Codex work - it was highly recommended by peers and friends. I’ve tried different model settings, explicitly instructed it to plan first and only execute after my approval, tested it on both Python and TypeScript backend codebases. Results are consistently underwhelming compared to Claude Code.

Claude Code just works for me out of the box. My default workflow is plan mode - a few iterations to nail the approach, then Claude one-shots the implementation after I approve. Haven’t been able to replicate anything close to that with Codex

rohansood15

+1 to this. Been using Codex the last few months, and this morning I asked it to plan a change. It gave me generic instructions like 'Check if you're using X' or 'Determine if logic is doing Y' - I was like WTF.

dudeinhawaii

Curious, are you doing the same planning with Codex out-of-band or otherwise? In order to have the same measurable outcome you'd need to perhaps use Codex in a plan state (there's experimental settings - not recommended) or other means (explicit detailed -reusable- prompt for planning a change). It's a missing feature if your preference is planning in CLI (I do not prefer this).

You are correct in that this mode isn't "out of the box" as it is with Claude (but I don't use it in Claude either).

My preference is to have smart models generate a plan with provided source. I wrote (with AI) a simple python tool that'll filter a codebase and let me select all files or just a subset. I then attach that as context and have a smart model with large context (usually Opus, GPT-5.2, and Gemini 3 Pro in parallel), give me their version of a plan. I then take the best parts of each plan, slap it into a single markdown and have Codex execute in a phased manner. I usually specify that the plan should be phased.

I prefer out-of-CLI planning because frankly it doesn't matter how good Codex or Claude Code dive in, they always miss something unless they read every single file and config. And if they do that, they tip over. Doing it out of band with specialized tools, I can ensure they give me a high quality plan that aligns with the code and expectations, in a single shot (much faster).

Then Claude/Codex/Gemini implement the phased plan - either all at once - or stepwise with me testing the app at each stage.

But yeah, it's not a skill issue on your part if you're used to Plan -> Implement within Claude Code. The Experimental /collab feature does this but it's not supported and more experimental than even the experimental settings.

adithyassekhar

I'm not taking OpenAI's side here but have you reviewed what claude did?

spiderfarmer

I only use claude through the chat ui because it’s faster and it gives me more control. I read most of it and the code is almost always better than what I would do, simply because lazy ass me likes to take shortcuts way too often.

dbbk

I just want Anthropic to spend like two weeks making their own "Codex app", but with Opus.

strongpigeon

Genuinely excited to try this out. I've started using Codex much more heavily in the past two months and honestly, it's been shockingly good. Not perfect mind you, but it keeps impressing me with what it's able to "get". It often gets stuff wrong, and at times runs with faulty assumptions, but overall it's no worse than having average L3-L4 engs at your disposal.

That being said, the app is stuck at the launch screen, with "Loading projects..." taking forever...

Edit: A lot of links to documentation aren't working yet. E.g.: https://developers.openai.com/codex/guides/environments. My current setup involves having a bunch of different environments in their own VMs using Tart and using VS Code Remote for each of them. I'm not married to that setup, but I'm curious how it handles multiple environments.

Edit 2: Link is working now. Looks like I might have to tweak my setup to have port offsets instead of running VMs.

raw_anon_1111

I have the $20 a month subscription for ChatGPT and the $200/year subscription to Claude (company reimbursed).

I have yet to hit usage limits with Codex. I continuously reach it with Claude. I use them both the same way - hands on the wheel and very interactive, small changes and tell them both to update a file to keep up with what’s done and what to do as I test.

Codex gets caught in a loop more often trying to fix an issue. I tell it to summarize the issue, what it’s tried and then I throw Claude at it.

Claude can usually fix it. Once it is fixed, I tell Claude to note in the same file and then go back to Codex

strongpigeon

The trick to reach the usage limit is to run many agents in parallel. Not that it’s an explicit goal of mine but I keep thinking of this blog post [0] and then try to get Codex to do as much for me as possible in parallel

[0]: http://theoryofconstraints.blogspot.com/2007/06/toc-stories-...

raw_anon_1111

Telling a bunch of agents to do stuff is like treating it as a senior developer who you trust to take an ambiguous business requirement and letting them use their best judgment and them asking you if they have a question .

But doing that with AI feels like hiring an outsourcing firm for a project and they come back with an unmaintable mess that’s hard to reason through 5 weeks later.

I very much micro manage my AI agents and test and validate its output. I treat it like a mid level ticket taker code monkey.

motbus3

I will say that doing small modifications or asking a bunch of stuff fills the context the same in my observations. It depends on your codebase and the rest of stuff you use (sub agents, skills, etc)

I was once minimising the changes and trying to take the max of it. I did an uncountable numbers of tests and and variations. Didn't really matter much if I told it to do it all or change one line. I feel Claude code tries to fill the context as fast as possible anyway

I am not sure how worth Claude is right now. I still prefer that rather than codex, but I am starting to feel that's just a bias

girvo

I don’t think it’s bias: I have no love for any of these tools, but in every evaluation we’ve done at work, Opus 4.5 continually comes out ahead in real world performance

Codex and Gemini are both good, but slower and less “smart” when it comes to our code base

petesergeant

I have a found Codex to be an exceptional code-reviewer of Claude's work.

650REDHAIR

I hit the Claude limit within an hour.

Most of my tokens are used arguing with the hallucinations.

I’ve given up on it.

hnsr

Do you use Claude Code, or do you use the models from some other tool?

I find it quite hard to hit the limits with Claude Code, but I have several colleagues complaining a lot about hitting limits and they use Cursor. Recently they also seem to be dealing with poor results (context rot?) a lot, which I haven't really encountered yet.

I wonder if Claude Code is doing something smart/special

TuxSH

In my case I've had it (Opus Thinking in CC) hit 80% of the 5-hour limit and 100% of the context window with one single tricky prompt, only to end up with worthless output.

Codex at least 'knows' to give up in half the time and 1/10th of the limits when that happens.

theshrike79

I don't want to be That Guy, but if you're "arguing with hallucinations" with an AI Agent in 2026 you're either holding it wrong or you're working on something highly nonstandard.

Areibman

Your goal should be to run agents all the time, all in parallel. If you’re not hitting limits, you’re massively underutilizing the VC intelligence subsidy

https://hyperengineering.bottlenecklabs.com/p/the-infinite-m...

dkundel

Hey thank you for calling out the broken link. That should be fixed now. Will make sure to track down the other broken links. We'll track down why loading is taking a while for you. Should definitely be snappier.

wahnfrieden

Is this the only announcement for Apple platform devs?

I thought Codex team tweeted about something coming for Xcode users - but maybe it just meant devs who are Apple users, not devs working on Apple platform apps...

SunshineTheCat

Same here. From my experience, codex usually knocks backend/highly "logical?" tasks out of the park while fairly basic front-end/UI tasks it stumbles over at times.

But overall it does seem to be consistently improving. Looking to see how this makes it easier to work with.

adithyassekhar

Backend, regardless of language or framework are often set in stone. There's a well defined/most used way for everything. Especially since most apps when reduced is CRUD. Frontend by the nature of how frontend works, can be completely different from project to project if one wants to architect it efficiently.

xiphias2

Cool, looks like I'll stay on Cursor. All alternatives come out buggy, they care a lot about developer experience.

BTW OpenAI should think a bit about polishing their main apps instead of trying to come out with new ones while the originals are still buggy.

embirico

(I work on Codex) One detail you might appreciate is that we built the app with a ton of code sharing with the CLI (as core agent harness) and the VSCode extension (UI layer), so that as we improve any of those, we polish them all.

theLiminator

Any chance you'll enable remote development on a self-hosted machine with this app?

Ie. I think the codex webapp on a self-hosted machine would be great. This is impotant when you need a beefier machine (with potentially a GPU).

thefounder

Any reason to switch from vscode with codex to this app? To me it looks like this app is more for non-developers but maybe I’m missing something

tomjen3

Awesome. Any chance we will see a phone app?

I know coding on a phone sounds stupid, but with an agent it’s mostly approvals and small comments.

nr378

Looks like another Claude App/Cowork-type competitor with slightly different tradeoffs (Cowork just calls Claude Code in a VM, this just calls Codex CLI with OS sandboxing).

Here's the Codex tech stack in case anyone was interested like me.

Framework: Electron 40.0.0

Frontend:

- React 19.2.0

- Jotai (state management)

- TanStack React Form

- Vite (bundler)

- TypeScript

Backend/Main Process:

- Node.js

- better-sqlite3 (local database)

- node-pty (terminal emulation)

- Zod (validation)

- Immer (immutable state)

Build & Dev:

- pnpm (package manager)

- Electron Forge

- Vitest (testing)

- ESLint + Prettier

Native/macOS:

- Sparkle (auto-updates)

- Squirrel (installer)

- electron-liquid-glass (macOS vibrancy effects)

- Sentry (error tracking)

epolanski

They have the same stack of a boot camper, quite telling.

undefined

[deleted]

dcre

The use of the name Codex and the focus on diffs and worktrees suggests this is still more dev-focused than Cowork.

nxobject

It's a smart move – while Codex has the same aspirations, limiting it to savvy power users will likely lead to better feedback, and less catastrophic misuse.

elpakal

> this just calls Codex CLI with OS sandboxing

The git and terminal views are a big plus for me. I usually have those open and active in addition to my codex CLI sessions.

Excited to try skills, too.

another_twist

Is the integration with Sentry native or via MCP ?

hdjrudni

What does Sentry via MCP even mean? You want the LLM to call Sentry itself whenever it encounters an error?

another_twist

Meaning sentry exposes an MCP layer with a tool call layer and tool registry. In this case, the layer is provided by Sentry. Native would mean if calling specific Sentry APIs is provided as a specific integration path depending on the context. Atleast thats how I categorize.

firemelt

wow where did you find the stack? how about claude app stack?

samuelstros

It's basically what Emdash (https://www.emdash.sh/), Conductor (https://www.conductor.build/) & CO have been building but as first class product from OpenAI.

Begs the question if Anthropic will follow up with a first-class Claude Code "multi agent" (git worktree) app themselves.

FanaHOVA

https://code.claude.com/docs/en/desktop

samuelstros

oh i didn't know that claude code has a desktop app already

esafak

And it uses worktrees.

mcintyre1994

It isn’t its own app, but it’s built in to their desktop, mobile and web apps.

desireco42

I never heard of Emdash before and I am following on AI tools closely. It just shows you how much noise there is and how hard is to promote the apps. Emdash looks solid. I almost went to build something similar because I wasn't aware of it.

another_twist

I am not sure if multi agent approach is what it is hyped up to be. As long we are working on parallel work streams with defined contracts (say an agreed upon API def that backend implements and frontend uses), I'd assume that running independent agent coding sessions is faster and in fact more desirable so that neither side bends the code to comply with under specified contracts.

sepositus

Usually I find the hype is centered around creating software no one cares about. If you're creating a prototype for dozens of investors to demo - I seriously doubt you'd take the "mainstream" approach.

IMTDb

Maybe a dumb question on my side; but if you are using a GUI like emdash with Claude Code, are you getting the full claude code harness under the hood or are you "just" leveraging the model ?

atestu

I can answer for Conductor: you're getting the full Claude Code, it's just a GUI wrapper on top of CC. It makes it easy to create worktrees (1 click) and manage them.

heystefan

I don't think this is true. Try running `/skills` or `/context` in both and you'll see.

mritchie712

yeah, I wanted a better terminal for operating many TUI agent's at once and none of these worked because they all want to own the agent.

I ended up building a terminal[0] with Tauri and xterm that works exactly how I want.

0 - screenshot: https://x.com/thisritchie/status/2016861571897606504?s=20

saadn92

looks like we both did haha: https://github.com/saadnvd1/aTerm

arnestrickmann

Emdash is inducing CC, Codex, etc. natively. Therefore users are getting the raw version of each agent.

krzyzanowskim

until then, the https://CommanderAI.app tries to fill the gap on Mac

asdev

They have Claude Code web in research preview

dbbk

It still doesn't support plan mode... I'm really confused why that's so hard to do

nycdatasci

The landing page for the demo game "Voxel Velocity" mentions "<Enter> start" at the bottom, but <Enter> actually changes selection. One would think that after 7mm tokens and use of a QA agent, they would catch something like this.

anematode

It's interesting, isn't it? On the one hand the game is quite impressive. Although it doesn't have anything particularly novel (and it shouldn't, given the prompt), it still would have taken me several days, probably a week, working nonstop. On the other hand, there's plenty of paper cuts.

I think these subtle issues are just harder to provide a "harness" for, like a compiler or rigorous test suite that lets the LLM converge toward a good (if sometimes inelegant) solution. Probably a finer-tuned QA agent would have changed the final result.

why_at

It's also interesting how the functionality of the game barely changes between 60k tokens, 800k tokens, and 7MM tokens. It seems like the additional tokens made the game look more finished, but it plays almost exactly the same in all of them.

I wonder what it was doing with all those tokens?

zamadatix

I'd bet the initial token usage is all net new while the later token usage probably has reading+regenerating significant portions of the project for individual minor changes/fixes.

E.g. I wouldn't be surprised if identifying the lack of touch screen support on the menu, feeding it in, and then regenerating the menu code sometime between 800k and 7MM took a lot of tokens.

mazswojejzony

Sadly, my own small game-dev adventures look similar: I can implement the core mechanics fairly quickly, but polishing the game takes ages.

UPDATE: without AI usage at all (just to clarify).

oneneptune

I'm a Claude Code user primarily. The best UI based orchestrator I've used is Zenflow by Zencoder.ai -- I am in no way affiliated with them, but their UI / tool can connect to any model or service you have. They offer their own model but I've not used it.

What I like is that the sessions are highly configurable from their plan.md which translates a md document into a process. So you can tweak and add steps. This is similar to some of the other workflow tools I've seen around hooks and such -- but presented in a way that is easy for me to use. I also like that it can update the plan.md as it goes to dynamically add steps and even add "hooks" as needed based on the problem.

sepositus

Always sounds so interesting and then I do a search only to found out it's another product trying to sell you your 20th "AI credit package." I really don't see how these apps will last that long. I pay for the big three already - and no I don't want to cancel them just so I can use your product.

cactusplant7374

Aren't there 500+ aggregator services?

vzaliva

How about us, Linux users? This is Mac only. Do they plan to support CLI version with all the features they are adding to desktop app?

romainhuet

Hi! Romain here, I work at OpenAI. The team actually built the Codex app in Electron so we can support both Windows and Linux very soon. Stay tuned!

tonkinai

Do you plan to release a build for Mac Intel?

miscend

You can try this script https://github.com/Miscend/codex-rebuilder

undefined

[deleted]

freedomben

Nice, thank you for sharing!

miscend

Would it be tricky to make an Intel Mac build?

AlexCoventry

Are you planning to open-source it?

kurtis_reed

Let me guess, you use MacOS yourself?

wutwutwat

not only is it mac only, it appears to be arm only as well. App won't launch on my intel mac

miscend

there a non official script to get it running on intel https://github.com/Miscend/codex-rebuilder

Rudybega

Yeah, I'm having the same issue. Disappointing limitations.

goniszewski

Guess MacOS gives you pass for early-access stuff, right? /s

From a developer's perspective it makes sense, though. You can test experimental stuff where configurations are almost the same in terms of OS and underlying hardware, so no weird, edge-case bugs at this stage.

waldopat

It seems the big feature is working agents in parallel? I've been working agents in parallel in Claude Code for almost 9 months now. Just create a command in .claude/commands that references an agent in .claude/agents. You can also just call parallel default Task agents to work concurrently.

Using slash commands and agents has been a game changer for me for anything from creating and executing on plans to following proper CI/CD policies when I commit changes.

To Codex more generally, I love it for surgical changes or whenever Claude chases its tail. It's also very, very good at finding Claude's blindspots on plans. Using AI tools adversarially is another big win in terms of getting things 90% right the first time. Once you get the right execution plan with the right code snippets, Claude is essentially a very fast typer. That's how I prefer to do AI-assisted development personally.

That said, I agree with the comments on tokens. I can use Codex until the sun goes down on $20/month. I use the $200/month pro plan with Claude and have only maxxed out a couple times, but I do find the volume to quality to be better with Claude. So far it's worth the money.

surrTurr

- looks like OpenAIs answer to Claude Code Desktop / Cowork

- workspace agent runner apps (like Conductor) get more and more obsolete

- "vibe working" is becoming a thing - people use folder based agents to do their work (not just coding)

- new workflows seem to be evolving into folder based workspaces, where agents can self-configure MCP servers and skills + memory files and instructions

kinda interested to see if openai has the ideas & shipping power to compete with anthropic going forward; anthropic does not only have an edge over openai because of how op their models are at coding, but also because they innovate on workflows and ai tooling standards; openai so far has only followed in adoption (mcp, skills, now codex desktop) but rarely pushed the SOTA themselves.

OkGoDoIt

Also interesting that they are both only for macOS. I’m feeling a bit left out on the Windows and Linux side, but this seems like an ongoing trend.

surrTurr

my guess is that openai/anthropic employees work on macOS and mostly vibe code these new applications (be it Atlas browser or now Codex Desktop); i wouldn't be surprised if Codex Desktop was built in a month or less;

linux / windows requires extra testing as well as some adjustments to the software stack (e.g. liquid glass only works on mac); to get the thing out the door ASAP, they release macos first.

abshkbh

We did train Codex models natively on Windows - https://openai.com/index/introducing-gpt-5-2-codex/ (and even 5.1-codex-max)

hdjrudni

I appreciate this (as a Windows user) but I'm also curious how necessary this was.

Like I notice in Codex in PhpStorm it uses Get-Whatever style PowerShell commands but firstly, I have a perfectly working Git-Bash installed that's like 98% compatible with Linux and Mac. Could it not use that instead of being retrained on Windows-centric commands?

But better yet, probably 95% of the commands it actually needs to run are like cat and ripgrep. Can't you just bundle the top 20 commands, make them OS-agnostic and train on that?

The last tiny bit of the puzzle I would think is the stuff that actually is OS-specific, but I don't know what that would be. Maybe some differences in file systems, sandboxing, networking.

scottyah

A lot of companies that use Windows are likely to use Microsoft Office products, and they were all basically forced to sign a non-compete where they can't run other models- just copilot.

RazerWazer

I'm so sick and tired of the macOS elitism in the AI/LLM world.

theshrike79

It's just realism.

MacOS is unix under the hood so the models can just use bash and cli tools easily instead of dealing with WSL or Powershell.

MacOS has built-in sandboxing at a better level than Windows (afaik the Codex App is delayed for Windows due to sandboxing complexities)

Also the vast majority of devs use MacBooks unless they work for Microsoft or are in a company where the vast majority of employees are locked to Windows for some reason (usually software related).

mellosouls

Mac only. Again.

Apple is great but this is OpenAI devs showing their disconnect from the mainstream. Its complacent at best, contemptuous at worst.

SamA or somebody really needs to give the product managers here a kick up the arse.

romainhuet

Hi! Romain here, I work on Codex at OpenAI. We totally hear you. The team actually built the app in Electron specifically so we can support Windows and Linux as well. We shipped macOS first, but Windows is coming very soon. Appreciate you calling this out. Stay tuned!

anonymous908213

Electron? Why can't Codex write, or at least translate, your application to native code instead of using a multi-hundred-mb browser wrapper to display text? Is this the future of software engineering Codex is promising me?

embirico

Only thing i'd add re windows is it's taking us some time to get really solid sandboxing working on Windows, where there are fewer OS-level primitives for it. There's some more at https://developers.openai.com/codex/windows and we'd love help with testing and feedback to make it robust.

Oras

Curios why electron not native?

Wouldn’t native give better performance and more system integration?

ForHackernews

When you're a trillion dollar company that burns more coal than Bangladesh in order to harness a hyperintelligent Machine God to serve your whims, you don't have the resources to maintain native clients for three separate targets.

rolymath

[flagged]

EastSmith

Really hoped you'd say Linux next.

romainhuet

We should be able to make it available for Linux very soon as well!

tptacek

If you were going to release a product for developers as soon as it was ready for developers to try, such that you could only launch on one platform and then follow up later with the rest, macOS is the obvious choice. There's nothing contemptuous about that.

nashashmi

Check out https://news.ycombinator.com/item?id=46829029

mellosouls

Kudos to the OpenAI reps for responding to my comment and doing so politely.

My ire was provoked by this following on from the Windows ChatGPT app that was just a container for the webpage compared to the earlier bells and whistles Mac app. Perceptions are built on those sorts of decisions.

piskov

Because of that windows had thinking budget selectors for months before ios and macos (those got this only last week)

dkundel

Windows is almost ready. It's already running but we are solving a few more things before the release to make sure it works well.

rubslopes

To me, the obvious next step for these companies is to integrate their products with web hosting. At this point, the remaining hurdle for non-developers is deploying their creations to the cloud with built-in monetization.

Bishonen88

I think deploying can already be done with the help of LLMs using docker and vpc's (e.g. hetzner and co.) rather easily.

What I struggle with is the legal overhead of e.g. collecting money for an app/website. I have a semi-finished app which I know I could delploy within a few hours but to collect money, living in Germany is a minefield from what I understand. I don't want my name made public with the app. GmbH (LLCs) cost thousands (?). The whole GDPR minefield, google-font usage scam etc. makes me hold back.

Googling/reddit only gives so much insights.

If someone has a good reference about starting a SaaS/App from within EU/Germany with all the legalities etc. I'd be super interested!

jimmy76615

Just tell it to use your gcp/aws account using the cli, makes it infinitely powerful in terms of deployment. (Also, while I might miss some parts of programming that I have given to AI, I certainly don't miss working with clouds).

wiether

> Just tell it to use your gcp/aws account using the cli

Please don't.

People burning through their tokens allowance on Claude Code is one thing.

People having their agent unknowingly provisioning thousands of $ of cloud resources is something completely different.

theshrike79

This is also on the cloud providers for not giving us good tools to manage costs.

trunnell

How about, "tell the agent to write instructions for cloud deployment with a cost estimate"

iamnotarobottho

and specifically, the big companies, in a way that people notice. Claude Artifacts, AI Studio, etc. all kinda suck. If you have used Manus or connected your own CF, GCP, AWS, etc. you see how easy it could be if one of the big guys wanted it to be (or could get out of their own way).

the big boys probably don't want people who don't know sec deploying on their infra lol.

halflings

Deploying from Antigravity is as easy as say connecting the Firebase MCP [1] and asking it "deploy my app to firebase".

[1] https://firebase.google.com/docs/ai-assistance/mcp-server

falloutx

I dont think these are made for non-devs, Lovable and other which are built for non-devs already provide hosting.

fabianlindfors

We have been working on this, letting any coding agent define infrastructure so we can define it effortlessly: https://specific.dev. We aren't just targeting non-developers though, we think this is useful to anyone building primarily through coding agents.

anthonypasq

Replit already does this

johnpaulkiser

interestingly opencode's first product was an IaC platform... seems to be where this is all going.

Daily Digest email

Get the top HN stories in your inbox every day.