GitHub Actions is slowly killing engineering teams

Daily Digest email

Get the top HN stories in your inbox every day.

danpalmer

I've used many of the CI systems that the author has here, and I've done a lot of CircleCI and GitHub Actions, and I don't come to quite the same conclusions. One caveat though, I haven't used Buildkite, which the author seems to recommend.

Over the years CI tools have gone from specialist to generalist. Jenkins was originally very good at building Java projects and not much else, Travis had explicit steps for Rails projects, CircleCI was similarly like this back in the day.

This was a dead end. CI is not special. We realised as a community that in fact CI jobs were varied, that encoding knowledge of the web framework or even language into the CI system was a bad idea, and CI systems became _general workflow orchestrators_, with some logging and pass/fail UI slapped on top. This was a good thing!

I orchestrated a move off CircleCI 2 to GitHub Actions, precisely because CircleCI botched the migration from the specialist to generalist model, and we were unable to express a performant and correct CI system in their model at the time. We could express it with GHA.

GHA is not without its faults by any stretch, but... the log browser? So what, just download the file, at least the CI works. The YAML? So it's not-quite-yaml, they weren't the first or last to put additional semantics on a config format, all CI systems have idiosyncrasies. Plugins being Docker images? Maybe heavyweight, but honestly this isn't a bad UX.

What does matter? Owning your compute? Yeah! This is an important one, but you can do that on all the major CI systems, it's not a differentiator. Dynamic pipelines? That's really neat, and a good reason to pick Buildkite.

My takeaway from my experience with these platforms is that Actions is _pretty good_ in the ways that truly matter, and not a problem in most other ways. If I were starting a company I'd probably choose Buildkite, sure, but for my open source projects, Actions is good.

dijit

I actually have the opposite opinion.

In game development we care a lot about build systems- and annoyingly, we have vanishingly few companies coming to throw money at our problems.

The few that do, charge a kings ransom (Incredibuild). Our build times are pretty long, and minimising them is ideal.

If, then, your build system does not understand your build-graph then you’re waiting even longer for builds or you’re keeping around incremental state and dirty workspaces (which introduces transient bugs, as now the compiler has to do the hard job of incrementally building anyway).

So our build systems need to be acutely aware of the intricacies of how the game is built (leading to things like UnrealEngine Horde and UBA).

If we used a “general purpose” approach we’d be waiting in some cases over a day for a build, even with crazy good hardware.

maccard

Also game dev here - I disagree with your take. Our _build tools_ need to be hyper aware but our CI systems absolutely do not and would be better served as general purpose. What good is Horde when you need to deploy your already packaged game to steam via steamcmd, or when you need to update a remote config file for a content hotfix. Horde used BuildGraph meaning you need a full engine sync’ed node to run curl -X POST whatever.com

Game dev has a serious case of NIH - sometimes for good reasons but in lots of cases it’s because things have been set up in a way that makes changing that impractical. Using UBA as an example - FastBuild, Incredibuild, SNDBS Sccache all exist as either caching or distribution systems. Compiling a game engine isn’t much different to compiling a web browser (which ninja was written for).

I’ve worked at two game studios where we’ve used general purpose CI systems and been able to push out builds in < 15 minutes. Horde and UBA exist to handle how epic are doing things internally, rather than as an inherent requirement on how to use the tools effectively. If you don’t have the same constraints as developing Unreal Engine (and Fortnite) then you don’t have the same needs.

(I worked for epic when horde came online, but don’t any more).

frankharrison

If you're at a games studio that values build-times, value that. I worked at a very good SRE-mindset studio and missed it, deeply, after I left. Back then I expected everyone to think and care about such things and have spent many, many hours advocating for best-in-class, more efficient, cheaper development practices.

WRT github actions... I agree with OOP, they leave much to be desired, esp when working on high-velocity work. My ci/cd runs locally first and then GHA is (slower) verification, low-noise, step.

SOLAR_FIELDS

Actions is many things. It’s an event dispatcher, an orchestrator, an execution engine and runtime, an artifact registry and caching system, a workflow modeler, a marketplace, and a secrets manager. And I didn’t even list all of the things Actions is. It’s better at some of those things and not others.

The systems I like to design that use GHA usually only use the good parts. GitHub is a fine events dispatcher, for instance, but a very bad workflow orchestrator. So delegate that to a system that is good at that instead

kyleee

Has anyone done the “GitHub Actions: The Good Parts” book yet?

dataflow

> but... the log browser? So what, just download the file, at least the CI works.

They answer your "so what" quite directly:

>> Build logs look like terminal output, because they are terminal output. ANSI colors work. Your test framework’s fancy formatting comes through intact. You’re not squinting at a web UI that has eaten your escape codes and rendered them as mojibake. This sounds minor. It is not minor. You are reading build logs dozens of times a day. The experience of reading them matters in the way that a comfortable chair matters. You only notice how much it matters after you’ve been sitting in a bad one for six hours and your back has filed a formal complaint.

Having to look mentally ignore ANSI escape codes in raw logs (let alone being unable to unable to search for text through them) is annoying as hell, to put it mildly.

usr1106

Doesn't `less -R` solve the ANSI escape problem?

dataflow

No, it's insane to have to rely on that workaround. Having to download raw logs, bring up a terminal, go to that directory, and type less -R, is already a massive pain. All of that and you don't even get back a basic scrollbar.

And how do you expect people to even know about this workaround, and how to search for text with it? It's not like the GitHub UI even tells you. Not everyone is a Linux pro.

Nobody is saying it's impossible to get past the ANSI escape codes. People eventually figure out ways to do it. The claim is how much of your time do you want to lose to friction in that process, which you have to repeated frequently. It's insane for it to be this hard.

bayindirh

> Having to look mentally ignore ANSI escape codes in raw logs (let alone being unable to unable to search for text through them) is annoying as hell, to put it mildly.

You have a tool here, which is noted elsewhere: it's "less --raw". Also there's another tool which analyzes your logs and color codes them: "lnav".

lnav is incredibly powerful and helps understanding what's happening, when, where. It can also tail logs. Recommended usage is "your_command 2>&1 | lnav -t".

estimator7292

> Owning your compute? Yeah! This is an important one, but you can do that on all the major CI systems

Except for GitHub charging you monthly to run your own CI jobs on your own hardware.

apothegm

I was a very early customer of BuildKite. It’s lovely, very ergonomic, and gives you so much control.

mickeyp

The winning strategy for all CI environments is a build system facsimile that works on your machine, your CI's machine, and your test/uat/production with as few changes between them as your project requirements demand.

I start with a Makefile. The Makefile drives everything. Docker (compose), CI build steps, linting, and more. Sometimes a project outgrows it; other times it does not.

But it starts with one unitary tool for triggering work.

carlsmedstad

This line of thinking inspired me to write mkincl [0] which makes Makefiles composable and reusable across projects. We're a couple of years into adoption at work and it's proven to be both intuitive and flexible.

[0]: https://github.com/mkincl/mkincl

zahlman

I think the README would be better with a clearer, up-front explanation of what this builds on top of using `make` directly.

PunchyHamster

[flagged]

antonvs

Because, in 2026, most build tools still aren't really all that good when it comes to integrating all the steps needed to build applications with non-trivial build requirements.

And, many of them lack some of the basic features that 'make' has had for half a century.

chedabob

Ye, kick off into some higher-level language instead of being at the mercy of your CI provider's plugins.

I use Fastlane extensively on mobile, as it reduces boilerplate and gives enough structure that the inherent risk of depending on a 3rd-party is worth it. If all else fails, it's just Ruby, so can break out of it.

krautsauer

Make is incredibly cursed. My favorite example is it having a built-in rule (oversimplified, some extra Makefile code that is pretended to exist in every Makefile) that will extract files from a version control system. https://www.gnu.org/software/make/manual/html_node/Catalogue...

What you're saying is essentially ”Just Write Bash Scripts”, but with an extra layer of insanity on top. I hate it when I encounter a project like this.

nicoburns

https://github.com/casey/just is an uncursed make (for task running purposes - it's not a general build system)

arielcostas

How does `just` compare to Task (https://taskfile.dev/)?

mickeyp

No I'm saying use Makefiles, which work just fine. Mark your targets with PHONY and move on.

krautsauer

You still get bash scripts in the targets, with $ escape hell and weirdness around multiline scripts, ordering & parallelism control headaches, and no support for background services.

The only sane use for Makefiles is running a few simple commands in independent targets, but do you really need make then?

(The argument that "everyone has it installed" is moot to me. I don't.)

wtcactus

I agree, but this is kind of an unachievable dream in medium to big projects.

I had this fight for some years in my present work and was really nagging in the beginning about the path we were getting into by not allowing the developers to run the full (or most) of the pipeline in their local machines… the project decided otherwise and now we spend a lot of time and resources with a behemoth of a CI infrastructure because each MR takes about 10 builds (of trial and error) in the pipeline to be properly tested.

mickeyp

It's not an unachievable dream. It's a trade-off made by people who may or may not have made the right call. Some things just don't run on a local machine: fair. But a lot of things do, even very large things. Things can be scaled down; the same harnesses used for the development environment and your CI environment and your prod environment. You don't need a full prod db, you need a facsimile mirroring the real thing but 1/50th the size.

Yes, there will always be special exemptions: they suck, and we suffer as developers because we cannot replicate a prod-like environment in our local dev environment.

But I laugh when I join teams and they say that "our CI servers" can run it but our shitty laptops cannot, and I wonder why they can't just... spend more money on dev machines? Or perhaps spend some engineering effort so they work on both?

maccard

> You don't need a full prod db, you need a facsimile mirroring the real thing but 1/50th the size.

My experience has been that the problems in CI systems come from exactly these differences “works on my machine” followed by “oops, I guess the build machine doesn’t have access to that random DB”, or “docker push fails in our CI environment because credentials/permissions, but it works when I run it just on my machine”

wtcactus

> It's not an unachievable dream. It's a trade-off made by people who may or may not have made the right call.

In my experience at work. Anything that demands too much though, collaboration between teams and enforcing hard development rules, is always an unachievable dream in a medium to big project.

Note, that I don't think it's technically unachievable (at all). I just accepted that it's culturally (as in work culture) unachievable.

zahlman

Sometimes the problem is that the project is bigger than it needs to be.

nottorp

Funny enough, the LLMs are allowed to run builds on your local machine. The humans, not any more.

wtcactus

But it isn't a question of security. The project would very much like the developers to be able to run the pipelines on their machines.

It's just that management don't see it as worth it, in terms of development cost and limitations it would introduce in the current workflow, to enable the developers to do that.

zdw

I tend to disagree with this as it seems like an ad for Nix/Buildkite...

If your CI invocations are anything more than running a script or a target on a build tool (make, etc.) where the real build/test steps exist and can be run locally on a dev workstation, you're making the CI system much more complex than it needs to be.

CI jobs should at most provide an environment and configuration (credentials, endpoints, etc.), as a dev would do locally.

This also makes your code CI agnostic - going between systems is fairly trivial as they contain minimal logic, just command invocations.

gchamonlive

The "just keep your CI simple" mindset doesn't work in practice. Any non-trivial project will have a high chance that it'll have to encode some form of logic in the CI, either for situational triggers, or git branching strategies, on demand deployments, permissions, secrets, heterogeneous runners, load balance, local testing, component testing... these are all valid use-cases, all with their own gotchas and hard-to-debug issues in all CI systems I know.

It's correct to design CI pipelines in order to offload much of the logic to subsystems, but pipelines will eventually grow in complexity and the CI config system should be designed in order not to get in the way. I don't know buildkite, but Gitlab CI is the best I know. Template and job composition works brilliantly, top-level object being the job and not the stage result in flat, easier to read config files and the packed features are really good, but it's hard to debug, the conditional logic sometimes fails in unexpected ways, it's exhausting to use the predefined variables reference and the permission system for multi project pipelines is abysmal.

zdw

I don't think we're necessarily in disagreement - your points about reusing CI code across jobs through templating or composition are well taken.

I'd argue that this also dovetails very nicely with having common, shared invocations - if you can run "make test" in any repo and have it work, that makes CI code reuse even easier.

As for the complexity comments, that complexity has to go somewhere, and you should look for how to best factor the system so it's debuggable. Sometimes this may mean restructuring how your code is factored or deployed or has failure tolerance so it's easier to test, and this should be thought of as an architecture task early on.

mitchjj

Can 100% confirm this is not an ad (at least not for Buildkite) and was a lovely surprise to read for the team.

jamesfinlayson

This so much - I remember migrating from one CI system to another a few years ago - I had built all of our pipelines to pull in some secrets and call a .sh file that did all the heavy lifting. The migration had a few pain points but was fairly easy. Meanwhile, the teams who had created their pipelines with the UI and broken them up in to multiple steps were not happy at all.

anotherevan

Hey, at least you didn't pull the reflexive, "this must be AI slop!" comment that seems quite prevalent on HN lately.

heftykoo

The problem isn't CI/CD; the problem is "programming in configuration". We've somehow normalized a dev loop that involves `git commit -m "try fix"`, waiting 10 minutes, and repeating. Local reproduction of CI environments is still the missing link for most teams.

MomsAVoxell

Bingo.

These tool fails are as a consequence of a failure of proper policy.

Tooling and Methodology!

Here’s the thing: build it first, then optimize it. Same goes for compile/release versus compile/debug/test/hack/compile/debug/test/test/code cycles.

That there is not a big enough distinction between a development build and a release build is a policy mistake, not a tooling ‘issue’.

Set things up properly and anyone pushing through git into the tooling pipeline are going to get their fingers bent soon enough, anyway, to learn how the machine mangles digits.

You can adopt this policy of environment isolation with any tool - it’s a method.

Tooling and Methodology!

cdaringe

Yes AND… more. He discusses your (correct) sentiment before and during his bash temptation segment. It’s only one of the gripes, but imho this one’s the 80%/pareto

dschu

`act` should help most teams reproducing CI locally.

mlrtime

act is horrible if:

* you have any remote resources that are needed during build

* for some reason your company doesn't have standardize build images

burnJS

Killing engineer teams? Hyperbole thread titles need to be killed. I find github actions to be just fine. I prefer it to bitbucket and gitlab.

altmanaltman

Yeah I was wondering how Microsoft is okay with Github murdering people but then was let down by the article.

shwetanshu21

[flagged]

slopusila

[flagged]

internet_points

aaand there we have godwin's law again

noident

I clicked the article thinking it was about GitLab. Much of the criticism held true for GitLab anyway, particularly the insanely slow feedback loops these CI/CD systems create.

dbtablesorrows

Can't blame gitlab for team not having a local dev setup.

spockz

You can though. GHA and Gitlab CI and all the others have a large feature set for orchestration (build matrices, triggers,etc.) that are hard to test on a local setup. Sometimes they interfere with the build because of flags, or the build fails because it got orchestrated on a different machine, or a package is missing, or the cache key was misconfigured, etc.

There are a bunch of failures of a build that have nothing to do with how your build itself works. Asking teams to rebuild all that orchestration logic into their builds is madness. We shouldn’t ask teams to have to replicate tests for features that are in the CI they use.

anttiharju

Github being less and less reliable nowadays just makes this more true.

In the past week I have seen:

- actions/checkout inexplicably failing, sometimes succeeding on 3rd retry (of the built-in retry logic)

- release ci jobs scheduling _twice_, causing failures, because ofc the release already exists

- jobs just not scheduling. Sometimes for 40m.

I have been using it actively for a few years and putting aside everything the author is saying, just the base reliability is going downhill.

I guess zig was right. Too bad they missed builtkite, Codeberg hasn't been that reliable or fast in my experience.

bugglebeetle

Yeah, do crons even work consistently for GitHub Actions? I tried to set one up the other day and it just randomly skipped runs. There were some docs that suggested they’re entirely unreliable as well.

habosa

Dead on. GitHub Actions is the worst CI tool I’ve ever used (maybe tied with Jenkins) and Buildkite is the best. Buildkite’s dynamic pipelines (the last item in the post) are so amazingly useful you’ll wonder how you ever did without them. You can do super cool things like have your unit test step spawn a test de-flaking step only if a test fails. Or control test parallelism based on the code changes you’re testing.

All of that on top of a rock-solid system for bringing your own runner pools which lets you use totally different machine types and configurations for each type of CI job.

Highly, highly recommend.

tcoff91

Jenkins had a lot of issues and I’m glad to not be using it overall, but I did like defining pipelines in Groovy and I’ll take Groovy over YAML all day.

bigstrat2003

Jenkins, like many complex tools, is as good or bad as you make it. My last two employers had rock solid Jenkins environments because they were set up as close to vanilla as possible.

But yes, Groovy is a much better language for defining pipelines than YAML. Honestly pretty much any programming language at all is better than YAML. YAML is fine for config files, but not for something as complex as defining a CI pipeline.

tcoff91

What kills me is when these things add like control flow constructs to YAML.

Like just use an actual programming language!

PunchyHamster

biggest flaw of jenkins is that by default it runs on builder env, as it was made pre-container era. But I do like integration with viewing tests and benchmarks directly in the project, stuff that most CI/CD systems lack

iberator

what's wrong with Jenkins? It's battle tested and hardened. Works flawless even with thousands of tasks, and WORKS OUT OF THE BOX.

imo top 10 best admin/devs free software written in past 25 years.

undefined

[deleted]

liveoneggs

It's too old and easy-to-use for anyone to hype it up as the next cool thing.

mitchjj

Buildkite has been around since 2013, hardly the next hype train

Storment33

I mean all CIs work out of the box, although I have no interest in self hosting CI.

Jenkins is probably a bit like Java, technically it is fine. The problem is really where/who typically uses it and as there is so much freedom it is really easy to make a monster. Where as for Go it is a lot harder to write terrible unmaintainable code compared to Java.

harikb

Ian Duncan, I was imagining you on a stage delivering this as a standup comedy show on Netflix.

My pet peeve with Github Actions was that if I want to do simple things like make a "release", I have to Google for and install packages from internet randos. Yes, it is possible this rando1234 is a founding github employee and it is all safe. But why does something so basic need external JS? packages?

computerfriend

Yeah, their "standard library" so to speak (basically everything under the actions org) is lacking. But for this specifically, you can use the gh CLI.

Storment33

This is what I done, GitHub Actions is basically a command line as a service for my projects. It does nothing but checkout the code, means I can do all the releasing, artefact uploading, compiling & testing etc locally.

rsyring

After troubleshooting a couple issues with the GitHub Actions Linux admin team, and their decision to not address either issue, I'm highly skeptical of investing more in GitHub Actions:

- Ubuntu useradd command causes 30s+ hang [1]

- Ubuntu: sudo -u some-user unexpectedly ends up with environment variables for the runner [2]

1: https://github.com/actions/runner-images/issues/13048

2: https://github.com/actions/runner-images/issues/13049

Storment33

I mean...

They told you why it takes so long no? the runners come by default with loads of programming languages installed like Rust, Haskell, Node, Python, .Net etc so it sets all that up per user add.

I would also question why your adding users on an ephemeral runner.

Marsymars

> I would also question why your adding users on an ephemeral runner.

We use runners for things that aren't quite "CI for software source code" that does some "weird" stuff.

For instance, we require that new developer system setup be automated - so we have a set of scripts to do that, and a CI runner that runs on those scripts.

Storment33

Fair enough if you've some development environment automation and you want the CI to run it as well so CI is consistent with local development.

Don't know exactly what your doing but others(myself included) are using Mise or Nix on a per project basis to automate the development environment setup and that works well on GitHub Actions.

But I don't think useradd taking 30's on GitHub Actions is a bug or something they need to fix, they've explained why. Unsure about the sudo issues, did not read it carefully.

rvz

> If you’re a small team with a simple app and straightforward tests, it’s probably fine. I’m not going to tell you to rip it out.

> But if you’re running a real production system, if you have a monorepo, if your builds take more than five minutes, if you care about supply chain security, if you want to actually own your CI: look at Buildkite.

Goes in line with exactly what I said in 2020 [0] about GitHub vs Self-hosting. Not a big deal for individuals, but for large businesses it's a problem if you can push that critical change when your CI is down every week.

[0] https://news.ycombinator.com/item?id=22867803

BoorishBears

I know this is off topic, but that homepage is a piece of work: https://buildkite.com

I get it's quirky, but I'm at a low energy state and just wanted to know what it does...

Right before I churned out, I happened to click "[E] Exit to classic Buildkite" and get sent to their original homepage: https://buildkite.com/platform/

It just tells you what it Buildkite does! Sure it looks default B2B SaaS, but more importantly it's clear. "The fastest CI platform" instead of some LinkedIn-slop manifesto.

If I want to know why it's fast, I scroll down and learn it scales to lots of build agents and has unlimited parallelism!

And if I wonder if it plays nice with my stack, I scroll and there's logos for a bunch of well known testing frameworks!

And if I want to know if this isn't v0.0001 pre-alpha software by a pre-seed company spending runway on science-fair home pages, this one has social proof that isn't buried in a pseudo-intellectual rant!

I went down the rabbit hole of what lead to this and it's... interesting to say the least.

https://medium.com/design-bootcamp/nothing-works-until-you-m...

https://www.reddit.com/r/branding/comments/1pi6b8g/nothing_w...

https://www.reddit.com/r/devops/comments/1petsis/comment/nsm...

mitchjj

Hello mate, Head of Brand and Design at BK here. Thanks for the feedback, genuinely; the homepage experiment has been divisive, in a great way. Some folk love it, some folk hate it, some just can't be bothered with it. All fair.

Glad that the classic site hit the mark, but a lot work to do to make that clearer than it is; we're working on the next iteration that will sunset the CLI homepage into an easter egg.

Happy to take more critique, either on the execution or the rabbit hole.

BoorishBears

Great of you to accept critiques, but I don't think there's anything more I can add.

You brought up Planetscale's markdown homepage rework in one of those posts and I actually think it's great... but it's also clear, direct, and has no hidden information.

I'd love to see what happens to conversions once you retire this to an Easter Egg.

jjgreen

I did a BK search earlier in the article and ended on the same page, decided I couldn't be bothered to play those sort of games and clicked away. The GPs link actually looks rather interesting so I'll investigate, so take this a hate-it-folk vote.

sevenseacat

oh wow, that's not good.

mitchjj

Would love to hear more from you on why

isoprophlex

> GitHub Actions is not good. It’s not even fine. It has market share because it’s right there in your repo

Microsoft being microsoft I guess. Making computing progressively less and less delightful because your boss sees their buggy crap is right there so why don't you use it

kdazzle

Pretty sure someone at MS told me that Actions was rewritten by the team who wrote Azure DevOps. So bureaucracy would be a feature.

That aside, GH Actions doesn’t seem any worse than GitLab. I forget why I stopped using CircleCI. Price maybe? I do remember liking the feature where you could enter the console of the CI job and run commands. That was awesome.

I agree though that yaml is not ideal.

olafmol

Debug with SSH(1) is still one of our (CircleCI) most loved and praised features. I really believe that these little QoL features can make a world of difference for sw developers and engineers, and this stays a strong focus for us.

1: https://circleci.com/docs/guides/execution-managed/ssh-acces...

(Disclaimer: i work at CircleCI)

Daily Digest email

Get the top HN stories in your inbox every day.