Copy Fail

Daily Digest email

Get the top HN stories in your inbox every day.

ebiggers

As someone who works on the Linux kernel's cryptography code, the regularly occurring AF_ALG exploits are really frustrating. AF_ALG, which was added to the kernel many years ago without sufficient review, should not exist. It's very complex, and it exposes a massive attack surface to unprivileged userspace programs. And it's almost completely unnecessary, as userspace already has its own cryptography code to use. The kernel's cryptography code is just for in-kernel users (for example, dm-crypt).

The algorithm being used in this exploit, "authencesn", is even an IPsec implementation detail, which never should have been exposed to userspace as a general-purpose en/decryption API.

If you're in charge of the configuration for a Linux kernel, I strongly recommend disabling all CONFIG_CRYPTO_USER_API_* kconfig options. This would have made this bug, and also every past and future AF_ALG bug, unexploitable. In the unlikely event that you find that it breaks any userspace programs on your system, please help migrate them to userspace crypto code! For some it's already been done. But in general, AF_ALG has actually never been used much in the first place, other than in exploits.

I don't think there's much other option. This sort of userspace API might have been sort of okay many years ago. But it just doesn't stand up in a world with syzbot, LLM-assisted bug discovery, etc.

still_grokking

As I did not know what AF_ALG is in the first place I've searched for it and found this here:

https://www.chronox.de/libkcapi/html/ch01s02.html

It states the following:

> There are several reasons for AF_ALG:

> * The first and most important item is the access to hardware accelerators and hardware devices whose technical interface can only be accessed from the kernel mode / supervisor state of the processor. Such support cannot be used from user space except through AF_ALG.

> * When using user space libraries, all key material and other cryptographic sensitive parameters remains in the calling application's memory even when the application supplied the information to the library. When using AF_ALG, the key material and other sensitive parameters are handed to the kernel. The calling application now can reliably erase that information from its memory and just use the cipher handle to perform the cryptographic operations. If the application is cracked an attacker cannot obtain the key material.

> * On memory constrained systems like embedded systems, the additional memory footprint of a user space cryptographic library may be too much. As the kernel requires the kernel crypto API to be present, reusing existing code should reduce the memory footprint.

I can't judge whether this is a good justification, but there is one.

p_l

AF_ALG if I remember correctly predates userspace-accessible crypto acceleration and was way more important back when it meant you had actual need for "SSL accelerator" cards in servers, among other things

ryukoposting

Hi, embedded firmware engineer here. I give it a B-

There's a weird area between the workloads that fit on a microcontroller, and the stuff that demands a full-blown CPU. Think softcore processors on FPGAs, super tiny MIPS and RISC-V cores on an ASIC, etc. Typically you run something like Yocto on a core like that. Maybe MontaVista or QNX if you've got the right nerd running the show.

So you have serious compute needs, and security concerns that justify virtual memory. But you don't have infinite space to work with, so hardware acceleration is important. Having a standard API built into the kernel seems like a decent idea I guess.

And yet, I've never heard of AF_ALG. I've never seen it used. The thing is, if you have some bizzaro softcore, there's a good chance you also have a bizzaro crypto engine with no upstream kernel driver. If you're going to the trouble of rolling your own kernel with drivers for special crypto engines, why would you bother hooking it into this thing? Roll your own API that fits your needs and doesn't have a gigantic attack surface.

buckle8017

You should take note that this is written by the person that wrote the bad patch.

So grain of salt.

still_grokking

I've said I'm not sure about the validity of that reasoning.

I've liked it nevertheless for context, as augmentation to parent's post.

asveikau

I feel like it should be possible to fulfill these advantages with a minimal, not very complex API. I.e. the grandparent's comment about IPsec implementation details doesn't make the cut, but a hardware accelerated cipher implementation does.

mihaaly

But is it true or not? Whoever wrote it. (for objective truth the subjects are unimportant)

buredoranna

Please don't rely on my judgement for this being safe for production, but after blacklisting the modules, the provided python exploit failed.

Check if the following are modules

  grep CONFIG_CRYPTO_USER_API /boot/config-$(uname -r)

If they are, you can try blacklisting them

  /etc/modprobe.d/blacklist-crypto-user-api.conf
  
  """
  blacklist af_alg
  blacklist algif_hash
  blacklist algif_skcipher
  blacklist algif_rng
  blacklist algif_aead

  install af_alg /bin/false
  install algif_hash /bin/false
  install algif_skcipher /bin/false
  install algif_rng /bin/false
  install algif_aead /bin/false
  """

  update-initramfs -u

Can anyone comment on the ramifications this?

ebiggers

If iwd, or cryptsetup with certain non-default algorithms, isn't being used on the system, you should be fine. Not many programs use AF_ALG. It's possible there are others I'm not aware of, but it's quite rare.

To be clear, general-purpose Linux distros generally can't disable these kconfig options yet, due to these cases. But there are many Linux systems that simply don't need this functionality.

A good project for someone to work on would be to fix iwd and cryptsetup to always use userspace crypto, as they should.

400thecat

is CONFIG_CRYPTO_USER_API needed for hw acceleration for cryptsetup (dm-crypt) disk encryption ?

Milpotel

  zgrep CONFIG_CRYPTO_USER_API /proc/config.gz

strenholme

I can’t comment on the ramifications, except to note that elsewhere in the thread this appears to not break anything (whether it makes userspace crypto a little less safe is academic, but that doesn’t matter if we have an easy local root shell), but I can verify the above fix does protect Ubuntu 24.04 from the exploit.

Just reboot after applying this change.

undefined

[deleted]

globular-toast

Is it built as a module in most distros?

dsr_

It is built as a module in Debian.

lsmod shows it is not loaded on any of the Trixie or Bookworm machines I have checked, Intel or AMD.

alpn

For anyone wondering: AF_ALG is a Linux socket interface that exposes the kernel’s crypto API via file descriptors, using normal read(2)/write(2) calls for hashing and encryption.

dnnddidiej

I wonder can the kernel just remove it and distros put on a compatiability layer.

TheDong

It's already a configurable option in the kernel which can be fully disabled by distros if they wanted to provide their own compatibility layer, or just not ship any software that has a hard dependency on it.

l1k

It does enable address space separation of secret keys from user space, which some people love:

https://blog.cloudflare.com/the-linux-kernel-key-retention-s...

https://www.youtube.com/watch?v=7djRRjxaCKk

https://www.youtube.com/watch?v=lvZaDE578yc

So it's not as simple as "should not exist". I agree though that there doesn't seem to be a valid need to expose authencesn to user space.

Disclosure: I'm co-maintaining crypto/asymmetric_keys/ in the kernel and the author/presenter in the first two links is another co-maintainer.

ebiggers

That can be done in userspace too -- different userspace processes have different address spaces too.

The fact that the first link recommends using keyctl() for RSA private keys is also "interesting", given that the kernel's implementation of RSA isn't hardened against timing attacks (but userspace implementations of RSA typically are).

ngomez

The CloudFlare blog discusses that idea when they talk about having an "agent process" to hold cryptographic material, but they list drawbacks like having to develop two processes, implement a well-defined interface, and enforce ACLs. I'm not convinced that "developing two processes" is a reason not to do it, since the kernel is effectively just the second process now, but everything else makes sense.

It's unfortunate though since this is one thing I think Windows does decently well. The Windows crypto and TLS APIs do use a key isolation process by default (LSASS) and have a stable interface for other processes to use it [0]. I imagine systemd could implement something similar, but I also know that there are very strong opinions about adding more surface area to systemd.

[0] https://blackhat.com/docs/us-16/materials/us-16-Kambic-Cunni...

l1k

> the kernel's implementation of RSA isn't hardened against timing attacks

Cloudflare is using custom BoringSSL-based crypto code in the kernel:

https://lore.kernel.org/all/CALrw=nEyTeP=6QcdEvaeMLZEq_pYB9W...

400thecat

can you please give me a real-life example of an application, on a typical linux laptop or typical linux server, which userspace application would use this CRYPTO_USER_API ? None that I looked at seem to use it: openssl, pgp, sha256sum

l1k

As Eric has correctly stated above, we believe iwd (Intel Wireless Daemon), or rather the ell library it relies on (Embedded Linux Library) is the only relatively widespread user space application relying on it.

XorNot

Isn't the better argument to ask whether there'd be benefit if all those things did?

A lack of adoption isn't apriori a good argument against an interface, and serious bugs can happen anywhere.

My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.

SeriousM

I was completely unaware of https://syzbot.org, thanks for sharing!

> syzbot system continuously fuzzes main Linux kernel branches and automatically reports found bugs to kernel mailing lists. syzbot dashboard shows current statuses of bugs. All syzbot-reported bugs are also CCed to syzkaller-bugs mailing list. Direct all questions to syzkaller@googlegroups.com.

wolttam

I love this. I think everyone in software should be feeling a tinge of “we should trim the fat” right now - get rid of as much of the old and infrequently used/tested code as we can. Push users towards the better tested alternatives.

zbentley

But but but … wE dOnT bREaK uSErsPaCe!

eqvinox

The primary benefit of AF_ALG is IMHO when it's combined with kernel keyrings, i.e. ALG_SET_KEY_BY_KEY_SERIAL.

To steal from the sibling post:

It's even more than this: you can do crypto ops in user space without ever even having the key to begin with.

[Ed.: that said, maybe AF_ALG should be locked behind some CAP_*]

[Ed.#2: that said^2, I'm putting this one on authencesn, not AF_ALG. It's the extended sequence number juggling that went poorly, not AF_ALG at large. I bet this might even blow up in some strange hardware scenarios, "network packet on PCIe memory" or something like that - I'm speculating, though.]

ebiggers

It doesn't seem to actually get used that way in practice. ALG_SET_KEY_BY_KEY_SERIAL didn't even appear until just a few years ago. And either way, if the interface allows you to overwrite the su binary, whether it theoretically could provide some other security benefit becomes kind of irrelevant.

eqvinox

It is being used that way:

https://github.com/opensourcerouting/frr/blob/2b48e4f97fb021...

And, sure, if it breaks system security it's pointless. But so did "dirty pipe".

I do agree the number of issues in AF_ALG is annoying, which is why I suggested a CAP_* restriction. Maybe CAP_SYS_ADMIN in init_ns, that's kinda the big hammer.

angry_octet

Better implemented as another user space process than in the kernel.

eqvinox

You can't access TPMs that way.

tosti

I think it would be reasonable to deprecate af_alg in favor of a character device. It's more accessible that way. The downside is that the maintainers hate adding new ioctls. I think that's fair. But I don't think a "regular" device node would cover the functionality userland expects.

That said, elsewhere ITT it's pointed out there are only a few use cases so far.

xeeeeeeeeeeenu

It seems there was some kind of confusion during the disclosure process, because the vendors aren't treating this vulnerability as serious and it remains unpatched in many distros.

https://access.redhat.com/security/cve/cve-2026-31431 "Moderate severity", "Fix deferred"

https://security-tracker.debian.org/tracker/CVE-2026-31431

https://ubuntu.com/security/CVE-2026-31431

https://www.suse.com/security/cve/CVE-2026-31431.html

MarleTangible

Seems like distros consider it a medium risk because it doesn't involve remote code execution and requires local access. Though it allows local root privilege escalation which is considered high priority.

https://ubuntu.com/security/cves/about#priority

> Medium: A significant problem, typically exploitable for many users. Includes network daemon denial of service, cross-site scripting, and gaining user privileges.

oskarkk

Strange that it's not classified as "high", which specifically includes "local root privilege escalations".

> High: A significant problem, typically exploitable for nearly all users in a default installation of Ubuntu. Includes serious remote denial of service, local root privilege escalations, local data theft, and data loss.

amarant

It is high now, someone at canonical is paying attention it seems

markhahn

if your model is that linux is just about single-user desktops, this local exploit isn't too bad. or if your model is nothing but DB servers or the like.

mystifying to me that shared, multi-user machines are not thought of. for instance, I administer a system with 27k users - people who can login. even if only 1/10,000 of them are curious/malicious/compromised, we (Canadian national research HPC systems) are at risk. yes, this is somewhat uncommon these days, when shell access is not the norm.

but consider the very common sort of shared hosting environment: they typically provide something like plesk to interface to shared machines with no particular isolation. can you (as a website owner or 0wner) convince wordpress/etc to drop and execute a script? yep.

CGamesPlay

> if your model is that linux is just about single-user desktops, this local exploit isn't too bad.

For example, if you have passwordless sudo, you've already got a widely known LPE vulnerability lurking on your system.

AntiUSAbah

Not to bad? So we just threat linux overall as a single user system or what?

edelbitter

Ubuntu is not really targeting multi-user any more. Security update installation is deliberately delayed for all users, until at some point all unprivileged users ended all processes launched from the vulnerable snap image. (Firefox RPC breaks when you replace the binary, so having to reopen your browser to keep opening tabs simple because security upgrades were applied in the background would be inconvenient)

dwedge

Local access is a bit of a misnomer though, a vulnerable website can be tricked into running a script

daveoc64

Ubuntu seems to have updated the page to say that it's a high priority now.

undefined

[deleted]

mghackerlady

it's not like this couldn't be chained with some other exploit to get remote access to get remote root access which seems like a bit of an issue

undefined

[deleted]

staticassertion

It was already known to attackers (or basically anyone watching) weeks ago when the patch hit the kernel but it wasn't communicated by upstream as a vuln (because Linus and Greg do not believe that vulnerabilities are conceptually relevant to the kernel).

still_grokking

Will this continue like that even when the prophesied Mythos Vulnocalypse hits the Kernel?

This stance doesn't seem sustainable any more to me.

staticassertion

The response from Greg was that Mythos proved that upstream was right all along and that they'll continue to do things the same way. That's my recollection, at least - pretty sure it was something like that, could have been even worse though and I'm misremembering.

The stance was never sustainable, hence linux LPEs being constantly available. The solution is to treat your kernel as impossible to secure. Notably, gvisor users are not impacted by this CVE. Seccomp also kills this CVE.

stefanor

As far as we can tell, nobody disclosed it to the distributions, only to the kernel security team (who did not reach out to distributions). So the distributions are all scrambling now.

Good lesson in how not to do disclosure.

baggy_trough

Why wouldn't the kernel security team reach out to distributions?

stefanor

The Linux project's view is that almost all kernel bugs are security vulnerabilities. They don't treat something like this as anything special.

I can understand that PoV, but it doesn't fit with distributions' approach to security. So, in practice, one has to reach out to distributions individually, or use distros lists on openwall.org to coordinate with all distros.

wangman

RedHat has also changed it to "Important severity" and "Affected" now.

DooMMasteR

Yeah, it was also staged for release on the affected kernel branches a while ago, but almost all still had the window open and only tonight got the merged across all maintained kernel versions.

It's not good... and surely not "responsible/planned" disclosure.

AntiUSAbah

I'm schocked that ubuntu is aware of this and the prv lts is not patched yet :|

wtf

Tuna-Fish

Yeah, by ubuntu's own guidelines linked on that page, this should be priority: high, but instead it's marked as medium.

no-name-here

That was fixed, it’s now marked high.

undefined

[deleted]

Neil44

I thought that. surely people are going crazy right now owning anything with an our of date Wordpress exposed.

jeffwass

This submission is currently the main HN submission.

As of now the submission title is simply “Copy Fail”.

Given the severity of the exploit, can we edit the Title to add some context that it’s a major Linux vulnerability?

Eg the other submissions say this : “Copy Fail: 732 Bytes to Root on Every Major Linux Distribution.”

ramon156

I dont really get why you'd

- buy a domain

- vibe code a page/artifact/whatever (which, given the quality of LLM wordings, only makes an argument less strong)

- post it on HN with no further explanation in the title

Why not write a detailed report? Even a tweet makes much more sense in my head than this. Even a logo??

Sorry if this comes over as salty, I guess I'm just not getting the thought process.

petcat

> I dont really get why you'd buy a domain [...] Even a tweet makes much more sense in my head than this

I think we should be celebrating people hosting their own content on their own website instead of just posting on some social media site.

stingraycharles

I think they’re using it to promote their product, Xint Code, which was used to discover it. That’s the way I read it anyway.

otherme123

I hope they sell a lot of Xint Code licenses, so they don't have to sell their findings.

eddythompson80

Maybe it’s tradition https://news.ycombinator.com/item?id=7548991

staticassertion

Where would you have them write a detailed report if not a website?

throwaway5465

The domain is canonical.

Then it's syndicate everywhere.

But all roads lead back to the domain.

vntok

Definitely comes over as salty. Naming major flaws has been a tradition for decades. Remember Heartbleed? It had a site and a logo :) Shellshock, Meltdown, Spectre as well. A few more: https://github.com/hannob/vulns

This site though is pretty useful; first it serves as a central location to point people to with short links in chats/emails/whatever, then it has a quick visual explainer and a link to the detailed technical report for those who want more info. Pretty neat.

Last but not least, buying the domain must have taken 5 minutes, prompting the page must have taken 30 minutes and posting it on HN must have taken 1 minute. So it certainly wasn't a lot of work in the grand scheme of things and probably did not deter the team from doing other important things.

Orygin

It used to be done for fame and visibility. Give a marketable name and a website, your exploit will be talked about and your name will shine in the industry.

Now it's done by an LLM to sell more LLMs services. Disclosure is botched to have the most sensational title so more click more upsell.

huflungdung

[dead]

jcul

Yes, strongly agree.

This is HUGE news, I would have skimmed over "Copy Fail".

The blog post might be a better place to link to also, it has more details on the exploit.

https://xint.io/blog/copy-fail-linux-distributions

There are also some good threads on which distros are vulnerable and mitigations on the github page.

https://github.com/theori-io/copy-fail-CVE-2026-31431/issues

arcfour

It's unfortunate that this does not include which versions of the kernel are vulnerable/patched, especially since this is a builtin module which cannot be easily removed with rmmod...

I was wondering if I was vulnerable running Fedora 44, kernel 6.19.14, and after a few minutes of digging I was able to find the linux-cve-announce mailing list post: https://lore.kernel.org/linux-cve-announce/2026042214-CVE-20... which says:

  ...fixed in 6.18.22 with commit fafe0fa2995a0f7073c1c358d7d3145bcc9aedd8

  ...fixed in 6.19.12 with commit ce42ee423e58dffa5ec03524054c9d8bfd4f6237

  ...fixed in 7.0 with commit a664bf3d603dc3bdcf9ae47cc21e0daec706d7a5

Hope that helps.

hnarn

most distros backport fixes which does not increment that version number. i.e. they patch it, they do not ship a completely new kernel release.

noisy_boy

Thanks for this - I was wondering why I got the password prompt on my Fedora 43 with latest packages.

baggy_trough

Greg KH says more backports coming soon.

https://openwall.com/lists/oss-security/2026/04/30/12

nh2

If you want to use the suggested mitigation (disabling kernel module `algif_aead` with a modprobe config), and you do not want to run that whole obfuscated shell code to get an actual root shell, but only check if the module can be loaded, here is a readable version of its first few lines:

    python3 -c 'import socket; s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0); s.bind(("aead","authencesn(hmac(sha256),cbc(aes))")); print("algif_aead probably successfully loaded, mitigation not effective; remove again with: rmmod algif_aead")'

Similarly, when the mitigation is in place,

    modprobe algif_aead

should fail with an error.

archon810

    modprobe algif_aead
    modprobe: FATAL: Module algif_aead not found in directory /lib/modules/6.14.3-x86_64-linode168

Yet this kernel is vulnerable.

Sophira

That would suggest that CRYPTO_USER_API_AEAD=y in your kernel config. You can disable it in that case by setting that to "n", recompiling your kernel, and putting the new kernel in place.

nh2

Indeed, no modprobe.d will help when the feature is compiled into the kernel ("=y") instead of compiled into a runtime-loadable module.

hackernudes

LPE = local privilege escalation

Too many darn acronyms. This one wasn't too hard to figure out from context but I wish people would define acronyms before using them!

arcfour

LPE is a very well-known acronym within the security community, it's not purely academic or obscure or anything.

I agree that it would be a good idea to define it explicitly when writing for a broader audience, but I don't think it's particularly egregious that they didn't. It's certainly something I could see myself forgetting.

Then again, the whole writeup appears to be AI-generated, so...

1970-01-01

It is nowhere near this. There are very few acronyms in the IT world that are actually well-known outside of it. LPE is less well-known than LVAD or MCU.

https://www.acronymfinder.com/Information-Technology/MCU.htm...

https://www.acronymfinder.com/LVAD.html

https://www.acronymfinder.com/Information-Technology/LPE.htm...

dataflow

> LPE is less well-known than LVAD or MCU.

I knew what LPE stands for but not the others. (I've seen MCU mentioned and kinda had a vague feeling for what it is. Never even seen LVAD.)

globular-toast

Sure, but the target audience of copy.fail is surely not the security community but regular sysadmins who probably don't otherwise follow as closely.

a96

I would absolutely expect a sysadmin in particular to know and understand the term and acronym.

jjordan

Good writing for a broad audience requires it. Unfortunately the LLMs don't tend to adopt this guideline.

boston_clone

it’s a CVE write up; the audience for these knows what an LPE is.

acdha

That’s very optimistic. I’d bet there are an order of magnitude more people wondering how exposed they are than security researchers reading this.

hackernudes

I've read many CVEs (somehow that acronym is ok... heh) but have never seen LPE despite being familiar with the concept.

no-name-here

Content at the OP link http://copy.fail seems fairly different from any normal CVE I’ve seen.

ButlerianJihad

To be fair, I just consulted 3 cybersecurity glossaries (SANS.org, NIST CSRC, Huntress), and none of them list "LPE" nor "Local Privilege Escalation".

If you type "LPE" into English Wikipedia's search bar, and press "Enter", you'll be sent to a disambiguation page which contains a link to the relevant article.

https://en.wikipedia.org/wiki/LPE

1970-01-01

I don't know why, but newer writers have never been taught to expand their acronyms on first use. I blame the US education system.

0xmagic0

0xdf made a nice breakdown video: https://www.youtube.com/watch?v=wQ914geKOcw

jesse_dot_id

Good thing nobody is silly enough to let fully autonomous AI agents run as regular users on these affected operating systems. That could be disastrous given a zero day prompt injection technique.

chromacity

I don't see what the issue is, my agent is already running as root.

dnnddidiej

Yeah it has all the government logins and full gmail access. It will be too busy to bother rooting the local machine!

latentsea

Shouldn't be a problem, we're currently clean on OpSec.

pixel_popping

As it should for full yolo_O

ryandrake

Good thing we haven't normalized installing things with curl | sh

still_grokking

Yeah, that's great!

Imagine we would download random code from the internet and just execute it, like with NPM, PIP, Maven, Cargo etc.

om8

cargo/uv/go have lock files though

Semaphor

I don’t think that matters as it’s usually curl | sudo sh

dawnerd

Or npm being allowed to run arbitrary post install scripts

FlyThruTheSun

I literally ship an installer that runs with curl | bash... reading this thread while patching my servers is a fun experience lol

sieabahlpark

[dead]

phreack

The page itself seems vibecoded and a bit of an advertisement, but it does look like the vulnerability is real and high risk. It does explain the big security update I just got, guess I'll prioritize updating today.

2001zhaozhao

This is pretty obviously an advertisement but it's a pretty good advertisement imo, it pairs a meaningful contribution to the OSS ecosystem (discovering and patching a real bug) with selling your cybersecurity tool at the same time.

Orygin

The incentive previously was having more secure software making a name for yourself. The incentive now is finding the most noisy vulnerability so you can push FUD to sell your AI software.

AntiUSAbah

With vibe coding, html is a visualiation tool. not sure if i get your problem with that?

angry_octet

These guys don't need to advertise, they are already 100% busy with work. But who wastes their time manually creating web pages? Especially kernel devs.

tkgally

Side comment: I have recently used Claude Code to make a few sites for testing purposes. In the prompt I added "don't make it look vibe coded," and it worked pretty well: No purple gradients, bento box layouts, etc. Nothing spectacularly original, either, but probably enough to avoid accusations of vibe coding.

x4132

it's advertising their AI, not the talents of their humans :D

angry_octet

People are confusing the presentation layer with the content, just a surface layer analysis. Basically people are feeling so burnt by reading AI fluff that they make a rushed judgement.

m3nu

I wasn't able to unload algif_aead on RHEL 9/10 because it's built in, rather than a module.

So here the next-best thing I found: Disable AF_ALG via systemd. Needs drop-ins for all exposed services. Here an Ansible playbook that covers ssdh and user@, which are the main ones usually.

https://gist.github.com/m3nu/c19269ef4fd6fa53b03eb388f77464d...

byron3256

How about blacklisting algif_aead initialization function on RHEL 9/10? I added "initcall_blacklist=algif_aead_init" to the kernel boot options and rebooted. The exploit is not working anymore.

m3nu

Good idea. Added to the playbook for RHEL only.

On Debian normal unloading of the module works.

pkoiralap

I was coming up with the same intuition. However, it's like a whack-a-mole. What about cronjobs and slurmjobs and other services? Is there a way to do this directly on systemd so that all other processes inherit it rather than doing it on each one?

chucky_z

https://www.freedesktop.org/software/systemd/man/latest/syst...

`/etc/systemd/system/service.d/${...}.conf`

I think this is what you're looking for.

undefined

[deleted]

yrro

FYI RHEL's SELinux policy blocks AF_ALG socket creation for confined services out of the box. But disabling via RestrictAddressFamilies= unit option, or initcall_blacklist= kernel parameter, seems to be a good mitigation for unconfined services, users and containers.

progval

So this replaces a SUID binary, in order to run as PID 0. The website claims it can escape "Kubernetes / container clusters" and "CI runners & build farms" but I don't see anything supporting the claim it can escape a container (or specifically, a user namespace).

I ran the exploit in rootless Podman, and predictably it doesn't escape the container.

They also claim their script "roots every Linux distribution shipped since 2017.", but only tested four; and it doesn't work on Alpine

john_strinlai

>The website claims it can escape "Kubernetes / container clusters" and "CI runners & build farms" but I don't see anything supporting the claim it can escape a container

they state that the write-up is forthcoming. presumably there is some additional steps or modifications that will be detailed in the 'part 2'.

"Next: "From Pod to Host," how Copy Fail escapes every major cloud Kubernetes platform."

tjbecker

This is correct. The container escape exploit and writeup is not yet released.

dnnddidiej

Opus 4.7 it if you can't wait

Twirrim

> They also claim their script "roots every Linux distribution shipped since 2017.", but only tested four; and it doesn't work on Alpine

They've done themselves no favours at all with their write up.

It does seem legitimate (I was able to use the PoC on a 24.04 instance), and seems like it should be a big deal, but the actual number of affected distributions seems way lower, and not even remotely as per their claim every distribution since 2017.

For example with Ubuntu, if I'm reading it right there's some impact in 16.04 (EOL), but then at least as per their analysis, only the vendor specific 6.17 kernels they ship that have it (e.g. linux-gcp, linux-oracle-6.7 etc.). That's a relatively new kernel version they started shipping recently, after it was released upstream last September.

x4132

i mean, it doesn't work on any SELinux, but it's still quite severe anyhow

yrro

Have you got any info about this. 'seinfo -c' shows there is an alg_socket class. I presume this permission is required to be able to create an AF_ALG socket:

    $ sesearch -A -c alg_socket -p createallow bluetooth_t bluetooth_t:alg_socket { accept append bind connect create getattr getopt ioctl listen lock read setattr setopt shutdown write };
    allow container_device_plugin_init_t container_device_plugin_init_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_device_plugin_t container_device_plugin_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_device_t container_device_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_engine_t container_engine_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_init_t container_init_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_kvm_t container_kvm_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_logreader_t container_logreader_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_logwriter_t container_logwriter_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_t container_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_userns_t container_userns_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow openshift_app_t openshift_app_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow openshift_t openshift_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow spc_t unlabeled_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow staff_t staff_t:alg_socket { append bind connect create getopt ioctl lock read setattr setopt shutdown write };
    allow sysadm_t sysadm_t:alg_socket { accept append bind connect create getopt ioctl listen lock read setattr setopt shutdown write };
    allow unconfined_domain_type domain:alg_socket { accept append bind connect create getattr getopt ioctl listen lock map name_bind read recv_msg recvfrom relabelfrom relabelto send_msg sendto setattr setopt shutdown write };
    allow user_t user_t:alg_socket { append bind connect create getopt ioctl lock read setattr setopt shutdown write };

... that's a lot of domains, including container_t and user_t; and obviously anything unconfined_t can't be expected to be restricted.

(Maybe you & others are specifically thinking of Android's policy?)

layer8

The 2017 claim is based on the vulnerability having been introduced in this commit in the second half of 2017: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

The details will depend on whether the kernel is a newer release or a maintenance version of an older release.

tardedmeme

It overwrites bytes in memory of any file you can read. It's not hard to imagine how it could escape a lot of things.

rcxdude

If you can get to real UID 0 from a rootless container, you can escape it, but you do need to take extra steps. Same with it working on Alpine: the underlying vulnerability probably still exists, but the script might need some adjusting. It's a PoC, not a full exploit for every situation.

CGamesPlay

It's worth pointing out that you cannot, definitionally, get "real UID 0" in a "rootless" container, because then it wouldn't be a rootless container. This is relevant because this exploit doesn't claim to be able to bypass user namespaces, and that getting "real UID 0" would be a different exploit.

rcxdude

The underlying exploit allows writing arbitrary values to the page cache, independent of any namespacing, so it should be assumed to allow container escapes even if the given PoC code doesn't do that.

undefined

[deleted]

amusingimpala75

Their PoC does as you say, but is built upon arbitrary modification of the page cache, which could be abused for the other things

progval

Ah indeed, it can be used to overwrite the page cache for files on read-only volumes.

CGamesPlay

Kubernetes 1.33 switches to user namespaces enabled by default, which I imagine is the same underlying mechanism that rootless Podman uses. `hostUsers: false` is the way to ensure that root in the pod is root on the host. It's trivial for a real (unmapped) root to escape a Kubernetes pod.

embedding-shape

Did you try it on systems that don't have the patch already? Seems many distributions already shipped kernels with the patch ~a month ago.

progval

Yes. Alpine in rootless Podman doesn't work (after replacing "/usr/bin/su" with "/bin/su" in the .py, running the .py just doesn't do anything) while it does in Debian in rootless Podman on the same host.

embedding-shape

For mitigation, the page currently basically just says:

> Update your distribution's kernel package to one that includes mainline commit a664bf3d603d

But it isn't very clear to me what Kernel version you can expect that to be in. For Arch/CachyOS, the patch seems to be included in 6.18.22+, 6.19.12+ and 7.0+. If you're on any of the lower versions in the same upstream stable series, you're likely vulnerable right now. Some distro kernels may include the fix in other versions, so check for your distribution.

nh2

On a git repo that has as remotes

    https://github.com/torvalds/linux.git
    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git as remotes:

running a search for commit a664bf3d603d's commit message:

    git log --all --grep 'crypto: algif_aead - Revert to operating out-of-place' '--format=%H' | xargs -I '{}' git tag --contains '{}' | sort -u

outputs these tags as having the fix:

    v6.18.22
    v6.18.23
    v6.18.24
    v6.18.25
    v6.19.12
    v6.19.13
    v6.19.14
    v7.0
    v7.0.1
    v7.0.2
    v7.0-rc7
    v7.1-rc1

bombcar

Here's the diff if you wanna play in your source (Gentoo, looking at you):

https://github.com/torvalds/linux/commit/a664bf3d603d

6.18.25-gentoo-x86_64 has the patch for Gentoo.

zepearl

Thanks a lot!!!

I was running in Gentoo "6.18.18" (amd64) and the exploit worked (and all other shells which I PREVIOUSLY opened could then just execute "su -" without password to become "root") -> doing temporarily a "modprobe -r algif_aead" on-the-fly did not fix it as I was still able to swap to "root" from the unprivileged user by executing just "su -".

"6.18.25" fixed it (module "algif_aead" still running).

- Maybe older Kernel versions that don't contain the fix should be blacklisted?

- FYI in Gentoo I had to recompile "sys-fs/zfs-kmod" after the minor kernel upgrade (I initially skipped it, but after rebooting with the new kernel I could not mount my raidz1) -> the same might be needed for other external modules.

rcxdude

distros might also apply patches to their own packages, so this isn't a perfect signal (i.e. if you have one of those versions, you almost certainly have the fix, but if you don't, it might still be fixed but you'll need to check the distro's package information to know for sure).

kro

Major os vendors will publish pages with the fixed versions:

https://security-tracker.debian.org/tracker/CVE-2026-31431

https://ubuntu.com/security/CVE-2026-31431

Also, disabling algif_aead is suggested as mitigation

1p09gj20g8h

Where are you seeing the disabling algif_aead mitigation?

oskarkk

In TFA: https://copy.fail/#mitigation

> Before you can patch: disable the algif_aead module.

> echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf

> rmmod algif_aead 2>/dev/null || true

Edit: and I can confirm that on my system with kernel 6.19.8 the above fixes the exploit.

Daily Digest email

Get the top HN stories in your inbox every day.