Hey it's totally cool that #Microsoft #GitHub blocked access to one of the repositories in the very center of the #xz backdoor saga.
It's not like a bunch of people are scrambling to try and make sense of all this right now, or that specific commits got linked to directly from media and blogposts and the like.
Cool, cool.
@rysiek So hang on, that means that code's in copilot.
But what does that even mean?
@onepict hahaha ooooh boy.
Wait, we could totally take a page from AI Hype Brigade's book and publish breathless pieces like:
All Copilot Co-generated Code Might Now Be Backdoored
Very tempting.
@onepict@chaos.social @rysiek@mstdn.social To get the full exploit, it would have to extract the release tarballs, then extract a couple of random files in the test directory, then feed those into the LLM.
One of the key pieces of the exploit seems to be some intentionally botched checking for whether a certain type of sandboxing is available, so that might be affected.
@onepict @rysiek I thought that too - it’s not.
The exploit mechanism relied upon an unannounced binary being posted as part of the tarball that’s not supplied when you get the xz utility from GitHub.
The only aspects of the exploit chain reflected on the GitHub are the build script that, when it trips over certain conditions, causes the binary file in the tarball to be unpacked and injected into the build process, installing the actual backdoor at build time.
@onepict @rysiek As far as I understood this LLM stuff it means that the code was tokenized into their database. That is, they treat it as text (not code).
The weaponising would need to pull ot out of their database via the appropriate prompts. But even then it would not be executed but shown as a response.
Only after people in front of the computer include it and run it it would potentially compromise the machine.
At that point I would expect the code to reach out to the xz on the system that either never was vulnerable (because not running a testing rolling release system and therefore have not seen it yet) or have updated to a safe version by now.
The window of opportunity is about a month. If the backdoor wasn't installed during that time I would say it's not dangerous anymore.
I'm not a security researcher. No warranties on following this piece of opinion.
Doesn't surprise me, like the times when Ive had a random linkedin invite to connect and it's a total stranger and yes ex military.
@rysiek
It was a mistake for people to keep their repos on GitHub after MS bought it.
@rysiek it's almost like Microsoft has experience hiding evidence or something
Seems reasonable reasonable that they would block it. Sources can be made available to people looking into it, assuming they don't already have a local clone.
But leaving available to the public would allow copycats or other bad actors to study the code as well.
@eric that's bull, sorry.
First of all, it's available in hundreds of other places, the cat is way out of the bag here.
But secondly, for figuring out what had happened an *authoritative* source repository is crucial. That GitHub repository is the authoritative one for this project. That's why everybody's been linking to it in the first place.
Now these links are gone, and the situation is ripe for someone to create a faux copy maliciously and try to trick people into analyzing that instead.
@eric it should have been "preserved in glass" instead, so to speak. Switched to strictly read-only, but publicly available.
Microsoft GitHub is actively interfering with the ability of researchers to figure this one out, and with the ability of the broader FLOSS community to learn from this.
Of course, I would have expected nothing less of them.
"for figuring out what had happened an authoritative source repository is crucial. That GitHub repository is the authoritative one for this project"
This makes sense, but there are ways to make the source available to those investigating without making publicly available.
And if the repo required signed commits, wouldn't that still be preserved and auditable in clones of the repo?
I'm no fan of Github/Microsoft but you can't force someone to publicly host malicious code. If they had left it up, I'm sure there would be criticism for them still making it available.
> This makes sense, but there are ways to make the source available to those investigating without making publicly available.
Again, what's the problem with making it publicly available?
Why force researchers to jump through additional hoops in what is already a difficult situation?
Why add the complexity of explaining to readers, policymakers, the management that yes it's the same repo I did not make it up because commits are signed?
Idk. My initial thought when I saw the repo was taken down was, "That makes sense. They don't want to host malicious code." And I really wasn't expecting my comment to be that much of a hot take.
I would imagine Github has there own security/legal/policy teams that that are calling shots on this, so probably not as simple as turning access back on in ro
mode.
I suppose that I'm still of the opinion that you can't force someone/company to publicly host content they don't want to, and they're not responsible for how other people link to there site. (To be clear again, I'm not a fan of Github. It's more just a general statement.)
But I'm not an infosec person, and none of my systems are affected, so I really don't have a horse in this race.
Besides, it's Saturday, it's nice out, the little one is with the grandparents, and I've committed myself to cleaning up the garage.
I'll just duck out of this one.
@eric fair enough. Have a good weekend and enjoy your time with your family.
@rysiek @eric There is a reliable archival copy at https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/tukaani-project/xz&snapshot=bcdaf33e1b3864c1c5f52dca8389a8f68d679e03
Still, the same sorts of concerns apply to the availability of meta discussion on GitHub for analysis. The release tarballs seem especially crucial, though I expect these also are archived e.g. by Debian.
"the same sorts of concerns apply to the availability of meta discussion on GitHub for analysis"
Yeah, comments on issues/PR would still need to be made available for those investigating, and that wouldn't be available in the git logs.
@LiberalArtist @rysiek @eric That doesn't help with existing links that are now broken by GitHub's actions. Besides that, the Software Heritage Archive has a lot of problems (e.g.: https://cohost.org/arborelia/post/5169338-the-software-heritag) that make it a bit suboptimal to rely on.