August 9, 2024
XZ Backdoor - A New Wave of Socio-Software Pathologies
On March 29, 2024, Andres Freund, a principal software engineer at Microsoft, noticed an unusual amount of CPU utilization while conducting a performance analysis of postgres. After deeper investigation, Andres identified what is now believed to be one of the most sophisticated, deliberate, and time-capsuled exploits in the history of open source software (OSS). This vulnerability is likely to permanently change the way open source software is developed as well as the way we detect such vulnerabilities. Moreover, we believe these are a new class of vulnerabilities, which we refer to as socio-software pathologies because rely equally on software and society to manifest. We anticipate this is the first of many.
This exploit would later be named CVE-2024-3094. We refer to it as XZ-backdoor or simply XZB. XZB appears to have been injected into the xz/liblzma repository on March 9, 2024. Andres discovered it only a few weeks later. If he hadn't, XZB could have become one of the most destructive vulnerabilities ever seen. This is because it had the potential to affect the entire Linux ecosystem. Moreover, it was essentially undetectable from a static code analysis perspective. This is due to its fragmented binary embedding inside a cryptographic file, where it is only ever fully assembled at the injection site prior to being linked into the liblzma library.
Unlike other wide-spread vulnerabilities, like Log4J (CVE-2021-45045) or OpenSSL heartbleed (CVE-2014-0160), we now believe that the XZ-backdoor was intentional – not accidental – in nature. Moreover, the bad actor (known as “Jia Tan” which is likely a pseudonym) spent over two years building trust with Lasse Collin, the primary XZ maintainer, and the larger Linux community. It is believed that Jia may have also had assistance from a team of bad actors. This team appears to have worked in concert with one another to create social tension in XZ’s mailing list by repeatedly harassing Lasse Collin and asking him to be replaced as the maintainer. Some believe this was done to mentally destabilize Lasse. Others believe it was to diminish his credibility as the maintainer of XZ. Whatever the case, these posts were a harbinger of Jia Tan’s escalation into the inner circle of XZ, which eventually provided him with nearly unfettered access and control of the repository.
In this article, we briefly discuss (i) how XZB happened, (ii) why traditional code analysis tools cannot detect these sociological exploits, (iii) why we believe socio-software pathologies, like this one, are likely to be the next wave of common vulnerability and exposures (CVE), and (iv) what the software development community can do to proactively defend against them. Minimally, we recommend the OSS community begin using analysis tools that include socio-software pathology detection, such as Merly Mentor. While these tools may not be able to precisely detect CVEs in code, they can provide a sociological confidence signal based on code-contributor behaviors. Such signals may help as precursors to identifying future socio-software exploits. (Merly Mentor is free of cost for all open source developers and projects.)
The Heroes of Our Story: Lasse Collin and Andres Freund
Before exploring the details of XZB, we believe it’s important to remind the community that Andres Freund and to a larger extent, Lasse Collin, are two heroes of this story. According to our data, using Mentor’s lifetime contributor analysis, we’ve observed that Lasse Collin worked regularly -- and often times in complete isolation -- on the XZ project since 2007 (see Figure 1). He appears to have successfully maintained the XZ repository for 15 years without incident, until the recent XZB exploit led by Jia Tan. Thank you, Lasse, for your dedication to open source software and your commitment to the XZ project.
Figure 1: XZ Lifetime Contributor Overview (Actual Merly Mentor Screenshot)
In addition, had Andres Freund not deeply investigated the performance spikes he was observing in SSH, leading to his discovery of XZB, it’s unclear how much more destructive XZB could have been. It is believed that it could have easily infected millions of devices or more. Thank you, Andres.
XZ-Backdoor: How It Happened
According to multiple security reports, the XZ-backdoor appears to be both technological and sociological in nature. We discuss both aspects in this section. The data suggest that the effort behind XZB began at least in 2021 (perhaps earlier). We summarize some of the key events below and highlight key socio-software abnormalities.
(Note: we are aware of many additional details of XZB. Many of those details have been omitted here for one of two reasons: (i) brevity and (ii) some events may have been accidental in nature and not directly part of the XZB bad actors. Our aim is to minimize the spread of misinformation and to not misplace blame on those who may have been tangentially involved in XZB, but possibly only by chance.)
Repository Events:
2021-11: GitHub account JiaT75 is created. This is the account that is eventually used to inject the XZ-backdoor in March 2024.
2022-02-06: JiaT75 makes first commit to the XZ repository (first socio-software abnormality).
This event is sociologically unusual for developers, but why?
Most developers who work in open source would have had a Git/GitHub account for many years prior to contributing to a production-quality repository like XZ. This is because Git accounts are generally required to perform even the most basic operations, such as “pulls”. A user’s first pull would usually happen many years before they would be sophisticated enough with Git and know enough as a software developer to commit changes to a production-grade open source project like XZ.
Instead, Jia Tan’s first commit happens only three months (90 days) after his GitHub account was created. This is a pathological behavior from a socio-software perspective.
2022-12 – 2024-02: JiaT75 makes many commits to XZ (second socio-software abnormality).
By February 2023, one year after Jia Tan’s first commit to XZ, Jia begins making more than 50% of the repository’s commits per quarter (Figure 2). This is a second socio-software anomaly. However, unlike the first case, this pathology is context specific. Said another way: this event is only sociologically anomalous within the context of XZ. In other repositories, this behavior may be normal or less unusual. For XZ, though, this is a red alert. This is because in the 15 years prior, XZ was principally maintained by one person: Lasse Collin.
As shown in the historical repository trends in Figure 1, we see that Lasse Collin is the primary contributor for XZ. We analyzed the commits across XZ’s entire 17-year window. From 2007-2021, no contributor other than Lasse Collin, ever made more than ~20% of the total commits in any 3-month period (quarter). Yet, in February 2023 this changed (Figure 2).
Given the history of XZ and its repository commit trends, shown in its entirety in Figure 1, we can visually observe two critical phenomena. Phenomena 1: the number of commits starting in November-December 2022, which exceed the repository’s previous lifetime maximum. Phenomena 2: around February-March 2024 the total number of commits for this period is more than double what it had been for any similar period in the near 20-year history of the repository. While these commits alone might not unusual is a broader context of all repositories, given the specific lifetime context of XZ, these social behaviors are highly suspicious. When we then consider that the commits were largely due to Jia Tan, not Lasse Collin, and then couple this with Jia's GitHub account having only existed for 1.5 years, this is a clear social anomaly. The increasing percentage of commits in XZ for Jia Tan in the one year period on June 2022 - June 2023 is shown in Figure 2.
Figure 2: Merly Mentor screenshots of XZ repository commits from 2022-06 through 2023-06. Jia Tan increases percentage of commits from ~10% to ~85% over the course of 12 months. According to our analysis, in the 17-year history of the XZ repository, no contributor other than Lasse Collin ever made more than 25% of the total quarterly commits. This changed for the first time in February 2023 (upper right quadrant), when another contributor (Jia Tan) commits over 50% of the changes to XZ; a sociological anomaly – the first time it had ever happened in XZ.
2024-03-09: JiaT75 injects hidden and obfuscated “tests” which introduce XZ-backdoor.
These “tests” were in fact obfuscated and malicious code that was pre-compiled into binary form, then fragmented into non-linear chunks which were then embedded into an encrypted tarball file. Moreover, the object code fragments were intentionally separated into non-contiguous binary chunks, which irregular ranges of separation across the file. Even if the file were decrypted and linearly assembled, the code would not have the appearance of being malign. Only when assembled with precisely sized chunks (as well as removing several red herring chunks) will the code become malicious. Otherwise, it would appear innocuous.
2024-03-29: Andres Freund detects unusual delays from SSH logins. He root causes it to a backdoor inside of xz/liblzma (a part of the XZ repo). Andres reports it to openwall.com. XZB is discovered.
Non-Repository Social Events:
In addition to repository events, there is an emerging body of data detailing social events – entirely outside of the repository – that appear to have contributed to the successful execution of XZB. In particular, other individuals made harassing, demeaning, or disparaging comments on the XZ mailing list toward Lasse Collin, XZ’s 17-year repository maintainer, and the overall state of the repository.
However, at the time of this writing, it’s not entirely clear if these external actors were the same person, different people, or if they were even directly part of the exploit. As such, we have chosen to not include their names as the evidence is not yet definitive (in our minds) regarding these individuals’ roles in XZB. Our goal is to share the facts of XZB as best we can while minimizing the spread of misinformation. As such, these other actors will remain nameless in our written account of the matter (at least, for now).
Why XZ’s Backdoor is Undetectable Using Traditional Code Analysis
A core concern around the XZ-backdoor is in its potential undetectability using traditional code analysis techniques. In our own analysis, it appears that XZB is undetectable using static code analysis techniques such as those found in tools like Snyk, SourceGraph, and SonarQube. While these tools are outstanding in their ability to perform precise code analysis, they would not be capable of detecting XZB. The reason for this is because the malicious code that is at the core of the XZ-backdoor was never directly committed to the XZ repository. Given that static analysis tools can only analyze code that is directly present and readable in a code repository, it would not be possible (nor reasonable) for such tools to detect a socio-software exploit.
Even the malicious code that is embedded inside XZB’s cryptographic tarball file was in a pre-compiled binary form targeting x86_64 architecture and then distributed in fragments across a specific range of locations in the file. Thus, even if an analysis system existed on all of the client machines where XZ was being used and such an analysis system could perform binary code inspection, without full symbolic information available to it once the file had been decryptied, it would likely still be incapable of detecting XZB’s malicious code due to its distribution and fragmentation across the tarball file.
Based on our research, it appears that the only time the XZ-backdoor code is fully comprehensible and malign is when it has been piecewise assembled and linked into liblzma library on an infected system where a GCC linker is present and the underlying hardware is using the x86_64 instruction set architecture (ISA). Otherwise, XZB would remain fragmented, dormant, and appear to be innocuous to any security tool analyzing it.
However, there are signals – particularly those of a sociological nature – that emerge when analyzing the XZ’s repository from a contributor perspective. So, while it may not be possible to detect XZB from a pure code analysis perspective, it may be possible to observe a signal of XZB’s anomalous socio-software footprint from a coder analysis perspective.
A New Hope: Confidence Signals Using Sociological Analysis
Traditional code analysis techniques appear incapable of detecting exploits like XZB. Yet, if we reframe the problem from a code to a coder analysis, we may be able to detect social abnormalities that would otherwise be missed. These social outliers may warn us of an impending attack like XZB.
In the case of Jia Tan, as discussed in prior sections, we observed at least two major social outliers in his behavior prior to the injection of the XZ backdoor. Moreover, we can visually observe a third social anomaly when analyzing the changes to the XZ repository over time (shown in Figure 2).
To demonstrate the efficacy of repository analysis from a sociological perspective, we will consider some of the features present in Merly Mentor. These features have been conceived in part through Merly’s partnership with the Cloud Native Computing Foundation (CNCF), specifically in our multi-year discussions with Taylor Dolezal, Head of Ecosystem, and Chris Aniszczyk, CNCF’s Chief Technology Officer.
Mentor Summaries:
The Mentor Summaries feature provides an overview of key repository events over the entire lifetime of the repository. This feature uses a fusion of several of Mentor’s novel AIs and programming language (PL) analyses to perform a full code analysis of the lifetime of the repository, which often includes performing inference across 10M-100M lines of code. Once the code pass is complete, Mentor performs a contributor pass. Mentor then performs a third analysis to make suppositions about code and coders based on contextual trends it has observed over the lifetime of the repository.
Figure 3: The Trend Analysis of Merly Mentor Summaries of XZ repository. The summaries provided by Mentor are fully automated based on multiple passes of analysis over a repository, each from a different perspective (e.g., code, contributor, rate of changes, etc.). Mentor Summaries observes that starting in November 2022 Jia Tan begins contributing substantially and remains highly active until the present.
The data shown in Figure 3 is an example of Mentor Summaries and the types of sociological behaviors it can observe. These findings are identified in an entirely automated way based on a holistic analysis of the repository’s history. Mentor also infers social behaviors as normal or anomalous from a broader landscape of similar repositories and within the context of normality specifically for the repository being analyzed.
In the case of XZ’s Mentor Summaries, Mentor has observed that the repository appears to take a notable shift in contributions starting around November 2022. It identifies both Lasse Collin and Jia Tan as the key contributors for this period, who remain key contributors to date (Figure 3). While Lasse Collin’s contributions are noted, given the historical trends of XZ and his near two decades worth of involvement, these observations are not abnormal. From Figure 1, Lasse can be seen contributing significantly over the lifetime of the repository. However, Jia Tan’s contributions are social anomalies. This is because Jia had not made any contributions to XZ prior to the final quarter of 2022. When considered from this lens, we believe that the identification of such social abnormalities could help the community identify the potential threat Jia Tan presented, where he was “hiding in plain sight.".
Mentor Watch:
While providing summaries of repositories may be helpful for deeper analysis, given the volume of repositories it may be unrealistic to expect any DevOps team to have constant oversight of all repositories under its purview. This is why we created Mentor Watch.
In a nutshell, Mentor Watch enables users to flag repositories that they wish to have Mentor “watch.” Once Mentor begins watching a repository, if it determines anything unusual (either by user definition or its own learned AI system), it will notify all users who have subscribed for the Mentor Watch events. While Merly is still in the early stages of development of Mentor Watch, it is already an experimental and functional feature of Mentor (shown in Figure 4). Within the next few months, Mentor Watch’s AI will be able to autonomously learn the normal behaviors of a given repository, entirely on its own, and then provide the requested users notifications whenever an anomalous event occurs – no matter what kind of event it is, code, coder, or something else entirely.
Figure 4: Screenshot of Mentor Watch for XZ. Mentor Watch is still in its early stages; in the coming months it will have its own AI to drive autonomous trend foci and alerts.
The Future is Open – Open Source Software
In writing this article, we believe the most important takeaway is that we don't lose our faith in the open source software (OSS) process. It is working.
OSS is changing the world and for the better. Linux is the most used operating system in the world and at both extremes: in the cloud and at the edge (and everywhere in between). This is evidence that OSS works. However, in deeply analyzing this exploit, it appears to us that a major potential goal of XZB is to erode trust in OSS.
Yes, we should raise our guard and prepare for the next wave of socio-software exploits. Yes, this could have gone catastrophically wrong. But, this doesn’t mean we should look for scapegoats, misplace blame, or question the actual viability of OSS. If we go too far down this path, the XZB bad actors may end up winning the game they may be really trying to play: to sabotage the OSS community.
Consider for a moment the data around the exploit, the rapidity of its discovery, and the ease of which it was remedied. Perhaps this exploit was less about slowly backdooring millions or billions of systems with a fast-acting infection and more about demonstrating that even our most sacred OSS system, Linux, is vulnerable. That might be enough to cast doubt, fear, and even give rise to an anti-OSS movement in even the most ardent OSS supporters.
Andres Freund is undoubtedly a hero in this story. However, as Andres reported, there was an obvious spike in CPU utilization and a degradation of performance in his SSH logins. Yet, the team that created the XZB hack are some of the most sophisticated and patient hackers we’ve seen. They worked in a covert fashion for over two years, patiently building trust across the OSS community. They introduced one of the most sophisticated backdoors ever seen ... yet, they couldn't infect systems without introducing a serious performance spike?
Really? They aren't aware of how to use C sleep() calls to baseline CPU utilization?
If the goal was for XZB to infect millions or billions of machines, certainly “staying hidden” for as long as possible would be a primary goal. Yet, the computational footprint of XZB is not at all subtle. Given the sophistication of the attackers, there should have been dozens of ways to minimize the backdoor’s computational signature to help ensure it stays nearly invisible for many years. Evidence of this can be found in other exploits like Log4Shell, which is one of the reasons why it was such a massive exploit: it remained virtually undetected for almost a decade. According to reports, Log4Shell became active as an exploit in 2013, but it wasn't identified until 2021. This raises some profound questions.
Whatever the answer to those questions are, it’s our strong conviction that we cannot let these bad actors sway our faith and trust in OSS. The OSS community is and will continue to be built on trust. Trust in other humans. Trust in our process. Trust that whatever hurdles we experience we can overcome so long as we continue to work together as a community. We have overwhelming evidence that OSS works and works well. Is it perfect? Of course not; very little in this world is. But, is it better than the alternative where there is no open source community?
100%, yes.
Open source is dead; long-live open source!