Skip to content

Add more discussion with a Python 2->3 example #884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/Simplifying-Software-Component-Updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ Historically fewer software components were reused, and there were fewer layers

Backward-incompatible changes are also increasingly problematic because most software components are used **indirectly**. When there are many layers of dependencies, it takes time for each layer’s updates to trickle up, introducing a sort of “speed of light” rate limit for updating software. Any delay in updating any intermediate layer impedes updates of all transitive users. For example, [[Wetter2021](https://security.googleblog.com/2021/12/understanding-impact-of-apache-log4j.html)] found in response to the Log4Shell vulnerability that “most artifacts that depend on log4j do so indirectly. The deeper the vulnerability is in a dependency chain, the more steps are required for it to be fixed. […] For greater than 80% of the packages, the vulnerability is more than one level deep, with a majority affected five levels down (and some as many as nine levels down).”

Backward-incompatible changes are even harder to deal with today because of the larger scale of software today. Custom software is often larger, and often depends on many other components. If a new interface must eventually be used, it may be possible to slowly change over time different files and components through a series of releases, though it can be costly and time-consuming. Demanding that "everything change at once" is far more difficult. For example, Python 3.0 was released on 2008-12-03. This was a backwards-incompatible release; transitioning from Python2 to Python3 required all code and libraries to simultaneously change. This transition was notoriously difficult and slow, with [even its creator finding it difficult in his organization](https://www.youtube.com/watch?v=Oiw23yfqQy8&t=1278s). [Python2 support was sunset on 2020-01-01](https://www.python.org/doc/sunset-python-2), yet the [Python Developers Survey 2022](https://lp.jetbrains.com/python-developers-survey-2022/#PythonVersions) found that 14 years after the Python3 release, 7% of Python users overall still used the older Python2, with notable uses in data analysis (29%), web development (19%), and DevOps (23%). As of 2025-05-07, 17 years later, [19.1% of websites using Python use Python2](https://w3techs.com/technologies/details/pl-python). Backward-incompatible changes are difficult to manage.

Developers have created mechanisms to deal with backward incompatibility, but these often create larger problems later. A developer may clone some code; in cloning, code is copied into the project. Unfortunately, these copies may include vulnerabilities, and since their origin is no longer automatically tracked, those vulnerabilities are hidden by the development process and are no longer automatically updated. An alternative is shading, a “variant of cloning where entire packages are cloned and renamed.” This may be done at build time (aka “b-shading”) and some ecosystems have tools specifically to support b-shading (e.g., the Maven shade plugin). While b-shading solves an immediate problem and is trackable by tools, the approach also introduces longer-term risks as it tends to endlessly defer necessary updates. Other kinds of shading are used as well [[Dietrich2023](https://arxiv.org/abs/2306.05534)]. In all cases, alternatives create risks when compared to simply updating a given component to its current version.

In extreme cases, such as the Log4Shell vulnerability, specialized programs were created to directly hotpatch programs to perform updates [[Nalley2021](https://aws.amazon.com/blogs/opensource/hotpatch-for-apache-log4j/)]. This extreme approach is **not** reasonable to apply in “normal” circumstances and risks causing many additional problems.
Expand Down