glibc vs. musl

An overview of the differences between glibc and musl.

Over the years, various implementations of the C standard library — such as the GNU C library, musl, dietlibc, μClibc, and many others — have emerged with different goals and characteristics. These various implementations exist because the C standard library defines the required functionality for operating system services (such as file input/output and memory management) but does not specify implementation details. Among these implementations, the GNU C Library (glibc) and musl are among the most popular.

When developing Wolfi, the “undistro” on which all Chainguard Images are built, Chainguard elected to have it use glibc instead of another implementation like musl. This conceptual article aims to highlight the differences between these two implementations within the context of Chainguard’s choice of using glibc over musl as the default implementation for the Wolfi undistro.

Note: Several sections of this guide present data about the differences between glibc and musl across various categories. You can recreate some of these examples used to find this data with the Dockerfiles and C program files hosted in the glibc-vs-musl directory of the Chainguard Academy Images Demos repository.

High-level Differences between glibc and musl

The GNU C Library (glibc), driven by the GNU Project, was first released in 1988. glibc aims to provide a consistent interface for developers to help them write software that will work across multiple platforms. Today, glibc has become the default implementation of the C standard library across the majority of Linux distributions, including Ubuntu, Debian, Fedora, and even Chainguard’s Wolfi.

musl (pronounced “muscle”) was first released in 2011 as an alternative to glibc. The goal of musl is to strive for “simplicity, resource efficiency, attention to correctness, safety, ease of deployment, and support for UTF-8/multilingual text.” While not as widely used as glibc, some Linux distributions are based on musl, the most notable being Alpine Linux.

The following table highlights some of the main differences between glibc and musl.

Criteriaglibcmusl
First Release19882011
LicenseGNU Lesser General Public License (LGPL)MIT License (more permissive)
Binary SizeLarger BinariesSmaller Binaries
Runtime PerformanceOptimized for performanceSlower performance
Build PerformanceSlowerFaster
CompatibilityPOSIX Compliant + GNU ExtensionsPOSIX Compliant
Memory UsageEfficient, higher memory usagePotential performance issues with large memory allocations (e.g. Rust)
Dynamic LinkingSupports lazy binding, unloads librariesNo lazy binding, libraries loaded permanently
ThreadingNative POSIX Thread Library, robust thread safetySimpler threading model, not fully thread-safe
Thread Stack SizeVaries (2-10 MB), based on resource limitsDefault size is 128K, can lead to crashes in some multithreaded code
Portability IssuesFewer portability issues, widely usedPotential issues due to different system call behaviors
Python SupportFast build times, supports precompiled wheelsSlower build times, often requires source compilation
NVIDIA SupportSupported by NVIDIA for CUDANot supported by NVIDIA for CUDA
Node.js SupportTier 1 Support - Full SupportExperimental - May not compile or test suite may not pass
Debug SupportSeveral debug features available such as sanitizers, profilersDoes not support sanitizers and limited profilers
DNS ImplementationStable and well-supportedHistorical reports of occasional DNS resolution issues

Buffer Overflows

musl lacks default protection against buffer overflows, potentially causing undefined behavior, while glibc has built-in stack smashing protection. Running a vulnerable C program, glibc terminates with an error upon detecting an overflow, whereas musl allows it without warnings. Even using FORTIFY_SOURCE or -fstack-protector-all won’t prevent the overflow in musl.

To illustrate buffer overflow, this section outlines running a vulnerable C program.

Creating the necessary files

First, create a working directory and cd into it.

mkdir ~/ovrflw-bffr-example && cd $_

Within this new directory, create a C program file called vulnerable.c.

#include <stdio.h>
#include <string.h>

int main() {
  char buffer[10];

  strcpy(buffer, "This is a very long string that will overflow the buffer.");

  printf("Buffer content: %sn", buffer);

  return 0;
}

Next create a Dockerfile named Dockerfile.musl to create an Image which will use musl as the C library implementation:

FROM alpine:latest

RUN apk add --no-cache gcc musl-dev

COPY vulnerable.c /vulnerable.c

RUN gcc -o /vulnerable_musl /vulnerable.c

CMD ["vulnerable_musl"]

Then create a Dockerfile named Dockerfile.glibc for one that uses glibc:

# Build stage
FROM cgr.dev/chainguard/gcc-glibc AS build

WORKDIR /work

COPY vulnerable.c /work/vulnerable.c

RUN gcc vulnerable.c -o vulnerable_glibc

# Runtime stage
FROM cgr.dev/chainguard/glibc-dynamic

COPY --from=build /work/vulnerable_glibc /vulnerable_glibc

CMD ["/vulnerable_glibc"]

Next, you can build and test both of the new images.

Building and testing the images

First build the image that will use musl:

docker build -t musl-test -f Dockerfile.musl .

Then build the image that will use glibc:

docker build -t glibc-test -f Dockerfile.glibc .

Then you can run the containers to test them.

First run the musl-test container:

docker run --rm musl-test

Because musl does not prevent buffer overflows by default, it will allow the program to print This is a very long string that will overflow the buffer.:

Buffer content: This is a very long string that will overflow the buffer.

Next test the glibc-test container:

docker run --rm glibc-test

glibc has built-in protection, so the output here will only let you know that the program was terminated:

*** stack smashing detected ***: terminated

Note: As mentioned previously, several of the remaining sections in this guide present data about the differences between glibc and musl across various categories. You can recreate some of these examples by following the same procedure of setting creating and testing images based on the Dockerfiles and program files relevant to the example you’re exploring. You can find the appropriate files in the glibc-vs-musl directory of the Chainguard Academy Images Demos repository.

Library and Binary Size

musl is significantly smaller than glibc. A primary reason for this is due to the differing approaches adhering to the Portable Operating System Interface (POSIX). POSIX is a family of standards specified by the IEEE Computer Society to ensure consistent application behavior across different systems. musl adheres strictly to POSIX standards without incorporating additional extensions.

glibc, while adhering to the POSIX standards, includes additional GNU-specific extensions and features. These extensions provide enhanced functionality and convenience, offering developers comprehensive tools. As an example, glibc provides support for Intel Control Enforcement Technology (CET) when running on compatible hardware, providing control flow security guarantees at runtime — a feature that doesn’t exist on musl. However, this extensive functionality results in larger library sizes for glibc, with glibc’s function index listing over 1700 functions.

You might notice the decreased binary size for musl in a simple hello world program, whether linked statically or dynamically. As we can observe, since musl is much smaller than glibc, the statically linked binary is much smaller on Alpine. In the case of dynamic linking, the binary size is smaller for musl compared to glibc because of its simplified implementation of the dynamic linker as outlined in musl project’sdesign philosophy.

The following table shows the difference in binary size of statically and dynamically linked hello world programs:

DistroStatic linkingDynamic linking
Alpine (musl) binary size132K12K
Wolfi (glibc) binary size892K16K

The smaller the binary size, the better the system is at debloating. You can find the Dockerfiles used in this setup in the binary-bloat directory of this guide’s example’s repository.

Portability of Applications

The portability of an application refers to its ability to run on various hardware or software environments without requiring significant modifications. Developers can encounter portability issues when moving an application from one libc implementation to another. That said, Hyrum’s Law reminds us that achieving perfect portability is tough. Even when you design an application to be portable, it might still unintentionally depend on certain quirks of the environment or libc implementation.

One common portability issue is the smaller thread stack size used by musl. musl has a default thread stack size of 128k. glibc has varying stack sizes which are determined based on the resource limit, but usually ends up being 2-10 MB.

This can lead to crashes with multithreaded code in musl, which assumes it has more than 2MiB available for each thread (as in a glibc system). Such issues cause application crashes and potentially introduce new vulnerabilities, such as stack overflows.

Building from Source Performance

We’ve compared the build from source performance for individual projects using the musl-gcc compiler toolchain used in Alpine and gcc compiler toolchain used in Chainguard Wolfi. We compare the build from source times of both ecosystems.

The following table highlights the results of this comparison by highlighting the compilation times between Wolfi (glibc) and musl-gcc. The shorter the build time, the better the system’s performance.

RepositoryWolfi compilation timemusl-gcc compilationBuild successful with musl?
binutils-gdb18m 3.11s*
No - C++17 features unsupported
Little-CMS29.44s24.13sYes
zlib11.48s9.37sYes
libpcap8.19s5.61sYes
gmp98.91s99.38sYes
openssl849.08s671.92sYes
curl92.33s79.15sYes
usrsctp55.39s48.38sYes

You can find the Dockerfiles used in this setup in the build-comparison directory of this guide’s example’s repository.

This table shows that musl-gcc has a lower compilation time than gcc on Wolfi for these projects if it can build the project successfully.

musl-gcc fails to compile binutils-gdb because it conforms to POSIX standards, and binutils-gdb uses certain code features that are not conformant to these standards. The binutils project on the main branch fails to configure with native musl-gcc.

Python Builds

A common way to use existing Python packages is through precompiled binary wheels distributed from the Python Package Index (PyPI). Python wheels are typically built against glibc; because musl and glibc are different implementations of the C standard library, binaries compiled against glibc may not work correctly or at all on systems using musl. Due to this incompatibility, PyPI defaults to compiling from source on Alpine Linux. This implies you need to compile all the C source code required for every Python package.

This also means you must determine every system library dependency needed to build the Python package from the source. For example, you have to install the dependencies beforehand, using apk add <long list of dependencies> before you perform pip install X.

The following table shows PIP install times across Alpine (musl) and Wolfi (glibc). You can find the Dockerfiles used in this setup in the python-build-comparison directory of this guide’s example’s repository.

Python PackageAlpine (musl)Wolfi (glibc)
Matplotlib, pandas21m 30.196s0m 24.3s
tensorflow104m 21.904s2m 54.5s
pwntools29m 45.952s21.5s

As this table shows, this results in long build times whenever you want to use Python-based applications with musl.

Take the example of pwntools, a Python package that allows for the construction of exploits in software. When using glibc-based distros, the installation would be in the form pip3 install pwntools. To install pwntools on a musl-based distro (such as Alpine), the Dockerfile is much more complicated:

FROM alpine:latest

# Prebuilt Alpine packages required to build from source
RUN apk add --no-cache musl-dev gcc python3 python3-dev libffi-dev libcap-dev make curl git pkgconfig openssl-dev bash alpine-sdk py3-pip
RUN python -m venv my-venv
RUN my-venv/bin/python -m pip install --upgrade pip

# Build from source cmake for latest version
RUN git clone https://github.com/Kitware/CMake.git && cd CMake && ./bootstrap && make && make install
ENV PATH=$PATH:/usr/local/bin

# Build from source Rust for latest version
RUN curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf > setup-rust.sh
RUN bash setup-rust.sh -y
ENV PATH=$PATH:/root/.cargo/bin

# Finally install pwntools
RUN pip3 install pwn

As this Dockerfile shows, pwntools requires a set of other packages. These in turn require the most up-to-date versions of Rust and cmake, which are not available in the default prebuilt packages in Alpine. You would have to build both from source before installing the Python dependencies and, finally, pwntools. Such dependencies will have to be identified iteratively through a process of trial and error while building from source.

Runtime Performance

Time is critical. One common bottleneck occurs when allocating large chunks of memory repeatedly. Various reports have shown musl to be slower in this aspect. We compare this memory allocation performance between Wolfi and the latest Alpine here. The benchmark uses JSON dumping, which is known to be highly memory intensive.

RuntimeAlpine (musl)Wolfi (glibc)
Memory Allocations Benchmark102.25 sec51.01 sec

This table highlights how excessive memory allocations can cause musl (used by Alpine) to perform up to 2x slower than glibc (used by Wolfi). A memory-intensive application needs to be wary of performance issues when migrating to the musl-alpine ecosystem. Technical details on why memory allocation (malloc) is slow can be found in this musl discussion thread.

Apart from memory allocations, multi-threading has also been problematic for musl, as shown in various GitHub issues and discussion threads). glibc provides a thread-safe system, while musl is not thread-safe. The POSIX standard only requires stream operations to be atomic; there are no requirements on thread safety, so musl does not provide additional thread-safe features. This means unexpected behavior or race conditions can occur during multiple threads.

We used a Rust script (referenced from the github issue) to test single-thread and multi-thread performance on Alpine (musl) and Wolfi (glibc). The next table shows performance benchmarks across single-threaded and multi-threaded Rust applications.

RuntimeAlpine (musl)Wolfi (glibc)
Single-thread (avg of 5 runs)1735 ms1300 ms
Multi-thread (avg of 5 runs)1178 ms293 ms

Alpine (musl) has the worse performance out of the two, taking around 4x more time for multi-thread when compared to Wolfi (glibc). As discussed previously, the real source of thread contention is in the malloc implementation of musl. Multiple threads may allocate memory at once, or free memory may be allocated to other threads. Therefore, the thread synchronization logic is a bottleneck for performance.

Experimental Warnings

Developers will most likely encounter musl through Alpine image variants, such as Node.js (node:alpine) and Go (golang:alpine). Both images have similar warnings that they use musl libc instead of glibc, pointing users to this Hacker News comment thread⁠ to discuss further pros and cons of using Alpine-based images.

Additionally, Node.js mentions in their building documentation: “For production applications, run Node.js on supported platforms only.” musl and Alpine are experimental supports, whereas glibc is Tier 1 support.

The Go image also mentions that Alpine is not officially supported and experimental: “This (Alpine) variant is highly experimental, and not officially supported by the Go project (see golang/go#19938⁠ for details).”

Unsupported Debug Features

Certain applications that rely on debug features for testing — including sanitizers (such as Addressanitizer, threadsanitizer, etc.) and profilers (such as gprof) — are not supported by musl.

Sanitizers help debug and detect behaviors such as buffer overflows or dangling pointers. According to the musl wiki open issues, GCC and LLVM sanitizer implementations rely on libc internals and are incompatible with musl. Feature requests have been made in the LLVM sanitizer repository for support for musl (check out this issue or this one for examples), but they have not been addressed.

DNS issues

The Domain Name System (DNS) is the backbone of the internet. It can be thought of like the internet’s phonebook, mapping internet protocol (IP) addresses to easy-to-remember website names. Multiple historical sources on the web have pointed out DNS issues when using musl-related distros. Some have pointed out issues with TCP (which is fixed in Alpine 3.18), and others have pointed out random cases with DNS resolution issues.

Please refer to the following resources regarding musl’s history with DNS:

Conclusion

glibc and musl both serve well as C implementations. Our goal for this article is to explain Chainguard’s rationale for choosing to use glibc for Wolfi. We believe that’s what made the most sense for our project, but you should continue your own research to determine if one C implementation would suit your needs better than another.

If you spot anything we’ve overlooked regarding glibc or musl or have additional insights to contribute, please feel free to raise the issue in chainguard-dev/edu. We welcome further discussion on weaknesses in glibc, such as its larger codebase and complexity compared to musl. Additionally, insights into the intricacies of compiler toolchains for cross-compilation are welcomed, especially when dealing with glibc and musl.

Finally, we encourage you to check out this additional set of articles and discussions about others’ experiences with musl:

Last updated: 2024-08-27 10:42