This post gathers my notes from day 2 of the EuroRust 2024 conference.

Notes: "The first six years in the development of Polonius, an improved borrow checker" by Amanda Stjerna

Rust is a cool playground for novel Computer Language Theory and Type Theory concepts.
- It's not just academic research :)
Rust in early days had a "lexical" lifetime checker, which was very annoying, as lifetimes were bound to code scopes.
The current, non-lexical borrow checker is better, but still not perfect. See infamous "Case #3" -- keeping track of lifetimes across conditional control flow and function calls.
Polonius, the new Rust borrow checker that's much more advanced and lenient, is coming.
- But no ETA yet.
- Certainly not for Rust 2024 edition as originally planned.
Polonius is VERY slow compared to the old borrow checker.
It would render the "lifetime" concept kind of obsolete -- the borrow checker would track "sets of variables" and their individual usages throughout the program.
- This raises some interesting syntax questions.
- Old code would still work, perhaps one way forward would be to use 2-tiered borrow checker. One operating on "lifetimes" and the new one for when the old one is not good enough.

Notes: "Non-binary Rust: Between Safe and Unsafe" by Boxy Uwu

This talk's delivery was pretty original, and suffice to say it polarized the audience.

The speaker made some good points about unsafe code following contributions to the Bevy game engine, especially the bevy_ptr crate.

Such software must use unsafe code for performance reasons.
- A game engine has hot loops and many allocations.
- It uses raw pointers over fat Rust references and custom slice wrappers without explicit bounds checks.
- The challenge the author faced was that the unsafe code was unstructured, and bevy_ptr wraps all the "unsafe-ish" pointer types in one separate library, with docs, tests etc.
Writing unsafe code doesn't mean the code is unsafe.
We can wrap all the unsafe code in safe-ish interface, and localize it in structs, modules, and functions.

Notes: "Writing a SD Card driver in Rust" by Jonathan Pallant

Talk delivered by one of the experts in the Rust embedded community.

I don't have many notes here, because it was full of examples and computer history lessons :)

SD Cards are kind of complicated!
- They are block devices, and the underlying flash storage is not as straightforward as "read bytes" / "write bytes".
- Each SD card is small computer (the memory chip needs a hardware driver), to provide a higher-level block interface to the underlying hardware.
- The driver hardware commands are standardized. The card knows nothing about the filesystem -- that's implemented on top of the block driver.
- The SD card spec is very complicated. There's many versions, and many interface peripherals that can actually talk to the SD card.
There's a crate embedded-hal that abstracts away many peripherals and it can support many embedded target platforms.
The driver that the author was working on is available as embedded-sdmmc.
The FAT filesystem is full of quirks, e.g. long filenames require special file allocation table entries that are kind of like directories, but not really.
Access time file metadata (somewhat surprisingly) is built right into FAT FS, alongside create and modification time.

Notes: "I/O in Rust: the whole story" by Vitaly Bragilevsky

This was an interesting dive into how Rust does I/O and filesystem operations in sync and async flavors.

Most I/O operations in the standard library (std) wrap a C library under the hood (e.g. glibc or the standard C library for the given platform)
tokio::fs::read does "asyncify" on the underlying std call which synchronous, and launches a background thread wrapped in a Future.
Tokio's AsyncReadExt under the hood which promises "actual async I/O" is also under layers of async abstractions wrapping synchronous std I/O calls.
In Linux, the new hot way of doing "actual async I/O" is io_uring.
- It's a low-level interface and needs unsafe.
- There's a Tokio fork tokio-uring but it seems to be pretty young.

Usage of inner functions inside `std` I/O functions

I've noticed that some standard library I/O functions like std::fs::read_to_string use an inner function that looks like this:

fn some_io_operation<P: AsRef<Path>>(path: P) -> io::Result<String> {
    // 👇 Note this `inner` function that takes `&Path` and not `P`.
    fn inner(path: &Path) -> io::Result<String> {
        let string = ...;
        // I/O-heavy operations...

        Ok(string)
    }

    inner(path.as_ref())
}

I really started to wonder why that is, since it's a pretty common pattern in the standard library.

I asked the speaker about this after the talk, but they didn’t quite know the answer. I then got approached by an audience member (whoever you were, another warm thank you for the tip!) and they said that it was about optimizing code monomorphization.

I started digging into this after the conference and found some answers:

So, to recap, the pattern looks like this:

The inner function only takes a non-generic &Path parameter and is only compiled once.
The outer (public) function is generic, and only calls the inner function. It is "monomorphized" for each type passed in.

It's a compile time optimization.

"Monomorphization" works by essentially creating a separate version of the generic code for each type passed in throughout the codebase. Without this inner function, the compiler would have to monomorphize the entire I/O function with all the heavy lifting code being copied for each monomorphized version.

Using this pattern, the inner function only gets compiled once, for the concrete non-generic type. The outer function only contains the type-specific conversion into &Path and a function call. This results in less work for the compiler, and faster compile times.

Notes: "Fast and efficient network protocols in Rust" by Nikita Lapkov

The author went through high-level implementation of elfo, a new async actor runtime.

Some interesting features of elfo:

Uses custom network protocol, gRPC was rejected as it wasn't fit for purpose here.
Telemetry / metrics are built-in.

Notes: "Code to contract to code: making ironclad APIs" by Adam Chalmers

The talk was a high-level overview of how to expose OpenAPI specs out of Rust code, and good practices learned while working on zoo.dev.

bump.sh was used to render the OpenAPI specs. I thought it looked quite nice.
It's best for the API server to generate the OpenAPI specs (code = spec).
For axum, there's utoipa-axum.
There's an new Rust web framework called dropshot which has tight OpenAPI integration.
Given the spec, a client library can (and should) be generated automatically.
- But it's common that the client needs custom hand-written domain-specific code.
- For generating Rust clients, the author's team ended up writing their own generator called openapitor.
cargo-llvm-lines was mentioned as a way of determining which functions are the heaviest in terms of generated LLVM IR code.

Notes: "Rust Irgendwie, Irgendwo, Irgendwann" (e.g. Taking Rust Everywhere) by Henk Oordt

This talk was about taking a step back and reflecting on how empowering Rust is.

We can write system software, web APIs, CLI programs, target embedded devices, WASM and all sorts of things with a single programming language and even a single codebase.

For WASM, there's a new tool/bundler called Trunk. It seems like a viable alternative to wasm-pack.
heapless crate was mentioned for embedded (which provides standard collections with static allocations).
embassy was mentioned as a next-generation async framework for embedded devices. It definitely is an interesting usage of Rust's async.

Notes: "Linting with Dylint" by Samuel Moelius

Slides link.

This talk described Dylint, a utility to create and run custom lints in Rust.

This is useful for library and project-specific, internal lints, as clippy is a very generic linter, with no library-specific lints.
The talk outlined adding a new lint to warn if an io::Result was returned out of a function with anyhow::Result without any context.
- It's actually a very common pitfall, not to know which file failed to open etc, and Rust by default gives no file information in tracebacks.
- Yoshua Wuyts' Error Handling Survey blog post was mentioned.
It's actually not that hard to add new lints! Dylint is definitely worth checking out.
clippy_utils provides a lot of helpful functions to produce lints.
clippy::author is a facility to automatically create a clippy lint that detects the offending pattern. It's not perfect, but it gives a good starting point.

Notes: "Building an extremely fast Python package manager, in Rust" by Charlie Marsh

This provided some interesting insights into uv, a Python package manager written in Rust.

uv uses a single Tokio I/O thread. It was proven to be the fastest for this use case.
Their cache keeps all packages unzipped and reflinks/hard-links the files into project specific environments from the global cache.
They use PubGrub SAT solver to resolve dependency constraints.
Python packaging dates back 25 years.
- Old packages use setup.py which provides no static metadata.
- The versioning scheme is very complex. An interesting optimization can be done for the "most common" versions to squeeze them into a single u64 which is trivially sortable.

It looks like uv is pretty mature these days. I will take a proper look at it, and hopefully migrate some of my Python projects. The speed benefits are a game changer.

One thing I should check is whether uv's Docker story is good. For instance, Poetry was a nightmare in Docker builds for Python projects, and I ended up using plain pip over Poetry everywhere.

More Notes!

This concludes my notes from day 2 of the conference. I have more notes:

EuroRust 2024: Day 2 Notes