Unsafe Rust

    2024-12-19 (last edit: 2024-12-18)

    Unsafe Rust Alter Ego

    So far, no operation in Rust that we performed could trigger UB (Undefined Behaviour):

    • data races were prevented by sharing XOR mutability borrow checker rule;
    • use-after-free, dangling references, etc. were prevented by lifetimes & ownership.

    But no respectable, powerful programming language can stand being constrained that much, in such a cage!

    In Rust, unsafe keyworld unleashes the hidden superpowers.

    Unsafe code superpowers

    Inside a unsafe { ... } block, you can (and normally you can't):

    • Dereference a raw pointer,
    • Call an unsafe function or method,
    • Access or modify a mutable static variable,
    • Implement an unsafe trait,
    • Access fields of a union.

    The first superpower is the most important. (Efficient) implementation of many data structures would be impossible without ability to use raw pointers, as references don't allow circular dependencies, among other limitations.

    In the following code sample, we show all superpowers of unsafe code:

    #![allow(unused_assignments)]
    #![allow(unused_variables)]
    #![allow(dead_code)]
    
    /* unsafe superpower 1: dereferencing pointers. */
    fn superpower_1() {
        let x = 42;
    
        // Implicit &T -> *const T conversion.
        let raw_ptr: *const i32 = &x;
    
        // An old way to directly create a pointer.
        let raw_ptr: *const i32 = std::ptr::addr_of!(x);
    
        // The new way to directly create a pointer.
        let raw_ptr: *const i32 = &raw const x;
    
        // Dereferencing a raw pointer requires an `unsafe` block.
        println!("Value: {}", unsafe { *raw_ptr });
    }
    
    /* unsafe superpower 2: calling an unsafe function. */
    unsafe fn unsafe_function() {
        println!("This is an unsafe function!");
    }
    
    fn superpower_2() {
        unsafe {
            // Calling an unsafe function.
            unsafe_function();
        }
    }
    
    /* unsafe superpower 3: Accessing or modifying mutable static variable.
     * It is unsafe because it can lead to data races if accessed concurrently.
     * */
    
    static mut COUNTER: i32 = 0;
    
    fn increment_counter() {
        unsafe {
            // Accessing and modifying a mutable static variable
            COUNTER += 1;
            println!("Counter: {}", COUNTER);
        }
    }
    
    fn superpower_3() {
        // This would cause UB: a data race.
        // std::thread::spawn(increment_counter);
        increment_counter();
    }
    
    /* unsafe superpower 4: Implementing unsafe traits.
     * It is unsafe because safe code is permitted to cause UB if an unsafe trait
     * is implemented for a type that should not implement it (think Send/Sync).
     * */
    
    unsafe trait CanBeAtomic {
        fn safe_method_of_unsafe_trait(&self);
    }
    
    struct MyStruct {
        i: i32,
    }
    
    unsafe impl UnsafeTrait for MyStruct {
        fn safe_method_of_unsafe_trait(&self) {
            println!("Method called!");
        }
    }
    
    fn superpower_4() {
        let my_struct = MyStruct { i: 42 };
    
        // Calling a safe method from an unsafe trait
        my_struct.safe_method_of_unsafe_trait();
    }
    
    /* unsafe superpower 5: Accessing fields of a union.
     * It is unsafe because union can contain a different variant that we try to read,
     * so we could read some rubbish value.
     * */
    
    union MyUnion {
        int_value: i32,
        bool_value: bool,
    }
    
    fn main() {
        let u = MyUnion { int_value: 42 };
    
        unsafe {
            // Accessing a field of a union
            println!("Union value as int: {}", u.int_value);
    
            // Would result in UB, as the compiler may assume that bool is either 0 or 1 underneath.
            // println!("Union value as bool: {}", u.bool_value);
        }
    }
    
    

    (Download the source code for this example: unsafe_superpowers.rs)

    Safe code guarantees

    The single fundamental property of Safe Rust, the soundness property:

    No matter what, Safe Rust can't cause Undefined Behavior.

    This is a valid sound code, with a safe encapsulation over unsafe interior.

    fn index(idx: usize, arr: &[u8]) -> Option<u8> {
        if idx < arr.len() {
            unsafe {
                Some(*arr.get_unchecked(idx))
            }
        } else {
            None
        }
    }
    

    (Un)soundness means that there exists a possibility to trigger UB. The following code is unsound (why? what has changed?):

    fn index(idx: usize, arr: &[u8]) -> Option<u8> {
        if idx <= arr.len() {
            unsafe {
                Some(*arr.get_unchecked(idx))
            }
        } else {
            None
        }
    }
    

    But we only changed safe code! This shows that unsafe is unfortunately not perfectly scoped and isolated. We need to be extra careful when writing unsafe code.

    Reading

    • The Book, Chapter 19.1

    • The Rustonomicon, especially chapter 1 (Meet Safe and Unsafe)

    • How unpleasant is Unsafe Rust?

    • RUDRA: Finding Memory Safety Bugs in Rust at the Ecosystem Scale - automatic static analyzer to find 3 most frequent subtle bugs in unsafe code:

      1. panic (unwind) safety bug (analogous to exception-handling guarantees in C++),
      2. higher-order safety invariant (assuming certain properties of the type that the generic is instantiated with that are not guaranteed by the type system, e.g., purity),
      3. propagating Send/Sync in Generic Types (implementing Send/Sync unconditionally for T, even if T contains non-Send/non-Sync types inside).

      RUDRA found 264 previously unknown memory-safety bugs in 145 packages on crates.io!!!

      Is Rust really a safe language...?

      Only transitively. Safe Rust is sound iff unsafe code called by it is sound too.