2024-11-14 (last edit: 2024-11-14)
Concurrency is when tasks can make progress independently of each other.
Parallelism is when multiple tasks make progress at the same time.
Nothing unusual here.
Threads can be created with the thread::spawn
function docs - please read them!.
This method returns a JoinHandle<T>
which can be used to wait for the thread to finish. T
is the type of the thread's return value.
Another way to spawn threads is using scope
. Threads created in such way are mandatorily joined at the end of the scope, which guarantees that they will borrow items for no longer that the lifetime of the scope. Hence, they can borrow non-'static
items!
In Rust a panic of one thread doesn't affect the other threads (similar to how Java handles exceptions in threads).
Closures which are used to create threads must take ownership of any values they use. It can be forced with the move
keyword.
use std::thread;
fn main() {
let v = vec![1, 2, 3];
let handle = thread::spawn(move || {
println!("Here's a vector: {:?}", v);
});
handle.join().unwrap();
}
Normal ownership rules still apply. It means that we cannot mutate the vector in the spawned thread from the main thread!
But what if we need to share some state?
They are marker traits used to indicate that a type or a reference to it can be used across threads. See the nomicon for more information.
- A type is
Send
if it is safe to move it (send it) to another thread.- A type is
Sync
if it is safe to share (sync) between threads (T
isSync
if and only if&T
isSend
).
This makes sense, because Sync
is about sharing object between threads, and &
is the shared reference.
There is also a great answer on Rust forum, listing + explaining example types that are !Send
or !Sync
.
For more convenient analysis, examples are listed here:
Send + !Sync
:UnsafeCell
(=> Cell
, RefCell
);!Send + !Sync
:Rc
*const T
, *mut T
(raw pointers)!Send + Sync
:MutexGuard
(hint: !Send
for POSIX reasons)Exercise for the reader: explain reasons for all limitations of the above types.
One possible way is to use message passing. We can use a blocking queue (called mpsc
- "multi producer single consumer FIFO queue") to do it.
We talked about blocking queues in the Concurrent programming class. In Rust, they are strongly-typed. Sending and receiving ends have different types.
In Rust, a mutex wraps a value and makes it thread-safe.
Because it becomes a part of the type, it's impossible to access the underlying value in an unsynchronized manner. It is conceptually similar to the RefCell
type.
Arc
is a smart pointer like Rc
but it can be shared between threads.
Please read more about them in the book.
The docs also mention poisoning
.
RwLocks are similar to mutexes, but they distinguish between read and write locks.
Atomic types are described in the docs.
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::{hint, thread};
fn main() {
let spinlock = Arc::new(AtomicUsize::new(1));
let spinlock_clone = Arc::clone(&spinlock);
let thread = thread::spawn(move|| {
spinlock_clone.store(0, Ordering::SeqCst);
});
// Wait for the other thread to release the lock
while spinlock.load(Ordering::SeqCst) != 0 {
hint::spin_loop();
}
if let Err(panic) = thread.join() {
println!("Thread had an error: {:?}", panic);
}
}
Note that atomic
values don't have to be wrapped in a mutex when shared across threads.
If most types are Sync + Send
, then what stops us from using a standard, non-atomic integer in the example above?
let spinlock = Arc::new(1);
let spinlock_clone = Arc::clone(&spinlock);
let thread = thread::spawn(move|| {
*spinlock_clone += 1;
});
while *spinlock != 0 {
hint::spin_loop();
}
error[E0594]: cannot assign to data in an `Arc`
--> src/main.rs:9:9
|
9 | *spinlock_clone += 1;
| ^^^^^^^^^^^^^^^^^^^^ cannot assign
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `Arc<i32>`
...so we would have to use a RefCell
to be able to modify the value through a shared reference...
let spinlock = Arc::new(RefCell::new(1));
let spinlock_clone = Arc::clone(&spinlock);
let thread = thread::spawn(move|| {
*spinlock_clone.borrow_mut() += 1;
});
// Wait for the other thread to release the lock
while *spinlock.borrow() != 0 {
hint::spin_loop();
}
...but RefCell
isn't Sync
:
error[E0277]: `RefCell<i32>` cannot be shared between threads safely
--> src/main.rs:9:18
|
9 | let thread = thread::spawn(move|| {
| ^^^^^^^^^^^^^ `RefCell<i32>` cannot be shared between threads safely
|
= help: the trait `Sync` is not implemented for `RefCell<i32>`
= note: required because of the requirements on the impl of `Send` for `Arc<RefCell<i32>>`
= note: required because it appears within the type `[closure@src/main.rs:9:32: 11:6]`
note: required by a bound in `spawn`
And that bound mentioned in the last line looks like this:
pub fn spawn<F, T>(f: F) -> JoinHandle<T> where
F: FnOnce() -> T,
F: Send + 'static,
T: Send + 'static,
Why is it impossible to share a reference to a Mutex
between threads spawned with std::thread::spawn
?
Rayon is a library for parallelization of data processing.
It can be used to parallelize the execution of functions over a collection of data by switching the standard Iterator
to a ParallelIterator
.
It works very similar to Java's parallel streams.
Why do that? Because thread synchronization is hard! Rust prevents data races, but logical races and deadlocks are impossible to prevent!!
Rayon's FAQ is worth reading.
Please work on the first iteration of the big project instead.