Salsa overview

This page contains a brief overview of the pieces of a Salsa program. For a more detailed look, check out the tutorial, which walks through the creation of an entire project end-to-end.

Goal of Salsa

The goal of Salsa is to support efficient incremental recomputation. Salsa is used in rust-analyzer, for example, to help it recompile your program quickly as you type.

The basic idea of a Salsa program is like this:


#![allow(unused)]
fn main() {
let mut input = ...;
loop {
    let output = your_program(&input);
    modify(&mut input);
}
}

You start out with an input that has some value. You invoke your program to get back a result. Some time later, you modify the input and invoke your program again. Our goal is to make this second call faster by re-using some of the results from the first call.

In reality, of course, you can have many inputs and "your program" may be many different methods and functions defined on those inputs. But this picture still conveys a few important concepts:

  • Salsa separates out the "incremental computation" (the function your_program) from some outer loop that is defining the inputs.
  • Salsa gives you the tools to define your_program.
  • Salsa assumes that your_program is a purely deterministic function of its inputs, or else this whole setup makes no sense.
  • The mutation of inputs always happens outside of your_program, as part of this master loop.

Database

Each time you run your program, Salsa remembers the values of each computation in a database. When the inputs change, it consults this database to look for values that can be reused. The database is also used to implement interning (making a canonical version of a value that can be copied around and cheaply compared for equality) and other convenient Salsa features.

Inputs

Every Salsa program begins with an input. Inputs are special structs that define the starting point of your program. Everything else in your program is ultimately a deterministic function of these inputs.

For example, in a compiler, there might be an input defining the contents of a file on disk:


#![allow(unused)]
fn main() {
#[salsa::input]
pub struct ProgramFile {
    pub path: PathBuf,
    pub contents: String,
}
}

You create an input by using the new method. Because the values of input fields are stored in the database, you also give an &-reference to the database:


#![allow(unused)]
fn main() {
let file: ProgramFile = ProgramFile::new(
    &db,
    PathBuf::from("some_path.txt"),
    String::from("fn foo() { }"),
);
}

Mutable access is not needed since creating a new input cannot affect existing tracked data in the database.

Salsa structs are just integers

The ProgramFile struct generated by the salsa::input macro doesn't actually store any data. It's just a newtyped integer id:


#![allow(unused)]
fn main() {
// Generated by the `#[salsa::input]` macro:
#[derive(Copy, Clone, PartialEq, Eq, Hash)]
pub struct ProgramFile(salsa::Id);
}

This means that, when you have a ProgramFile, you can easily copy it around and put it wherever you like. To actually read any of its fields, however, you will need to use the database and a getter method.

Reading fields and returns(mode)

You can access the value of an input's fields by using the getter method. As this is only reading the field, it just needs a &-reference to the database:


#![allow(unused)]
fn main() {
let contents: &str = file.contents(&db);
}

Field getters return a reference into the database by default. Use #[returns(copy)] for Copy fields or #[returns(clone)] to return an owned clone instead. #[returns(deref)] borrows through Deref, so a String field returns a &str. For optional and fallible values, #[returns(as_ref)] converts an Option<T> or Result<T, E> into references, while #[returns(as_deref)] also borrows through Deref (for example, converting Option<String> into Option<&str>).


#![allow(unused)]
fn main() {
#[salsa::input]
pub struct ProgramFile {
    #[returns(clone)]
    pub path: PathBuf,
    #[returns(deref)]
    pub contents: String,
}
}

Now file.path(&db) returns a PathBuf, while file.contents(&db) returns a &str.

Writing input fields

Finally, you can also modify the value of an input field by using the setter method. Since this is modifying the input, and potentially invalidating data derived from it, the setter takes an &mut-reference to the database:


#![allow(unused)]
fn main() {
use salsa::Setter as _;

file.set_contents(&mut db).to(String::from("fn foo() { /* add a comment */ }"));
}

Note that the setter method set_contents returns a "builder". This gives the ability to set the durability and other advanced concepts.

Tracked functions

Once you've defined your inputs, the next thing to define are tracked functions:


#![allow(unused)]
fn main() {
#[salsa::tracked]
fn parse_file<'db>(db: &'db dyn crate::Db, file: ProgramFile) -> Ast<'db> {
    let contents: &str = file.contents(db);
    ...
}
}

When you call a tracked function, Salsa will track which inputs it accesses (in this example, file.contents(db)). It will also memoize the return value (the Ast, in this case). If you call a tracked function twice, Salsa checks if the inputs have changed; if not, it can return the memoized value. The algorithm Salsa uses to decide when a tracked function needs to be re-executed is called the red-green algorithm, and it's where the name Salsa comes from.

Tracked functions have to follow a particular structure:

  • They must take a &-reference to the database as their first argument.
    • Note that because this is an &-reference, it is not possible to modify inputs during a tracked function!
  • They may take no other arguments, one Salsa struct, or multiple arguments that implement Eq and Hash. A single Salsa struct can be used directly as the query key, whereas multiple arguments are interned together to create a key.

Tracked functions return a reference to their memoized value by default, so callers of parse_file receive an &Ast<'_>. Use #[salsa::tracked(returns(clone))] to clone the value out of the database instead.

Tracked structs

Tracked structs are intermediate structs created during your computation. Like inputs, their fields are stored inside the database, and the struct itself just wraps an id. Unlike inputs, they can only be created inside a tracked function, and their fields can never change once they are created (until the next revision, at least). Getter methods are provided to read the fields, but there are no setter methods. Example:


#![allow(unused)]
fn main() {
#[salsa::tracked]
struct Ast<'db> {
    #[tracked]
    #[returns(deref)]
    top_level_items: Vec<Item<'db>>,
}
}

Just as with an input, new values are created by invoking Ast::new. The new function on a tracked struct only requires a &-reference to the database:


#![allow(unused)]
fn main() {
#[salsa::tracked]
fn parse_file<'db>(db: &'db dyn crate::Db, file: ProgramFile) -> Ast<'db> {
    let contents: &str = file.contents(db);
    let mut parser = Parser::new(contents);
    let mut top_level_items = vec![];
    while let Some(item) = parser.parse_top_level_item() {
        top_level_items.push(item);
    }
    Ast::new(db, top_level_items) // <-- create an Ast!
}
}

Identity fields and #[tracked] fields

By default, a field is part of the tracked struct's identity. Fields annotated with #[tracked] are not part of its identity; their values can be updated when Salsa matches the struct across executions. If a tracked field's value has not changed, then other tracked functions that only read that field will not be re-executed.

For example, imagine that we had a tracked struct for items in the file:


#![allow(unused)]
fn main() {
#[salsa::tracked]
struct Item<'db> {
    name: Word<'db>, // we'll define Word in a second!
    #[tracked]
    body: Ast<'db>,
}
}

Here name is part of the item's identity, while body can change without changing the item's identity.

Specify the result of tracked functions for particular structs

Sometimes it is useful to define a tracked function but specify its value for some particular struct specially. For example, maybe the default way to compute the representation for a function is to read the AST, but you also have some built-in functions in your language and you want to hard-code their results. This can also be used to simulate a field that is initialized after the tracked struct is created.

To support this use case, you can use the specify method associated with tracked functions. To enable this method, you need to add the specify flag to the function to alert users that its value may sometimes be specified externally.


#![allow(unused)]
fn main() {
#[salsa::tracked(specify)] // <-- specify flag required
fn representation<'db>(db: &'db dyn crate::Db, item: Item<'db>) -> Representation {
    // read the user's input AST by default
    let ast = ast(db, item);
    // ...
}

#[salsa::tracked]
fn create_builtin_item<'db>(db: &'db dyn crate::Db) -> Item<'db> {
    let i = Item::new(db, ...);
    let r = hardcoded_representation();
    representation::specify(db, i, r); // <-- use the method!
    i
}
}

Specifying is only possible for tracked functions that take a single tracked struct as an argument (besides the database). The call to specify must occur in the same tracked-function invocation that created that struct.

Interned structs

The final kind of Salsa struct are interned structs. Interned structs are useful for quick equality comparison. They are commonly used to represent strings or other primitive values.

Most compilers, for example, will define a type to represent a user identifier:


#![allow(unused)]
fn main() {
#[salsa::interned]
struct Word<'db> {
    #[returns(deref)]
    pub text: String,
}
}

As with input and tracked structs, the Word struct itself is just a newtyped integer, and the actual data is stored in the database.

You can create a new interned struct using new, just like with input and tracked structs:


#![allow(unused)]
fn main() {
let w1 = Word::new(db, "foo".to_string());
let w2 = Word::new(db, "bar".to_string());
let w3 = Word::new(db, "foo".to_string());
}

When you create two interned structs with the same field values, you are guaranteed to get back the same integer id. So here, we know that assert_eq!(w1, w3) is true and assert_ne!(w1, w2).

You can access the fields of an interned struct using a getter, like word.text(db), which returns a &str because of the #[returns(deref)] annotation. The fields of interned structs are immutable.

Accumulators

The final Salsa concept are accumulators. Accumulators are a way to report errors or other "side channel" information that is separate from the main return value of your function.

To create an accumulator, you declare a type as an accumulator:


#![allow(unused)]
fn main() {
#[salsa::accumulator]
pub struct Diagnostics(String);
}

Now, during a tracked function's execution, you can accumulate values of this type:


#![allow(unused)]
fn main() {
use salsa::Accumulator as _;

Diagnostics("some_string".to_string()).accumulate(db);
}

Then later, from outside the execution, you can ask for the set of diagnostics that were accumulated by some particular tracked function. For example, imagine that we have a type-checker and, during type-checking, it reports some diagnostics:


#![allow(unused)]
fn main() {
#[salsa::tracked]
fn type_check<'db>(db: &'db dyn Db, item: Item<'db>) {
    // ...
    Diagnostics("some error message".to_string()).accumulate(db);
    // ...
}
}

we can then later invoke the associated accumulated function to get references to all the Diagnostics values that were accumulated:


#![allow(unused)]
fn main() {
let diagnostics: Vec<&Diagnostics> = type_check::accumulated::<Diagnostics>(db, item);
}