Introduction

This book contains a collection of Rust Exercises, written by Ferrous Systems.

We use these exercises as part of our Rust Training, but you are welcome to try them for yourself as well.

Source Code

The source code for this book can be found at https://github.com/ferrous-systems/rust-exercises. It is open sourced as a contribution to the growth of the Rust language.

If you wish to fund further development of the course, why not book a training with us!

Icons and Formatting we use

We use Icons to mark different kinds of information in the book:

  • βœ… Call for action
  • ❗️ Warnings, Details that require special attention
  • πŸ”Ž Knowledge, that gets you deeper into the subject, but you do not have to understand it completely to proceed.
  • πŸ’¬ Descriptions for Accessibility

Note: Notes like this one contain helpful information

Course Material

We have attempted to make our material as inclusive as possible. This means, that some information is available in several forms, for example as a picture and as a text description. We also use icons so that different kinds of information are visually distinguishable on the first glance. If you are on a course and have accessibility needs that are not covered, please let us know.

License

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

We encourage the use of this material, under the terms of the above license, in the production and/or delivery of commercial or open-source Rust training programmes.

Copyright (c) Ferrous Systems, 2023

Fizzbuzz

In this exercise, you will implement your first tiny program in rust: FizzBuzz. FizzBuzz is easy to implement, but allows for application of Rust patterns in a very clean fashion. If you have never written Rust before, use the cheat sheet for help on syntax.

After completing this exercise you are able to

  • write a simple Rust program
  • create and return owned String s
  • use conditionals
  • format strings with and without printing them to the system console
  • write a function with a parameter and return type.

Prerequisites

For completing this exercise you need to have

  • basic programming skills in other languages
  • the Rust Syntax Cheat Sheet

Task

  • Create a new project called fizzbuzz

  • Define a function fn fizzbuzz that implements the following rules:

    • If i is divisible by 3, return the String "Fizz"
    • If i is divisible by 5, return the String "Buzz"
    • If i is divisible by both 3 and 5, return the String "FizzBuzz"
    • If neither of them is true return the number as a String
  • Write a main function that implements the following:

    • Iterate from 1 to 100 inclusive.
    • On each iteration the integer is tested with fn fizzbuzz
    • print the returned value.

If you need it, we have provided a complete solution for this exercise.

Knowledge

Printing to console

The recommended way to print to the console in this exercise is println!. println! always needs a format string - it uses {} as a placeholder to mean print the next argument, like Python 3 or C#.

#![allow(unused)]
fn main() {
let s = String::from("Fizz");
println!("The value is s is {}. That's nice.", s);
}

Creating Strings

The two recommended ways to get a String type for this exercise are:

#![allow(unused)]
fn main() {
// 1.
let string = String::from("Fizz");

let i = 4;
let string = i.to_string();

// 2. 
let string = format!("Buzz");

let i = 4;
let string = format!("{}", i);
}

Returning data

If you have issues returning data from multiple branches of your solution, liberally use return.

#![allow(unused)]
fn main() {
fn returner() -> String {
    let x = 10;
    if x % 5 == 0 {
        return String::from("Buzz");
    }
    String::from("Fizz")
}

}

Step-by-Step-Solution

In general, we also recommend to use the Rust documentation to figure out things you are missing to familiarize yourself with it. If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Step 1: New Project

Create a new binary Cargo project, check the build and see if it runs.

Solution
cargo new fizzbuzz 
cd fizzbuzz 
cargo run

Step 2: Counting from 1 to 100 in fn main()

Print the numbers from 1 to 100 (inclusive) to console. Use a for loop. Running this code should print the numbers from 1 to 100.

Solution
fn main() {
    for i in 1..=100 {
        println!("{}", i);
    }
}

Step 3: The function fn fizzbuzz

βœ… Function Signature

Create the function with the name fizzbuzz. It takes an unsigned 32-bit integer as an argument and returns a String type.

Solution
#![allow(unused)]
fn main() {
fn fizzbuzz(i: u32) -> String {
    unimplemented!()
}
}

βœ… Function Body

Use if statements with math operators to implement the following rules:

  • If i is divisible by 3, return the String "Fizz"
  • If i is divisible by 5, return the String "Buzz"
  • If i is divisible by both 3 and 5, return the String "FizzBuzz"
  • If neither of them is true return the number as a String

Running this code should still only print the numbers from 1 to 100.

Solution
#![allow(unused)]
fn main() {
fn fizzbuzz(i: u32) -> String {
    if i % 3 == 0 && i % 5 == 0 {
        format!("FizzBuzz")
    } else if i % 3 == 0 {
        format!("Fizz")
    } else if i % 5 == 0 {
        format!("Buzz")
    } else {
        format!("{}", i)
    }
}
}

Step 4: Call the function

Add the function call to fn fizzbuzz() to the formatted string in the println!() statement.

Running this code should print numbers, interlaced with Fizz, Buzz and FizzBuzz according to the rules mentioned above.

Solution
fn fizzbuzz(i: u32) -> String {
    if i % 3 == 0 && i % 5 == 0 {
        format!("FizzBuzz")
    } else if i % 3 == 0 {
        format!("Fizz")
    } else if i % 5 == 0 {
        format!("Buzz")
    } else {
        format!("{}", i)
    }
}

fn main() {
    for i in 1..=100 {
        println!("{}", fizzbuzz(i));
    }
}

Fizzbuzz Cheat Sheet

This is a syntax cheat sheet to be used with the Fizzbuzz exercise.

Variables

#![allow(unused)]
fn main() {
let thing = 42; // an immutable variable
let mut thing = 43; // a mutable variable
}

Functions

// a function with one argument, no return.
fn number_crunch(input: u32) {
    // function body
}

// a function with two arguments and a return type.
fn division_machine(dividend: f32, divisor: f32) -> f32 {
    // function body
    let quotient = dividend / divisor;

    // return line does not have a semi-colon!
    quotient
}

fn main() {
    
    let cookies = 1000.0_f32;
    let cookie_monsters = 1.0_f32;

    // calling a function 
    let number = division_machine(cookies, cookie_monsters);
}

for loops and ranges

#![allow(unused)]
fn main() {
// for loop with end-exclusive range
for i in 0..10 {
    // do this
}

// for loop with end-inclusive range
for j in 0..=10 {
    // do that 
}
}

if - statements

#![allow(unused)]
fn main() {
let number = 4;

if number == 4 {
    println!("This happens");
} else if number == 5 {
    println!("Something else happens");
} else {
    println!("Or this happens");
}

// condition can be anything that evaluates to a bool

}

Operators (Selection)

OperatorExampleExplanation
!=expr != exprNonequality comparison
==expr == exprEquality comparison
&&expr && exprShort-circuiting logical AND
||expr || exprShort-circuiting logical OR
%expr % exprArithmetic remainder
/expr / exprArithmetic division

Fizzbuzz with match

In this exercise you will modify your previously written fizzbuzz to use match statements instead of if statements.

After completing this exercise you are able to

  • use match statements
  • define a tuple

Prerequisites

For completing this exercise you need to have

  • a working fizzbuzz

Task

Rewrite the body of fn fizzbuzz() so the different cases are not distinguished with if statements, but with pattern matching of a tuple containing the remainders.

If you need it, we have provided a complete solution for this exercise.

Knowledge

Tuple

A tuple is a collection of values of different types. Tuples are constructed using parentheses (), and each tuple itself is a value with type signature (T1, T2, ...), where T1, T2 are the types of its members. Functions can use tuples to return multiple values, as tuples can hold any number of values, including the _ placeholder

#![allow(unused)]
fn main() {
// A tuple with a bunch of different types.
let long_tuple = (1u8, 2u16, 3u32, 4u64,
                      -1i8, -2i16, -3i32, -4i64,
                      0.1f32, 0.2f64,
                      'a', true);
}

Step-by-Step-Solution

We assume you have deleted the entire function body of fn fizzbuzz() before you get started.

Step 1: The Tuple

Define a tuple that consists of the remainder of the integer i divided by 3 and the integer i divided by 5.

Solution
#![allow(unused)]
fn main() {
let i = 10;
let remainders = (i%3, i%5);
}

Step 2: Add the match statement with its arms

The the for us relevant patterns of the tuple that we match for are a combination of 0 and the placeholder _ (underscore). _ stands for any value. Think about what combinations of 0 and _ represent which rules. Add the match arms accordingly.

Solution
#![allow(unused)]
fn main() {
fn fizzbuzz(i: i32) -> String {
let remainders = (i%3, i%5);

    match remainders {
        (0, 0) => format!("FizzBuzz"),
        (0, _) => format!("Fizz"),
        (_, 0) => format!("Buzz"),
        (_, _) => format!("{}", i),
    }
}
}

Rustlatin

In this exercise we will implement a Rust-y, simpler variant of Pig Latin: Depending on if a word starts with a vowel or not, either a suffix or a prefix is added to the word

Learning Goals

You will learn how to

  • create a Rust library
  • split a &str at specified char
  • get single char out of a &str
  • iterate over a &str
  • define Globals
  • compare a value to the content of an array
  • use the Rust compiler’s type inference to your advantage
  • to concatenate &str
  • return the content of a Vec<String> as String.

Prerequisites

You must be able to

  • define variables as mutable
  • use for loop
  • use an if/else construction
  • read Rust documentation
  • define a function with signature and return type
  • define arrays and vectors
  • distinguish between String and &str

For this exercise we define

  • the Vowels of English alphabet β†’ ['a', 'e', 'i', 'o', 'u']
  • a sentence is a collection of Unicode characters with words that are separated by a space character (U+0020)

Task

βœ… Implement a function that splits a sentence into its words, and adds a suffix or prefix to them according to the following rules:

  • If the word begins with a vowel add prefix β€œsr” to the word.

  • If the word does not begin with a vowel add suffix β€œrs” to the word.

The function returns a String containing the modified words.

In order to learn as much as possible we recommend following the step-by-step solution.

Getting started

Find the exercise template in ../../exercise-templates/rustlatin

The folder contains each step as its own numbered project, containing a lib.rs file. Each lib.rs contains starter code and a test that needs to pass in order for the step to be considered complete.

Complete solutions are available ../../exercise-solutions/rustlatin

Knowledge

Rust Analyzer

A part of this exercise is seeing type inference in action and to use it to help to determine the type the function is going to return. To make sure the file can be indexed by Rust Analyzer, make sure you open the relevant step by itself - e.g. exercise-templates/rustlatin/step1. You can close each step when complete and open the next one.

Step-by-step-Solution

Step 1: Splitting a sentence and pushing its words into a vector.

βœ… Iterate over the sentence to split it into words. Use the white space as separator. This can be done with the .split() method, where the separator character ' ' goes into the parenthesis. This method returns an iterator over substrings of the string slice. In Rust, iterators are lazy, that means just calling .split() on a &str doesn’t do anything by itself. It needs to be in combination with something that advances the iteration, such as a for loop, or a manual advancement such as the .next() method. These will yield the actual object you want to use. Push each word into the vector collection_of_words. Add the correct return type to the function signature.

βœ… Run the test to see if it passes.

Solution
#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<String> {
    let mut collection_of_words = Vec::new();

    for word in sentence.split(' ') {
        collection_of_words.push(word.to_string())
    }
    collection_of_words
}
}

Step 2: Concatenating String types.

βœ… After iterating over the sentence to split it into words, add the suffix "rs" to each word before pushing it to the vector.

βœ… To concatenate two &str the first needs to be turned into the owned type with .to_owned(). Then String and &str can be added using +.

βœ… Add the correct return type to the function signature.

βœ… Run the test to see if it passes.

Solution
#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<String> {
    let mut collection_of_words = Vec::new();

    for word in sentence.split(' ') {
            collection_of_words.push(word.to_owned() + "rs")

    };
    collection_of_words
}
}

Step 3: Iterating over a word to return the first character.

βœ… After iterating over the sentence to split it into words, add the first character of each word to the vector.

βœ… Check the Rust documentation on the primitive str Type for a method that returns an iterator over the chars of a &str. The char type holds a Unicode Scalar Value that represents a single character (although just be aware the definition of character is complex when talking about emojis and other non-English text).

Since iterators don’t do anything by themselves, it needs to be advanced first, with the .next() method. This method returns an Option(Self::Item), where Self::Item is the char in this case. You don’t need to handle it with pattern matching in this case, a simple unwrap() will do, as a None is not expected to happen.

βœ… Add the correct return type to the function signature. Run the test to see if it passes.

Solution
#![allow(unused)]
fn main() {
fn rustlatin(sentence: &str) -> Vec<char> {
    let mut collection_of_chars = Vec::new();

    for word in sentence.split(' ') {
        let first_char = word.chars().next().unwrap();
        collection_of_chars.push(first_char);
    };
    collection_of_chars
}
}

Step 4: Putting everything together: Comparing values and returning the content of the vector as String.

βœ… Add another function that checks if the first character of each word is a vowel. contains() is the method to help you with this. It adds the prefix or suffix to the word according to the rules above.

Call the function in each iteration.

In fn rustlatin return the content of the vector as String. Run the tests to see if they pass.

Solution
#![allow(unused)]
fn main() {
const VOWELS: [char; 5] = ['a', 'e', 'i', 'o', 'u'];

fn latinize(word: &str) -> String {
    let first_char_of_word = word.chars().next().unwrap();
    if VOWELS.contains(&first_char_of_word) {
        "sr".to_string() + word
    } else {
        word.to_string() + "rs"
    }
}
}

Step 5 (optional)

If not already done, use functional techniques (i.e. methods on iterators) to write the same function. Test this new function as well.

Solution
#![allow(unused)]
fn main() {
const VOWELS: [char; 5] = ['a', 'e', 'i', 'o', 'u'];

fn rustlatin_match(sentence: &str) -> String {
    // transform incoming words vector to rustlatined outgoing
    let new_words: Vec<_> = sentence
        .split(' ')
        .into_iter()
        .map(|word| {
            let first_char_of_word = word.chars().next().unwrap();
            if VOWELS.contains(&first_char_of_word) {
                "sr".to_string() + word
            } else {
                word.to_string() + "rs"
            }
        })
        .collect();

    new_words.join(" ")
}
}

URLs, Match and Result

In this exercise you will complete a number of mini exercises to learn about Error Handling. The final result will be a url parser that reads lines from a text file and can distinguish the content between URLs and non-urls.

In this exercise, you will learn how to

  • handle occurring Result-types with match for basic error handling.

  • when to use the .unwrap() method.

  • propagate an error with the ? operator

  • return the Option-type.

  • do some elementary file processing (opening, reading to buffer, counting, reading line by line).

  • navigate the Rust stdlib documentation

  • add external dependencies to your project

Task

Find the exercise template here ../../exercise-templates/urls-match-result

Find the solution to the exercise here ../../exercise-solutions/urls-match-result. You can run them with the following command: cargo run --example step_x, where x is the number of the step.

  1. Fix the runtime error in the template code by correcting the file path. Then, handle the Result type that is returned from the std::fs::read_to_string() with a match block, instead of using .unwrap().

  2. Take the code from Step 1 and instead of using a match, propagate the Error with ? out of fn main(). Note that your main function will now need to return something when it reaches the end.

  3. Take the code from Step 2, and split the String into lines using the lines() method. Use this to count how many lines there are.

  4. Change the code from Step 3 to filter out empty lines using is_empty and print the non-empty ones.

  5. Take your code from Step 4 and write a function like fn parse_url(input: &str) -> Option<url::Url> which checks if the given input: &str is a Url, or not. The function should return Some(url) where url is of type Url, which is from the url crate. Use this function to convert each line and use the returned value to print either Is a URL: <url> or Not a URL.

    The url crate has already been added as a dependency so you can just use url::Url::parse

Knowledge

Option and Result

Both Option and Result are similar in a way. Both have two variants, and depending on what those variants are, the program may continue in a different way.

The Option type can have the variant Some(T) or None. T is a type parameter that means some type should go here, we'll decide which one later. The Option type is used when you have to handle optional values. For example if you want to be able to leave a field of a struct empty, you use the Option type for that field. If the field has a value, it is Some(<value>), if it is empty, it is None.

The variants of the Result type are Ok(t) and Err(e). It is used to handle errors. If an operation was successful, Ok(t) is returned. In Ok(t), t can be the empty tuple or some other value. In Err(e), e contains an error message that can usually be printed with println!("Err: {:?}", e);.

Both types can be used with the match keyword. The received value is matched on patterns, each leads to the execution of one of a number of different expressions depending on which arm matches first.

How to use match

match is a way of control flow based on pattern matching. A pattern on the left results in the expression on the right side.

#![allow(unused)]
fn main() {
let value = true;

match value {
   true => println!("This is true!"),
   false => println!("This is false!"),
}
}

Unlike with if/else, every case has to be handled explicitly, at least with a last catch all arm that uses a place holder:

#![allow(unused)]
fn main() {
let value = 50_u32;

match value {
    1 => println!("This is one."),
    50 => println!("This is fifty!"),
    _ => println!("This is any other number from 0 to 4,294,967,295."),
}
}

There are different ways to use match:

The return values of the expression can be bound to a variable:

#![allow(unused)]
fn main() {
enum Season {
    Spring,
    Summer,
    Fall,
    Winter
}

fn which_season_is_now(season: Season) -> String {

    let return_value = match season {
        Season::Spring => String::from("It's spring!"),
        Season::Summer => String::from("It's summer!."),
        Season::Fall => String::from("It's Fall!"),
        Season::Winter => String::from("Brrr. It's Winter."),
    };

    return_value
}
}

In case of a Result<T, E>, match statements can be used to get to the inner value.

use std::fs::File;

fn main() {
    let file_result = File::open("hello.txt");

    let _file_itself = match file_result {
        Ok(file) => file,
        Err(error) => panic!("Error opening the file: {:?}", error),
    };
}

All arms of the match tree have to either result in the same type, or they have to diverge (that is, panic the program or return early from the function)!

Template

Start your VSCode in the proper root folder to have Rust-Analyzer working properly.

../../exercise-templates/urls-match-result/

The template builds, but has a runtime error, as the location of the file is wrong. This is intentional.

Your code will use the example data found in

../../exercise-templates/urls-match-result/src/data

Step-by-Step Solution

Step 1: Handle the Result instead of unwrapping it

std::fs::read_to_string returns a Result<T, E> kind of type, a quick way to get to inner type T is to use the .unwrap() method on the Result<T, E>. The cost is that the program panics if the Error variant occurs and the Error can not be propagated. It should only be used when the error does not need to be propagated and would result in a panic anyways. It’s often used as a quick fix before implementing proper error handling.

βœ… Check the documentation for the exact type std::fs::read_to_string returns.

βœ… Handle the Result using match to get to the inner type. Link the two possible patterns, Ok(some_string) and Err(e) to an an appropriate code block, for example: println!("File opened and read") and println!("Problem opening the file: {:?}", e).

βœ… Fix the path of the file so that the program no longer prints an error.

Click me
fn main() {
    let read_result = std::fs::read_to_string("src/data/content.txt");

    match read_result {
        Ok(_str) => println!("File opened and read"),
        Err(e) => panic!("Problem opening and reading the file: {:?}", e),
    };
}

TIP: IDEs often provide a "quick fix" to roll out all match arms quickly

Step 2: Returning a Result from main

βœ… Add Result<(), Error> as return type to fn main() and Ok(()) as the last line of fn main().

βœ… Delete the existing match block and add a ? after the call to std::fs::read_to_string(...).

βœ… Print something after the std::fs::read_to_string but before the Ok(()) so you can see that your program did run. Try changing the file path back to the wrong value to see what happens if there is an error.

Click me
fn main() -> Result<(), std::io::Error> {
    let _file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");
    Ok(())
}

Step 3: Count the number of lines

βœ… Take a look at the documentation of std::lines. It returns a struct Lines which is an iterator.

βœ… Add a block like for line in my_contents.lines() { }

βœ… Declare a mutable integer, initialized to zero. Increment that integer inside the for loop.

βœ… Print the number of lines the file contains.

Click me
fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    let mut number = 0;

    for _line in file_contents.lines() {
        number += 1;
    }

    println!("{}", number);

    Ok(())
}

Step 4: Filter out empty lines

βœ… Filter out the empty lines, and only print the the others. The is_empty method can help you here.

Click me
fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    for line in file_contents.lines() {
        if !line.is_empty() {
            println!("{}", line)
        }
    }

    Ok(())
}

Step 5: Check if a string is a URL, and return with Option<T>

βœ… Write a function that takes (input: &str), parses each line and returns Option<url::Url> (using the url::Url). Search the docs for a method for this!

βœ… If a line can be parsed successfully, return Some(url), and return None otherwise.

βœ… In the main function, use your new function to only print value URLs.

βœ… Test the fn parse_url().

Click me
fn parse_url(line: &str) -> Option<url::Url> {
    match url::Url::parse(&line) {
        Ok(u) => Some(u),
        Err(_e) => None,
    }
}

fn main() -> Result<(), std::io::Error> {
    let file_contents = std::fs::read_to_string("src/data/content.txt")?;
    println!("File opened and read");

    for line in file_contents.lines() {
        match parse_url(line) {
            Some(url) => {
                println!("Is a URL: {}", url);
            }
            None => {
                println!("Not a URL");
            }
        }
    }

    Ok(())
}

#[test]
fn correct_url() {
    assert!(parse_url("https://example.com").is_some())
}

#[test]
fn no_url() {
    assert!(parse_url("abcdf").is_none())
}

Help

Typing variables

Variables can be typed by using : and a type.

#![allow(unused)]
fn main() {
let my_value: String = String::from("test");
}

SimpleDB Exercise

In this exercise, we will implement a toy protocol parser for a simple protocol for databank queries. We call it simpleDB. The protocol has two commands, one of them can be sent with a payload of additional data. Your parser parses the incoming data strings, makes sure the commands are formatted correctly and returns errors for the different ways the formatting can go wrong.

After completing this exercise you are able to

  • write a simple Rust library from scratch

  • interact with borrowed and owned memory, especially how to take ownership

  • handle complex cases using the match and if let syntax

  • create a safe protocol parser in Rust manually

Prerequisites

  • basic pattern matching with match

  • control flow with if/else

  • familiarity with Result<T, E>, Option<T>

Tasks

  1. Create a library project called simple-db.
  2. Implement appropriate data structures for Command and Error.
  3. Read the documentation for str, especially split_once() and splitn(). Pay attention to their return type. Use the result value of split_once() and splitn() to guide your logic. The Step-by-Step-Solution contains a proposal.
  4. Implement the following function so that it implements the protocol specifications to parse the messages. Use the provided tests to help you with the case handling.
pub fn parse(input: &str) -> Result<Command, Error> {
    todo!()
}

The Step-by-Step-Solution contains steps 4a-e that explain a possible way to handle the cases in detail.

Optional Tasks:

  • Run clippy on your codebase.
  • Run rustfmt on your codebase.

If you need it, we have provided solutions for every step for this exercise.

Protocol Specification

The protocol has two commands that are sent as messages in the following form:

  • PUBLISH <payload>\n

  • RETRIEVE\n

With the additional properties:

  1. The payload cannot contain newlines.

  2. A missing newline at the end of the command is an error.

  3. Data after the first newline is an error.

  4. Empty payloads are allowed. In this case, the command is PUBLISH \n.

Violations against the form of the messages and the properties are handled with the following error codes:

  • TrailingData (bytes found after newline)

  • IncompleteMessage (no newline)

  • EmptyMessage (empty string instead of a command)

  • UnknownCommand (string is not empty, but neither PUBLISH nor RECEIVE)

  • UnexpectedPayload (message contains a payload, when it should not)

  • MissingPayload (message is missing a payload)

Testing

Below are the tests your protocol parser needs to pass. You can copy them to the bottom of your lib.rs.

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::TrailingData);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}

Knowledge

This section explains concepts necessary to solve the simpleDB exercise.

In general, we also recommend to use the Rust documentation to figure out things you are missing to familiarize yourself with it. If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Derives

#[derive(PartialEq, Eq)]

This enables comparison between 2 instances of the type, by comparing every field/variant. This enables the assert_eq! macro, which relies on equality being defined. Eq for total equality isn’t strictly necessary for this example, but it is good practice to derive it if it applies.

#[derive(Debug)]

This enables automatic debug output for the type. The assert_eq!macro requires this for testing.

Control flow and pattern matching, returning values

This exercise involves handling a number of cases. You are already familiar with if/else and a basic form of match. Here, we’ll introduce you to if let.

    if let Some(payload) = substrings.next() {
        // execute if the above statement is true
    }

When to use what?

if let is like a pattern-matching match block with only one arm. So, if your match only has one arm of interest, consider an if let instead.

match can be used to handle more fine grained and complex pattern matching, especially when there are several, equally ranked possibilities. The match arms may have to include a catch all _ => arm, for every possible case that is not explicitly spelled out. The order of the match arms matter: The catch all branch needs to be last, otherwise, it catches all…

Returning Values from branches and match arms

All match arms always need to produce a value the same type (or they diverge with a return statement).

Step-by-Step Solution

Step 1: Creating a library project with cargo

Create a new Cargo project, check the build and the test setup:

Solution
cargo new --lib simple-db
cd simple-db
cargo build
cargo test

Step 2: Appropriate data structures

Define two enums, one is called Command and one is called Error. Command has 2 variants for the two possible commands. Publish carries data (the message), Retrieve does not. Error is just a list of error kinds. Use #[derive(Eq,PartialEq,Debug)] for both enums.

Solution
#[derive(Eq, PartialEq, Debug)]
pub enum Command {
    Publish(String),
    Retrieve,
}

#[derive(Eq, PartialEq, Debug)]
pub enum Error {
    TrailingData,
    IncompleteMessage,
    EmptyMessage,
    UnknownCommand,
    UnexpectedPayload,
    MissingPayload,
}

// Tests go here!

Step 3: Read the documentation for str, especially splitn(), split_once() to build your logic

tl;dr

  • split_once() splits a str into 2 parts at the first occurrence of a delimiter.
  • splitn() splits a str into a max of n substrings at every occurrence of a delimiter.
The proposed logic

Split the input with split_once() using \n as delimiter, this allows to distinguish 3 cases:

  • a command where \n is the last part, and the second substring is "" -> some kind of command
  • a command with trailing data (i.e. data after a newline) -> Error::TrailingData
  • a command with no \n -> Error::IncompleteMessage

After that, split the input with splitn() using ' ' as delimiter and 2 as the max number of substrings. The method an iterator over the substrings, and the iterator produces Some(...), or None when there are no substrings. Note, that even an empty str "" is a substring.

From here, the actual command cases need to be distinguished with pattern matching:

  • RETRIEVE has no whitespace and no payload
  • PUBLISH <payload> has always whitespace and an optional payload

Step 4: Implement fn parse()

Step 4a: Sorting out wrongly placed and absent newlines

Missing, wrongly placed and more than one \n are errors that occur independent of other errors so it makes sense to handle these cases first. Split the incoming message at the first appearing \n using split_once(). This operation yields Some((&str, &str)) if at least one \n is present, and None if 0 are present. If the \n is not the last item in the message, the second &str in Some((&str, &str)) is not "".

Tip: Introduce a generic variant Command::Command that temporarily stands for a valid command.

Handle the two cases with match, check if the second part is "". Return Err(Error::TrailingData) or for wrongly placed \n, Err(Error::IncompleteMessage) for absent \n and Ok(Command::Command) if the \n is placed correct.

Solution
pub fn parse(input: &str) -> Result<Command, Error> {
    match input.split_once('\n') {
        Some((_message, "")) => Ok(Command::Command),
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    }
}

Step 4b: if let: sorting Some() from None

In 4a, we produce a Ok(Command::Command) if the newlines all check out. Instead of doing that, we want to capture the message - that is the input, without the newline on the end, and we know it has no newlines within it.

Use .splitn() to split the message into 2 parts maximum, use a space as delimiter (' '). This method yields an iterator over the substrings.

Use .next() to access the first substring, which is the command keyword. You will always get Some(value) - the splitn method never returns None the first time around. We can unwrap this first value because splitn always returns at least one string - but add yourself a comment to remind yourself why this unwrap() is never going to fail!

Solution
pub fn parse(input: &str) -> Result<Command, Error> {
    let message = match input.split_once('\n') {
        Some((message, "")) => message,
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    };

    let mut substrings = message.splitn(2, ' ');

    let _command = substrings.next().unwrap();

    Ok(Command::Command)
}

Step 4c: Pattern matching for the command keywords

Remove the Ok(Command::Command) and the enum variant. Use match to pattern match the command instead. Next, implement two necessary match arms: "" for empty messages, _ for any other string, currently evaluated to be an unknown command.

Solution
pub fn parse(input: &str) -> Result<Command, Error> {
    let message = match input.split_once('\n') {
        Some((message, "")) => message,
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    };

    let mut substrings = message.splitn(2, ' ');

    // Note: `splitn` *always* returns at least one value
    let command = substrings.next().unwrap();
    match command {
        "" => Err(Error::EmptyMessage),
        _ => Err(Error::UnknownCommand),
    }
}

Step 4d: Add Retrieve Case

Add a match arm to check if the command substring is equal to "RETRIEVE". It’s not enough to return Ok(Command::Retrieve) just yet. The Retrieve command cannot have a payload, this includes whitespace! To check for this, add an if else statement, that checks if the next iteration over the substrings returns None. If this is true, return the Ok(Command::Retrieve), if it is false, return Err(Error::UnexpectedPayload).

Solution
pub fn parse(input: &str) -> Result<Command, Error> {
    let message = match input.split_once('\n') {
        Some((message, "")) => message,
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    };

    let mut substrings = message.splitn(2, ' ');

    // Note: `splitn` *always* returns at least one value
    let command = substrings.next().unwrap();
    match command {
        "RETRIEVE" => {
            if substrings.next().is_none() {
                Ok(Command::Retrieve)
            } else {
                Err(Error::UnexpectedPayload)
            }
        }
        "" => Err(Error::EmptyMessage),
        _ => Err(Error::UnknownCommand),
    }
}

Step 4e: Add Publish Case and finish

Add a match arm to check if the command substring is equal to "PUBLISH". Just like with the Retrieve command, we need to add a distinction, but the other way round: Publish needs a payload or whitespace for an empty payload to be valid.

Use if let to check if the next iteration into the substrings returns Some(). If it does, return Ok(Command::Publish(payload)), where payload is an owned version (a String) of the trimmed payload. Otherwise return Err(Error::MissingPayload).

Solution
pub fn parse(input: &str) -> Result<Command, Error> {
    let message = match input.split_once('\n') {
        Some((message, "")) => message,
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    };

    let mut substrings = message.splitn(2, ' ');

    // Note: `splitn` *always* returns at least one value
    let command = substrings.next().unwrap();
    match command {
        "RETRIEVE" => {
            if substrings.next().is_none() {
                Ok(Command::Retrieve)
            } else {
                Err(Error::UnexpectedPayload)
            }
        }
        "PUBLISH" => {
            if let Some(payload) = substrings.next() {
                Ok(Command::Publish(String::from(payload)))
            } else {
                Err(Error::MissingPayload)
            }
        }
        "" => Err(Error::EmptyMessage),
        _ => Err(Error::UnknownCommand),
    }
}

Full source code

If all else fails, feel free to copy this solution to play around with it.

Solution
#![allow(unused)]
fn main() {
#[derive(Eq, PartialEq, Debug)]
pub enum Command {
    Publish(String),
    Retrieve,
}

#[derive(Eq, PartialEq, Debug)]
pub enum Error {
    TrailingData,
    IncompleteMessage,
    EmptyMessage,
    UnknownCommand,
    UnexpectedPayload,
    MissingPayload,
}

pub fn parse(input: &str) -> Result<Command, Error> {
    let message = match input.split_once('\n') {
        Some((message, "")) => message,
        Some(_) => return Err(Error::TrailingData),
        None => return Err(Error::IncompleteMessage),
    };

    let mut substrings = message.splitn(2, ' ');

    // Note: `splitn` *always* returns at least one value
    let command = substrings.next().unwrap();
    match command {
        "RETRIEVE" => {
            if substrings.next().is_none() {
                Ok(Command::Retrieve)
            } else {
                Err(Error::UnexpectedPayload)
            }
        }
        "PUBLISH" => {
            if let Some(payload) = substrings.next() {
                Ok(Command::Publish(String::from(payload)))
            } else {
                Err(Error::MissingPayload)
            }
        }
        "" => Err(Error::EmptyMessage),
        _ => Err(Error::UnknownCommand),
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::TrailingData);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}
}

Green and Yellow Game

In this assignment we will implement the game "Green and Yellow". It’s like Wordle, but with numerical digits instead of letters. But for legal reasons it’s also entirely unlike Wordle, nor remotely similar to the 1970’s board-game "Mastermind".

After completing this exercise you are able to

  • Work with rust slices and vectors
  • Accept input from stdin
  • Iterate through arrays and slices
  • Generate random numbers

Prerequisites

For completing this exercise you need to have:

  • basic Rust programming skills
  • the Rust Syntax Cheat Sheet

Task

  1. Create a new binary crate called green-yellow
  2. Copy all the test cases into into your main.rs
  3. Define a function fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String that implements the following rules:
    • Return a string containing four Unicode characters
    • For every item in guess, if guess[i] == secret[i], then position i in the output String should be a green block (🟩)
    • Then, for every item in guess, if guess[i] is in secret somewhere, and hasn't already been matched, then position i in the output String should be a yellow block (🟨)
    • If any of the guesses do not appear in the secret, then that position in the output String should be a grey block (⬜)
  4. Ensure all the test cases pass!
  5. Write a main function that implements the following:
    • Generate 4 random digits - our 'secret'
    • Go into a loop
    • Read a string from Standard In and trim the whitespace off it
    • Parse that string into a guess, containing four digits (give an error if the user makes a mistake)
    • Run the calculation routine above and print the coloured blocks
    • Exit if all the blocks are green
  6. Play the game

If you need it, we have provided a complete solution for this exercise.

Your test cases are:

#![allow(unused)]
fn main() {
#[test]
fn all_wrong() {
    assert_eq!(
        &calc_green_and_yellow(&[5, 6, 7, 8], &[1, 2, 3, 4]),
        "⬜⬜⬜⬜"
    );
}

#[test]
fn all_green() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 4], &[1, 2, 3, 4]),
        "🟩🟩🟩🟩"
    );
}

#[test]
fn one_wrong() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 5], &[1, 2, 3, 4]),
        "🟩🟩🟩⬜"
    );
}

#[test]
fn all_yellow() {
    assert_eq!(
        &calc_green_and_yellow(&[4, 3, 2, 1], &[1, 2, 3, 4]),
        "🟨🟨🟨🟨"
    );
}

#[test]
fn one_wrong_but_duplicate() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 3, 1], &[1, 2, 3, 4]),
        "🟩🟩🟩⬜"
    );
}

#[test]
fn one_right_others_duplicate() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 1, 1, 1], &[1, 2, 3, 4]),
        "🟩⬜⬜⬜"
    );
}

#[test]
fn two_right_two_swapped() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 2, 2, 2], &[2, 2, 2, 1]),
        "🟨🟩🟩🟨"
    );
}

#[test]
fn two_wrong_two_swapped() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 3, 3, 2], &[2, 2, 2, 1]),
        "🟨⬜⬜🟨"
    );
}

#[test]
fn a_bit_of_everything() {
    assert_eq!(
        &calc_green_and_yellow(&[1, 9, 4, 3], &[1, 2, 3, 4]),
        "🟩⬜🟨🟨"
    );
}
}

Knowledge

Generating Random Numbers

There are no random number generators in the standard library - you have to use the rand crate.

You will need to change Cargo.toml to depend on the rand crate - we suggest version 0.8.

You need a random number generator (call rand::thread_rng()), and using that you can generate a number out of a given range with gen_range. See https://docs.rs/rand for more details.

Reading from the Console

You need to grab a standard input handle with std::io::stdin(). This implements the std::io::Write trait, so you can call read_to_string(&mut some_string) and get a line of text into your some_string: String variable.

Parsing Strings into Integers

Strings have a parse() method, which returns a Result, because of course the user may not have typed in a proper digit. The parse() function works out what you are trying to create based on context - so if you want a u8, try let x: u8 = my_str.parse().unwrap(). Or you can say let x = my_str.parse::<u8>().unwrap(). Of course, try and do something better than unwrap!

Step-by-Step-Solution

In general, we also recommend to use the Rust documentation to figure out things you are missing to familiarize yourself with it. If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Step 1: New Project

Create a new binary Cargo project, check the build and see if it runs.

Solution
cargo new green-yellow
cd fizzbuzz
cargo run

Step 2: Generate some squares

Get calc_green_and_yellow to just generate grey blocks. We put them in an Vec first, as that's easier to index than a string.

Call the function from main() to avoid the warning about it being unused.

Solution
fn calc_green_and_yellow(_guess: &[u8; 4], _secret: &[u8; 4]) -> String {
    let result = ["⬜"; 4];

    result.join("")
}

Step 3: Check for green squares

You need to go through every pair of items in the input arrays and check if they are the same. If so, set the output square to be green.

Solution
fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String {
    let mut result = ["⬜"; 4];

    for i in 0..guess.len() {
        if guess[i] == secret[i] {
            result[i] = "🟩";
        }
    }

    result.join("")
}

Step 4: Check for yellow squares

This gets a little more tricky.

We need to loop through every item in the guess array and compare it to every item in the secret array. But! We must make sure we ignore any values we already 'used up' when we produced the green squares.

Let's do this by copying the input, so we can make it mutable, and mark off any values used in the green-square-loop by setting them to zero.

Solution
fn calc_green_and_yellow(guess: &[u8; 4], secret: &[u8; 4]) -> String {
    let mut result = ["⬜"; 4];
    let mut guess = *guess;
    let mut secret = *secret;

    for i in 0..guess.len() {
        if guess[i] == secret[i] {
            result[i] = "🟩";
            secret[i] = 0;
            guess[i] = 0;
        }
    }

    for i in 0..guess.len() {
        for j in 0..secret.len() {
            if guess[i] == secret[j] && secret[j] != 0 && guess[i] != 0 {
                result[i] = "🟨";
            }
        }
    }

    result.join("")
}

Step 5: Get some random numbers

Add rand = "0.8" to your Cargo.toml, and make a random number generator with rand::thread_rng() (Random Number Generator). You will also have to use rand::Rng; to bring the trait into scope.

Call your_rng.gen_range() in a loop.

Solution
fn main() {
    let mut rng = rand::thread_rng();
    let mut secret = [0u8; 4];
    for digit in secret.iter_mut() {
        *digit = rng.gen_range(1..=9);
    }
    println!("{:?}", secret);

    println!("{}", calc_green_and_yellow(&[1, 2, 3, 4], &secret));
}

Step 6: Make the game loop

We a loop to handle each guess the user makes.

For each guess we need to read from Standard Input (using std::io::stdin() and its read_line()) method.

You will need to trim and then split the input, then parse each piece into a digit.

  • If the digit doesn't parse, continue the loop.
  • If the digit parses but it out of range, continue the loop.
  • If you get the wrong number of digits, continue the loop.
  • If the guess matches the secret, then break out of the loop and congratulate the winner.
  • Otherwise run the guess through our calculation function and print the squares.
Solution
    loop {
        let mut line = String::new();
        println!("Enter guess:");
        stdin.read_line(&mut line).unwrap();
        let mut guess = [0u8; 4];
        let mut idx = 0;
        for piece in line.trim().split(' ') {
            let Ok(digit) = piece.parse::<u8>() else {
                println!("{:?} wasn't a number", piece);
                continue;
            };
            if digit < 1 || digit > 9 {
                println!("{} is out of range", digit);
                continue;
            }
            if idx >= guess.len() {
                println!("Too many numbers, I only want {}", guess.len());
                continue;
            }
            guess[idx] = digit;
            idx += 1;
        }
        if idx < guess.len() {
            println!("Not enough numbers, I want {}", guess.len());
            continue;
        }
        println!("Your guess is {:?}", guess);
        if guess == secret {
            println!("Well done!!");
            break;
        }
    }

Shapes

In this exercise we're going to define methods for a struct, define and implement a trait, and look into how to make these generic.

You will learn:

Learning Goals

You will learn how to:

  • implement methods for a struct
  • when to use Self, self, &self and &mut self in methods
  • define a trait with required methods
  • make a type generic over T
  • how to constrain T

Tasks

Part 1: Defining Methods for Types

You can find a complete solution

  1. Make a new library project called shapes

  2. Make two structs, Circle with field radius and Square with field side to use as types. Decide on appropriate types for radius and side.

  3. Make an impl block and implement the following methods for each type. Consider when to use self, &self, &mut self and Self.

    • fn new(...) -> ...

      • creates an instance of the shape with a certain size (radius or side length).
    • fn area(...) -> ...

      • calculates the area of the shape.
    • fn scale(...)

      • changes the size of an instance of the shape.
    • fn destroy(...) -> ...

      • destroys the instance of a shape and returns the value of its field.

Part 2: Defining and Implementing a Trait

You can find a complete solution

  1. Define a Trait HasArea with a mandatory method: fn area(&self) -> f32.
  2. Implement HasArea for Square and Circle. You can defer to the existing method but may need to cast the return type.
  3. Abstract over Circle and Square by defining an enum Shape that contains both as variants.
  4. Implement HasArea for Shape.

Part 3: Making Square generic over T

You can find a complete solution

We want to make Square and Circle generic over T, so we can use other numeric types and not just u32 and f32.

  1. Add the generic type parameter <T> to Square. You can temporarily remove enum Shape to make this easier.

  2. Import the num crate, version 0.4.0, in order to be able to use the num::Num trait as bound for the generic type <T>. This assures, whatever type is used for T is a numeric type and also makes some guarantees about operations that can be performed.

  3. Add a where clause on the methods of Square, as required, e.g.:

    where T: num::Num 
  4. Depending on the operations performed in that function, you may need to add further trait bounds, such as Copy and std::ops::MulAssign. You can add them to the where clause with a + sign, like T: num::Num + Copy.

  5. Add the generic type parameter <T> to Circle and then appropriate where clauses.

  6. Re-introduce Shape but with the generic type parameter <T>, and then add appropriate where clauses.

Help

This section gives partial solutions to look at or refer to.

In general, we also recommend to use the Rust documentation to figure out things you are missing to familiarize yourself with it. If you ever feel completely stuck or that you haven’t understood something, please hail the trainers quickly.

Getting Started

Create a new library Cargo project, check the build and see if it runs:

$ cargo new --lib shapes 
$ cd shapes
$ cargo run

Creating a Type

Each of your shape types (Square, Circle, etc.) will need some fields (or properties) to identify its geometry. Use /// to add documentation to each field.

/// Describes a human individual
struct Person {
    /// How old this person is
    age: u8
}

Functions that take arguments: self, &self, &mut self

Does your function need to take ownership of the shape in order to calculate its area? Or is it sufficient to merely take a read-only look at the shape for a short period of time?

You can pass arguments by reference in Rust by making your function take x: &MyShape, and passing them with &my_shape.

You can also associate your function with a specific type by placing it inside a block like impl MyShape { ... }

impl Pentagon {
    fn area(&self) -> u32 {
        // calculate the area of the pentagon here...
    }
}

A Shape of many geometries

You can use an enum to provide a single type that can be any of your supported shapes. If we were working with fruit, we might say:

struct Banana { ... }
struct Apple { ... }

enum Fruit {
    Banana(Banana),
    Apple(Apple),
}

If you wanted to count the pips in a piece of Fruit, you might just call to the num_pips() method on the appropriate constituent fruit. This might look like:

impl Fruit {
    fn num_pips(&self) -> u8 {
        match self {
            Fruit::Apple(apple) => apple.num_pips(),
            Fruit::Banana(banana) => banana.num_pips(),
        }
    }
}

I need a Ο€

The f32 type also has its own module in the standard library called std::f32. If you look at the docs, you will find a defined constant for Ο€: std::f32::consts::PI.

I need a Ο€, of type T

If you want to convert a Pi constant to some type T, you need a where bound like:

where T: num::Num + From<f32>

This restricts T to values that can be converted from an f32 (or, types you can convert an f32 into). You can then call let my_pi: T = my_f32_pi.into(); to convert your f32 value into a T value.

Defining a Trait

A trait has a name, and lists function definitions that make guarantees about the name of a method, it's arguments and return types.

#![allow(unused)]
fn main() {
pub trait Color {
    fn red() -> u8;
}
}

Adding generic Type parameters

#![allow(unused)]
fn main() {
pub struct Square<T> {
    /// The length of one side of the square
    side: T,
}

impl<T> Square<T> {
    // ...
}
}

Connected Mailbox Exercise

In this exercise, we will take our "SimpleDB" protocol parser and turn it into a network-connected data storage service. When a user sends a "PUBLISH" we will push the data into a queue, and when the user sends a "RETRIEVE" we will pop some data off the queue (if any is available). The user will connect via TCP to port 7878.

After completing this exercise you are able to

  • write a Rust binary that uses a Rust library

  • combine two Rust packages into a Cargo Workspace

  • open a TCP port and perform an action when each user connects

  • use I/O traits to read/write from a TCP socket

Prerequisites

  • creating and running binary crates with cargo

  • using match to pattern-match on an enum, capturing any inner values

  • using Rust's Read and Write I/O traits

  • familiarity with TCP socket listening and accepting

Tasks

  1. Create an empty folder called connected-mailbox. Copy in the simple-db project from before and create a new binary crate called tcp-server, and put them both into a Cargo Workspace.

    πŸ“‚ connected-mailbox
    ┣ πŸ“„ Cargo.toml 
    ┃
    ┣ πŸ“‚ simple-db 
    ┃  ┣ πŸ“„ Cargo.toml 
    ┃  β”— ...
    ┃
    β”— πŸ“‚ tcp-server 
       ┣ πŸ“„ Cargo.toml 
       β”— ...
    
  2. Write a basic TCP Server which can listen for TCP connections on 127.0.0.1:7878. For each incoming connection, read all of the input as a string, and send it back to the client.

  3. Change the TCP Server to depend upon the simple-db crate, using a relative path.

  4. Change your TCP Server to use your simple-db crate to parse the input, and provide an appropriate canned response.

  5. Set up a VecDeque and either push or pop from that queue, depending on the command you have received.

At every step, try out your program using a command-line TCP Client: you can either use nc, or netcat, or our supplied tools/tcp-client program.

Optional Tasks:

  • Run cargo clippy on your codebase.
  • Run cargo fmt on your codebase.
  • Wrap your VecDeque into a struct Application with a method that takes a simple-db::Command and returns an Option<String>. Write some tests for it.

Help

Connecting over TCP/IP

Using nc, netcat or ncat

The nc, netcat, or ncat tools may be available on your macOS or Linux machine. They all work in a similar fashion.

$ echo "PUBLISH 1234" | nc 127.0.0.1 7878

The echo command adds a new-line character automatically. Use echo -n if you don't want it to add a new-line character.

Using our TCP Client

We have written a basic TCP Client which should work on any platform.

$ cd tools/tcp-client
$ cargo run -- "PUBLISH hello"
$ cargo run -- "RETRIEVE"

It automatically adds a newline character on to the end of every message you send. It is hard-coded to connect to a server at 127.0.0.1:7878.

Writing to a stream

If you want to write to an object that implements std::io::Write, you could use writeln!.

Solution
#![allow(unused)]
fn main() {
use std::io::prelude::*;
use std::net::{TcpStream};

fn handle_client(mut stream: TcpStream) -> Result<(), std::io::Error> {
    let mut buffer = String::new();
    stream.read_to_string(&mut buffer)?;
    println!("Received: {:?}", buffer);
    writeln!(stream, "Thank you for {buffer:?}!")?;
    Ok(())
}
}

Writing a TCP Server

If you need a working example of a basic TCP Echo server, you can start with our template.

Solution
use std::io::prelude::*;
use std::net::{TcpListener, TcpStream};
use std::time::Duration;

const DEFAULT_TIMEOUT: Option<Duration> = Some(Duration::from_millis(1000));

fn main() -> std::io::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:7878")?;

    // accept connections and process them one at a time
    for stream in listener.incoming() {
        match stream {
            Ok(stream) => {
                println!("Got client {:?}", stream.peer_addr());
                if let Err(e) = handle_client(stream) {
                    println!("Error handling client: {:?}", e);
                }
            }
            Err(e) => {
                println!("Error connecting: {:?}", e);
            }
        }
    }
    Ok(())
}

/// Process a single connection from a single client.
///
/// Drops the stream when it has finished.
fn handle_client(mut stream: TcpStream) -> Result<(), std::io::Error> {
    stream.set_read_timeout(DEFAULT_TIMEOUT)?;
    stream.set_write_timeout(DEFAULT_TIMEOUT)?;

    let mut buffer = String::new();
    stream.read_to_string(&mut buffer)?;
    println!("Received: {:?}", buffer);
    writeln!(stream, "Thank you for {buffer:?}!")?;
    Ok(())
}

Making a Workspace

Solution A workspace file looks like:
[workspace]
resolver= "2"
members = ["simple-db", "tcp-server"]

Each member is a folder containing a Cargo package (i.e. that contains a Cargo.toml file).

Handling Errors

Solution

In a binary program anyhow is a good way to handle top-level errors.

use std::io::Read;

fn handle_client(stream: &mut std::net::TcpStream) -> Result<(), anyhow::Error> {
    // This returns a `Result<(), std::io::Error>`, and the `std::io::Error` will auto-convert into an `anyhow::Error`.
    stream.read_to_string(&mut buffer)?;
    /// ... etc
    Ok(())    
}

You could also write an enum Error which has a variant for std::io::Error and a variant for simple_db::Error, and suitable impl From<...> for Error blocks.

When handling a client, you could .unwrap() the function which handles the client, but do you want to quit the server if the client sends a malformed message? Perhaps you should catch the result with a match, and print an error to the console before moving on to the next client.

Solution

If you need it, we have provided a complete solution for this exercise.

Multi-Threaded Mailbox Exercise

In this exercise, we will take our "Connected Mailbox" and make it multi-threaded. A new thread should be spawned for every incoming connection, and that thread should take ownership of the TcpStream and drive it to completion.

After completing this exercise you are able to

  • spawn threads

  • convert a non-thread-safe type into a thread-safe-type

  • lock a Mutex to access the data within

Prerequisites

  • A completed "Connected Mailbox" solution

Tasks

  1. Use the std::thread::spawn API to start a new thead when your main loop produces a new connection to a client. The handle_client function should be executed within that spawned thread. Note how Rust doesn't let you pass &mut VecDeque<String> into the spawned thread, both because you have multiple &mut references (not allowed) and because the thread might live longer than the VecDeque (which only lives whilst the main() function is running, and main() might quit at any time with an early return or a break out of the connection loop).

  2. Convert the VecDeque into a Arc<Mutex<VecDeque>> (use std::sync::Mutex). Change the handle_client function to take a &Mutex<VecDeque>. Clone the Arc handle with .clone() and move that cloned handle into the new thread. Change the handle_client function to call let mut queue = your_mutex.lock().unwrap(); whenever you want to access the queue inside the Mutex.

  3. Convert the Arc<Mutex<VecDeque>> into a Mutex<VecDeque> and introduce scoped threads with std::thread::scope. The Mutex<VecDeque> should be created outside of the scope (ensure it lives longer than any of the scoped threads), but the connection loop should be inside the scope. Change std::thread::spawn to be s.spawn, where s is the name of the argument to the scope closure.

At every step (noting that Step 1 won't actually work...), try out your program using a command-line TCP Client: you can either use nc, or netcat, or our supplied tools/tcp-client program.

Optional Tasks:

  • Run cargo clippy on your codebase.
  • Run cargo fmt on your codebase.

Help

Making a Arc, containing a Mutex, containing a VecDeque

You can just nest the calls to SomeType::new()...

Solution
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    // This type annotation isn't required if you actually push something into the queue...
    let queue_handle: Arc<Mutex<VecDeque<String>>> = Arc::new(Mutex::new(VecDeque::new()));
}

Spawning Threads

The std::thread::spawn function takes a closure. Rust will automatically try and borrow any local variables that the closure refers to but that were declared outside the closure. You can put move in front of the closure bars (e.g. move ||) to make Rust try and take ownership of variables instead of borrowing them.

You will want to clone the Arc and move the clone into the thread.

Solution
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    let queue_handle = Arc::new(Mutex::new(VecDeque::new()));

    for _ in 0..10 {
        // Clone the handle and move it into a new thread
        let thread_queue_handle = queue_handle.clone();
        std::thread::spawn(move || {
            handle_client(&thread_queue_handle);
        });

        // This is the same, but fancier. It stops you passing the wrong Arc handle
        // into the thread.
        std::thread::spawn({ // this is a block expression
            // This is declared inside the block, so it shadows the one from the
            // outer scope.
            let queue_handle = queue_handle.clone();
            // this is the closure produced by the block expression
            move || {
                handle_client(&queue_handle);
            }
        });
    }

    // This doesn't need to know it's in an Arc, just that it's in a Mutex.
    fn handle_client(locked_queue: &Mutex<VecDeque<String>>) {
        todo!();
    }
}

Locking a Mutex

A value of type Mutex<T> has a lock() method, but this method can fail if the Mutex has been poisoned (i.e. a thread panicked whilst holding the lock). We generally don't worry about handling the poisoned case (because one of your threads has already panicked, so the program is in a fairly bad state already), so we just use unwrap() to make this thread panic as well.

Solution
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};

fn main() {
    let queue_handle = Arc::new(Mutex::new(VecDeque::new()));

    let mut inner_q = queue_handle.lock().unwrap();
    inner_q.push_back("Hello".to_string());
    println!("{:?}", inner_q.pop_front());
    println!("{:?}", inner_q.pop_front());
}

Creating a thread scope

Recall, the purpose of a thread scope is to satisfy the compiler that it is safe for a thread to borrow an item that is on the current function's stack. It does this by ensuring that all threads created with the scope terminate before the thread scope ends (after which, the remainder of the function is executed including perhaps destruction or transfer of the variables that were borrowed).

Use std::thread::scope to create a scope, and pass it a closure containing the bulk of your main function. Any variables you want to borrow should be created before the thread scope is created, but you should wait for incoming connections inside the thread scope (think about what happens to any spawned threads that are still executing at the point you try and leave the thread scope).

Solution
use std::collections::VecDeque;
use std::sync::Mutex;

fn main() {
    let locked_queue = Mutex::new(VecDeque::new());

    std::thread::scope(|s| {
        for i in 0..10 {
            let locked_queue = &locked_queue;
            s.spawn(move || {
                let mut inner_q = locked_queue.lock().unwrap();
                inner_q.push_back(i.to_string());
                println!("Pop {:?}", inner_q.pop_front());
            });
        }
    });
}

Solution

If you need it, we have provided a complete solution for this exercise.

Self-Check Project

This exercise is intended for you to check your Rust knowledge. It is based on our other exercises, so you can follow those one by one instead of attempting to do everything in one go if you prefer.

In this exercise you will create a small in-memory message queue that is accessible over a TCP connection and uses a plain-text format for its protocol. The protocol has two commands: one to put a message into the back of the queue and one to read a message from the front of the queue. When a user sends a "PUBLISH" you will push the data into the queue, and when the user sends a "RETRIEVE" you will pop some data off the queue (if any is available). The user will connect via TCP to port 7878. You should handle multiple clients adding or removing messages from the queue at the same time.

Goals

After completing this exercise you will have demonstrated that you can:

  • write a Rust binary that uses a Rust library

  • combine two Rust packages into a Cargo workspace

  • open a TCP port and perform an action when each user connects

  • use I/O traits to read/write from a TCP socket

  • create a safe protocol parser in Rust manually

  • interact with borrowed and owned memory, especially how to take ownership

  • handle complex cases using the match and if let syntax

  • handle errors using Result and custom error types

  • spawn threads

  • convert a non-thread-safe type into a thread-safe-type

  • lock a Mutex to access the data within

Tasks

  1. Create a Cargo workspace for your project.

  2. Create a binary package inside your workspace for your TCP server

  3. Implement a simple TCP Server that listens for connections on 127.0.0.1:7878. For each incoming connection, read all of the input as a string, and send it back to the client. Disconnect the client if they send input that is not valid UTF-8.

  4. Create a package with a library crate inside your workspace for the message protocol parser. Make your TCP server depend on that library using a relative path.

  5. Inside your library implement the following function so that it implements the protocol specifications to parse the messages. Use the provided tests to help you with the case handling.

    pub fn parse(input: &str) -> Result<Command, Error> {
        todo!()
    }
  6. Change your TCP Server to use your parser crate to parse the input, and provide an appropriate canned response.

  7. Set up a VecDeque queue and either push or pop from that queue, depending on the command you have received.

  8. Add support for multiple simultaneous client connections using threads. Make sure all clients read and write to the same shared queue.

Optional Tasks

  • Allow each connection to read input line by line as a sequence of commands and execute them in the same order as they come in. This way you should be able to open several connections in terminal and type commands in them one by one.
  • Handle slow clients by disconnecting them if the input isn't received within some timeout.
  • Run cargo fmt on your codebase.
  • Run cargo clippy on your codebase.

Protocol Specification

The protocol has two commands that are sent as messages in the following form:

  • PUBLISH <payload>\n

  • RETRIEVE\n

With the additional properties:

  1. The payload cannot contain newlines.

  2. A missing newline at the end of the command is an error.

  3. Data after the first newline is an error.

  4. Empty payloads are allowed. In this case, the command is PUBLISH \n.

Violations against the form of the messages and the properties are handled with the following error codes:

  • TrailingData (bytes found after newline)

  • IncompleteMessage (no newline)

  • EmptyMessage (empty string instead of a command)

  • UnknownCommand (string is not empty, but neither PUBLISH nor RECEIVE)

  • UnexpectedPayload (message contains a payload, when it should not)

  • MissingPayload (message is missing a payload)

Testing

Below are the tests your protocol parser needs to pass. You can copy them to the bottom of your lib.rs.

#[cfg(test)]
mod tests {
    use super::*;

    // Tests placement of \n
    #[test]
    fn test_missing_nl() {
        let line = "RETRIEVE";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }
    #[test]
    fn test_trailing_data() {
        let line = "PUBLISH The message\n is wrong \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::TrailingData);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_string() {
        let line = "";
        let result = parse(line);
        let expected = Err(Error::IncompleteMessage);
        assert_eq!(result, expected);
    }

    // Tests for empty messages and unknown commands

    #[test]
    fn test_only_nl() {
        let line = "\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::EmptyMessage);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_unknown_command() {
        let line = "SERVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnknownCommand);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of RETRIEVE command

    #[test]
    fn test_retrieve_w_whitespace() {
        let line = "RETRIEVE \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve_payload() {
        let line = "RETRIEVE this has a payload\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::UnexpectedPayload);
        assert_eq!(result, expected);
    }

    #[test]
    fn test_retrieve() {
        let line = "RETRIEVE\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Retrieve);
        assert_eq!(result, expected);
    }

    // Tests correct formatting of PUBLISH command

    #[test]
    fn test_publish() {
        let line = "PUBLISH TestMessage\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("TestMessage".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_empty_publish() {
        let line = "PUBLISH \n";
        let result: Result<Command, Error> = parse(line);
        let expected = Ok(Command::Publish("".into()));
        assert_eq!(result, expected);
    }

    #[test]
    fn test_missing_payload() {
        let line = "PUBLISH\n";
        let result: Result<Command, Error> = parse(line);
        let expected = Err(Error::MissingPayload);
        assert_eq!(result, expected);
    }
}

Help

Connecting over TCP/IP

Using nc, netcat or ncat

The nc, netcat, or ncat tools may be available on your macOS or Linux machine, or on WSL on windows. They all work in a similar fashion.

$ echo "PUBLISH 1234" | nc 127.0.0.1 7878

The echo command adds a new-line character automatically. Use echo -n if you don't want it to add a new-line character.

Using our TCP Client

We have written a basic TCP Client which should work on any platform.

$ cd tools/tcp-client
$ cargo run -- "PUBLISH hello"
$ cargo run -- "RETRIEVE"

It automatically adds a newline character on to the end of every message you send. It is hard-coded to connect to a server at 127.0.0.1:7878.

Solution

This exercise is based on three other exercises. Check their solutions below:

nRF52 Preparation

This chapter contains information about the nRF52-based exercises, the required hardware and an installation guide.

Required Hardware

  • nRF52840 Development Kit (DK)
  • nRF52840 Dongle
  • 2 micro-USB cables
    • ❗️ make sure you're using micro usb cables which can transmit data (some are charging-only; these are not suitable for these exercises)
  • 2 available USB-A ports on your laptop / PC (you can use a USB hub if you don't have enough ports)

In our nRF52-focussed exercises we will use both the nRF52840 Development Kit (DK) and the nRF52840 Dongle. We'll mainly develop programs for the DK and use the Dongle to assist with some exercises.

For the span of these exercises keep the nRF52840 DK connected to your PC using a micro-USB cable. Connect the USB cable to the J2 port on the nRF52840 DK. Instructions to identify the USB ports on the nRF52840 board can be found in the top level README file.

Starter code

Project templates and starter code for this workshop can be found at in this repo.

Required tools

Please install the required tools before the lesson starts.

nRF52 Code Organization

Workshop Materials

You will need a local copy of the workshop materials. We recommend the Github release as it contains pre-compiled HTML docs and pre-compiled dongle firmware, but you can clone the repo with git and check out the appropriate tag as well if you prefer.

Ask your trainer which release/tag you should be using.

Github Release

Download the latest release from the rust-exercises Github release area. Unpack the zip file somewhere you can work on the contents.

Git checkout

Clone and change into the rust-exercises git repository:

git clone https://github.com/ferrous-systems/rust-exercises.git
cd rust-exercises

The git repository contains all workshop materials, i.e. code snippets, custom tools and the source for this handbook, but not the pre-compiled dongle firmware.

Firmware

The target firmware for the nRF52 for this exercise lives in ./nrf52-code:

$ tree -L 2
.
β”œβ”€β”€ boards
β”‚Β Β  β”œβ”€β”€ dk
β”‚Β Β  β”œβ”€β”€ dk-solution
β”‚Β Β  β”œβ”€β”€ dongle
β”‚Β Β  └── dongle-fw
β”œβ”€β”€ consts
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
β”œβ”€β”€ hal-app
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
β”œβ”€β”€ loopback-fw
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
β”œβ”€β”€ puzzle-fw
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  β”œβ”€β”€ build.rs
β”‚Β Β  └── src
β”œβ”€β”€ radio-app
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
β”œβ”€β”€ usb-app
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
β”œβ”€β”€ usb-app-solutions
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  β”œβ”€β”€ src
β”‚Β Β  └── traces
β”œβ”€β”€ usb-lib
β”‚Β Β  β”œβ”€β”€ Cargo.lock
β”‚Β Β  β”œβ”€β”€ Cargo.toml
β”‚Β Β  └── src
└── usb-lib-solutions
    β”œβ”€β”€ get-descriptor-config
    β”œβ”€β”€ get-device
    └── set-config

27 directories, 17 files

board/dk

Contains a Board Support Package for the nRF52840 Developer Kit.

board/dk-solution

Contains a Board Support Package for the nRF52840 Developer Kit, with a solution to the BSP exercise.

board/dongle

Contains a Board Support Package for the nRF52840 USB Dongle. You won't be using this.

board/dongle-fw

Contains pre-compiled firmware for the nRF52 USB Dongle. Use in the nRF52 Radio Exercise.

consts

Contains constants (e.g. USB Vendor IDs) shared by multiple crates.

hal-app

Contains template and solution binary crates for the nRF BSP exercise.

loopback-fw

Source code for the USB Dongle firmware to implement loopback mode.

puzzle-fw

Source code for the USB Dongle firmware to implement puzzle mode. No, you won't find the solution to the puzzle in this source directory - nice try!

radio-app

Contains template and solution binary crates for the nRF Radio exercise.

usb-app

Contains template binary crates for the nRF USB exercise.

usb-app-solutions

Contains solution binary crates for the nRF USB exercise.

usb-lib

Contains a template library crate for the nRF USB exercise. This library can parse USB descriptor information.

usb-lib-solutions/get-descriptor-config

Contains a solution library crate for the nRF USB exercise.

usb-lib-solutions/get-device

Contains a solution library crate for the nRF USB exercise.

usb-lib-solutions/set-config

Contains a solution library crate for the nRF USB exercise.

nRF52 Hardware

nRF52840 Dongle

Connect the Dongle to your PC/laptop. Its red LED should start oscillating in intensity. The device will also show up as:

Windows: a USB Serial Device (COM port) in the Device Manager under the Ports section

Linux: a USB device under lsusb. The device will have a VID of 0x1915 and a PID of 0x521f -- the 0x prefix will be omitted in the output of lsusb:

$ lsusb
(..)
Bus 001 Device 023: ID 1915:521f Nordic Semiconductor ASA 4-Port USB 2.0 Hub

The device will also show up in the /dev directory as a ttyACM device:

$ ls /dev/ttyACM*
/dev/ttyACM0

macOS: a usb device when executing ioreg -p IOUSB -b -n "Open DFU Bootloader". The device will have a vendor ID ("idVendor") of 6421 and a product ID ("idProduct") of 21023:

$ ioreg -p IOUSB -b -n "Open DFU Bootloader"
(...)
| +-o Open DFU Bootloader@14300000  <class AppleUSBDevice, id 0x100005d5b, registered, matched, ac$
  |     {
  |       (...)
  |       "idProduct" = 21023
  |       (...)
  |       "USB Product Name" = "Open DFU Bootloader"
  |       (...)
  |       "USB Vendor Name" = "Nordic Semiconductor"
  |       "idVendor" = 6421
  |       (...)
  |       USB Serial Number" = "CA1781C8A1EE"
  |       (...)
  |     }
  |

The device will show up in the /dev directory as tty.usbmodem<USB Serial Number>:

$ ls /dev/tty.usbmodem*
/dev/tty.usbmodemCA1781C8A1EE1

nRF52840 Development Kit (DK)

Connect one end of a micro USB cable to the USB connector J2 of the board and the other end to your PC.

πŸ’¬ These directions assume you are holding the board "horizontally" with components (switches, buttons and pins) facing up. In this position, rotate the board, so that its convex shaped short side faces right. You'll find one USB connector (J2) on the left edge, another USB connector (J3) on the bottom edge and 4 buttons on the bottom right corner.

Labeled Diagram of the nRF52840 Development Kit (DK)

After connecting the DK to your PC/laptop it will show up as:

Windows: a removable USB flash drive (named JLINK) and also as a USB Serial Device (COM port) in the Device Manager under the Ports section

Linux: a USB device under lsusb. The device will have a VID of 0x1366 and a PID of 0x10?? or 0x01?? (? is a hex digit) -- the 0x prefix will be omitted in the output of lsusb:

$ lsusb
(..)
Bus 001 Device 014: ID 1366:1015 SEGGER 4-Port USB 2.0 Hub

The device will also show up in the /dev directory as a ttyACM device:

$ ls /dev/ttyACM*
/dev/ttyACM0

macOS: a removable USB flash drive (named JLINK) in Finder and also a USB device named "J-Link" when executing ioreg -p IOUSB -b -n "J-Link".

$ ioreg -p IOUSB -b -n "J-Link"
(...)
  | +-o J-Link@14300000  <class AppleUSBDevice, id 0x10000606a, registered, matched, active, busy 0 $
  |     {
  |       (...)
  |       "idProduct" = 4117
  |       (...)
  |       "USB Product Name" = "J-Link"
  |       (...)
  |       "USB Vendor Name" = "SEGGER"
  |       "idVendor" = 4966
  |       (...)
  |       "USB Serial Number" = "000683420803"
  |       (...)
  |     }
  |

The device will also show up in the /dev directory as tty.usbmodem<USB Serial Number>:

$ ls /dev/tty.usbmodem*
/dev/tty.usbmodem0006834208031

The board has several switches to configure its behavior. The out of the box configuration is the one we want. If the above instructions didn't work for you, check the position of the following switches:

  • SW6 is set to the DEFAULT position (to the right - nRF = DEFAULT).
  • SW7 (protected by Kapton tape) is set to the Def. position (to the right - TRACE = Def.).
  • SW8 is set to the ON (to the left) position (Power = ON)
  • SW9 is set to the VDD position (center - nRF power source = VDD)
  • SW10 (protected by Kapton tape) is set to the OFF position (to the left - VEXT -> nRF = OFF).

For reference, here's the board picture again:

Labeled Diagram of the nRF52840 Development Kit (DK)

nRF52 Tools

VS Code

Windows: Go to https://code.visualstudio.com and run the installer.

Linux: Follow the instructions for your distribution on https://code.visualstudio.com/docs/setup/linux.

macOS: Go to https://code.visualstudio.com and click on "Download for Mac"

OS specific dependencies

Linux only: USB

Some of our tools depend on pkg-config and libudev.pc. Ensure you have the proper packages installed; on Debian based distributions you can use:

sudo apt-get install libudev-dev libusb-1.0-0-dev

To access the USB devices as a non-root user, follow these steps:

  1. (Optional) Connect the dongle and check its permissions with these commands:

    $ lsusb -d 1915:521f
    Bus 001 Device 016: ID 1915:521f Nordic Semiconductor ASA USB Billboard
    $ #   ^         ^^
    
    $ # take note of the bus and device numbers that appear for you when run the next command
    $ ls -l /dev/bus/usb/001/016
    crw-rw-r-- 1 root root 189, 15 May 20 12:00 /dev/bus/usb/001/016
    

    The root root part in crw-rw-r-- 1 root root indicates the device can only be accessed by the root user.

  2. Create the following file with the displayed contents. You'll need root permissions to create the file.

    $ cat /etc/udev/rules.d/50-ferrous-training.rules
    # udev rules to allow access to USB devices as a non-root user
    
    # nRF52840 Dongle in bootloader mode
    ATTRS{idVendor}=="1915", ATTRS{idProduct}=="521f", TAG+="uaccess"
    
    # nRF52840 Dongle applications
    ATTRS{idVendor}=="1209", TAG+="uaccess"
    
    # nRF52840 Development Kit
    ATTRS{idVendor}=="1366", ENV{ID_MM_DEVICE_IGNORE}="1", TAG+="uaccess"
    
  3. Run the following command to make the new udev rules effective

    sudo udevadm control --reload-rules
    
  4. (Optional) Disconnect and reconnect the dongle. Then check its permissions again.

    $ lsusb
    Bus 001 Device 017: ID 1915:521f Nordic Semiconductor ASA 4-Port USB 2.0 Hub
    
    $ ls -l /dev/bus/usb/001/017
    crw-rw-r--+ 1 root root 189, 16 May 20 12:11 /dev/bus/usb/001/017
    

    The + part in crw-rw-r--+ indicates the device can be accessed without root permissions.

On Windows you'll need to associate the nRF52840 Development Kit's USB device to the WinUSB driver.

To do that connect the nRF52840 DK to your PC using micro-USB port J2 (as done before) then download and run the Zadig tool.

In Zadig's graphical user interface,

  1. Select the 'List all devices' option from the Options menu at the top.

  2. From the device (top) drop down menu select "BULK interface (Interface nnn)"

  3. Once that device is selected, 1366 1015 should be displayed in the USB ID field. That's the Vendor ID - Product ID pair.

  4. Select 'WinUSB' as the target driver (right side)

  5. Click "Install Driver". The process may take a few minutes to complete and might not appear to do anything right away. Click it once and wait.

You do not need to do anything for the nRF52840 Dongle device.

Rust and tooling

Base Rust installation

Go to https://rustup.rs and follow the instructions.

Windows: Be sure to select the optional "Desktop development with C++" part of the C++ build tools package. The installation size may take up to 5.7 GB of disk space.

Rust Analyzer

All: Open VS Code and look for Rust Analyzer in the marketplace (bottom icon in the left panel). Then install it.

Windows: It's OK to ignore the message about git not being installed, if you get one!

Better TOML

All: For better handling of Cargo.toml files, we recommend you install Better TOML if you're using VS Code.

Rust Cross compilation support

All: Run this command in a terminal:

rustup +stable target add thumbv7em-none-eabihf

ELF analysis tools

All: Run these commands in a terminal:

cargo install cargo-binutils
rustup +stable component add llvm-tools

General purpose tools

Install the flip-link and nrf-dfu tools from source using the following Cargo commands:

$ cargo install flip-link
(..)
Installed package `flip-link v0.1.7` (..)

$ cargo install nrfdfu
(..)
Installed package `nrfdfu v0.1.3` (..)

Install probe-rs 0.24 pre-compiled binaries on Linux or macOS with:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/probe-rs/probe-rs/releases/download/v0.24.0/probe-rs-tools-installer.sh | sh

Install probe-rs 0.24 pre-compiled binaries on Windows with:

powershell -c "irm https://github.com/probe-rs/probe-rs/releases/download/v0.24.0/probe-rs-tools-installer.ps1 | iex"

Setup check

βœ… Let's check that you have installed all the tools listed in the previous section.

$ cargo size --version
cargo-size 0.3.6

βœ… Connect the nRF52840-DK with your computer by plugging the usb-cable into the J2 connector on the DK (the usb connector on the short side of the board).

βœ… In the terminal run the following command from the nrf52-code/radio-app folder. This will build and run a simple program on the DK to test the set-up.

cargo run --bin hello -- --allow-erase-all

The -- --allow-erase-all option gives the --allow-erase-all argument to probe-rs, which gives it permission to clear out the pre-installed Nordic bootloader code. You only need that the first time you try and program the nRF52840-DK with cargo run.

References and Resources

Radio Project

USB Project

Tooltips

Besides the ones covered in this workshop, there are many more tools that make embedded development easier. Here, we'd like to introduce you to some of these tools and encourage you to play around with them and adopt them if you find them helpful!

cargo-bloat

cargo-bloat is a useful tool to analyze the binary size of a program. You can install it through cargo:

$ cargo install cargo-bloat
(..)
Installed package `cargo-bloat v0.10.0` (..)

Let's inspect our radio workshop's hello program with it:

$ cd nrf52-code/radio-app
$ cargo bloat --bin hello
File  .text   Size      Crate Name
0.7%  13.5% 1.3KiB        std <char as core::fmt::Debug>::fmt
0.5%   9.6%   928B      hello hello::__cortex_m_rt_main
0.4%   8.4%   804B        std core::str::slice_error_fail
0.4%   8.0%   768B        std core::fmt::Formatter::pad
0.3%   6.4%   614B        std core::fmt::num::<impl core::fmt::Debug for usize>::fmt
(..)
5.1% 100.0% 9.4KiB            .text section size, the file size is 184.5KiB

This breaks down the size of the .text section by function. This breakdown can be used to identify the largest functions in the program; those could then be modified to make them smaller.

Using probe-rs VS Code plugin

The probe-rs team have produced a VS Code plugin. It uses the probe-rs library to talk directly to your supported Debug Probe (J-Link, ST-Link, CMSIS-DAP, or whatever) and supports both single-stepping and defmt logging.

Install the probe-rs.probe-rs-debugger extension in VS Studio, and when you open the nrf52-code/radio-app folder in VS Code, the .vscode/launch.json file we supply should give you a Run with probe-rs entry in the Run and Debug panel. Press the green triangle and it will build the code, flash device, set up defmt and then start the chip running. You can set breakpoints in the usual way (by clicking to the left of your source code to place a red dot).

Using gdb and probe-rs

The CLI probe-rs command has an option for opening a GDB server. We have found the command-line version of GDB to be a little buggy though, so the VS Code plugin above is preferred.

$ probe-rs gdb --chip nRF52840_xxAA
# In another window
$ arm-none-eabi-gdb ./target/thumbv7em-none-eabihf/debug/blinky
gdb> target extended-remote :1337
gdb> monitor reset halt
gdb> break main
gdb> continue
Breakpoint 1, blinky::__cortex_m_rt_main_trampoline () at src/bin/blinky.rs:10

Using gdb and openocd

You can also debug a Rust program using gdb and openocd. However, this isn't recommended because it requires significant extra set-up, especially to get the RTT data piped out of a socket and into defmt-print (this function is built into a probe-rs).

If you are familiar with OpenOCD and GDB, and want to try this anyway, then do pretty much what you would do with a C program.

The only change is that if you want defmt output, you need these OpenOCD commands to enable RTT:

rtt setup 0x20000000 0x40000 "SEGGER RTT"
rtt start
rtt server start 9090 0

You can then use nc to connect to localhost:9090, and pipe the output into defmt-print:

nc localhost:9090 | defmt-print ./target/thumbv7em-none-eabihf/debug/blinky

Troubleshooting

If you have issues with any of the tools used in this workshop check out the sections in this chapter.

cargo-size is not working

$ cargo size --bin hello
Failed to execute tool: size
No such file or directory (os error 2)

llvm-tools is not installed. Install it with rustup component add llvm-tools

β–Ά Run button, type annotations and syntax highlighting missing / Rust-Analyzer is not working

If you get no type annotations, no "Run" button and no syntax highlighting this means Rust-Analyzer isn't at work yet.

Try the following:

  • add something to the file you're currently looking at, delete it again and save. This triggers a re-run. (you can also touch the file in question)

  • check that you have a single folder open in VS code; this is different from a single-folder VS code workspace. First close all the currently open folders then open a single folder using the 'File > Open Folder' menu. The open folder should be the nrf52-code/radio-app folder for the Radio workshop, the nrf52-code/hal-app folder for the HAL workshop, or the nrf52-code/usb-app folder for the USB workshop. ß

  • use the latest version of the Rust-Analyzer plugin. If you get a prompt to update the Rust-Analyzer extension when you start VS code accept it. You may also get a prompt about updating the Rust-Analayzer binary; accept that one too. The extension should restart automatically after the update. If it doesn't then close and re-open VS code.

  • You may need to wait a little while Rust-Analyzer analyzes all the crates in the dependency graph. Then you may need to modify and save the currently open file to force Rust-Analyzer to analyze it.

cargo build fails to link

If you have configured Cargo to use sccache then you'll need to disable sccache support. Unset the RUSTC_WRAPPER variable in your environment before opening VS code. Run cargo clean from the Cargo workspace you are working from (nrf52-code/radio-app or nrf52-code/usb-app). Then open VS code.

If you are on Windows and get linking errors like LNK1201: error writing to program database, then something in your target folder has become corrupt. A cargo clean should fix it.

Dongle USB functionality is not working

NOTE: this section only applies to the Beginner workshop

If you don't get any output from cargo xtask serial-term it could just have been that first line got lost when re-enumerating the device from bootloader mode to the loopback application.

Run cargo xtask serial-term in one console window. Leave this window open.

In another window, run these two commands:

$ cargo xtask change-channel 20
requested channel change to channel 20

$ cargo xtask change-channel 20
requested channel change to channel 20

If you get two lines of output in cargo xtask serial-term like this, you are good to go:

$ cargo xtask serial-term
now listening on channel 20
now listening on channel 20

Return to the "Interference" section.

πŸ”Ž cargo xtask serial-term shows you the log output that the Dongle is sending to your computer via the serial interface (not over the wireless network!). After you've ran cargo xtask change-channel, it tells you that it is now listening for network traffic on channel 20. This is helpful for debugging, but not mission-critical.

If you only get one line of output then your OS may be losing some serial data -- we have seen this behavior on some macOS machines. You will still be able to work through the exercises but will miss log data every now and then. Return to the "Interference" section.

If you don't get any output from cargo xtask serial-term and/or the cargo xtask change-channel command fails then the Dongle's USB functionality is not working correctly.

In this case you should flash one of the loopback-nousb* programs:

Put the device in bootloader mode again. Now, run:

nrfdfu nrf52-code/boards/dongle/loopback-nousb21  # you can pick 11, 16, 21 or 26

❗️ The number in the loopback-nousb* file name is the radio channel the Dongle will listen on. This means that when you program the Development Kit to send data to the Dongle, you need to ensure they are communicating on the same channel by setting

/* make sure to pass the channel number of the loopback-nousb* program you picked */
radio.set_channel(Channel::_21);

Note that the loopback-nousb* programs do not send you any logs via cargo xtask serial-term for debugging but you will be able do the exercises nonetheless. For your debugging convenience, the Dongle will toggle the state of its green LED when it receives a packet. When you're done, return to the "Interference" section.

cargo run errors

You may get one of these errors:

  • "Access denied (insufficient permissions)" (seen on macOS)
  • "USB error while taking control over USB device: Resource busy" (seen on Linux)
$ cargo run --bin usb-4
Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/usb-4`
Error: An error specific to a probe type occured: USB error while taking control over USB device: Access denied (insufficient permissions)

Caused by:
    USB error while taking control over USB device: Access denied (insufficient permissions)
$ cargo run --bin usb-4
Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/usb-4`
Error: An error specific to a probe type occured: USB error while taking control over USB device: Resource busy

Caused by:
    USB error while taking control over USB device: Resource busy

All of them have the same root issue: You have another instance of the cargo run process running.

It is not possible to have two or more instances of cargo run running. Terminate the old instance before executing cargo run. If you are using VS Code click the garbage icon ("Kill Terminal") on the top right corner of the terminal output window (located on the bottom of the screen).

no probe was found error

You may encounter this error:

Running probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/hello
Error: no probe was found
  • It may be caused by the micro-USB cable plugged on the long side of the board, instead of the short side.
  • Check that the board is powered on.
  • Check that your cable is a data cable and not power-only.

location info is incomplete error

Problem: Using cargo run --bin hello from within the nrf52-code/radio-app folder finishes compiling and starts up probe-rs. But then the following error is returned:

Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/hello`
(HOST) WARN  (BUG) location info is incomplete; it will be omitted from the output
Error: AP ApAddress { dp: Default, ap: 0 } is not a memory AP
The LED5 next to the FTDI chip on the DK goes off for a split second but no program is flashed.

Solution: It seems like my nRF52840-DK was shipped with the MCU in some kind of protected state. Using nrfjprog from the nRF command line tools you can run nrfjprog --recover which makes the MCU exit this state and programming etc. using probe-rs works fine again.

Untested: using nrf-recover may also work.

nRF52 Radio Workbook

In this workshop you'll get familiar with:

  • the structure of embedded Rust programs,
  • the existing embedded Rust tooling, and
  • embedded application development using a Board Support Package (BSP).

To put these concepts in practice you'll write applications that use the radio functionality of the nRF52840 microcontroller.

You have received two development boards for this workshop. We'll use both in the this radio workshop.

The nRF52840 Development Kit

This is the larger development board.

The board has two USB ports: J2 and J3 and an on-board J-Link programmer / debugger -- there are instructions to identify the ports in a previous section. USB port J2 is the J-Link's USB port. USB port J3 is the nRF52840's USB port. Connect the Development Kit to your computer using the J2 port.

The nRF52840 Dongle

This is the smaller development board.

The board has the form factor of a USB stick and can be directly connected to one of the USB ports of your PC / laptop. Do not connect it just yet.

The nRF52840

Both development boards have an nRF52840 microcontroller. Here are some details about it that are relevant to this workshop.

  • single core ARM Cortex-M4 processor clocked at 64 MHz
  • 1 MB of Flash (at address 0x0000_0000)
  • 256 KB of RAM (at address 0x2000_0000)
  • IEEE 802.15.4 and BLE (Bluetooth Low Energy) compatible radio
  • USB controller (device function)

Parts of an Embedded Program

We will look at the elements that distinguish an embedded Rust program from a desktop program.

βœ… Open the nrf52-code/radio-app folder in VS Code.

# or use "File > Open Folder" in VS Code
code nrf52-code/radio-app

βœ… Then open the nrf52-code/radio-app/src/bin/hello.rs file.

Attributes

In the file, you will find the following attributes:

#![no_std]

The #![no_std] language attribute indicates that the program will not make use of the standard library, the std crate. Instead it will use the core library, a subset of the standard library that does not depend on an underlying operating system (OS).

#![no_main]

The #![no_main] language attribute indicates that the program will use a custom entry point instead of the default fn main() { .. } one.

#[entry]

The #[entry] macro attribute marks the custom entry point of the program. The entry point must be a divergent function whose return type is the never type !. The function is not allowed to return; therefore the program is not allowed to terminate. The macro comes from the cortex-m-rt crate and is not part of the Rust language.

Building an Embedded Program

The default in a Cargo project is to compile for the host (native compilation). The nrf52-code/radio-app project has been configured for cross compilation to the ARM Cortex-M4 architecture. This configuration can be seen in the Cargo configuration file (.cargo/config):

# .cargo/config
[build]
target = "thumbv7em-none-eabihf" # = ARM Cortex-M4

The target thumbv7em-none-eabihf can be broken down as:

  • thumbv7em - we generate instructions for the Armv7E-M architecture running in Thumb-2 mode (actually the only supported mode on this architecture)
  • none - there is no Operating System
  • eabihf - use the ARM Embedded Application Binary Interface, with Hard Float support
    • f32 and f64 can be passed to functions in FPU registers (like S0), instead of in integer registers (like R0)

βœ… Inside the folder nrf52-code/radio-app, use the following command to cross compile the program to the ARM Cortex-M4 architecture.

cargo build --bin hello

The output of the compilation process will be an ELF (Executable and Linkable Format) file. The file will be placed in the target/thumbv7em-none-eabihf directory.

βœ… Run $ file target/thumbv7em-none-eabihf/debug/hello and compare if your output is as expected.

Expected output:

$ file target/thumbv7em-none-eabihf/debug/hello
hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped

Binary Size

ELF files contain metadata like debug information so their size on disk is not a good indication of the amount of Flash the program will use once it's loaded on the target device's memory.

To display the amount of Flash the program will occupy on the target device use the cargo-size tool, which is part of the cargo-binutils package.

βœ… Use the following command to print the binary's size in system V format.

cargo size --bin hello -- -A

Expected output: The breakdown of the program's static memory usage per linker section.

$ cargo size --bin hello -- -A
   Compiling radio v0.0.0 (/Users/jonathan/Documents/rust-exercises/nrf52-code/radio-app)
    Finished dev [optimized + debuginfo] target(s) in 0.92s
hello  :
section               size        addr
.vector_table          256         0x0
.text                 4992       0x100
.rodata               1108      0x1480
.data                   48  0x2003fbc0
.gnu.sgstubs             0      0x1920
.bss                    12  0x2003fbf0
.uninit               1024  0x2003fbfc
.defmt                   6         0x0
.debug_loc            3822         0x0
.debug_abbrev         3184         0x0
.debug_info         109677         0x0
.debug_aranges        2896         0x0
.debug_ranges         4480         0x0
.debug_str          108868         0x0
.debug_pubnames      40295         0x0
.debug_pubtypes      33582         0x0
.ARM.attributes         56         0x0
.debug_frame          2688         0x0
.debug_line          18098         0x0
.comment                19         0x0
Total               335111

πŸ”Ž More details about each linker section:

The first three sections are contiguously located in Flash memory -- on the nRF52840, flash memory spans from address 0x0000_0000 to 0x0010_0000 (i.e. 1 MiB of flash).

  • The .vector_table section contains the vector table, a data structure required by the Armv7E-M specification
  • The .text section contains the instructions the program will execute
  • The .rodata section contains constants like strings literals

Skipping .gnu.sgstubs (which is empty), the next few sections - .data, .bss and .uninit - are located in RAM. Our RAM spans the address range 0x2000_0000 - 0x2004_0000 (256 KB). These sections contain statically allocated variables (static variables), which are either initialised with a value kept in flash, with zero, or with nothing at all.

The remaining sections are debug information, which we ignore for now. But your debugger might refer to them when debugging!

Running the Program

Setting the log level

Enter the appropriate command into the terminal you're using. This will set the log level for this session.

MacOS & Linux

export DEFMT_LOG=warn

PowerShell

$Env:DEFMT_LOG = "warn"

Windows Command Prompt

set DEFMT_LOG=warn

Inside VS Code

To get VS Code to pick up the environment variable, you can either:

  • set it as above and then open VS Code from inside the terminal (ensuring it wasn't already open and hence just getting you a new window on the existing process), or

  • add it to your rust-analyzer configuration, by placing this in your settings.json file:

    "rust-analyzer.runnables.extraEnv": {
        "DEFMT_LOG": "warn"
    }
    

    This will ensure the variable is set whenever rust-analyzer executes cargo run for you.

Running from VS Code

βœ… Open the nrf52-code/radio-app/src/bin/hello.rs file, go to the "Run and Debug" button on the left, and then click the "Run" triangle next to Debug Microcontroller.

Note: you will get the "Run" button if the Rust analyzer's workspace is set to the nrf52-code/radio-app folder. This will be the case if the current folder in VS code (left side panel) is set to nrf52-code/radio-app.

Running from the console

If you are not using VS code, you can run the program out of your console. Enter the command cargo run --bin hello from within the nrf52-code/radio-app folder. Rust Analyzer's "Run" button is a short-cut for that command.

Expected output

NOTE: Recent version of the nRF52840-DK have flash-read-out protection to stop people dumping the contents of flash on an nRF52 they received pre-programmed, so if you have problems immediately after first plugging your board in, see this page.

If you run into an error along the lines of "Debug power request failed" retry the operation and the error should disappear.

$ cargo run --bin hello
   Compiling radio_app v0.0.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app)
    Finished dev [optimized + debuginfo] target(s) in 0.28s
     Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/hello`
     Erasing sectors βœ” [00:00:00] [######################################################] 8.00 KiB/8.00 KiB @ 26.71 KiB/s (eta 0s )
 Programming pages   βœ” [00:00:00] [######################################################] 8.00 KiB/8.00 KiB @ 29.70 KiB/s (eta 0s )    Finished in 0.59s
Hello, world!
`dk::exit()` called; exiting ...

What just happened?

cargo run will compile the application and then invoke the probe-rs tool with its final argument set to the path of the output ELF file.

The probe-rs tool will

  • flash (load) the program on the microcontroller
  • reset the microcontroller to make it execute the new program
  • collect logs from the microcontroller and print them to the console
  • print a backtrace of the program if the halt was due to an error.

Should you need to configure the probe-rs invocation to e.g. flash a different microcontroller you can do that in the .cargo/config.toml file.

[target.thumbv7em-none-eabihf]
runner = "probe-rs run --chip nRF52840_xxAA" # <- add/remove/modify flags here
# ..

πŸ”Ž How does flashing work?

The flashing process consists of the PC communicating with a second microcontroller on the nRF52840 DK over USB (J2 port). This second microcontroller, which is a J-Link Arm Debug Probe, is connected to the nRF52840 through a electrical interface known as SWD (Serial Wire Debug). The SWD protocol specifies procedures for reading memory, writing to memory, halting the target processor, reading the target processor registers, etc.

πŸ”Ž How does logging work?

Logging is implemented using the Real Time Transfer (RTT) protocol. Under this protocol the target device writes log messages to a ring buffer stored in RAM; the PC communicates with the J-Link to read out log messages from this ring buffer. This logging approach is non-blocking in the sense that the target device does not have to wait for physical IO (USB comm, serial interface, etc.) to complete while logging messages since they are written to memory. It is possible, however, for the target device to run out of space in its logging ring buffer; this causes old log messages to be overwritten or the microcontroller to pause whilst waiting for the PC to catch up with reading messages (depending on configuration).

Panicking

βœ… Open the nrf52-code/radio-app/src/bin/panic.rs file and click the "Run" button (or run with cargo run --bin panic).

This program attempts to index an array beyond its length and this results in a panic.

$ cargo run --bin panic
   Compiling defmt-macros v0.3.6
   Compiling defmt v0.3.5
   Compiling defmt-rtt v0.4.0
   Compiling panic-probe v0.3.1
   Compiling dk v0.0.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/boards/dk)
   Compiling radio_app v0.0.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app)
    Finished dev [optimized + debuginfo] target(s) in 1.27s
     Running `probe-rs run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/debug/panic`
      Erasing βœ” [00:00:00] [#######################################################################################################################################] 16.00 KiB/16.00 KiB @ 32.26 KiB/s (eta 0s )
  Programming βœ” [00:00:00] [#######################################################################################################################################] 16.00 KiB/16.00 KiB @ 41.48 KiB/s (eta 0s )    Finished in 0.904s
ERROR panicked at src/bin/panic.rs:30:13:
index out of bounds: the len is 3 but the index is 3
└─ panic_probe::print_defmt::print @ /Users/jonathan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/panic-probe-0.3.1/src/lib.rs:104
`dk::fail()` called; exiting ...
Frame 0: fail @ 0x00001308
       /Users/jonathan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cortex-m-semihosting-0.5.0/src/lib.rs:201:13
Frame 1: __cortex_m_rt_HardFault @ 0x000016a6 inline
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/lib.rs:12:5
Frame 2: __cortex_m_rt_HardFault_trampoline @ 0x00000000000016a2
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/lib.rs:10:1
Frame 3: "HardFault handler. Cause: Escalated UsageFault (Undefined instruction)." @ 0x000016a6
Frame 4: __udf @ 0x00001530 inline
       ./asm/lib.rs:48:1
Frame 5: __udf @ 0x0000000000001530
       ./asm/lib.rs:51:17
Frame 6: udf @ 0x0000151c
       /Users/jonathan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cortex-m-0.7.7/src/asm.rs:43:5
Frame 7: hard_fault @ 0x0000150e
       /Users/jonathan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/panic-probe-0.3.1/src/lib.rs:86:5
Frame 8: panic @ 0x000014dc
       /Users/jonathan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/panic-probe-0.3.1/src/lib.rs:54:9
Frame 9: panic_fmt @ 0x0000034a
       /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
Frame 10: panic_bounds_check @ 0x000003fe
       /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:190:5
Frame 11: bar @ 0x00000180
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/bin/panic.rs:30:13
Frame 12: foo @ 0x00000176
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/bin/panic.rs:24:2
Frame 13: __cortex_m_rt_main @ 0x000002de
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/bin/panic.rs:13:5
Frame 14: __cortex_m_rt_main_trampoline @ 0x0000018a
       /Users/jonathan/Documents/ferrous-systems/rust-exercises/nrf52-code/radio-app/src/bin/panic.rs:9:1
Frame 15: memmove @ 0x0000013c
Frame 16: memmove @ 0x0000013c
Error: Semihosting indicates exit with failure code: 0x020023 (131107)

In no_std programs the behavior of panic is defined using the #[panic_handler] attribute. In the example, the panic handler is defined in the panic-probe crate but we can also implement a custom one in our binary:

βœ… Change radio-app/lib.rs and the remove the use panic_probe as _; line and add a custom panic handler, like:

#[panic_handler]
fn panic(info: &core::panic::PanicInfo) -> ! {
    defmt::error!("Oops!! {}", defmt::Debug2Format(info));
    dk::fail();
}

Now run the program again. Try again, but without printing the info variable. Can you print info without defmt::Debug2Format(..) wrapped around it? Why not?

Using a Hardware Abstraction Layer

βœ… Open the nrf52-code/radio-app/src/bin/led.rs file.

You'll see that it initializes your board using the dk crate:

let board = dk::init().unwrap();

This grants you access to the board's peripherals, like its LEDs.

The dk crate / library is a Board Support Package (BSP) tailored to this workshop to make accessing the peripherals used in this workshop extra seamless. You can find its source code at nrf52-code/boards/dk/src/.

dk is based on the nrf52840-hal crate, which is a Hardware Abstraction Layer (HAL) over the nRF52840 System on Chip. The purpose of a HAL is to abstract away the device-specific details of the hardware, for example registers, and instead expose a higher level API more suitable for application development.

The dk::init function we have been calling in all programs initializes a few of the nRF52840 peripherals and returns a Board structure that provides access to those peripherals. We'll first look at the Leds API.

βœ… Run the led program. Two of the green LEDs on the board should turn on; the other two should stay off.

NOTE this program will not terminate itself. Within VS code you need to click "Kill terminal" (garbage bin icon) in the bottom panel to terminate it.

βœ… Open the documentation for the dk crate by running the following command from the nrf52-code/radio-app folder:

cargo doc -p dk --open

βœ… Check the API docs of the Led abstraction. Change the led program, so that the bottom two LEDs are turned on, and the top two are turned off.

πŸ”Ž If you want to see logs from Led API of the dk Board Support Package, flash the dk with the following environment variable:

DEFMT_LOG=trace cargo run --bin led

The logs will appear on your console, as the output of cargo run. Among the logs you'll find the line "I/O pins have been configured for digital output". At this point the electrical pins of the nRF52840 microcontroller have been configured to drive the 4 LEDs on the board.

After the dk::init logs you'll find logs about the Led API. As the logs indicate, an LED becomes active when the output of the pin is a logical zero, which is also referred as the "low" state. This "active low" configuration does not apply to all boards: it depends on how the pins have been wired to the LEDs. You should refer to the board documentation to find out which pins are connected to LEDs and whether "active low" or "active high" applies to it.

πŸ”Ž When writing your own embedded project, you can implement your own BSP similar to dk, or use the matching HAL crate for your chip directly. Check out awesome-embedded-rust if there's a BSP for the board you want to use, or a HAL crate for the chip you'd like to use.

Timers and Time

Next we'll look into the time related APIs exposed by the dk HAL.

βœ… Open the nrf52-code/radio-app/src/bin/blinky.rs file.

This program will blink (turn on and off) one of the LEDs on the board. The time interval between each toggle operation is one second. This wait time between consecutive operations is generated by the blocking timer.wait operation. This function call will block the program execution for the specified Duration argument.

The other time related API exposed by the dk HAL is the dk::uptime function. This function returns the time that has elapsed since the call to the dk::init function. This function is used in the program to log the time of each LED toggle operation.

βœ… Try changing the Duration value passed to Timer.wait. Try values larger than one second and smaller than one second. What values of Duration make the blinking imperceptible?

❗ If you set the duration to below 2ms, try removing the defmt::println! command in the loop. Too much logging will fill the logging buffer and cause the loop to slow down, resulting in the blink frequency to reduce after a while.

nRF52840 Dongle

Next, we'll look into the radio API exposed by the dk HAL. But before that we'll need to set up the nRF52840 Dongle.

From this section on, we'll use the nRF52840 Dongle in addition to the nRF52840 DK. We'll run some pre-compiled programs on the Dongle and write programs for the DK that will interact with the Dongle over a radio link.

πŸ’¬ How to find the buttons on the Dongle: Put the Dongle in front of you, so that the side with the parts mounted on faces up. Rotate it, so that the narrower part of the board, the surface USB connector, faces away from you. The Dongle has two buttons. They are next to each other in the lower left corner of the Dongle. The reset button (RESET) is mounted sideways, it's square shaped button faces you. Further away from you is the round-ish user button (SW1), which faces up.

The Dongle does not contain an on-board debugger, like the DK, so we cannot use probe-rs tools to write programs into it. Instead, the Dongle's stock firmware comes with a bootloader.

When put in bootloader mode the Dongle will run a bootloader program instead of the last application that was flashed into it. This bootloader program will make the Dongle show up as a USB CDC ACM device (AKA Serial over USB device) that accepts new application images over this interface. We'll use the nrfdfu tool to communicate with the bootloader-mode Dongle and flash new images into it.

βœ… Connect the Dongle to your computer. Put the Dongle in bootloader mode by pressing its reset button.

When the Dongle is in bootloader mode its red LED will pulsate. The Dongle will also appear as a USB CDC ACM device with vendor ID 0x1915 and product ID 0x521f.

You can also use our cargo xtask usb-list tool, a minimal cross-platform version of the lsusb tool, to check out the status of the Dongle.

βœ… Run cargo xtask usb-list in the root of the rust-exercises checkout to list all USB devices; the Dongle will be highlighted in the output, along with a note if in bootloader mode.

Output should look like this:

radio-app/ $ cd ../..
rust-exercises/ $ cargo xtask usb-list
(..)
Bus 001 Device 016: ID 1915:521f <- nRF52840 Dongle (in bootloader mode)

πŸ”Ž cargo xtask lets us extend cargo with custom commands which are installed as you run them for the first time. We've used it to add some helper tools to our workshop materials while keeping the preparation installations as minimal as possible.

Now that the device is in bootloader mode browse to the nrf52-code/boards/dongle-fw directory. You'll find some ELF files (without a file ending) there. These are pre-compiled Rust programs to be flashed onto your dongle.

For the next section you'll need to flash the loopback file onto the Dongle.

βœ… Run the following command:

nrfdfu nrf52-code/boards/dongle-fw/loopback-fw

If the file is missing, you might be in a git checkout instead of a Github release tarball. Grab the Github release tarball and find the binary in there.

Expected output:

[INFO  nrfdfu] Sending init packet...
[INFO  nrfdfu] Sending firmware image of size 37328...
[INFO  nrfdfu] Done.

After the device has been programmed it will automatically reset and start running the new application.

πŸ”Ž Alternatively, you can also use nordic's own nrfutil tool to convert a .hex file and flash it for you, among many other things nrfutil is a very powerful tool, but also unstable at times, which is why we replaced the parts we needed from it with nrfdfu.

πŸ”Ž The loopback application will make the Dongle enumerate itself as a CDC ACM device.

βœ… Run cargo xtask usb-list tool to see the newly enumerated Dongle in the output:

$ cargo xtask usb-list
(..)
Bus 001 Device 020: ID 1209:0309 <- nRF52840 Dongle (loopback-fw)

The loopback app will log messages over the USB interface. To display these messages on the host we have provided a cross-platform tool: cargo xtask serial-term.

❗ Do not use serial terminal emulators like minicom or screen. They use the USB TTY ACM interface in a slightly different manner and may result in data loss.

βœ… Run cargo xtask serial-term. It shows you the logging output the Dongle is sending on its serial interface to your computer. This helps you monitor what's going on at the Dongle and debug connection issues. Start with the Dongle unplugged and you should see the following output:

$ cargo xtask serial-term
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `xtask/target/debug/xtask serial-term`
(waiting for the Dongle to be connected)
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw

This line is printed by the loopback app on boot. It contains the device ID of the dongle, a 64-bit unique identifier (so everyone will see a different number); the radio channel that the device will use to communicate; and the transmission power of the radio in dBm.

If you don't get any output from cargo xtask serial-term check the USB dongle troubleshooting section.

Interference

At this point you should not get more output from cargo xtask serial-term.

❗If you get "received N bytes" lines in output like this:

$ cargo xtask serial-term
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
received 7 bytes (CRC=Ok(0x2459), LQI=0)
received 5 bytes (CRC=Ok(0xdad9), LQI=0)
received 6 bytes (CRC=Ok(0x72bb), LQI=0)

That means the device is observing interference traffic, likely from 2.4 GHz WiFi or Bluetooth. In this scenario you should switch the listening channel to one where you don't observe interference. Use the cargo xtask change-channel tool to do this in a second window. The tool takes a single argument: the new listening channel which must be in the range 11-26.

$ cargo xtask change-channel 11
requested channel change to channel 11

Then you should see new output from cargo xtask serial-term:

deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
(..)
now listening on channel 11

Leave the Dongle connected and cargo xtask serial-term running. Now we'll switch back to the Development Kit. Note that if you remove and re-insert the dongle, it goes back to its default channel of 20.

Radio Out

In this section you'll send radio packets from the DK to the Dongle and get familiar with the different settings of the radio API.

Radio Setup

βœ… Open the nrf52-code/radio-app/src/bin/radio-send.rs file.

βœ… First run the program radio-send.rs as it is. You should see new output in the output of cargo xtask serial-term, if you left your Dongle on channel 20. If you change your Dongle's channel to avoid interference, change to the channel to match in radio-send.rs before you run it.

$ cargo xtask serial-term
deviceid=588c06af0877c8f2 channel=20 TxPower=+8dBm app=loopback-fw
received 5 bytes (CRC=Ok(0xdad9), LQI=53)

The program broadcasts a radio packet that contains the 5-byte string Hello over channel 20 (which has a center frequency of 2450 MHz). The loopback program running on the Dongle is listening to all packets sent over channel 20; every time it receives a new packet it reports its length and the Link Quality Indicator (LQI) metric of the transmission over the USB/serial interface. As the name implies the LQI metric indicates how good the connection between the sender and the receiver is (a higher number means better quality).

Because of how our firmware generates a semihosting exception to tell our flashing tool (probe-run) when the firmware has finished running, if you load the radio-send firmware and then power-cycle the nRF52840-DK, the firmware will enter a reboot loop and repeatedly send a packet. This is because nothing catches the semihosting exception and so the CPU reboots, sends a packet, and then tries another semihosting exception.

Messages

In radio-send.rs we propose three different ways to define the bytes we want to send to the radio:

#![allow(unused)]
fn main() {
let msg: &[u8; 5] = &[72, 101, 108, 108, 111];
let msg: &[u8; 5] = &[b'H', b'e', b'l', b'l', b'o'];
let msg: &[u8; 5] = b"Hello";
}

Here, we explain the different types.

Slices

The send method takes a reference -- in Rust, a reference (&) is a non-null pointer that's compile-time known to point into valid (e.g. non-freed) memory -- to a Packet as argument. A Packet is a stack-allocated, fixed-size buffer. You can fill the Packet (buffer) with data using the copy_from_slice method -- this will overwrite previously stored data.

This copy_from_slice method takes a slice of bytes (&[u8]). A slice is a reference into a list of elements stored in contiguous memory. One way to create a slice is to take a reference to an array, a fixed-size list of elements stored in contiguous memory.

#![allow(unused)]
fn main() {
// stack allocated array
let array: [u8; 3] = [0, 1, 2];

let ref_to_array: &[u8; 3] = &array;
let slice: &[u8] = &array;
}

slice and ref_to_array are constructed in the same way but have different types. ref_to_array is represented in memory as a single pointer (1 word / 4 bytes); slice is represented as a pointer + length (2 words, or 8 bytes).

Because slices track length at runtime rather than in their type they can point to chunks of memory of any length.

let array1: [u8; 3] = [0, 1, 2];
let array2: [u8; 4] = [0, 1, 2, 3];

let mut slice: &[u8] = &array1;
defmt::println!("{:?}", slice); // length = 3

// now point to the other array
slice = &array2;
defmt::println!("{:?}", slice); // length = 4

Byte literals

In the example we sent the list of bytes: [72, 101, 108, 108, 111], which can be interpreted as the string "Hello". To see why this is the case check this list of printable ASCII characters. You'll see that letter H is represented by the (single-byte) value 72, e by 101, etc.

Rust provides a more convenient way to write ASCII characters: byte literals. b'H' is syntactic sugar for the literal 72u8, b'e' is equivalent to 101u8, etc.. So we can rewrite [72, 101, 108, 108, 111] as [b'H', b'e', b'l', b'l', b'o']. Note that byte literals can also represent u8 values that are not printable ASCII characters: those values are written using escaped sequences like b'\x7F', which is equivalent to 0x7F.

Byte string literals

[b'H', b'e', b'l', b'l', b'o'] can be further rewritten as b"Hello". This is called a byte string literal (note that unlike a string literal like "Hello" this one has a b before the opening double quote). A byte string literal is a series of byte literals (u8 values); these literals have type &[u8; N] where N is the number of byte literals in the string.

Because byte string literals are references you need to dereference them to get an array type.

#![allow(unused)]
fn main() {
let reftoarray: &[u8; 2] = b"Hi";

// these two are equivalent
let array1:  [u8; 2] = [b'H', b'i'];
let array2:  [u8; 2] = *b"Hi";
//          ^          ^ dereference
}

Or if you want to go the other way around: you need to take a reference to an array to get the same type as a byte string literal.

#![allow(unused)]
fn main() {
// these two are equivalent
let reftoarray1: &[u8; 2] = b"Hi";
let reftoarray2: &[u8; 2] = &[b'H', b'i'];
//               ^          ^
}

Character constraints in byte string vs. string literals

You can encode text as b"Hello" or as "Hello".

b"Hello" is by definition a string (series) of byte literals so each character has to be a byte literal like b'A' or b'\x7f'. You cannot use "Unicode characters" (char type) like emoji or CJK (Chinese Japanese Korean) in byte string literals.

On the other hand, "Hello" is a string literal with type &str. str strings in Rust contain UTF-8 data so these string literals can contain CJK characters, emoji, Greek letters, Cyrillic script, etc.

Printing strings and characters

In this workshop we'll work with ASCII strings so byte string literals that contain no escaped characters are OK to use as packet payloads.

You'll note that defmt::println!("{:?}", b"Hello") will print [72, 101, 108, 108, 111] rather than "Hello" and that the {} format specifier (Display) does not work. This is because the type of the literal is &[u8; N] and in Rust this type means "bytes"; those bytes could be ASCII data, UTF-8 data or something else.

To print this you'll need to convert the slice &[u8] into a string (&str) using the core::str::from_utf8 function. This function will verify that the slice contains well formed UTF-8 data and interpret it as a UTF-8 string (&str). We were careful to ensure that our three example messages were the same, and were all valid UTF-8, so we expect the conversion to always succeed. Why not try and see which bytes cause this conversion to fail?

Something similar will happen with byte literals: defmt::println!("{}", b'A') will print 65 rather than A. To get the A output you can cast the byte literal (u8 value) to the char type: defmt::println!("{}", b'A' as char).

Link Quality Indicator (LQI)

received 7 bytes (CRC=Ok(0x2459), LQI=60)

βœ… Now run the radio-send program several times with different variations to explore how LQI can be influenced

  • change the distance between the Dongle and the DK -- move the DK closer to or further away from the Dongle.
  • change the transmit power
  • change the channel
  • change the length of the packet
  • different combinations of all of the above

Take note of how LQI changes with these changes. Does packet loss occur in any of these configurations?

NOTE if you decide to send many packets in a single program then you should use the Timer API to insert a delay of at least five milliseconds between the transmissions. This is required because the Dongle will use the radio medium right after it receives a packet. Not including the delay will result in the Dongle missing packets

802.15.4 radios are often used in mesh networks like Wireless Sensors Networks (WSN). The devices, or nodes, in these networks can be mobile so the distance between nodes can change in time. To prevent a link between two nodes getting broken due to mobility the LQI metric is used to decide the transmission power -- if the metric degrades power should be increased, etc. At the same time, the nodes in these networks often need to be power efficient (e.g. are battery powered) so the transmission power is often set as low as possible -- again the LQI metric is used to pick an adequate transmission power.

πŸ”Ž 802.15.4 compatibility

The radio API we are using follows the PHY layer of the IEEE 802.15.4 specification, but it's missing MAC level features like addressing (each device gets its own address), opt-in acknowledgment (a transmitted packet must be acknowledged with a response acknowledgment packet; the packet is re-transmitted if the packet is not acknowledged in time). These MAC level features are not implemented in hardware (in the nRF52840 Radio peripheral) so they would need to be implemented in software to be fully IEEE 802.15.4 compliant.

This is not an issue for the workshop exercises but it's something to consider if you would like to continue from here and build a 802.15.4 compliant network API.

Radio In

In this section we'll explore the recv_timeout method of the Radio API. As the name implies, this is used to listen for packets. The method will block the program execution until a packet is received or the specified timeout has expired. We'll continue to use the Dongle in this section; it should be running the loopback application; and cargo xtask serial-term should also be running in the background.

The loopback application running on the Dongle will broadcast a radio packet after receiving one over channel 20. The contents of this outgoing packet will be the contents of the received one but reversed.

βœ… Open the nrf52-code/radio-app/src/bin/radio-recv.rs file. Make sure that the Dongle and the Radio are set to the same channel. Click the "Run" button.

The Dongle does not inspect the contents of your packet and does not require them to be ASCII, or UTF-8. It will simply send a packet back containing the same bytes it received, except the bytes will be in reverse order to how you sent it.

That is, if you send b"olleh", it will send back b"hello".

The Dongle will respond as soon as it receives a packet. If you insert a delay between the send operation and the recv operation in the radio-recv program this will result in the DK not seeing the Dongle's response. So try this:

βœ… Add a timer.wait(x) call before the recv_timeout call, where x is core::time::Duration; try different lengths of time for x and observe what happens.

Having log statements between send and recv_timeout can also cause packets to be missed so try to keep those two calls as close to each other as possible and with as little code in between as possible.

NOTE Packet loss can always occur in wireless networks, even if the radios are close to each other. The Radio API we are using will not detect lost packets because it does not implement IEEE 802.15.4 Acknowledgement Requests. For the next step in the workshop, we will use a new function to handle this for us. For the sake of other radio users, please do ensure you never call send() in a tight loop!

Radio Puzzle

illustration showing that you send plaintext and the dongle responds with ciphertext

Your task in this section is to decrypt the substitution cipher encrypted ASCII string stored in the Dongle using one of the stack-allocated maps in the heapless crate. The string has been encrypted using simple substitution.

Preparing the Dongle

βœ… Flash the puzzle-fw program on the Dongle. Follow the instructions from the "nRF52840 Dongle" section but flash the puzzle-fw program instead of the loopback-fw one -- don't forget to put the Dongle in bootloader mode (pressing the reset button) before invoking nrfdfu.

Note: If you experienced USB issues with loopback-fw you can use the older puzzle-nousb*.hex variants.

Like in the previous sections the Dongle will listen for radio packets -- this time over channel 25 -- while also logging messages over a USB/serial interface. It also prints a . periodically so you know it's still alive.

Sending Messages and Receiving the Dongle's Responses

βœ… Open the nrf52-code/radio-app folder in VS Code; then open the src/bin/radio-puzzle.rs file. Run the program.

This will send a zero sized packet let msg = b"" to the dongle. It does this using a special function called dk::send_recv. This function will:

  1. Determine a unique address for your nRF52840 (Nordic helpfully bake a different random address into every nRF52 chip they make)
  2. Construct a packet where the first six bytes are the unique address, and the remainder are the ones you passed to the send_recv() function
  3. Use the Radio::send() method to wait for the channel to be clear (using a Clear Channel Assessment) before actually sending the packet
  4. Use the Radio::recv_timeout() method to wait for a reply, up to the given number of microseconds specified
  5. Check that the first six bytes in the reply match our six byte address a. If so, the remainder of the reply is returned as the Ok variant b. Otherwise, increment a retry counter and, if we have run out of retry attempts, we return the Err variant c. Otherwise, we go back to step 2 and try again.

This function allows communication with the USB dongle to be relatively robust, even in the presence of other devices on the same channel. However, it's not perfect and sometimes you will run out of retry attempts and your program will need to be restarted.

❗ The Dongle responds to the DK's requests wirelessly (i.e. by sending back radio packets) as well. You'll see the dongle responses printed by the DK. This means you don't have to worry if serial-term doesn't work on your machine.

βœ… Try sending one-byte sized packets. βœ… Try sending longer packets.

What happens?

❗ The Dongle responds to the DK's requests wirelessly (i.e. by sending back radio packets) as well. You'll see the dongle responses printed by the DK. This means you don't have to worry if serial-term doesn't work on your machine.

Answer

The Dongle will respond differently depending on the length of the payload in the incoming packet:

  • On zero-sized payloads (i.e. packets that only contain the device address and nothing else) it will respond with the encrypted string.
  • On one-byte sized payloads it will respond with the direct mapping from the given plaintext letter (single u8 value) to the corresponding ciphertext letter (another u8 value).
  • On payloads of any other length the Dongle will respond with the string correct if it received the correct secret string, otherwise it will respond with the string incorrect.

The Dongle will always respond with payloads that are valid UTF-8 so you can use str::from_utf8 on the response packets. However, do not attempt to look inside the raw packet, as it will contain six random address bytes at the start, and they will not be valid UTF-8. Only look at the &[u8] that the send_recv() function returns, and treat the Packet as just a storage area that you don't look inside.

This step is illustrated in src/bin/radio-puzzle-1.rs

From here on, the exercise can be solved in multiple ways. If you have an idea on how to go from here and what tools to use, you can work on your own. If you don't have an idea what to do next or what tools to use, we'll provide a guide on the next page.

Help

Use a dictionary

Our suggestion is to use a dictionary / map. std::collections::HashMap is not available in no_std code (it requires a secure random number generator to prevent collision attacks) but you can use one of the stack-allocated maps in the heapless crate. It supplies a stack-allocated, fixed-capacity version of the std::Vec type which will come in handy to store byte arrays. To store character mappings we recommend using a heapless::LinearMap.

heapless is already declared as a dependency in the Cargo.toml of the project so you can directly import it into the application code using a use statement.

use heapless::Vec;       // like `std::Vec` but stack-allocated
use heapless::LinearMap; // a dictionary / map

fn main() {
    // A hash map with a capacity of 16 `(u8, u8)` key-value pairs allocated on the stack
    let mut my_map = LinearMap::<u8, u8, 16>::new();
    my_map.insert(b'A', b'~').unwrap();

    // A vector with a fixed capacity of 8 `u8` elements allocated on the stack
    let mut my_vec = Vec::<u8, 8>::new();
    my_vec.push(b'A').unwrap();
}

If you haven't used a stack-allocated collection before note that you'll need to specify the capacity of the collection as a const-generic parameter. The larger the value, the more memory the collection takes up on the stack. The heapless::LinearMap documentation of the heapless crate has some usage examples, as does the heapless::Vec documentation.

Note the difference between character literals and byte literals!

Something you will likely run into while solving this exercise are character literals ('c') and byte literals (b'c'). The former has type char and represent a single Unicode "scalar value". The latter has type u8 (1-byte integer) and it's mainly a convenience for getting the value of ASCII characters, for instance b'A' is the same as the 65u8 literal.

IMPORTANT you do not need to use the str or char API to solve this problem, other than for printing purposes. Work directly with slices of bytes ([u8]) and bytes (u8); and only convert those to str or char when you are about to print them.

Note: The plaintext secret string is not stored in puzzle-fw so running strings on it will not give you the answer. Nice try.

Make sure not to flood the log buffer

When you log more messages than can be moved from the probe to the target, the log buffer will get overwritten, resulting in data loss. This can easily happen when you repeatedly poll the dongle and log the result. The quickest solution to this is to wait a short while until you send the next packet so that the logs can be processed in the meantime.

use core::time::Duration;

#[entry]
fn main() -> ! {

    let mut timer = board.timer;

    for plainletter in 0..=127 {
        /* ... send letter to dongle ... */
        defmt::println!("got response");
        /* ... store output ... */

        timer.wait(Duration::from_millis(20));
    }
}

Each step is demonstrated in a separate example so if for example you only need a quick reference of how to use the map API you can look at step / example number 2 and ignore the others.

  1. Send a one letter packet (e.g. A) to the radio to get a feel for how the mapping works. Then do a few more letters. See src/bin/radio-puzzle-1.rs.

  2. Get familiar with the dictionary API. Do some insertions and look ups. What happens if the dictionary gets full? See src/bin/radio-puzzle-2.rs.

  3. Next, get mappings from the radio and insert them into the dictionary. See src/bin/radio-puzzle-3.rs.

  4. You'll probably want a buffer to place the plaintext in. We suggest using heapless::Vec for this. See src/bin/radio-puzzle-4.rs for a starting-point (NB It is also possible to decrypt the packet in place).

  5. Simulate decryption: fetch the encrypted string and "process" each of its bytes. See src/bin/radio-puzzle-5.rs.

  6. Now merge steps 3 and 5: build a dictionary, retrieve the secret string and do the reverse mapping to decrypt the message. See src/bin/radio-puzzle-6.rs.

  7. As a final step, send the decrypted string to the Dongle and check if it was correct or not. See src/bin/radio-puzzle-7.rs.

For your reference, we have provided a complete solution in the src/bin/radio-puzzle-solution.rs file. That solution is based on the seven steps outlined above. Did you solve the puzzle in a different way?

All finished? See the next steps.

Next Steps

If you've already completed the main workshop tasks or would like to explore more on your own this section has some suggestions.

Alternative containers

Modify-in-place

If you solved the puzzle using a Vec buffer you can try solving it without the buffer as a stretch goal. You may find the slice methods that let you mutate a Packet's data useful, but remember that the first six bytes of your Packet will be the random device address - you can't decrypt those! A solution that does not use a heapless:Vec buffer can be found in the src/bin/radio-puzzle-solution-2.rs file.

Using liballoc::BTreeMap

If you solved the puzzle using a heapless::Vec buffer and a heapless::LinearMap and you still need something else to try, you could look at the Vec and BTreeMap types contained within liballoc. This will require you to set up a global memory allocator, like embedded-alloc.

Collision avoidance

In this section you'll test the collision avoidance feature of the IEEE 802.15.4 radio used by the Dongle and DK.

If you check the API documentation of the Radio abstraction we have been using you'll notice that we haven't used these methods: energy_detection_scan(), set_cca() and try_send().

The first method scans the currently selected channel (see set_channel()), measures the energy level of ongoing radio communication in this channel and returns the maximum energy observed over a span of time. This method can be used to determine what the idle energy level of a channel is. If there's non-IEEE 802.15.4 traffic on this channel the method will return a high value.

Under the 802.15.4 specification, before sending a data packet devices must first check if there's communication going on in the channel. This process is known as Clear Channel Assessment (CCA). The send method we have been used performs CCA in a loop and sends the packet only when the channel appears to be idle. The try_send method performs CCA once and returns the Err variant if the channel appears to be busy. In this failure scenario the device does not send any packet.

The Radio abstraction supports 2 CCA modes: CarrierSense and EnergyDetection. CarrierSense is the default CCA mode and what we have been using in this workshop. CarrierSense will only look for ongoing 802.15.4 traffic in the channel but ignore other traffic like 2.4 GHz WiFi and Bluetooth. The EnergyDetection method is able to detect ongoing non-802.15.4 traffic.

Here are some things for you to try out:

  • First, read the section 6.20.12.4 of the nRF52840 Product Specification, which covers the nRF52840's implementation of CCA.

  • Disconnect the Dongle. Write a program for the DK that scans and reports the energy levels of all valid 802.15.4 channels. In your location which channels have high energy levels when there's no ongoing 802.15.4 traffic? If you can, use an application like WiFi Analyzer to see which WiFi channels are in use in your location. Compare the output of WiFiAnalyzer to the values you got from energy_detection_scan. Is there a correspondence? Note that WiFi channels don't match in frequency with 802.15.4 channels; some mapping is required to convert between them -- check this illustration for more details about co-existence of 802.15.4 and WiFi.

  • Choose the channel with the highest idle energy. Now write a program on the DK that sets the CCA mode to EnergyDetection and then send a packet over this channel using try_send. The EnergyDetection CCA mode requires a Energy Detection (ED) "threshold" value. Try different threshold values. What threshold value makes the try_send succeed?

  • Repeat the previous experiment but use the channel with the lowest idle energy.

  • Pick the channel with the lowest idle energy. Run the loopback app on the Dongle and set its listening channel to the chosen channel. Modify the DK program to perform a send operation immediately followed by a try_send operation. The try_send operation will collide with the response of the Dongle (remember: the Dongle responds to all incoming packets after a 5ms delay - see the loopback-fw program for details). Find a ED threshold that detects this collision and makes try_send return the Err variant.

Interrupt handling

We haven't covered interrupt handling in the workshop but the cortex-m-rt crate provides attributes to declare exception and interrupt handlers: #[exception] and #[interrupt]. You can find documentation about these attributes and how to safely share data with interrupt handlers using Mutexes in the "Concurrency" chapter of the Embedded Rust book.

Another way to deal with interrupts is to use a framework like Real-Time Interrupt-driven Concurrency (RTIC); this framework has a book that explains how you can build reactive applications using interrupts. We use this framework in the "USB" workshop.

Starting a Project from Scratch

So far we have been using a pre-made Cargo project to work with the nRF52840 DK. In this section we'll see how to create a new embedded project for any microcontroller.

Identify the microcontroller

The first step is to identify the microcontroller you'll be working with. The information about the microcontroller you'll need is:

1. Its processor architecture and sub-architecture

This information should be in the device's data sheet or manual. In the case of the nRF52840, the processor is an ARM Cortex-M4 core. With this information you'll need to select a compatible compilation target. rustup target list will show all the supported compilation targets.

$ rustup target list
(..)
thumbv6m-none-eabi
thumbv7em-none-eabi
thumbv7em-none-eabihf
thumbv7m-none-eabi
thumbv8m.base-none-eabi
thumbv8m.main-none-eabi
thumbv8m.main-none-eabihf

The compilation targets will usually be named using the following format: $ARCHITECTURE-$VENDOR-$OS-$ABI, where the $VENDOR field is sometimes omitted. Bare metal and no_std targets, like microcontrollers, will often use none for the $OS field. When the $ABI field ends in hf it indicates that the output ELF uses the hardfloat Application Binary Interface (ABI).

The thumb targets listed above are all the currently supported ARM Cortex-M targets. The table below shows the mapping between compilation targets and ARM Cortex-M processors.

Compilation targetProcessor
thumbv6m-none-eabiARM Cortex-M0, ARM Cortex-M0+
thumbv7m-none-eabiARM Cortex-M3
thumbv7em-none-eabiARM Cortex-M4, ARM Cortex-M7
thumbv7em-none-eabihfARM Cortex-M4F, ARM Cortex-M7F
thumbv8m.base-none-eabiARM Cortex-M23
thumbv8m.main-none-eabiARM Cortex-M33, ARM Cortex-M35P
thumbv8m.main-none-eabihfARM Cortex-M33F, ARM Cortex-M35PF

The ARM Cortex-M ISA is backwards compatible so for example you could compile a program using the thumbv6m-none-eabi target and run it on an ARM Cortex-M4 microcontroller. This will work but using the thumbv7em-none-eabi results in better performance (ARMv7-M instructions will be emitted by the compiler) so it should be preferred. The older ISAs may also be limited in terms of the maximum number of interrupts you can define, which maybe be fewer than your newer microcontroller actually has.

2. Its memory layout

In particular, you need to identify how much Flash and RAM memory the device has and at which address the memory is exposed. You'll find this information in the device's data sheet or reference manual.

In the case of the nRF52840, this information is in section 4.2 (Figure 2) of its Product Specification. It has:

  • 1 MB of Flash that spans the address range: 0x0000_0000 - 0x0010_0000.
  • 256 KB of RAM that spans the address range: 0x2000_0000 - 0x2004_0000.

The cortex-m-quickstart project template

With all this information you'll be able to build programs for the target device. The cortex-m-quickstart project template provides the most frictionless way to start a new project for the ARM Cortex-M architecture -- for other architectures check out other project templates by the rust-embedded organization.

The recommended way to use the quickstart template is through the cargo-generate tool:

cargo generate --git https://github.com/rust-embedded/cortex-m-quickstart

But it may be difficult to install the cargo-generate tool on Windows due to its libgit2 (C library) dependency. Another option is to download a snapshot of the quickstart template from GitHub and then fill in the placeholders in Cargo.toml of the snapshot.

Once you have instantiated a project using the template you'll need to fill in the device-specific information you collected in the two previous steps:

1. Change the default compilation target in .cargo/config

[build]
target = "thumbv7em-none-eabi"

For the nRF52840 you can choose either thumbv7em-none-eabi or thumbv7em-none-eabihf. If you are going to use the FPU then select the hf variant.

2. Enter the memory layout of the chip in memory.x

MEMORY
{
  /* NOTE 1 K = 1 KiBi = 1024 bytes */
  FLASH : ORIGIN = 0x00000000, LENGTH = 1M
  RAM : ORIGIN = 0x20000000, LENGTH = 256K
}

3. cargo build now will cross compile programs for your target device

If there's no template or signs of support for a particular architecture under the rust-embedded organization then you can follow the embedonomicon to bootstrap support for the new architecture by yourself.

Flashing the program

To flash the program on the target device you'll need to identify the on-board debugger, if the development board has one. Or choose an external debugger, if the development board exposes a JTAG or SWD interface via some connector.

If the hardware debugger is supported by the probe-rs project -- for example J-Link, ST-Link or CMSIS-DAP -- then you'll be able to use probe-rs-based tools like probe-rs and cargo-embed. This is the case of the nRF52840 DK: it has an on-board J-Link probe.

If the debugger is not supported by probe-rs then you'll need to use OpenOCD or vendor provided software to flash programs on the board.

If the board does not expose a JTAG, SWD or similar interface then the microcontroller probably comes with a bootloader as part of its stock firmware. In that case you'll need to use dfu-util or a vendor specific tool like nrfdfu or nrfutil to flash programs onto the chip. This is the case of the nRF52840 Dongle.

Getting output

If you are using one of the probes supported by probe-rs then you can use the rtt-target library to get text output on cargo-embed. The logging functionality we used in the examples is implemented using the rtt-target crate.

If that's not the case or there's no debugger on board then you'll need to add a HAL before you can get text output from the board.

Adding a Hardware Abstraction Layer (HAL)

Now you can hopefully run programs and get output from them. To use the hardware features of the device you'll need to add a HAL to your list of dependencies. crates.io, lib.rs and awesome embedded Rust are good places to search for HALs.

After you find a HAL you'll want to get familiar with its API through its API docs and examples. HAL do not always expose the exact same API, specially when it comes to initialization and configuration of peripherals. However, most HAL will implement the embedded-hal traits. These traits allow inter-operation between the HAL and driver crates. These driver crates provide functionality to interface external devices like sensors, actuators and radios over interfaces like I2C and SPI.

If no HAL is available for your device then you'll need to build one yourself. This is usually done by first generating a Peripheral Access Crate (PAC) from a System View Description (SVD) file using the svd2rust tool. The PAC exposes a low level, but type safe, API to modify the registers on the device. Once you have a PAC you can use of the many HALs on crates.io as a reference; most of them are implemented on top of svd2rust-generated PACs.

Hello, πŸ’‘

Now that you've set up your own project from scratch, you could start playing around with it by turning on one of the DK's on-board LEDs using only the HAL. Some hints that might be helpful there:

nRF52 HAL Workbook

In this workshop you'll learn to:

  • use a HAL to provide features in a BSP
  • configure GPIO pins using the nRF52 HAL

To test your BSP changes, you will modify a small example: hal-app/src/bin/blinky.rs

You will need an nRF52840 Development Kit for this exercise, but not the nRF USB dongle.

If you haven't completed the Radio Workbook, you should start there, and go at least as far as completing the "Timers and Time" section.

The nRF52840 Development Kit

This is the larger development board.

The board has two USB ports: J2 and J3 and an on-board J-Link programmer / debugger -- there are instructions to identify the ports in a previous section. USB port J2 is the J-Link's USB port. USB port J3 is the nRF52840's USB port. Connect the Development Kit to your computer using the J2 port.

Adding Buttons

To practice using a HAL to provide functionality through a Board Support Package, you will now modify the dk crate to add support for Buttons.

Change the demo app

βœ… Change the hal-app/src/bin/buttons.rs file as described within, so it looks for button presses.

It should now fail to compile, because the dk crate doesn't have support for buttons. You will now fix that!

Define a Button

βœ… Open up the dk crate in VS Code (nrf52-code/board/dk) and open src/lib.rs.

βœ… Add a struct Button which represents a single button.

It should be similar to struct Led, except the inner type must be Pin<Input<PullUp>>. You will need to import those types - look where Output and PushPull types were imported from for clues! Think about where it makes sense to add this new type. At the top? At the buttom? Maybe just after to the LED related types?

πŸ”Ž The pins must be set as pull-ups is because each button connects a GPIO pin to ground, but the pins float when the button is not pressed. Enabling the pull-ups inside the SoC ensure that the GPIO pin is weakly connected to 3.3V through a resistor, giving it a 'default' value of 'high'. Pressing the button then makes the pin go 'low.

Define all the Buttons

βœ… Add a struct Buttons which contains four buttons.

Use struct Leds for guidance. Add a buttons field to struct Board which is of type Buttons. Again, think about where it makes sense to insert this new field.

Set up the buttons

Now the Board struct initaliser is complaining you didn't initialise the new buttons field.

βœ… Take pins from the HAL, configure them as inputs with pull-ups, and install them into the Buttons structure.

The mapping is:

  • Button 1: P0.11
  • Button 2: P0.12
  • Button 3: P0.24
  • Button 4: P0.25

You can verify this in the User Guide.

Run your program

βœ… Run the buttons demo:

cd nrf52-code/hal-app
cargo run --bin buttons

Now when you press the button, the LED should illuminate. If it does the opposite, check your working!

Write a more interesting demo program for the BSP

βœ… You've got four buttons and four LEDs. Make up a demo!

If you're stuck for ideas, you could have the LEDs do some kind of animation. The buttons might then stop or start the animation, or make it go faster or slower. Try setting up a loop with a 20ms delay inside it, to give yourself a basic 50 Hz "game tick". You can look at the blinky demo for help with the timer.

Troubleshooting

πŸ”Ž If you get totally stuck, ask for help! If all else fails, you could peek in board/dk-solution, which has a complete set of the required BSP changes.

nRF52 USB Workbook

In this workshop you'll learn to:

  • work with registers and peripherals from Rust
  • handle external events in embedded Rust applications using RTIC
  • debug event driven applications
  • test no_std code

To put these concepts and techniques in practice you'll write a toy USB device application that gets enumerated and configured by the host. This embedded application will run in a fully event driven fashion: only doing work when the host asks for it.

You will need an nRF52840 Development Kit for this exercise, but not the nRF USB dongle.

The nRF52840 Development Kit

The board has two USB ports: J2 and J3 and an on-board J-Link programmer / debugger -- there are instructions to identify the ports in a previous section. USB port J2 is the J-Link's USB port. USB port J3 is the nRF52840's USB port. Connect the Development Kit to your computer using both ports.

Workshop Steps

You will need to complete the workshop steps in order. It's OK if you don't get them all finished, but you must complete one before starting the next one. You can look at the solution for each step if you get stuck.

If you are reading the book view, the steps are listed on the left in the sidebar (use the hamburger if that is hidden). If you are reading the source on Github, go back to the SUMMARY.md file to see the steps.

Listing USB Devices

βœ… To list all USB devices, run cargo xtask usb-list from the top-level checkout.

$ cargo xtask usb-list
(...) random other USB devices will be listed
Bus 001 Device 010: ID 1366:1015 <- J-Link on the nRF52840 Development Kit

The goal of this workshop is to get the nRF52840 SoC to show in this list. The embedded application will use the USB Vendor ID (VID) and USB Product ID (PID) defined in nrf52-code/consts; cargo xtask usb-list will highlight the USB device that matches that VID/PID pair, like this:

$ cargo xtask usb-list
(...) random other USB devices will be listed
Bus 001 Device 010: ID 1366:1015 <- J-Link on the nRF52840 Development Kit
Bus 001 Device 059: ID 1209:0717 <- nRF52840 on the nRF52840 Development Kit

Hello, world!

In this section, we'll set up the integration in VS Code and run the first program.

βœ… Open the nrf52-code/usb-app folder in VS Code and open the src/bin/hello.rs file.

Note: To ensure full rust-analyzer support, do not open the whole rust-exercises folder.

Give rust-analyzer some time to analyze the file and its dependency graph. When it's done, a "Run" button will appear over the main function. If it doesn't appear on its own, type something in the file, delete and save. This should trigger a re-load.

βœ… Click the "Run" button to run the application on the microcontroller.

If you are not using VS code run the cargo run --bin hello command from the nrf52-code/usb-app folder.

NOTE: Recent version of the nRF52840-DK have flash-read-out protection to stop people dumping the contents of flash on an nRF52 they received pre-programmed, so if you have problems immediately after first plugging your board in, see this page.

If you run into an error along the lines of "Debug power request failed" retry the operation and the error should disappear.

The usb-app package has been configured to cross-compile applications to the ARM Cortex-M architecture and then run them using the probe-rs custom Cargo runner. The probe-rs tool will load and run the embedded application on the microcontroller and collect logs from the microcontroller.

The probe-rs process will terminate when the microcontroller enters the "halted" state. From the embedded application, one can enter the "halted" state using by performing a CPU breakpoint with a special argument that indicates 'success'. For convenience, an exit function is provided in the dk Board Support Package (BSP). This function is divergent like std::process::exit (fn() -> !) and can be used to halt the device and terminate the probe-rs process.

Checking the API documentation

We'll be using the dk Board Support Package. It's good to have its API documentation handy. You can generate the documentation for that crate from the command line:

βœ… Run the following command from within the nrf52-code/usb-app folder. It will open the generated documentation in your default web browser. Note that if you run it from inside the nrf52-code/boards/dk folder, you will find a bunch of USB-related documentation missing, because we disable that particular feature by default.

cargo doc --open

NOTE: If you are using Safari and the documentation is hard to read due to missing CSS, try opening it in a different browser.

βœ… Browse to the documentation for the dk crate, and look at what is available within the usbd module. Some of these functions will be useful later.

RTIC hello

RTIC, Real-Time Interrupt-driven Concurrency, is a framework for building evented, time sensitive applications.

βœ… Open the nrf52-code/usb-app/src/bin/rtic-hello.rs file.

RTIC applications are written in RTIC's Domain Specific Language (DSL). The DSL extends Rust syntax with custom attributes like #[init] and #[idle].

RTIC makes a clearer distinction between the application's initialization phase, the #[init] function, and the application's main loop or main logic, the #[idle] function. The initialization phase runs with interrupts disabled and interrupts are re-enabled before the idle function is executed.

rtic::app is a procedural macro that generates extra Rust code, in addition to the user's functions. The fully expanded version of the macro can be found in the file target/rtic-expansion.rs. This file will contain the expansion of the procedural macro for the last compiled RTIC application.

βœ… Build the rtic-hello example and look at the generated rtic-expansion.rs file.

You can use rustfmt on target/rtic-expansion.rs to make the generated code easier to read. Among other things, the file should contain the following lines. Note that interrupts are disabled during the execution of the init function:

unsafe extern "C" fn main() -> ! {
    rtic::export::interrupt::disable();
    let mut core: rtic::export::Peripherals = rtic::export::Peripherals::steal().into();
    #[inline(never)]
    fn __rtic_init_resources<F>(f: F)
    where
        F: FnOnce(),
    {
        f();
    }
    __rtic_init_resources(|| {
        let (shared_resources, local_resources, mut monotonics) =
            init(init::Context::new(core.into()));
        rtic::export::interrupt::enable();
            });
    idle(idle::Context::new(&rtic::export::Priority::new(0)))
}

Dealing with Registers

In this and the next section we'll look into RTIC's event handling features. To explore these features we'll use the action of connecting a USB cable to the DK's port J2 as the event we'd like to handle.

βœ… Open the nrf52-code/usb-app/src/bin/events.rs file.

We'll read the code and explain what it does.

The example application enables the signaling of this "USB power" event in the init function. This is done using the low level register API generated by the svd2rust tool. The register API was generated from a SVD (System View Description) file, a file that describes all the peripherals and registers, and their memory layout, on a device. In our case the device was the nRF52840; a sample SVD file for this microcontroller can be found here.

In the svd2rust API, peripherals are represented as structs. The fields of each peripheral struct are the registers associated to that peripheral. Each register field exposes methods to read and write to the register in a single memory operation.

The read and write methods take closure arguments. These closures in turn grant access to a "constructor" value, usually named r or w, which provides methods to modify the bitfields of a register. At the same time the API of these "constructors" prevent you from modifying the reserved parts of the register: you cannot write arbitrary values into registers; you can only write valid values into registers.

Apart from the read and write methods there's a modify method that performs a read-modify-write operation on the register; this API is also closure-based. The svd2rust-generated API is documented in detail in the svd2rust crate starting at the Peripheral API section.

In Cortex-M devices interrupt handling needs to be enabled on two sides: on the peripheral side and on the core side. The register operations done in init take care of the peripheral side. The core side of the operation involves writing to the registers of the Nested Vector Interrupt Controller (NVIC) peripheral. This second part doesn't need to be done by the user in RTIC applications because the framework takes care of it.

Event Handling

Below the idle function you'll see a #[task] handler, a function. This task is bound to the POWER_CLOCK interrupt signal and will be executed, function-call style, every time the interrupt signal is raised by the hardware.

βœ… Run the events application. Then connect a micro-USB cable to your PC/laptop then connect the other end to the DK (port J3). You'll see the "POWER event occurred" message after the cable is connected.

Note that all tasks will be prioritized over the idle function so the execution of idle will be interrupted (paused) by the on_power_event task. When the on_power_event task finishes (returns) the execution of the idle will be resumed. This will become more obvious in the next section.

Try this: add an infinite loop to the end of init so that it never returns. Now run the program and connect the USB cable. What behavior do you observe? How would you explain this behavior? (hint: look at the rtic-expansion.rs file: under what conditions is the init function executed?)

Task State

Now let's say we want to change the previous program to count how many times the USB cable (port J3) has been connected and disconnected.

βœ… Open the nrf52-code/usb-app/src/bin/task-state.rs file.

Tasks run from start to finish, like functions, in response to events. To preserve some state between the different executions of a task we can add a resource to the task. In RTIC, resources are the mechanism used to share data between different tasks in a memory safe manner but they can also be used to hold task state.

To get the desired behavior we'll want to store some counter in the state of the on_power_event task.

The starter code shows the syntax to declare a resource, the Resources struct, and the syntax to associate a resource to a task, the resources list in the #[task] attribute.

In the starter code a resource is used to move (by value) the POWER peripheral from init to the on_power_event task. The POWER peripheral then becomes part of the state of the on_power_event task and can be persistently accessed throughout calls to on_power_event() through a reference. The resources of a task are available via the Context argument of the task.

To elaborate more on this move action: in the svd2rust API, peripheral types like POWER are singletons (only a single value of that type can ever exist). The consequence of this design is that holding a peripheral instance, like POWER, by value means that the function (or task) has exclusive access, or ownership, over the peripheral. This is the case of the init function: it owns the POWER peripheral but then transfers ownership over it to a task using the resource initialization mechanism.

We have moved the POWER peripheral into the task because we want to clear the USBDETECTED interrupt flag after it has been set by the hardware. If we miss this step the on_power_event task (function) will be called again once it returns and then again and again and again (ad infinitum).

Also note that in the starter code the idle function has been modified. Pay attention to the logs when you run the starter code.

βœ… Modify the program so that it prints the number of times the USB cable has been connected to the DK every time the cable is connected, as shown below.

USBDETECTED interrupt enabled
idle: going to sleep
on_power_event: cable connected 1 time
idle: woke up
idle: going to sleep
on_power_event: cable connected 2 times
idle: woke up
idle: going to sleep
on_power_event: cable connected 3 times

You can find a solution to this exercise in the nrf52-code/usb-app-solutions/src/bin/task-state.rs file.

USB Enumeration

A USB device, like the nRF52840, can be one of these three states:

  • Default
  • Address
  • Configured

After being powered the device will start in the Default state. The enumeration process will take the device from the Default state to the Address state. As a result of the enumeration process the device will be assigned an address, in the range 1..=127, by the host.

The USB protocol is complex so we'll leave out many details and focus only on the concepts required to get enumeration and configuration working. There are also several USB specific terms so we recommend checking chapter 2, "Terms and Abbreviations", of the USB specification (linked at the bottom of this document) every now and then.

Each OS may perform the enumeration process slightly differently but the process will always involve these host actions:

  • A USB reset, to put the device in the Default state, regardless of what state it was in.
  • A GET_DESCRIPTOR request, to get the device descriptor.
  • A SET_ADDRESS request, to assign an address to the device.

These host actions will be perceived as events by the nRF52840 and these events will cause some bits to be set in the relevant register, and then an interrupt to be fired. During this workshop, we will gradually parse and handle these events and learn more about Embedded Rust along the way.

There are more USB concepts involved that we'll need to cover, like descriptors, configurations, interfaces and endpoints but for now let's see how to handle USB events.

For each step of the course, we've prepared a usb-<n>.rs file that gives you a base structure and hints on how to proceed. The matching usb-<n>.rs in usb-app-solutions contains a sample solution should you need it. Switch from usb-<n>.rs to usb-<n+1>.rs when instructed and continue working from there. Please keep the USB cable plugged into J3 through all these exercises.

USB-1: Dealing with USB Events

The USBD peripheral on the nRF52840 contains a series of registers, called EVENTS registers, that indicate the reason for entering the USBD interrupt handler. These events must be handled by the application to complete the enumeration process.

βœ… Open the nrf52-code/usb-app/src/bin/usb-1.rs file.

In this starter code the USBD peripheral is initialized in init and a task, named main, is bound to the interrupt signal called USBD. This task will be called every time a new USBD event needs to be handled. The main task uses usbd::next_event() to check all the event registers; if any event is set (i.e. that event just occurred) then the function returns the event, represented by the Event enum, wrapped in the Some variant. This Event is then passed to the on_event function for further processing.

βœ… Connect the USB cable to the port J3 then run the starter code.

❗️ Keep the cable connected to the J3 port for the rest of the workshop

This code will panic because Event::UsbReset is not handled yet - it has a todo!() on the relevant match arm.

βœ… Go to fn on_event(...), line 48. You'll need to handle the Event::UsbReset case - for now, just print the log message returning to the Default state.

βœ… Now handle the Event::UsbEp0Setup case - for now, just print the log message usb-1 exercise complete and then execute dk::exit() to shut down the microcontroller.

Your logs should look like:

USBD initialized
USB: UsbReset
returning to the Default state
USB: UsbEp0Setup
usb-1 exercise complete

You can ignore the Event::UsbEp0DataDone event for now because we don't yet get far enough when talking to the host computer for this event to come up.

USB Knowledge

USBRESET (indicated by Events::UsbReset)

This event indicates that the host issued a USB reset signal - the first step in the enumeration process. According to the USB specification this will move the device from any state to the Default state. Since we are currently not dealing with any other state, for now we just log that we received this event and move on.

EP0SETUP (indicated by Events::UsbEp0Setup)

The USBD peripheral has detected the SETUP stage of a control transfer. For now, we just print a log message and exit the application.

EP0DATADONE (indicated by Events::UsbEp0DataDone)

The USBD peripheral is signaling the end of the DATA stage of a control transfer. Since you won't encounter this event just yet, you can leave it as it is.

Help

You can find the solution in the nrf52-code/usb-app-solutions/src/bin/usb-1.rs file.

USB Endpoints

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram all the endpoint rectangles are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

Under the USB protocol data transfers occur over endpoints.

Endpoints are similar to UDP or TCP ports in that they allow logical multiplexing of data over a single physical USB bus. USB endpoints, however, have directions: an endpoint can either be an IN endpoint or an OUT endpoint. The direction is always from the perspective of the host so at an IN endpoint data travels from the device to the host and at an OUT endpoint data travels from the host to the device.

Endpoints are identified by their address, a zero-based index, and direction. There are four types of endpoints: control endpoints, bulk endpoints, interrupt endpoints and isochronous endpoints. Each endpoint type has different properties: reliability, latency, etc. In this workshop we'll only need to deal with control endpoints.

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the 'control endpoint' rectangle is highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

All USB devices must use "endpoint 0" as the default control endpoint. "Endpoint 0" actually refers to two endpoints: endpoint 0 IN and endpoint 0 OUT. This endpoint pair is used to establish a control pipe, a bidirectional communication channel between the host and device where data is exchanged using a predefined format. The default control pipe over endpoint 0 is mandatory: it must always be present and must always be active.

Going back to our enumeration steps, we are expecting the host to request our Device Descriptor using a GET_DESCRIPTOR request sent over the control pipe. Later, we will expect the device to send us a SET_ADDRESS request, giving us our new USB address - again, over the control pipe.

For detailed information about endpoints check Section 5.3.1 Device Endpoints, in the [USB 2.0 specification][usb_spec]. Or you can look at Chapter 3 of USB In a Nutshell.

USB Control Transfers

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the 'control endpoint' rectangle is highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints

Before we continue we need to discuss how data transfers work under the USB protocol.

The control pipe handles control transfers, a special kind of data transfer used by the host to issue requests. A control transfer is a data transfer that occurs in three stages: a SETUP stage, an optional DATA stage and a STATUS stage. The device must handle these requests by either supplying the requested data, or performing the requested action.

During the SETUP stage the host sends 8 bytes of data that identify the control request. Depending on the issued request there may be a DATA stage or not; during the DATA stage data is transferred either from the device to the host or the other way around. During the STATUS stage the device acknowledges, or not, the whole control request.

For detailed information about control transfers see Chapter 4 of USB In a Nutshell.

In this workshop, we expect the host to perform a control transfer to find out what kind of device we are.

USB-2: SETUP Stage

At the end of program usb-1 we received a EP0SETUP event. This event signals the end of the SETUP stage of a control transfer. The nRF52840 USBD peripheral will automatically receive the SETUP data and store it in the registers BMREQUESTTYPE, BREQUEST, WVALUE{L,H}, WINDEX{L,H} and WLENGTH{L,H}.

In nrf52-code/usb-app/src/bin/usb-2.rs, you will find a short description of each register above the variable into which it should be read. But before we read those registers, we need to write some parsing code and get it unit tested.

For in-depth register documentation, refer to Sections 6.35.13.31 to 6.35.13.38 of the nRF52840 Product Specification.

Writing a parser for the data of this SETUP stage

We could parse the SETUP data inside our application, but it makes more sense to put the code in a library where we can test it, and where we can share it with other applications.

We have provided just such a library in nrf52-code/usb-lib. But it's missing some important parts that you need to complete. The definition of Descriptor::Configuration as well as the associated test has been "commented out" using an #[cfg(TODO)] attribute because it is not handled by the firmware yet - leave those disabled for the time being.

βœ… Run cargo test in the nrf52-code/usb-lib directory.

When you need to write some no_std code that does not involve device-specific I/O you should consider writing it as a separate crate. This way, you can test it on your development machine (e.g. x86_64) using the standard cargo test functionality.

So that's what we'll do here. In nrf52-code/usb-lib/src/lib.rs you'll find starter code for writing a no_std SETUP data parser. The starter code contains some unit tests; you can run them with cargo test (from within the usb-lib folder) or you can use Rust Analyzer's "Test" button in VS code.

You should see:

running 2 tests
test tests::set_address ... ok
test tests::get_descriptor_device ... FAILED

failures:

---- tests::get_descriptor_device stdout ----
thread 'tests::get_descriptor_device' panicked at src/lib.rs:119:9:
assertion `left == right` failed
  left: Err(UnknownRequest)
 right: Ok(GetDescriptor { descriptor: Device, length: 18 })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::get_descriptor_device

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `--lib`

βœ… Fix the tests by parsing GET_DESCRIPTOR requests for DEVICE descriptors.

Modify Request::parse() in nrf52-code/usb-lib/src/lib.rs to recognize a GET_DESCRIPTOR request of type DEVICE so that the get_descriptor_device test passes. Note that the parser already handles SET_ADDRESS requests.

Description of GET_DESCRIPTOR request

We can recognize a GET_DESCRIPTOR request by the following properties:

  • bmRequestType is 0b10000000
  • bRequest is 6 (i.e. the GET_DESCRIPTOR Request Code, defined in table 9-4 in the USB spec)

Description of GET_DESCRIPTOR requests for DEVICE descriptors

In this step of the exercise, we only need to parse DEVICE descriptor requests. They have the following properties:

  • the descriptor type is 1 (i.e. DEVICE, defined in table 9-5 of the USB spec)
  • the descriptor index is 0
  • the wIndex is 0 for our purposes
  • ❗️you need to fetch the descriptor type from the high byte of wValue, and the descriptor index from the the low byte of wValue

Check Section 9.4.3 of the USB specification for a very detailed description of the requests. All the constants we'll be using are also described in Tables 9-3, 9-4 and 9-5 of the same document. Or, you can refer to Chapter 6 of USB In a Nutshell.

You should return Err(Error::xxx) if the properties aren't met.

πŸ”Ž Remember that you can:

  • define binary literals by prefixing them with 0b
  • use bit shifts (>>) and casts (as u8) to get the high/low bytes of wValue

You will also find this information in the // TODO implement ... comment in the Request::parse() function of lib.rs file.

See nrf52-code/usb-lib-solutions/get-device/src/lib.rs for a solution.

Using our new parser

βœ… Read incoming request information and pass it to the parser:

Modify nrf52-code/usb-app/src/bin/usb-2.rs to read the appropriate USBD registers and parse them when an EP0SETUP event is received.

Getting Started:

  • for a mapping of register names to the USBD API, check the entry for nrf52840_hal::target::usbd in the documentation you created using cargo doc

  • Try let value = usbd.register_name.read().bits() as u8; if you just want the bottom eight bits of a register.

  • remember that we've learned how to read registers in events.rs.

  • you will need to put together the higher and lower bits of wlength, windex and wvalue to get the whole field, or use a library function to do it for you. Can the dk crate help?

  • Note: If you're using a Mac, you need to catch SET_ADDRESS requests returned by the parser as these are sent before the first GET_DESCRIPTOR request. We added an empty handler for you already so there's nothing further to do (we're just explaining why it's there).

Expected Result:

When you have successfully received a GET_DESCRIPTOR request for a Device descriptor you are done. You should see an output like this:

USB: UsbReset @ Duration { secs: 0, nanos: 361145018 }
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 402465820 }
SETUP: bmrequesttype: 0, brequest: 5, wlength: 0, windex: 0, wvalue: 10
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 404754637 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 8, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=8]
Goal reached; move to the next section
`dk::exit()` called; exiting ...

Note: wlength / length can vary depending on the OS, USB port (USB 2.0 vs USB 3.0) or the presence of a USB hub so you may see a different value.

You can find a solution to this step in nrf52-code/usb-app-solutions/src/bin/usb-2.rs.

USB Device Descriptors

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the outermost 'device' rectangle and the 'bNumConfigurations' label are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

After receiving a GET_DESCRIPTOR request during the SETUP stage, the device needs to respond with the actual descriptor data during the DATA stage. In our Rust application, this descriptor will be generated using some library code and serialised into an array of bytes which we can give to the USBD peripheral.

A descriptor is a binary encoded data structure sent by the device to the host. The device descriptor, in particular, contains information about the device, like its product and vendor identifiers and how many configurations it has. The format of the device descriptor is specified in Section 9.6.1 of the USB specification.

As far as the enumeration process goes, the most relevant fields of the device descriptor are the number of configurations and bcdUSB, the version of the USB specification the devices adheres to. In bcdUSB you should report compatibility with USB 2.0.

What about (the number of) configurations?

A configuration is akin to an operation mode. USB devices usually have a single configuration that will be the only mode in which they'll operate, for example a USB mouse will always act as a USB mouse. Some devices, though, may provide a second configuration for the purpose of firmware upgrades. For example a printer may enter DFU (Device Firmware Upgrade) mode, a second configuration, so that a user can update its firmware; while in DFU mode the printer will not provide printing functionality.

The specification mandates that a device must have at least one available configuration so we can report a single configuration in the device descriptor.

You can read more about Device Descriptors in Chapter 5 of USB In a Nutshell.

USB-3: DATA Stage

The next step is to respond to the GET_DESCRIPTOR request for our device descriptor, with an actual device descriptor that describes our USB Device.

Handle the request

βœ… Open the nrf52-code/usb-app/src/bin/usb-3.rs file

Part of this response is already implemented. We'll go through this.

We'll use the dk::usb::Ep0In abstraction. An instance of it is available in the board value (inside the #[init] function). The first step is to make this Ep0In instance available to the on_event function.

The Ep0In API has two methods: start and end. start is used to start a DATA stage; this method takes a slice of bytes ([u8]) as argument; this argument is the response data. The end method needs to be called after start, when the EP0DATADONE event is raised, to complete the control transfer. Ep0In will automatically issue the STATUS stage that must follow the DATA stage.

βœ… Handle the EP0DATADONE event

Do this by calling the end method on the EP0In instance.

βœ… Implement the response to the GET_DESCRIPTOR request for device descriptors.

Extend nrf52-code/usb-app/src/bin/usb-3.rs so that it uses Ep0In to respond to the GET_DESCRIPTOR request (but only for device descriptors - no other kind of descriptor).

Values of the device descriptor

The raw values you need to pack into the descriptor are as follows. Note, we won't be doing this by hand, so read on before you start typing!

  • bLength = 18, the size of the descriptor (must always be this value)
  • bDescriptorType = 1, device descriptor type (must always be this value)
  • bDeviceClass = bDeviceSubClass = bDeviceProtocol = 0, these are unimportant for enumeration
  • bMaxPacketSize0 = 64, this is the most performant option (minimizes exchanges between the device and the host) and it's assumed by the Ep0In abstraction
  • idVendor = consts::VID, value expected by cargo xtask usb-list (*)
  • idProduct = consts::PID, value expected by cargo xtask usb-list (*)
  • bcdDevice = 0x0100, this means version 1.0 but any value should do
  • iManufacturer = iProduct = iSerialNumber = None, string descriptors not supported
  • bNumConfigurations = 1, must be at least 1 so this is the minimum value

(*) the consts crate refers to the crate in the nrf52-code/consts folder. It is already part of the usb-app crate dependencies.

Use the usb2::device::Descriptor abstraction

Although you can create the device descriptor by hand as an array filled with magic values we strongly recommend you use the usb2::device::Descriptor abstraction. The crate is already in the dependency list of the project; browse to the usb2 crate in the cargo doc output you opened earlier.

The length of the device descriptor

The usb2::device::Descriptor struct does not have bLength and bDescriptorType fields. Those fields have fixed values according to the USB spec so you cannot modify or set them. When bytes() is called on the Descriptor value the returned array, the binary representation of the descriptor, will contain those fields set to their correct value.

The device descriptor is 18 bytes long but the host may ask for fewer bytes (see wlength field in the SETUP data). In that case you must respond with the amount of bytes the host asked for. The opposite may also happen: wlength may be larger than the size of the device descriptor; in this case your answer must be 18 bytes long (do not pad the response with zeroes).

Expected log output

Once you have successfully responded to the GET_DESCRIPTOR Device request you should get logs like these (if you are logging like our solution does):

USB: UsbReset @ Duration { secs: 0, nanos: 211334227 }
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 252380370 }
SETUP: bmrequesttype: 0, brequest: 5, wlength: 0, windex: 0, wvalue: 52
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 254577635 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 8, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=8]
EP0IN: start 8B transfer
USB: UsbEp0DataDone @ Duration { secs: 0, nanos: 254852293 }
EP0IN: transfer done
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 257568358 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 18, windex: 0, wvalue: 256
GET_DESCRIPTOR Device [length=18]
EP0IN: start 18B transfer
USB: UsbEp0DataDone @ Duration { secs: 0, nanos: 257843016 }
EP0IN: transfer done
USB: UsbEp0Setup @ Duration { secs: 0, nanos: 259674071 }
SETUP: bmrequesttype: 128, brequest: 6, wlength: 9, windex: 0, wvalue: 512
ERROR unknown request (goal achieved if GET_DESCRIPTOR Device was handled before)
`dk::exit()` called; exiting ...

A solution to this exercise can be found in nrf52-code/usb-app-solutions/src/bin/usb-3.rs.

Configuration descriptor

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the 'configuration 1' rectangle and the 'bNumInterface' label are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

The configuration descriptor describes one of the device configurations to the host. The descriptor contains the following information about a particular configuration:

  • the total length of the configuration: this is the number of bytes required to transfer this configuration descriptor and the interface and endpoint descriptors associated to it
  • its number of interfaces -- must be >= 1
  • its configuration value -- this is not an index and can be any non-zero value
  • whether the configuration is self-powered
  • whether the configuration supports remote wakeup
  • its maximum power consumption

The full format of the configuration descriptor is specified in section 9.6.3, Configuration, of the USB specification.

USB-4: Supporting more Standard Requests

After responding to the GET_DESCRIPTOR Device request the host will start sending different requests. Let's identify those, and then handle them.

Update the parser

The starter nrf52-code/usb-lib package contains unit tests for everything we need. Some of them have been commented out using a #[cfg(TODO)] attribute.

βœ… Remove all #[cfg(TODO)] attributes so that everything is enabled.

βœ… Update the parser in nrf52-code/usb-lib to handle GET_DESCRIPTOR requests for Configuration Descriptors.

When the host issues a GET_DESCRIPTOR Configuration request the device needs to respond with the requested configuration descriptor plus all the interface and endpoint descriptors associated to that configuration descriptor during the DATA stage.

As a reminder, all GET_DESCRIPTOR request types share the following properties:

  • bmRequestType is 0b10000000
  • bRequest is 6 (i.e. the GET_DESCRIPTOR Request Code, defined in Table 9-4 of the USB specification)

A GET_DESCRIPTOR Configuration request is determined by the high byte of its wValue field:

  • The high byte of wValue is 2 (i.e. the CONFIGURATION descriptor type, defined in Table 9-5 of the USB specification)

βœ… Update the parser in nrf52-code/usb-lib to handle SET_CONFIGURATION requests.

See the section on SET_CONFIGURATION for details on how to do this.

Once you've completed this, all your test cases should pass. If not, fix the code until they do!

Help

If you need a reference, you can find solutions to parsing GET_DESCRIPTOR Configuration and SET_CONFIGURATION requests in the following files:

Each file contains just enough code to parse the request in its name and the GET_DESCRIPTOR Device and SET_ADDRESS requests. So you can refer to nrf52-code/usb-lib-solutions/get-descriptor-config without getting "spoiled" about how to parse the SET_CONFIGURATION request.

Update the application

We're now going to be using nrf52-code/usb-app/src/bin/usb-4.rs.

Since the logic of the EP0SETUP event handling is getting more complex with each added event, you can see that usb-4.rs was refactored to add error handling: the event handling now happens in a separate function that returns a Result. When it encounters an invalid host request, it returns the Err variant which can be handled by stalling the endpoint:

fn on_event(/* parameters */) {
    match event {
        /* ... */
        Event::UsbEp0Setup => {
            if ep0setup(/* arguments */).is_err() {
                // unsupported or invalid request:
                // TODO add code to stall the endpoint
                defmt::warn!("EP0IN: unexpected request; stalling the endpoint");
            }
        }
    }
}

fn ep0setup(/* parameters */) -> Result<(), ()> {
    let req = Request::parse(/* arguments_*/)?;
    //                                       ^ early returns an `Err` if it occurs

    // TODO respond to the `req`; return `Err` if the request was invalid in this state

    Ok(())
}

Note that there's a difference between the error handling done here and the error handling commonly done in std programs. std programs usually bubble up errors to the top main function (using the ? operator), report the error (or chain of errors) and then exit the application with a non-zero exit code. This approach is usually not appropriate for embedded programs as

  1. main cannot return,
  2. there may not be a console to print the error to and/or
  3. stopping the program, and e.g. requiring the user to reset it to make it work again, may not be desirable behavior.

For these reasons in embedded software errors tend to be handled as early as possible rather than propagated all the way up.

This does not preclude error reporting. The above snippet includes error reporting in the form of a defmt::warn! statement. This log statement may not be included in the final release of the program as it may not be useful, or even visible, to an end user but it is useful during development.

βœ… For each green test, extend usb-4.rs to handle the new requests your parser is now able to recognize.

If that's all the information you need - go ahead! If you'd like some more detail, read on.

Dealing with unknown requests: Stalling the endpoint

You may come across host requests other than the ones listed in previous sections.

For this situation, the USB specification defines a device-side procedure for "stalling an endpoint", which amounts to the device telling the host that it doesn't support some request.

This procedure should be used to deal with invalid requests, requests whose SETUP stage doesn't match any USB 2.0 standard request, and requests not supported by the device – for instance the SET_DESCRIPTOR request is not mandatory.

βœ… Use the dk::usbd::ep0stall() helper function to stall endpoint 0 in nrf52-code/usb-app/src/bin/usb-4.rs if an invalid request is received.

Updating Device State

At some point during the initialization you'll receive a SET_ADDRESS request that will move the device from the Default state to the Address state. If you are working on Linux, you'll also receive a SET_CONFIGURATION request that will move the device from the Address state to the Configured state. Additionally, some requests are only valid in certain states– for example SET_CONFIGURATION is only valid if the device is in the Address state. For this reason usb-4.rs will need to keep track of the device's current state.

The device state should be tracked using a resource so that it's preserved across multiple executions of the USBD event handler. The usb2 crate has a State enum with the 3 possible USB states: Default, Address and Configured. You can use that enum or roll your own.

βœ… Start tracking and updating the device state to move your request handling forward.

Update the handling of the USBRESET event

Instead of ignoring it, we now want it to change the state of the USB device. See section 9.1 USB Device States of the USB specification for details on what to do. Note that fn on_event() was given state: &mut State.

Update the handling of SET_ADDRESS requests

This request should come right after the GET_DESCRIPTOR Device request if you're using Linux, or be the first request sent to the device by macOS.

A SET_ADDRESS request has the following fields as defined by Section 9.4.6 Set Address of the USB spec:

  • bmrequesttype is 0b00000000
  • brequest is 5 (i.e. the SET_ADDRESS Request Code, see table 9-4 in the USB spec)
  • wValue contains the address to be used for all subsequent accesses
  • wIndex and wLength are 0, there is no wData

It should be handled as follows:

  • If the device is in the Default state, then

    • if the requested address stored in wValue was 0 (None in the usb API) then the device should stay in the Default state
    • otherwise the device should move to the Address state
  • If the device is in the Address state, then

    • if the requested address stored in wValue was 0 (None in the usb API) then the device should return to the Default state
    • otherwise the device should remain in the Address state but start using the new address
  • If the device is in the Configured state this request results in "unspecified" behavior according to the USB specification. You should stall the endpoint in this case.

Note: According to the USB specification the device needs to respond to this request with a STATUS stage -- the DATA stage is omitted. The nRF52840 USBD peripheral will automatically issue the STATUS stage and switch to listening to the requested address (see the USBADDR register) so no interaction with the USBD peripheral is required for this request.

For more details, read the introduction of section 6.35.9 of the nRF52840 Product Specification 1.0.

Implement the handling of GET_DESCRIPTOR Configuration requests

So how should we respond to the host when it wants our Configuration Descriptor? As our only goal is to be enumerated we'll respond with the minimum amount of information possible.

βœ… First, check the request

Configuration descriptors are requested by index, not by their configuration value. Since we reported a single configuration in our device descriptor the index in the request must be zero. Any other value should be rejected by stalling the endpoint (see section Dealing with unknown requests: Stalling the endpoint for more information).

βœ… Next, create and send a response

The response should consist of the configuration descriptor, followed by interface descriptors and then by (optional) endpoint descriptors. We'll include a minimal single interface descriptor in the response. Since endpoints are optional we will include none.

The configuration descriptor and one interface descriptor will be concatenated in a single packet so this response should be completed in a single DATA stage.

The configuration descriptor in the response should contain these fields:

  • bLength = 9, the size of this descriptor (must always be this value)
  • bDescriptorType = 2, configuration descriptor type (must always be this value)
  • wTotalLength = 18 = one configuration descriptor (9 bytes) and one interface descriptor (9 bytes)
  • bNumInterfaces = 1, a single interface (the minimum value)
  • bConfigurationValue = 42, any non-zero value will do
  • iConfiguration = 0, string descriptors are not supported
  • bmAttributes { self_powered: true, remote_wakeup: false }, self-powered due to the debugger connection
  • bMaxPower = 250 (500 mA), this is the maximum allowed value but any (non-zero?) value should do

The interface descriptor in the response should contain these fields:

  • bLength = 9, the size of this descriptor (must always be this value)
  • bDescriptorType = 4, interface descriptor type (must always be this value)
  • bInterfaceNumber = 0, this is the first, and only, interface
  • bAlternateSetting = 0, alternate settings are not supported
  • bNumEndpoints = 0, no endpoint associated to this interface (other than the control endpoint)
  • bInterfaceClass = bInterfaceSubClass = bInterfaceProtocol = 0, does not adhere to any specified USB interface
  • iInterface = 0, string descriptors are not supported

Again, we strongly recommend that you use the usb2::configuration::Descriptor and usb2::interface::Descriptor abstractions here. Each descriptor instance can be transformed into its byte representation using the bytes method -- the method returns an array. To concatenate both arrays you can use an stack-allocated heapless::Vec buffer. If you haven't used the heapless crate before you can find example usage in the the src/bin/vec.rs file.

NOTE: the usb2::configuration::Descriptor and usb2::interface::Descriptor structs do not have bLength and bDescriptorType fields. Those fields have fixed values according to the USB spec so you cannot modify or set them. When bytes() is called on the Descriptor value, the returned array (which contains a binary representation of the descriptor, packed according to the USB 2.0 standard) will contain those fields set to their correct value.

Getting it Configured

At this stage the device will be in the Address stage. It has been identified and enumerated by the host but cannot yet be used by host applications. The device must first move to the Configured state before the host can start, for example, HID communication or send non-standard requests over the control endpoint.

There is no template for this step - start with your solution to USB-4.

Windows will enumerate the device but not automatically configure it after enumeration. Here's what you should do to force the host to configure the device.

Linux and macOS

Nothing extra needs to be done if you're working on a Linux or macOS host. The host will automatically send a SET_CONFIGURATION request so proceed to the SET_CONFIGURATION section to see how to handle the request.

Windows

After getting the device enumerated and into the idle state, open the Zadig tool (covered in the setup instructions; see the top README) and use it to associate the nRF52840 USB device to the WinUSB driver. The nRF52840 will appear as a "unknown device" with a VID and PID that matches the ones defined in the consts crate.

Now modify the usb-descriptors command within the xtask package to "open" the device -- this operation is commented out in the source code. With this modification usb-descriptors will cause Windows to send a SET_CONFIGURATION request to configure the device. You'll need to run cargo xtask usb-descriptors to test out the correct handling of the SET_CONFIGURATION request.

SET_CONFIGURATION

The SET_CONFIGURATION request is sent by the host to configure the device. Its configuration according to Section 9.4.7 of the USB specification is:

  • bmrequesttype is 0b00000000
  • brequest is 9 (i.e. the SET_CONFIGURATION Request Code, see table 9-4 in the USB spec)
  • wValue contains the requested configuration value
  • wIndex and wLength are 0, there is no wData

βœ… To handle a SET_CONFIGURATION, do the following:

  • If the device is in the Default state, you should stall the endpoint because the operation is not permitted in that state.

  • If the device is in the Address state, then

    • if wValue is 0 (None in the usb API) then stay in the Address state
    • if wValue is non-zero and valid (was previously reported in a configuration descriptor) then move to the Configured state
    • if wValue is not valid then stall the endpoint
  • If the device is in the Configured state, then read the requested configuration value from wValue

    • if wValue is 0 (None in the usb API) then return to the Address state
    • if wValue is non-zero and valid (was previously reported in a configuration descriptor) then move to the Configured state with the new configuration value
    • if wValue is not valid then stall the endpoint

In all the cases where you did not stall the endpoint (by returning Err) you'll need to acknowledge the request by starting a STATUS stage.

βœ… This is done by writing 1 to the TASKS_EP0STATUS register.

NOTE: On Windows, you may get a GET_STATUS request before the SET_CONFIGURATION request and although you should respond to it, stalling the GET_STATUS request seems sufficient to get the device to the Configured state.

Expected output

βœ… Run the progam and check the log output.

Once you are correctly handling the SET_CONFIGURATION request you should get logs like these:

INFO:usb_5 -- USB: UsbReset @ 397.15576ms
INFO:usb_5 -- USB reset condition detected
INFO:usb_5 -- USB: UsbEp0Setup @ 470.00122ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: Device, length: 64 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_5 -- USB: UsbEp0DataDone @ 470.306395ms
INFO:usb_5 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_5 -- USB: UsbReset @ 520.721433ms
INFO:usb_5 -- USB reset condition detected
INFO:usb_5 -- USB: UsbEp0Setup @ 593.292235ms
INFO:usb_5 -- EP0: SetAddress { address: Some(21) }
INFO:usb_5 -- USB: UsbEp0Setup @ 609.954832ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: Device, length: 18 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_5 -- USB: UsbEp0DataDone @ 610.260008ms
INFO:usb_5 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_5 -- USB: UsbEp0Setup @ 610.443113ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
WARN:usb_5 -- EP0IN: stalled
INFO:usb_5 -- USB: UsbEp0Setup @ 610.809325ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
WARN:usb_5 -- EP0IN: stalled
INFO:usb_5 -- USB: UsbEp0Setup @ 611.175535ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
WARN:usb_5 -- EP0IN: stalled
INFO:usb_5 -- USB: UsbEp0Setup @ 611.511228ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 9 }
INFO:dk::usbd -- EP0IN: start 9B transfer
INFO:usb_5 -- USB: UsbEp0DataDone @ 611.846922ms
INFO:usb_5 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_5 -- USB: UsbEp0Setup @ 612.030027ms
INFO:usb_5 -- EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 18 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_5 -- USB: UsbEp0DataDone @ 612.365721ms
INFO:usb_5 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_5 -- USB: UsbEp0Setup @ 612.640378ms
INFO:usb_5 -- EP0: SetConfiguration { value: Some(42) }
INFO:usb_5 -- entering the configured state

These logs are from a Linux host. You can find traces for other OSes in these files (they are in the nrf52-code/usb-app-solutions/traces folder):

  • linux-configured.txt (same logs as the ones shown above)
  • win-configured.txt, this file only contains the logs produced by running cargo xtask usb-descriptors
  • macos-configured.txt

You can find a solution to this part of the exercise in nrf52-code/usb-app-solutions/src/bin/usb-5.rs.

Idle State

Once you have handled all the previously covered requests the device should be enumerated and remain idle awaiting for a new host request. Your logs may look like this:

INFO:usb_4 -- USB: UsbReset @ 318.66455ms
INFO:usb_4 -- USB reset condition detected
INFO:usb_4 -- USB: UsbEp0Setup @ 391.418456ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: Device, length: 64 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_4 -- USB: UsbEp0DataDone @ 391.723632ms
INFO:usb_4 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_4 -- USB: UsbReset @ 442.016601ms
INFO:usb_4 -- USB reset condition detected
INFO:usb_4 -- USB: UsbEp0Setup @ 514.709471ms
INFO:usb_4 -- EP0: SetAddress { address: Some(17) }
INFO:usb_4 -- USB: UsbEp0Setup @ 531.37207ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: Device, length: 18 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_4 -- USB: UsbEp0DataDone @ 531.646727ms
INFO:usb_4 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_4 -- USB: UsbEp0Setup @ 531.829832ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
ERROR:usb_4 -- EP0IN: unexpected request; stalling the endpoint
INFO:usb_4 -- USB: UsbEp0Setup @ 532.226562ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
ERROR:usb_4 -- EP0IN: unexpected request; stalling the endpoint
INFO:usb_4 -- USB: UsbEp0Setup @ 532.592772ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: DeviceQualifier, length: 10 }
ERROR:usb_4 -- EP0IN: unexpected request; stalling the endpoint
INFO:usb_4 -- USB: UsbEp0Setup @ 533.020018ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 9 }
INFO:dk::usbd -- EP0IN: start 9B transfer
INFO:usb_4 -- USB: UsbEp0DataDone @ 533.386228ms
INFO:usb_4 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_4 -- USB: UsbEp0Setup @ 533.569335ms
INFO:usb_4 -- EP0: GetDescriptor { descriptor: Configuration { index: 0 }, length: 18 }
INFO:dk::usbd -- EP0IN: start 18B transfer
INFO:usb_4 -- USB: UsbEp0DataDone @ 533.935546ms
INFO:usb_4 -- EP0IN: transfer complete
INFO:dk::usbd -- EP0IN: transfer done
INFO:usb_4 -- USB: UsbEp0Setup @ 534.118651ms
INFO:usb_4 -- EP0: SetConfiguration { value: Some(42) }
ERROR:usb_4 -- EP0IN: unexpected request; stalling the endpoint

Note that these logs are from a Linux host where a SET_CONFIGURATION request is sent after the SET_ADDRESS request. On other OSes you may not get that request before the bus goes idle. Also note that there are some GET_DESCRIPTOR DeviceQualifier requests in this case; you do not need to parse them in the usb crate as they'll be rejected (stalled) anyways.

You can find traces for other OSes in these files (they are in the nrf52-code/usb-app-solutions/traces folder):

  • linux-enumeration.txt (same logs as the ones shown above)
  • macos-enumeration.txt
  • win-enumeration.txt

βœ… Double check that the enumeration works by running cargo xtask usb-list while usb-4.rs is running.

Bus 001 Device 013: ID 1366:1015 <- J-Link on the nRF52840 Development Kit
(..)
Bus 001 Device 016: ID 1209:0717 <- nRF52840 on the nRF52840 Development Kit

Both the J-Link and the device implemented by your firmware should appear in the list.

You can find a working solution up to this point in nrf52-code/usb-app-solutions/src/bin/usb-4.rs. Note that the solution uses the usb2 crate to parse SETUP packets and that crate supports parsing all standard requests.

Next Steps

String descriptors

If you'd like to continue working on your workshop project, we recommend adding String Descriptors support to the USB firmware. To do this, follow these steps:

βœ… Read through section 9.6.7 of the USB spec, which covers string descriptors.

βœ… Change your configuration descriptor to use string descriptors. You'll want to change the iConfiguration field to a non-zero value. Note that this change will likely break enumeration.

βœ… Re-run the program to see what new control requests you get from the host.

βœ… Update the usb parser to handle the new requests.

βœ… Extend the logic of ep0setup to handle these new requests.

Eventually, you'll need to send a string descriptor to the host. Note here that Rust string literals are UTF-8 encoded but the USB protocol uses UTF-16 strings. You'll need to convert between these formats.

βœ… If this works, add strings to other descriptors like the device descriptor e.g. its iProduct field.

βœ… To verify that string descriptors are working in a cross-platform way, extend the cargo xtask usb-descriptors program to also print the device's string descriptors. See the read_string_descriptor method but note that this must be called on a "device handle", which is what the commented out open operation does.

Explore more RTIC features

We have covered only a few of the core features of the RTIC framework but the framework has many more features like software tasks, tasks that can be spawned by the software; message passing between tasks; and task scheduling, which allows the creation of periodic tasks. We encourage to check the RTIC book which describes the features we haven't covered here.

usb-device

usb-device is a library for building USB devices. It has been built using traits (the pillar of Rust's generics) such that USB interfaces like HID and TTY ACM can be implemented in a device agnostic manner. The device details then are limited to a trait implementation. There's an implementation of the usb-device trait for the nRF52840 device in the nrf-hal and there are many usb-device "classes" like HID and TTY ACM that can be used with that trait implementation. We encourage you to check out that implementation, test it on different OSes and report issues, or contribute fixes, to the usb-device ecosystem.

Extra Info

The following chapters contain extra detail about DMA on the nRF52, the USB stack, and how we protect against stack overflows. You do not require them to complete the exercises, but you may find them interesting reading.

The USB Specification

The USB 2.0 specification is available free of charge from https://www.usb.org/document-library/usb-20-specification. On the right, you will see a link like usb_20_yyyymmdd.zip. Download and unpack the zip file, and the core specification can be found within as a file called usb_20.pdf (alongside a bunch of errata and additional specifications). Note that the date on the cover page is April 27, 2000 - and actually, the portions of the specification we are implementing are unchanged from the earlier USB 1.1 specification.

Direct Memory Access

πŸ”Ž this section covers the implementation of the Ep0In abstraction; it's not necessary to fully understand this section to continue working on the workshop.

Let's zoom into the Ep0In abstraction we used in usb-3.rs.

βœ… Open the file. Use VSCode's "Go to Definition" to see the implementation of the Ep0In.start() method.

This is how data transfers over USB work on the nRF52840: for each endpoint there's a buffer in the USBD peripheral. Data sent by the host over USB to a particular endpoint will be stored in the corresponding endpoint buffer. Likewise, data stored in one of these endpoint buffers can be send to the host over USB from that particular endpoint. These buffers are not directly accessible by the CPU but data stored in RAM can be copied into these buffers; likewise, the contents of an endpoint buffer can be copied into RAM. A second peripheral, the Direct Memory Access (DMA) peripheral, can copy data between these endpoint buffers and RAM. The process of copying data in either direction is referred to as "a DMA transfer".

What the start method does is start a DMA transfer to copy bytes into endpoint buffer IN 0; this makes the USBD peripheral send data to the host from endpoint IN 0 fs. The data (bytes), which may be located in Flash or RAM, is first copied into an internal buffer, allocated in RAM, and then the DMA is configured to move the data from this internal buffer to endpoint buffer 0 IN, which is part of the USBD peripheral.

The signature of the start() method does not ensure that:

  • bytes won't be deallocated before the DMA transfer is over (e.g. bytes could be pointing into the stack), or that
  • bytes won't be modified right after the DMA transfer starts (this would be a data race in the general case).

For these two safety reasons the API is implemented using an internal buffer called buffer. The internal buffer has a 'static lifetime so it's guaranteed to never be deallocated -- this prevents issue (a). The busy flag prevents any further modification to the internal buffer -- from the public API -- while the DMA transfer is in progress.

Apart from thinking about lifetimes and explicit data races in the surface API one must internally use memory fences to prevent reordering of memory operations (e.g. by the compiler), which can also cause data races. DMA transfers run in parallel to the instructions performed by the processor and are "invisible" to the compiler.

In the implementation of the start method, data is copied from bytes to the internal buffer (a memcpy() operation) and then the DMA transfer is started with a write to the TASKS_STARTEPIN0 register. The compiler sees the start of the DMA transfer (register write) as an unrelated memory operation so it may move the memcpy() to after the DMA transfer has started. This reordering results in a data race: the processor modifies the internal buffer while the DMA is reading data out from it.

To avoid this reordering a memory fence, dma_start(), is used. The fence pairs with the store operation (register write) that starts the DMA transfer and prevents the previous memcpy(), and any other memory operation, from being move to after the store operation.

Another memory fence, dma_end(), is needed at the end of the DMA transfer. In the general case, this prevents instruction reordering that would result in the processor accessing the internal buffer before the DMA transfer has finished. This is particularly problematic with DMA transfers that modify a region of memory which the processor intends to read after the transfer.

Note: Not relevant to the DMA operation but relevant to the USB specification, the start() method sets a shortcut in the USBD peripheral to issue a STATUS stage right after the DATA stage is finished. Thanks to this it is not necessary to manually start a STATUS stage after calling the end method.

SET_CONFIGURATION (Linux & macOS)

On Linux and macOS, the host will likely send a SET_CONFIGURATION request right after enumeration to put the device in the Configured state. For now you can stall the request. It is not necessary at this stage because the device has already been enumerated.

Interface

We have covered configurations and endpoints but what is an interface?

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the 'interface 0' rectangle and the 'bNumEndpoints' label are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

An interface is closest to a USB device's function. For example, a USB mouse may expose a single HID (Human Interface Device) interface to report user input to the host. USB devices can expose multiple interfaces within a configuration. For example, the nRF52840 Dongle could expose both a CDC ACM interface (AKA virtual serial port) and a HID interface; the first interface could be used for (defmt::println!-style) logs; and the second one could provide a RPC (Remote Procedure Call) interface to the host for controlling the nRF52840's radio.

An interface is made up of one or more endpoints. To give an example, a HID interface can use two (interrupt) endpoints, one IN and one OUT, for bidirectional communication with the host. A single endpoint cannot be used by more than one interface with the exception of the special "endpoint 0", which can be (and usually is) shared by all interfaces.

For detailed information about interfaces check section 9.6.5, Interface, of the USB specification.

Interface descriptor

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the 'interface 0' rectangle and the 'bNumEndpoints' label are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

The interface descriptor describes one of the device interfaces to the host. The descriptor contains the following information about a particular interface:

  • its interface number -- this is a zero-based index
  • its alternate setting -- this allows configuring the interface
  • its number of endpoints
  • class, subclass and protocol -- these define the interface (HID, or TTY ACM, or DFU, etc.) according to the USB specification

The number of endpoints can be zero and endpoint zero must not be accounted when counting endpoints.

The full format of the interface descriptor is specified in section 9.6.5, Interface, of the USB specification.

Endpoint descriptor

USB hierarchy diagram showing the relationship between configurations, interfaces and endpoints. The diagram consists of nested rectangles. In this version of the diagram the endpoint rectangles inside the 'interface 1' rectangle are highlighted in blue. The outermost rectangle is labeled 'device' and represents the complete USB device. Inside the 'device' rectangle there is one rectangle labeled 'configuration 1'; this rectangle has a 'parallel lines' symbol that indicates there may be more than one configuration instance; the symbol is labeled 'bNumConfigurations=1' indicating that this device has only one configuration. Inside the 'configuration 1' rectangle there are two rectangles labeled 'control endpoint' and 'interface 0'. Inside the 'control endpoint' rectangle there are two rectangles labeled 'endpoint 0 IN' and 'endpoint 0 OUT. The 'interface 0' rectangle has a 'parallel lines' symbol that indicates there may be more than one interface instance; the symbol is labeled 'bNumInterfaces=1' indicating that this configuration has only one interface. Inside the 'interface 0' rectangle there are three rectangles labeled 'endpoint 1 IN', 'endpoint 2 IN' and 'endpoint 2 OUT'. Between these three rectangle there is a label that says 'bNumEndpoints=3'; it indicates that this interface has only three endpoints.

We will not need to deal with endpoint descriptors in this workshop but they are specified in section 9.6.6, Endpoint, of the USB specification.

Inspecting the Descriptors

There's a tool built into our cargo xtask called usb-descriptors, it prints all the descriptors reported by your application

βœ… Run this tool

Your output should look like this:

$ cargo xtask usb-descriptors
DeviceDescriptor {
    bLength: 18,
    bDescriptorType: 1,
    bcdUSB: 512,
    bDeviceClass: 0,
    bDeviceSubClass: 0,
    bDeviceProtocol: 0,
    bMaxPacketSize: 64,
    idVendor: 8224,
    idProduct: 1815,
    bcdDevice: 256,
    iManufacturer: 0,
    iProduct: 0,
    iSerialNumber: 0,
    bNumConfigurations: 1,
}
address: 22
config0: ConfigDescriptor {
    bLength: 9,
    bDescriptorType: 2,
    wTotalLength: 18,
    bNumInterfaces: 1,
    bConfigurationValue: 42,
    iConfiguration: 0,
    bmAttributes: 192,
    bMaxPower: 250,
    extra: None,
}
iface0: [
    InterfaceDescriptor {
        bLength: 9,
        bDescriptorType: 4,
        bInterfaceNumber: 0,
        bAlternateSetting: 0,
        bNumEndpoints: 0,
        bInterfaceClass: 0,
        bInterfaceSubClass: 0,
        bInterfaceProtocol: 0,
        iInterface: 0,
    },
]

The output above corresponds to the descriptor values we suggested. If you used different values, e.g. for bMaxPower, you'll a slightly different output.

Stack Overflow Protection

The usb-app crate in which we developed our advanced workshop solutions (i.e. nrf52-code/usb-app) uses our open-source flip-link tool for zero-cost stack overflow protection.

This means that your application will warn you by crashing if you accidentally overreach the boundaries of your application's stack instead of running into undefined behavior and behaving erratically in irreproducible ways. This memory protection mechanism comes at no additional computational or memory-usage cost.

πŸ”Ž For a detailed description of how flip-link and Stack Overflows in bare metal Rust in general work, please refer to the flip-link README.

You can see this in action in the stack_overflow.rs file that can be found in nrf52-code/usb-app/src/bin/stack_overflow.rs:

#![no_main]
#![no_std]

use cortex_m::asm;
use cortex_m_rt::entry;
// this imports `src/lib.rs`to retrieve our global logger + panicking-behavior
use usb_app as _;

#[entry]
fn main() -> ! {
    // board initialization
    dk::init().unwrap();

    fib(100);

    loop {
        asm::bkpt();
    }
}

#[inline(never)]
fn fib(n: u32) -> u32 {
    // allocate and initialize one kilobyte of stack memory to provoke stack overflow
    let use_stack = [0xAA; 1024];
    defmt::println!("allocating [{}; 1024]; round #{}", use_stack[1023], n);

    if n < 2 {
        1
    } else {
        fib(n - 1) + fib(n - 2) // recursion
    }
}

The spam() function allocates data on the stack until the stack boundaries are reached.

βœ… Run stack_overflow.rs

You should see output similar to this (the program output between the horizontal bars might be missing):

(HOST) INFO  flashing program (35.25 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
INFO:stack_overflow -- provoking stack overflow...
INFO:stack_overflow -- address of current `use_stack` at recursion depth 0: 0x2003aec0
INFO:stack_overflow -- address of current `use_stack` at recursion depth 1: 0x20039e50
(...)
INFO:stack_overflow -- address of current `use_stack` at recursion depth 10: 0x20030a60
INFO:stack_overflow -- address of current `use_stack` at recursion
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
(HOST) WARN  call stack was corrupted; unwinding could not be completed
(HOST) ERROR the program has overflowed its stack

❗️ flip-link is a third-party tool, so make sure you've installed it through cargo install flip-link

To see how we've activated flip-link, take a look at nrf52-code/usb-app/.cargo/config.toml:

rustflags = [
  "-C", "linker=flip-link", # adds stack overflow protection
  #
]

There, we've configured flip-link as the linker to be used for all ARM targets. If you'd like to use flip-link in your own projects, this is all you need to add!

πŸ”Ž Note: if you try to run stack_overflow.rs without flip-link enabled, you might see varying behavior depending on the rustc version you're using, timing and pure chance. This is because undefined behavior triggered by the program may change between rustc releases.

Working without the Standard Library

This section has some exercises which introduce ways to move away from libstd and write applications which only use libcore (or liballoc). This is important when writing safety-critical systems.

Replacing println!

In this exercise, we will write a basic "Hello, World!" application, but without using println!. This will introduce some of the concepts we will need for writing safety-critical Rust code that runs on certified OSes like QNX, where the Rust Standard Library is not available.

However, to keep things easy to deploy, you can use your normal Windows, macOS or Linux system to complete this exercise.

Task 1 - Make a program

Use cargo new to make a package containing the default binary crate - a Hello, World example that uses println!

Solution
$ cargo new testprogram
     Created binary (application) `testbin` package
$ cd testprogram
$ cargo run
   Compiling testbin v0.1.0 (/Users/jonathan/Documents/clients/training/oxidze-2024/testbin)
    Finished dev [unoptimized + debuginfo] target(s) in 0.32s
     Running `target/debug/testbin`
Hello, world!

Task 2 - Lock the Standard Out

The println! expands to some code which:

  1. Grabs a lock on standard out
  2. Formats the arguments into the locked standard out

We can do these two steps manually, using std::io::stdout(), and the writeln! (which is actually from in libcore).

Replace the call to println! with a call to writeln! that uses a locked standard out. Work out how best to handle the fact that writeln! returns an error. Think about why println! didn't return an error? How it did handle a possible failure?

If you get an error about the write_fmt method not being available, make sure you have brought the std::io::Write trait into scope. Recall that trait methods are not available on types unless the trait is in scope - otherwise how would the compiler know which traits to look for the method in? If we were on a no-std system, the same method is available in the core::fmt::Write trait - the writeln! macro is happy with either as long as the method exists.

Solution
use std::io::Write;

fn main() {
    let mut stdout = std::io::stdout();
    writeln!(stdout, "Hello, World!").expect("writing to stdout");
}

The writeln call can fail because the it can get an error from the object it is writing to. What if you are writing to a file on disk, and the disk is full? Or the USB Thumb Drive it is on is unplugged? The println! macro knows it only writes to Standard Out, and if that is broken, there isn't much you can do about it (you probably can't even print an error), so it just panics.

Task 3 - Call write_fmt

The writeln! call expands to some code which:

  1. Generates a value of type std::fmt::Arguments, using a macro called format_args!.
  2. Passes that to the write_fmt method on whatever we're writing into.

You can do these two steps manually - but that's as far as we can go! The format_args! macro is special, and we are unable to replicate its functions by writing regular Rust code.

Replace the call to writeln! with a call to format_args!, passing the result to the write_fmt method on the locked standard output. Note that Rust won't let you store the result of format_args! in a variable - you need to call it inside the call to write_fmt. Try it for yourself!

Solution
use std::io::Write;

fn main() {
    let mut stdout = std::io::stdout();
    stdout.write_fmt(format_args!("Hello, World!"));
}

Task 4 - Ditch the standard output object

Rather than throw bytes into this mysterious Standard Out object, let's try and talk to our Operating System directly. We're going to do this using the libc crate, which provides raw access to the APIs typically found in most C Standard Libraries.

  • Step 1 - Run cargo add libc to add it as a dependency

  • Step 2 - Store your message in a local variable, as a string slice

    #![allow(unused)]
    fn main() {
    let message = "Hello, World!";
    }
  • Step 3 - Unsafely call the libc::write method, passing:

    • 1 as the file descriptor (the standard output has this value, by default)
    • A pointer to the start of your string slice
    • The length of the string in bytes

You can make a pointer from a slice using the as_ptr() method, but this will give you *const u8 and libc::write might want *const c_void. You can use message.as_ptr() as _ to get Rust to cast the pointer into an automatically determined type (the _ means 'work this out for me').

You might also find the length of the string needs casting from the default usize to whatever libc wants on your platform.

Solution
fn main() {
    let message = "Hello, world";
    unsafe {
        libc::write(1, message.as_ptr() as _, message.len() as _);
    }
}

Bare-Metal Firmware on Cortex-R52 - Preparation

This chapter contains information about the QEMU-based exercises, the required software and an installation guide.

Required Software

QEMU, version 9

Available for Windows, macOS or Linux from https://www.qemu.org/download/

Note that version 8 or lower will not work. It must be version 9 or higher to support the Cortex-R52.

Ensure that once installed you have qemu-system-arm on your path.

Ferrocene or Rust

If you use Ferrocene, you will need pre-rolling-2024-05-21 or newer. A criticalup.toml file is provided, you can just criticalup install in the example directory and an appropriate toolchain will be provided.

If you use Rust, you will need a version that supports armv8r-none-eabihf. This should be included in Rust 1.78 or newer, or a nightly from around March 2024 or newer. You will also need to compile the standard library from source - see the README for more details.

Bare-Metal Firmware on Cortex-R52 - Writing a UART Driver

We have supplied a small Rust no-std application, which is designed to run inside a QEMU emulation of an Armv8-R Cortex-R52 system. We build the code using the armv8r-none-eabihf target.

The application lives in ./qemu-code/uart-driver.

The application talks to the outside world through a UART driver. We have provided two - a working one, and a template one that doesn't work which you need to fix.

Task 1 - Get UART TX working

Modify the template driver and complete the missing code sections as commented. You can peek at the complete driver if you really need to!

This will involve reading and writing to the given registers. You have been given the base-address of the UART peripheral as a const generic, and you have been given constants for the offset of each register from the base address (assuming you are working with a *mut u32).

You'll want to write a private method to read/write each register, and use write_volatile and read_volatile to access them.

Task 2 - Get UART RX working

Continue modifying the UART driver so that you can read data. You'll need to enable the RX bit in the configuration register, and add an appropriate method to read a single byte, returning Option<u8>. Now modify the main loop to echo back received characters.

You'll need to look in the Cortex-M SDK UART documentation to see which bit in the status register indicates that the 1-byte long RX FIFO has data in it.

Running the code

You will need QEMU 9 installed and in your $PATH for cargo run to work. This was the first version with Arm Cortex-R52 emulation.

With the template unfinished:

$ cargo run
   Compiling uart-exercise v0.1.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/qemu-code/uart-driver)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
     Running `qemu-system-arm -machine mps3-an536 -cpu cortex-r52 -semihosting -nographic -kernel target/armv8r-none-eabihf/debug/uart-exercise`
PANIC: PanicInfo { payload: Any { .. }, message: Some(I am a panic), location: Location { file: "src/main.rs", line: 43, col: 5 }, can_unwind: true, force_no_backtrace: false }

With the Task 1 completed:

$ cargo run
   Compiling uart-exercise v0.1.0 (/Users/jonathan/Documents/ferrous-systems/rust-exercises/qemu-code/uart-driver)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
     Running `qemu-system-arm -machine mps3-an536 -cpu cortex-r52 -semihosting -nographic -kernel target/armv8r-none-eabihf/debug/uart-exercise`
Hello, this is Rust!
    1.00     2.00     3.00     4.00     5.00     6.00     7.00     8.00     9.00    10.00
    2.00     4.00     6.00     8.00    10.00    12.00    14.00    16.00    18.00    20.00
    3.00     6.00     9.00    12.00    15.00    18.00    21.00    24.00    27.00    30.00
    4.00     8.00    12.00    16.00    20.00    24.00    28.00    32.00    36.00    40.00
    5.00    10.00    15.00    20.00    25.00    30.00    35.00    40.00    45.00    50.00
    6.00    12.00    18.00    24.00    30.00    36.00    42.00    48.00    54.00    60.00
    7.00    14.00    21.00    28.00    35.00    42.00    49.00    56.00    63.00    70.00
    8.00    16.00    24.00    32.00    40.00    48.00    56.00    64.00    72.00    80.00
    9.00    18.00    27.00    36.00    45.00    54.00    63.00    72.00    81.00    90.00
   10.00    20.00    30.00    40.00    50.00    60.00    70.00    80.00    90.00   100.00
PANIC: PanicInfo { payload: Any { .. }, message: Some(I am a panic), location: Location { file: "src/main.rs", line: 43, col: 5 }, can_unwind: true, force_no_backtrace: false }

Interactive TCP Echo Server

In this exercise, we will make a simple TCP "echo" server using APIs in Rust's Standard Library.

Here's how an interaction with it would look like from a client point of view. You connect to it using nc, for example:

nc localhost 7878

and type in one line of text. As soon as you hit enter, the server sends the line back but keeps the connection opened. You can type another line and get it back, and so on.

Here's an example interaction with the server. Notice that after typing a single line the connection is not closed and we receive the line back. All inputs and outputs should be separated by new line characters (\n).

$ nc localhost 7878
hello
> hello
world
> world

(> denotes the text that is sent back to you)

After completing this exercise you are able to

  • open a TCP port and react to TCP clients connecting

  • use I/O traits to read/write from a TCP socket

  • use threads to support multiple connections

Tasks

  1. Create a new binary project tcp-server
  2. Implement a basic TCP server that listens for connections on a given port (you can use 127.0.0.1:7878 or any other port that you like).
  3. Implement a loop that would read data from a TcpStream one line at a time. We assume that lines are separated by a \n character.
  4. Add writing the received line back to the stream. Resolve potential borrow checker issues using standard library APIs.
  5. Use Rust's thread API to add support for multiple connections.

Here's a bit of code to get you started:

use std::{io, net::{TcpListener, TcpStream}};

fn handle_client(mut stream: TcpStream) -> Result<(), io::Error> {
    todo!("read stream line by line, write lines back to the stream");
    // for line in stream {
    //   write line back to the to stream
    // }
    Ok(())
}

fn main() -> Result<(), io::Error> {
    let listener = todo!("bind a listener to 127.0.0.1:7878");
    for stream in todo!("accept incoming connections") {
        // todo!("support multiple connections in parallel");
        handle_client(stream)?;
    }
    Ok(())
}

Help

Reading line by line

Rust by Example has a chapter showing examples of reading files line by line that can be adapted to TcpStream, too.

Solving borrow checker issues

At some point you may run into borrow checker issues because you are essentially trying to write into a stream as you read from it.

The solution is to end up with two separate owned variables that perform reading and writing respectively.

There are two general approaches to do so:

  1. Simply clone the stream. TcpStream has a try_clone() method. This will not clone the stream itself: on the Operating System level there will still be a single connection. But from Rust perspective now this underlying OS resource will be represented by two distinct variables.
  2. Use the fact that Read and Write traits are implemented not only for TcpStream but also for &TcpStream. For example, you can create a pair of BufReader and BufWriter by passing &stream as an argument.

Troubleshooting I/O operations

If you decide to use BufWriter to handle writes you may not see any text echoed back in the terminal when using nc. As the name applies the output is buffered, and you need to explicitly call flush() method for text to be send out over the TCP socket.

Running nc on Windows

Windows doesn't come with a TCP client out of the box. You have a number of options:

  1. Git-for-Windows comes with Git-Bash - a minimal Unix emulation layer. It has Windows ports of many popular UNIX command-line utilities, including nc.
  2. If you have WSL setup your Linux environment has nc (or it is available as a package). You may either run the exercise in your Linux environment, too, or connect from Linux guest to your host.
  3. There's a Windows-native version of ncat from Nmap project that is available as a separate portable download
  4. If you have access to a remote Linux server you can use SSH tunnelling to connect remote nc to a TCP server running on your local machine. ssh -L 7878:<remote_host>:8888 <user>@<remote_host> -p <ssh_port> will let you run nc 0.0.0.0 8888 on your Linux box and talk to a locally run TCP Echo server example.
  5. If you have friends that can run nc you can let them connect to your developer machine and play a role of your client. It's often possible if you share the same local network with them, but you can always rely on ngrok or cloudflared to expose a specific TCP port to anyone on the internet.

Share data between connections

In this exercise we will take our interactive server and add a common log for lengths of messages that each client sends us. We will explore synchronization primitives that Rust offers in its Standard Library.

After completing this exercise you are able to

  • share data between threads using Mutexes

  • use reference-counting to ensure data stays available across multiple threads

  • use scoped threads to avoid runtime reference counting

  • use channels and message passing to share data among threads by communicating

Tasks

Part 1

  1. Add a log to store length of messages: let mut log: Vec<usize> = vec![];
  2. Pass it to a handle_client function and record a length of each incoming line of text:
    log.push(line.len());
  3. Resolve lifetime issues by using a reference-counting pointer.
  4. Resolve mutability issues by using a mutex

Part 2

  1. Use the thread::scope function to get rid of reference counting for log vector

Part 3

  1. Instead of sharing log vector use a mpsc::channel to send length of lines from worker threads.
  2. Create a separate thread that listens for new channel messages and updates the vector accordingly.

Writing an async chat

Nothing is simpler than creating a chat server, right? Not quite, chat servers expose you to all the fun of asynchronous programming:

How will the server handle clients connecting concurrently?

How will it handle them disconnecting?

How will it distribute the messages?

This tutorial explains how to write a chat server in tokio.

Specification and Getting Started

Specification

The chat uses a simple text protocol over TCP. The protocol consists of utf-8 messages, separated by \n.

The client connects to the server and sends login as a first line. After that, the client can send messages to other clients using the following syntax:

login1, login2, ... loginN: message

Each of the specified clients then receives a from login: message message.

A possible session might look like this

On Alice's computer:   |   On Bob's computer:

> alice                |   > bob
> bob: hello               < from alice: hello
                       |   > alice, bob: hi!
                           < from bob: hi!
< from bob: hi!        |

The main challenge for the chat server is keeping track of many concurrent connections. The main challenge for the chat client is managing concurrent outgoing messages, incoming messages and user's typing.

Getting Started

Let's create a new Cargo project:

$ cargo new a-chat
$ cd a-chat

Add the following lines to Cargo.toml:

[dependencies]
tokio = { version = "1", features = ["full"] }

Writing an Accept Loop

Let's implement the scaffold of the server: a loop that binds a TCP socket to an address and starts accepting connections.

First of all, let's add required import boilerplate:

extern crate tokio;
use std::future::Future; // 1
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader}, // 1
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs}, // 3
    sync::{mpsc, oneshot},
    task, // 2
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>; // 4
  1. Import some traits required to work with futures and streams.
  2. The task module roughly corresponds to the std::thread module, but tasks are much lighter weight. A single thread can run many tasks.
  3. For the socket type, we use TcpListener from tokio, which is similar to the sync std::net::TcpListener, but is non-blocking and uses async API.
  4. We will skip implementing detailled error handling in this example. To propagate the errors, we will use a boxed error trait object. Do you know that there's From<&'_ str> for Box<dyn Error> implementation in stdlib, which allows you to use strings with ? operator?

Now we can write the server's accept loop:

extern crate tokio;
use tokio::net::{TcpListener, ToSocketAddrs};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> { // 1
    let listener = TcpListener::bind(addr).await?; // 2

    loop { // 3
        let (stream, _) = listener.accept().await?;
        // TODO
    }

    Ok(())
}
  1. We mark the accept_loop function as async, which allows us to use .await syntax inside.
  2. TcpListener::bind call returns a future, which we .await to extract the Result, and then ? to get a TcpListener. Note how .await and ? work nicely together. This is exactly how std::net::TcpListener works, but with .await added.
  3. We generally use loop and break for looping in Futures, that makes things easier down the line.

Finally, let's add main:

extern crate tokio;
use tokio::net::{ToSocketAddrs};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
Ok(())
}

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

The crucial thing to realise that is in Rust, unlike other languages, calling an async function does not run any code. Async functions only construct futures, which are inert state machines. To start stepping through the future state-machine in an async function, you should use .await. In a non-async function, a way to execute a future is to hand it to the executor.

Receiving messages

Let's implement the receiving part of the protocol. We need to:

  1. split incoming TcpStream on \n and decode bytes as utf-8
  2. interpret the first line as a login
  3. parse the rest of the lines as a login: message

We highly recommend to go past this quick, this is a lot of protocol minutia.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;
    loop {
        let (stream, _socket_addr) = listener.accept().await?;
        println!("Accepting from: {}", stream.peer_addr()?);
        let _handle = task::spawn(connection_loop(stream));
    }
    Ok(())
}

async fn connection_loop(stream: TcpStream) -> Result<()> {
    let reader = BufReader::new(stream);
    let mut lines = reader.lines(); // 2

    // 3
    let name = match lines.next_line().await? {
        None => Err("peer disconnected immediately")?,
        Some(line) => line,
    };
    println!("name = {}", name);

    // 4
    loop {
        if let Some(line) = lines.next_line().await? {
            // 5
            let (dest, msg) = match line.find(':') {
                None => continue,
                Some(idx) => (&line[..idx], line[idx + 1..].trim()),
            };
            let dest = dest
                .split(',')
                .map(|name| name.trim().to_string())
                .collect::<Vec<_>>();
            let msg = msg.to_string();
            // TODO: this is temporary
            println!("Received message: {}", msg);
        } else {
            break
        }
    }
    Ok(())
}
  1. We use task::spawn function to spawn an independent task for working with each client. That is, after accepting the client the accept_loop immediately starts waiting for the next one. This is the core benefit of event-driven architecture: we serve many clients concurrently, without spending many hardware threads.

  2. Luckily, the "split byte stream into lines" functionality is already implemented. .lines() call returns a stream of String's.

  3. We get the first line -- login

  4. And, once again, we implement a manual async loop.

  5. Finally, we parse each line into a list of destination logins and the message itself.

Managing Errors

One serious problem in the above solution is that, while we correctly propagate errors in the connection_loop, we just drop the error on the floor afterwards! That is, task::spawn does not return an error immediately (it can't, it needs to run the future to completion first), only after it is joined. We can "fix" it by waiting for the task to be joined, like this:

extern crate tokio;
use tokio::{
    net::TcpStream,
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
async fn connection_loop(stream: TcpStream) -> Result<()> {
Ok(())
}

async fn accept_loop(stream: TcpStream) -> Result<()> {
let handle = task::spawn(connection_loop(stream));
handle.await?
}

The .await waits until the client finishes, and ? propagates the result.

There are two problems with this solution however! First, because we immediately await the client, we can only handle one client at a time, and that completely defeats the purpose of async! Second, if a client encounters an IO error, the whole server immediately exits. That is, a flaky internet connection of one peer brings down the whole chat room!

A correct way to handle client errors in this case is log them, and continue serving other clients. So let's use a helper function for this:

extern crate tokio;
use std::future::Future;
use tokio::task;
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}

Sending Messages

Now it's time to implement the other half -- sending messages. As a rule of thumb, only a single task should write to each TcpStream. This way, we also have compartmentalised that activity and automatically serialize all outgoing messages. So let's create a connection_writer_loop task which receives messages over a channel and writes them to the socket. If Alice and Charley send two messages to Bob at the same time, Bob will see the messages in the same order as they arrive in the channel.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::oneshot,
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
use tokio::sync::mpsc; // 1

type Sender<T> = mpsc::UnboundedSender<T>; // 2
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf // 3
) -> Result<()> {
    loop {
        let msg = messages.recv().await;
        match msg {
            Some(msg) => stream.write_all(msg.as_bytes()).await?,
            None => break,
        }
    }
    Ok(())
}
  1. We will use mpsc channels from tokio.

  2. For simplicity, we will use unbounded channels, and won't be discussing backpressure in this tutorial.

  3. As connection_loop and connection_writer_loop share the same TcpStream, we use splitting. We'll glue this together later.

    extern crate tokio;
    use tokio::net::TcpStream;
    async fn connection_loop(stream: TcpStream) {
    
    use tokio::net::tcp;
    let (reader, writer): (tcp::OwnedReadHalf, tcp::OwnedWriteHalf) = stream.into_split();
    }

A broker as a connection point

So how do we make sure that messages read in connection_loop flow into the relevant connection_writer_loop? We should somehow maintain a peers: HashMap<String, Sender<String>> map which allows a client to find destination channels. However, this map would be a bit of shared mutable state, so we'll have to wrap an RwLock over it and answer tough questions of what should happen if the client joins at the same moment as it receives a message.

One trick to make reasoning about state simpler is by taking inspiration from the actor model. We can create a dedicated broker task which owns the peers map and communicates with other tasks using channels. The broker reacts on events and appropriately informs the peers. By hiding peer handling inside such an "actor" task, we remove the need for mutexes and also make the serialization point explicit. The order of events "Bob sends message to Alice" and "Alice joins" is determined by the order of the corresponding events in the broker's event queue.

extern crate tokio;
use std::future::Future;
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
) -> Result<()> {
Ok(())
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}

use std::collections::hash_map::{Entry, HashMap};

#[derive(Debug)]
enum Event { // 1
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new(); // 2

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };

        match event {
            Event::Message { from, to, msg } => { // 3
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {from}: {msg}\n");
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer { name, mut stream } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender); // 4
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    }); // 5
                }
            },
        }
    }
}
  1. The broker task should handle two types of events: a message or an arrival of a new peer.
  2. The internal state of the broker is a HashMap. Note how we don't need a Mutex here and can confidently say, at each iteration of the broker's loop, what is the current set of peers.
  3. To handle a message, we send it over a channel to each destination.
  4. To handle a new peer, we first register it in the peer's map ...
  5. ... and then spawn a dedicated task to actually write the messages to the socket.

Gluing all together

At this point, we only need to start the broker to get a fully-functioning (in the happy case!) chat.

Scroll past the example find a list of all changes.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::mpsc,
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel(); // 1
    let _broker = task::spawn(broker_loop(broker_receiver));

    while let Ok((stream, _socket_addr)) = listener.accept().await {
        println!("Accepting from: {}", stream.peer_addr()?);
        spawn_and_log_error(connection_loop(broker_sender.clone(), stream));
    }
    Ok(())
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> { // 2
    let (reader, writer) = stream.into_split(); // 3
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();

    let name = match lines.next_line().await {
        Ok(Some(line)) => line,
        Ok(None) => return Err("peer disconnected immediately".into()),
        Err(e) => return Err(Box::new(e)),
    };

    println!("user {} connected", name);

    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
        })
        .unwrap(); // 5

    loop {
        if let Some(line) = lines.next_line().await? {
            let (dest, msg) = match line.find(':') {
                None => continue,
                Some(idx) => (&line[..idx], line[idx + 1..].trim()),
            };
            let dest: Vec<String> = dest
                .split(',')
                .map(|name| name.trim().to_string())
                .collect();
            let msg: String = msg.trim().to_string();

            broker
                .send(Event::Message { // 4
                    from: name.clone(),
                    to: dest,
                    msg,
                })
                .unwrap();
        } else {
            break;
        }
    }

    Ok(())
}

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf // 3
) -> Result<()> {
    loop {
        let msg = messages.recv().await;
        match msg {
            Some(msg) => stream.write_all(msg.as_bytes()).await?,
            None => break,
        }
    }
    Ok(())
}

#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };
        match event {
            Event::Message { from, to, msg } => {
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {from}: {msg}\n");
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer { name, mut stream } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    });
                }
            },
        }
    }
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}
  1. Inside the accept_loop, we create the broker's channel and task.
  2. We need the connection_loop to accept a handle to the broker.
  3. Inside connection_loop, we need to split the TcpStream, to be able to share it with the connection_writer_loop.
  4. On login, we notify the broker. Note that we .unwrap on send: broker should outlive all the clients and if that's not the case the broker probably panicked, so we can escalate the panic as well.
  5. Similarly, we forward parsed messages to the broker, assuming that it is alive.

Clean Shutdown

One of the problems of the current implementation is that it doesn't handle graceful shutdown. If we break from the accept loop for some reason, all in-flight tasks are just dropped on the floor.

We will intercept Ctrl-C.

A more correct shutdown sequence would be:

  1. Stop accepting new clients
  2. Notify the readers we're not accepting new messages
  3. Deliver all pending messages
  4. Exit the process

A clean shutdown in a channel based architecture is easy, although it can appear a magic trick at first. In Rust, receiver side of a channel is closed as soon as all senders are dropped. That is, as soon as producers exit and drop their senders, the rest of the system shuts down naturally. In tokio this translates to two rules:

  1. Make sure that channels form an acyclic graph.
  2. Take care to wait, in the correct order, until intermediate layers of the system process pending messages.

In a-chat, we already have an unidirectional flow of messages: reader -> broker -> writer. However, we never wait for broker and writers, which might cause some messages to get dropped.

We also need to notify all readers that we are going to stop accepting messages. Here, we use tokio::sync::Notify.

Let's first add the notification feature to the readers. We have to start using select! here to work

async fn connection_loop(broker: Sender<Event>, stream: TcpStream, shutdown: Arc<Notify>) -> Result<()> {
    // ...
    loop {
        tokio::select! {
            Ok(Some(line)) = lines.next_line() => {
                let (dest, msg) = match line.split_once(':') {

                    None => continue,
                    Some((dest, msg)) => (dest, msg.trim()),
                };
                let dest: Vec<String> = dest
                    .split(',')
                    .map(|name| name.trim().to_string())
                    .collect();
                let msg: String = msg.trim().to_string();
        
                broker
                    .send(Event::Message {
                        from: name.clone(),
                        to: dest,
                        msg,
                    })
                    .unwrap();
            },
            _ = shutdown.notified() => break,
        }
    }
}

Let's add Ctrl-C handling and waiting to the server.

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot, Notify},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}
async fn broker_loop(mut events: Receiver<Event>) {}
async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    Ok(())
}
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel();
    let broker = task::spawn(broker_loop(broker_receiver));
    let shutdown_notification = Arc::new(Notify::new());

    loop {
        tokio::select!{
            Ok((stream, _socket_addr)) = listener.accept() => {
                println!("Accepting from: {}", stream.peer_addr()?);
                spawn_and_log_error(connection_loop(broker_sender.clone(), stream, shutdown_notification.clone()));
            },
            _ = tokio::signal::ctrl_c() => break,
        }
    }
    println!("Shutting down server!");
    shutdown_notification.notify_waiters(); // 1
    drop(broker_sender); // 2
    broker.await?; // 5
    Ok(())
}

And to the broker:

extern crate tokio;
use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
};
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}
async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    Ok(())
}
fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}
async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>,
) -> Result<()> {
    Ok(())
}

async fn broker_loop(mut events: Receiver<Event>) {
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = match events.recv().await {
            Some(event) => event,
            None => break,
        };        
        match event {
            Event::Message { from, to, msg } => {
                // ...
            }
            Event::NewPeer {
                name,
                mut stream,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    spawn_and_log_error(async move {
                        connection_writer_loop(&mut client_receiver, &mut stream).await
                    });
                }
            },
        }
    }

    drop(peers) //4
}

Notice what happens with all of the channels once we exit the accept loop:

  1. We notify all readers to stop accepting messages.
  2. We drop the main broker's sender. That way when the readers are done, there's no sender for the broker's channel, and the channel closes.
  3. Next, the broker exits while let Some(event) = events.next().await loop.
  4. It's crucial that, at this stage, we drop the peers map. This drops writer's senders.
  5. Tokio will automatically wait for all finishing futures
  6. Finally, we join the broker, which also guarantees that all the writes have terminated.

Handling Disconnections

Currently, we only ever add new peers to the map. This is clearly wrong: if a peer closes connection to the chat, we should not try to send any more messages to it.

One subtlety with handling disconnection is that we can detect it either in the reader's task, or in the writer's task. The most obvious solution here is to just remove the peer from the peers map in both cases, but this would be wrong. If both read and write fail, we'll remove the peer twice, but it can be the case that the peer reconnected between the two failures! To fix this, we will only remove the peer when the write side finishes. If the read side finishes we will notify the write side that it should stop as well. That is, we need to add an ability to signal shutdown for the writer task.

One way to approach this is a shutdown: Receiver<()> channel. There's a more minimal solution however, which makes clever use of RAII. Closing a channel is a synchronization event, so we don't need to send a shutdown message, we can just drop the sender. This way, we statically guarantee that we issue shutdown exactly once, even if we early return via ? or panic.

First, let's add a shutdown channel to the connection_loop:

extern crate tokio;
use std::future::Future;
use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot},
    task,
};
type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
) -> Result<()> {
Ok(())
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    unimplemented!()
}


#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream) -> Result<()> {
    let (reader, writer) = stream.into_split();
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();
    let name: String = String::new();
    // ...
    let (_shutdown_sender, shutdown_receiver) = oneshot::channel::<()>();
    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
            shutdown: shutdown_receiver,
        })
        .unwrap();
    // ...
  unimplemented!()
}
  1. To enforce that no messages are sent along the shutdown channel, we use a oneshot channel.
  2. We pass the shutdown channel to the writer task.
  3. In the reader, we create a _shutdown_sender whose only purpose is to get dropped.

In the connection_writer_loop, we now need to choose between shutdown and message channels. We use the select macro for this purpose:

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>,
) -> Result<()> {
    loop {
        tokio::select! {
            msg = messages.recv() => match msg {
                Some(msg) => stream.write_all(msg.as_bytes()).await?,
                None => break,
            },
            _ = &mut shutdown => break
        }
    }

    println!("Closing connection_writer loop!");

    Ok(())
}
  1. We add shutdown channel as an argument.
  2. Because of select, we can't use a while let loop, so we desugar it further into a loop.
  3. In the shutdown case break the loop.

Another problem is that between the moment we detect disconnection in connection_writer_loop and the moment when we actually remove the peer from the peers map, new messages might be pushed into the peer's channel.

The final thing to handle is actually clean up our peers map. Here, we need to establish a communication back to the broker. However, we can handle that completely within the brokers scope, to not infect the writer loop with this concern.

To not lose these messages completely, we'll return the writers messages receiver back to the broker. This also allows us to establish a useful invariant that the message channel strictly outlives the peer in the peers map, and makes the broker itself infallible.

async fn broker_loop(mut events: Receiver<Event>) {
    let (disconnect_sender, mut disconnect_receiver) =
        mpsc::unbounded_channel::<(String, Receiver<String>)>(); // 1
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = tokio::select! {
            event = events.recv() => match event {
                None => break,
                Some(event) => event,
            },
            disconnect = disconnect_receiver.recv() => {
                let (name, _pending_messages) = disconnect.unwrap();
                assert!(peers.remove(&name).is_some());
                println!("user {} disconnected", name);
                continue;
            },
        };
        match event {
            Event::Message { from, to, msg } => {
                // ...
            }
            Event::NewPeer {
                name,
                mut stream,
                shutdown,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    // ...
                    spawn_and_log_error(async move {
                        let res =
                            connection_writer_loop(&mut client_receiver, &mut stream, shutdown)
                                .await;
                        println!("user {} disconnected", name);
                        disconnect_sender.send((name, client_receiver)).unwrap(); // 2
                        res
                    });
                }
            },
        }
    }
    drop(peers);
    drop(disconnect_sender);
    while let Some((_name, _pending_messages)) = disconnect_receiver.recv().await {}
}

Final Server Code

The final code looks like this:

use std::{
    collections::hash_map::{Entry, HashMap},
    future::Future,
    sync::Arc,
};

use tokio::{
    io::{AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{tcp::OwnedWriteHalf, TcpListener, TcpStream, ToSocketAddrs},
    sync::{mpsc, oneshot, Notify},
    task,
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
type Sender<T> = mpsc::UnboundedSender<T>;
type Receiver<T> = mpsc::UnboundedReceiver<T>;

#[tokio::main]
pub(crate) async fn main() -> Result<()> {
    accept_loop("127.0.0.1:8080").await
}

async fn accept_loop(addr: impl ToSocketAddrs) -> Result<()> {
    let listener = TcpListener::bind(addr).await?;

    let (broker_sender, broker_receiver) = mpsc::unbounded_channel();
    let broker = task::spawn(broker_loop(broker_receiver));
    let shutdown_notification = Arc::new(Notify::new());

    loop {
        tokio::select!{
            Ok((stream, _socket_addr)) = listener.accept() => {
                println!("Accepting from: {}", stream.peer_addr()?);
                spawn_and_log_error(connection_loop(broker_sender.clone(), stream, shutdown_notification.clone()));
            },
            _ = tokio::signal::ctrl_c() => break,
        }
    }
    println!("Shutting down!");
    shutdown_notification.notify_waiters();
    drop(broker_sender);
    broker.await?;
    Ok(())
}

async fn connection_loop(broker: Sender<Event>, stream: TcpStream, shutdown: Arc<Notify>) -> Result<()> {
    let (reader, writer) = stream.into_split();
    let reader = BufReader::new(reader);
    let mut lines = reader.lines();
    let (shutdown_sender, shutdown_receiver) = oneshot::channel::<()>();

    let name = match lines.next_line().await {
        Ok(Some(line)) => line,
        Ok(None) => return Err("peer disconnected immediately".into()),
        Err(e) => return Err(Box::new(e)),
    };

    println!("user {} connected", name);

    broker
        .send(Event::NewPeer {
            name: name.clone(),
            stream: writer,
            shutdown: shutdown_receiver,
        })
        .unwrap();
    
    loop {
        tokio::select! {
            Ok(Some(line)) = lines.next_line() => {
                let (dest, msg) = match line.split_once(':') {

                    None => continue,
                    Some((dest, msg)) => (dest, msg.trim()),
                };
                let dest: Vec<String> = dest
                    .split(',')
                    .map(|name| name.trim().to_string())
                    .collect();
                let msg: String = msg.trim().to_string();
        
                broker
                    .send(Event::Message {
                        from: name.clone(),
                        to: dest,
                        msg,
                    })
                    .unwrap();
            },
            _ = shutdown.notified() => break,
        }
    }
    println!("Closing connection loop!");
    drop(shutdown_sender);

    Ok(())
}

async fn connection_writer_loop(
    messages: &mut Receiver<String>,
    stream: &mut OwnedWriteHalf,
    mut shutdown: oneshot::Receiver<()>,
) -> Result<()> {
    loop {
        tokio::select! {
            msg = messages.recv() => match msg {
                Some(msg) => stream.write_all(msg.as_bytes()).await?,
                None => break,
            },
            _ = &mut shutdown => break
        }
    }

    println!("Closing connection_writer loop!");

    Ok(())
}

#[derive(Debug)]
enum Event {
    NewPeer {
        name: String,
        stream: OwnedWriteHalf,
        shutdown: oneshot::Receiver<()>,
    },
    Message {
        from: String,
        to: Vec<String>,
        msg: String,
    },
}

async fn broker_loop(mut events: Receiver<Event>) {
    let (disconnect_sender, mut disconnect_receiver) =
        mpsc::unbounded_channel::<(String, Receiver<String>)>();
    let mut peers: HashMap<String, Sender<String>> = HashMap::new();

    loop {
        let event = tokio::select! {
            event = events.recv() => match event {
                None => break,
                Some(event) => event,
            },
            disconnect = disconnect_receiver.recv() => {
                let (name, _pending_messages) = disconnect.unwrap();
                assert!(peers.remove(&name).is_some());
                println!("user {} disconnected", name);
                continue;
            },
        };
        match event {
            Event::Message { from, to, msg } => {
                for addr in to {
                    if let Some(peer) = peers.get_mut(&addr) {
                        let msg = format!("from {}: {}\n", from, msg);
                        peer.send(msg).unwrap();
                    }
                }
            }
            Event::NewPeer {
                name,
                mut stream,
                shutdown,
            } => match peers.entry(name.clone()) {
                Entry::Occupied(..) => (),
                Entry::Vacant(entry) => {
                    let (client_sender, mut client_receiver) = mpsc::unbounded_channel();
                    entry.insert(client_sender);
                    let disconnect_sender = disconnect_sender.clone();
                    spawn_and_log_error(async move {
                        let res =
                            connection_writer_loop(&mut client_receiver, &mut stream, shutdown)
                                .await;
                        println!("user {} disconnected", name);
                        disconnect_sender.send((name, client_receiver)).unwrap();
                        res
                    });
                }
            },
        }
    }
    drop(peers);
    drop(disconnect_sender);
    while let Some((_name, _pending_messages)) = disconnect_receiver.recv().await {}
}

fn spawn_and_log_error<F>(fut: F) -> task::JoinHandle<()>
where
    F: Future<Output = Result<()>> + Send + 'static,
{
    task::spawn(async move {
        if let Err(e) = fut.await {
            eprintln!("{}", e)
        }
    })
}

Implementing a client

Since the protocol is line-based, implementing a client for the chat is straightforward:

  • Lines read from stdin should be sent over the socket.
  • Lines read from the socket should be echoed to stdout.

Although async does not significantly affect client performance (as unlike the server, the client interacts solely with one user and only needs limited concurrency), async is still useful for managing concurrency!

The client has to read from stdin and the socket simultaneously. Programming this with threads is cumbersome, especially when implementing a clean shutdown. With async, the select! macro is all that is needed.

extern crate tokio;
use tokio::{
    io::{stdin, AsyncBufReadExt, AsyncWriteExt, BufReader},
    net::{TcpStream, ToSocketAddrs},
};

type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;

// main
async fn run() -> Result<()> {
    try_main("127.0.0.1:8080").await
}

async fn try_main(addr: impl ToSocketAddrs) -> Result<()> {
    let stream = TcpStream::connect(addr).await?;
    let (reader, mut writer) = stream.into_split();

    let mut lines_from_server = BufReader::new(reader).lines(); // 2
    let mut lines_from_stdin = BufReader::new(stdin()).lines(); // 2

    loop {
        tokio::select! { // 3
            line = lines_from_server.next_line() => match line {
                Ok(Some(line)) => {
                    println!("{}", line);
                },
                Ok(None) => break,
                Err(e) => eprintln!("Error {:?}:", e),
            },
            line = lines_from_stdin.next_line() => match line {
                Ok(Some(line)) => {
                    writer.write_all(line.as_bytes()).await?;
                    writer.write_all(b"\n").await?;
                },
                Ok(None) => break,
                Err(e) => eprintln!("Error {:?}:", e),
            }
        }
    }
    Ok(())
}
  1. Here we split TcpStream into read and write halves.
  2. We create a stream of lines for both the socket and stdin.
  3. In the main select loop, we print the lines we receive from the server and send the lines we read from the console.