Skip to main content

Arrow Rust Crash Course

Through these projects you will learn what you need to contribute to the Arrow Rust codebase.

Attempt these steps without looking them up first! You'll only learn by trying them yourself.

You're meant to struggle a bit - don't worry! The compiler will help :)

When you're ready for help, many of the instructions link to helpful resources or Rust docs.

Level 0 - Libraries, Tests, Refs

1) Compile the Hello, Cargo! project

2) Create src/lib.rs and add it as a library to Cargo.toml

Solution

# in Cargo.toml
[lib]
name = "adder"
path = "src/lib.rs"
test = true

3) Add a function to src/lib.rs which takes an unsigned integer as an argument and returns the sum.

fn add_one(x: u32) -> u32 {
// what are two ways to return a value from this function?
// ...
}
Solution

fn add_one(x: u32) -> u32 {
x + 1

// Could also return with:
// return x + 1;
}

Notice that there is no semicolon ; after x+1. return is not needed when the returned value is the last expression in the function.

Try adding a semicolon and see the compiler explanation!

$ cargo build
error[E0308]: mismatched types
--> src/main.rs:1:23
|
1 | fn add_one(x: u32) -> u32 {
| ------- ^^^ expected `u32`, found `()`
| |
| implicitly returns `()` as its body has no tail or `return` expression
2 | x + 1;
| - help: remove this semicolon

4) Add a unit test to the bottom of lib.rs for add_one

  • Unit tests validate the internal workings of a module
  • Use assert_eq! to compare the result with the expected result
Solution

// Bottom of lib.rs
#[cfg(test)]
mod tests {
// `mod tests` is its own context, so functions must be
// imported from the library even though the tests are
// in the same file

// Imports all functions from file, specifically `add_one`
use super::*;

#[test]
fn ut_add_one() {
let x: u32 = 5;
let result = add_one(x);
assert_eq!(x + 1, result);
}
}

5) Add an integration test for add_one

  • This should go into a new file: tests/integration_test.rs
Solution

// Top of tests/integration_test.rs
#[test]
fn it_add_one() {
let x: u32 = 5;

// No imports needed, all public functions
// from all libraries are accessible by
// integration tests
assert_eq!(adder::add_one(x), x+1);
}

  • You may get the following compilation error:
error[E0603]: function `add_one` is private
--> tests/integration_test.rs:4:23
|
4 | assert_eq!(adder::add_one(x), x+1);
| ^^^^^^^ private function
  • To test this function in integration_test.rs, we need to make add_one public.
    • pub fn add_one(x: u32) -> u32
  • This emphasizes what integration tests are for!
    • Integration tests should validate the public interfaces (inputs and outputs) of a module
    • These tests provide confidence that other modules can interact with this module's public interfaces without causing an error
    • By proving this, we verify module integration, the inter-operation of multiple modules as a successful whole

6) Modify add_one to instead update x in place, with no return value.

Solution

// in src/lib.rs
// Pass a mutable/editable reference to the variable: &mut
pub fn add_one(x: &mut u32) {
// Since X is now a pointer (&), must dereference with *
*x += 1
}

#[cfg(test)]
mod tests {

use super::*;

#[test]
fn ut_add_one() {
let mut x: u32 = 5;
let orig = x;

// Error if we try add_one(x)
// Add one only accepts a mutable pointer
// with &mut we pass the pointer to x to the function
add_one(&mut x); // no return value!
assert_eq!(orig + 1, x);
}
}
// in tests/integration_test.rs
#[test]
fn it_add_one() {
let start: u32 = 5;
let mut x: u32 = start;
adder::add_one(&mut x);
assert_eq!(x, start + 1);
}

Level 1 - Generics, Traits, Imports

1) Write a function which returns the largest element of a vector of type u32

  • fn largest(xs: &[u32]) -> u32
  • Why should largest take a pointer to the list (&[u32]) instead of the list ([u32])?
    • Consider the size of a pointer versus the size of the list
  • Why not a mutable pointer &mut?
Solution

// There are several implementations possible
// largest takes a pointer to the vector instead of copying the
// the whole vector. A pointer is 4 bytes. The vector might
// be hundreds of bytes
// The pointer is not mutable (&mut[u32]) because we do not want
// the list to be modified. We only want to find the largest
// element.
// This is a powerful protection, similar to const arguments in
// C/C++, but by default.
fn largest(list: &[u32]) -> u32 {
let mut largest: u32 = list[0];

// &item means a reference to (and not a copy of) item
for &item in list {
if item > largest {
largest = item;
}
}

largest
}

#[cfg(test)]
mod tests {

use super::*;

fn ut_largest() {
let max = 100;
let xs = vec![max - 1, max - 10, max - 50];
assert_eq!(max, largest(&xs));
}
}

2) Modify largest to return the largest element of a vector of any type

  • This one is tricky!
  • You'll learn about Generics and traits, specifically PartialOrd and Copy
Solution

// How to read:
// return the largest item from list of items of type T
// where T must be Orderable/Sortable and Copyable
fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
// Makes a copy of an element of type T
// therefore T must be Copyable
let mut largest: T = list[0];

for &item in list {
// Compares items of type T
// Greater than (>) operator requires two objects
// that can be compared
// Therefore PartialOrd trait
if item > largest {
largest = item;
}
}

largest
}

  • Run the same unit tests again, should pass!
  • Try changing the unit tests to add together other types: f32, u8, i16, etc.

3) Add a data structure called Ticket.

  • Ticket should have two fields:
    • u32 id
    • DateTime<UTC> timestamp
  • We need to import an external crate: chrono
Solution

use chrono::{DateTime, Utc};

struct Ticket {
id: u32,
timestamp: DateTime<Utc>
}

4) Create a unit test ut_largest_ticket which does the following:

  • Creates a vector tickets with the following two Ticket elements:
    • Ticket { id: 1, timestamp: Utc::now() }
    • Ticket { id: 2, timestamp: Utc::now() + Duration::minutes(30) }
  • Passes this Ticket vector to the largest function.
  • assert_eq! that the ID of the returned element is 2.
Solution

#[cfg(test)]
mod tests {

use super::*;
// Duration is used in unit tests, but not in library
// import here
use chrono::{Duration};

#[test]
fn ut_largest() {
let tickets = vec![
Ticket { id: 1, timestamp: Utc::now() },
Ticket { id: 2, timestamp: Utc::now() + Duration::minutes(30) }
];
assert_eq!(2, largest(&tickets).id);
}
}

  • Should fail to compile!

5) The compiler doesn't know how to compare two Ticket objects yet.

Solution

#[derive(Debug, Eq, Copy, Clone)]
struct Ticket {
id: u32,
timestamp: DateTime<Utc>
}

6) The compiler needs to be told how to determine if one Ticket is "larger" than another.

Solution

use std::cmp::Ordering;

#[derive(Debug, Eq, Copy, Clone)]
struct Ticket {
id: u32,
timestamp: DateTime<Utc>
}

impl Ord for Ticket {
fn cmp(&self, other: &Self) -> Ordering {
self.timestamp.cmp(&other.timestamp)
}
}

impl PartialOrd for Ticket {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}

impl PartialEq for Ticket {
fn eq(&self, other: &Self) -> bool {
self.timestamp == other.timestamp
}
}

7) Run the unit tests again.

  • Should pass!

Level 2

  • Coming Soon
  • Option, trait, enum