David Beck

Thoughts

Follow on GitHub

Learning Rust: Iterator

17 Apr 2016 by David Beck on [LinkedIn] / [Feed]

The second episode of my Rust experiments lightly scratches the surface of iterators. There are a lot of good and better articles about them. At the end of this post I put some references. There are a few points that was not immediately obvious about the iterators, hence this blogpost.

In the previous post I started with the development of a circular buffer and implemented a put method for adding elements into the buffer. First I implemented a get_range function that was complicated and I didn’t like it. Then I dove into iterators and I am very happy that I did.

iterator chaining

My use case

My circular buffer owns its elements in a preallocated fixed size vector. All items are identified by a sequence number. When the buffer is full the writer starts overwriting past elements. The writer doesn’t wait for the reader to gather elements.

The reader can get at most as many elements as the buffer’s size.

Iterator

This is very well explained in other Rust documents, but please allow me to reiterate.

The iterator is an object that implements the Iterator trait. It requires to have a next() method which will be called by the iterator adaptors or consumers:

  fn next(&mut self) -> Option<Self::Item>

The return value tells the users if there are more items to be iterated over or returns None if not.

Iterators can be chained together with adaptors and a consumer. This allows chaining up a data source, a number of transformations and a destination. An example from Herman J. Radtke III looks like this:

fn use_names_for_something_else(_names: Vec<&str>) {
}

fn main() {
    let names = vec!["Jane", "Jill", "Jack", "John"];

    let total_bytes = names
        .iter()
        .map(|name: &&str| name.len())
        .fold(0, |acc, len| acc + len );

    assert_eq!(total_bytes, 16);
    use_names_for_something_else(names);
}

Notice how similar the iterator chain is to Elixir pipes.

Generator example

Rust iterators are impressive. I especially like them being lazily evaluated. One example that I am particularly keen on is when the iterator is actually a generator of a sequence that is fed into a chain of other functions.

This generator example is copied from the ‘A Journey into Iterators’ post at hoverbear.org, Andrew Hobden’s blog:

struct CountUp {
    current: usize,
}

impl Iterator for CountUp {
    type Item = usize;
    // The only fn we need to provide for a basic iterator.
    fn next(&mut self) -> Option<usize> {
        self.current += 1;
        Some(self.current)
    }
}

fn main() {
    // In more sophisticated code use `::new()` from `impl CountUp`
    let iterator = CountUp { current: 0 };
    // This is an infinite iterator, only take so many.
    let output = iterator.take(20).collect::<Vec<_>>();
    println!("{:?}", output);
}
// Outputs [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

Another generator type iterator is available for the Fibonacci sequence at the Rust by example website.

Implementation

Since my circular buffer is planned to be a single publisher-single consumer queue, I could have chosen to modify the buffer so it has a read position too. But, I didn’t like it that way, because I wanted a separate object to be responsible for the reading of the data.

Doing that I need to reference the container form the iterator. Unfortunately there was not so many examples for this, but it was fairly simple.

My circular buffer looks like this:

struct CircularBuffer<T : Copy> {
  seqno  : usize,
  data   : Vec<T>,
}

The iterator holds a reference to the CB and two variables for the range it iterates over. This is the first time in my short Rust career that I need to take care of lifetimes. Later I realized that adding a lifetime parameter creates a separate type:

struct CircularBufferIterator<'a, T: 'a + Copy> {
  buffer    : &'a CircularBuffer<T>,
  position  : usize,
  limit     : usize,
}

position is where it starts iterating from and limit tells where the end is. The iterator is generated by the iter() functions:

impl <T : Copy> CircularBuffer<T> {
  fn min_pos(&self) -> usize {
    if self.seqno < self.data.len() {
      0
    } else {
      (self.seqno - self.data.len()) as usize
    }
  }

  fn iter(&self) -> CircularBufferIterator<T> {
    CircularBufferIterator{
      buffer    : self,
      position  : self.min_pos(),
      limit     : self.seqno
     }
  }
}

The important point here is the CircularBufferIterator must depend on the lifetime of the CircularBuffer. It is very nice, that Rust knows this lifetime dependency and doesn’t complicate my life in the iter() function.

The final step is to make sure that the new CircularBufferIterator implements what is needed to conform to the Iterator trait:

impl <'_, T: '_ + Copy> Iterator for CircularBufferIterator<'_, T> {
  type Item = T;

  fn next(&mut self) -> Option<T> {
    if self.position >= self.limit {
      None
    } else {
      let at = self.position % self.buffer.data.len();
      self.position += 1;
      match self.buffer.data.get(at) {
        Some(v) => Some(*v),
        None => None
      }
    }
  }
}

The iterator implementation needs to match a CircularBufferIterator that has a lifetime parameter. I didn’t add this at first, since the next() function doesn’t seem to need it. This results in this error message:

src/simple/mod.rs:68:29: 68:54 error: wrong number of lifetime parameters: expected 1, found 0 [E0107]
src/simple/mod.rs:68 impl <T: Copy> Iterator for CircularBufferIterator<T> {
                                                 ^~~~~~~~~~~~~~~~~~~~~~~~~
src/simple/mod.rs:68:29: 68:54 help: run `rustc --explain E0107` to see a detailed explanation
error: aborting due to previous error

Fair enough. Then I tried this one:

impl <'a, T: 'a + Copy> Iterator for CircularBufferIterator<'a, T> {

This compiled, but since I don’t use the 'a lifetime anywhere I preferred the one with the wildcard above.

Usage

When I started writing tests for my original get_range function I realized that it would be a lot easier to test the CircularBuffer with the builtin adaptors:

  #[test]
  fn sum_available() {
    let mut x = CircularBuffer::new(4, 0 as i32);
    x.put(|v| *v = 2);
    x.put(|v| *v = 4);
    x.put(|v| *v = 6);
    x.put(|v| *v = 8);
    x.put(|v| *v = 10);
    assert_eq!(x.iter().count(), 4);
    let sum = x.iter().take(3).fold(0, |acc, num| acc + num);
    assert_eq!(sum, 18);
  }

Finally I got rid of get_range function in favor of the CircularBufferIterator.

Rust iterator resources

Rust version

$ rustc --version
rustc 1.7.0 (a5d1e7a59 2016-02-29)

Git repo

I opened a github repo for this experiment series.

Episodes of this series

  1. Closures
  2. Iterator
  3. Yet Another Lock-Free Queue
  4. Sharing My Queue Between Threads