Simple Rust, Part Two

Arrays, Memory and the &

Posted by Ian Whitney on November 14, 2015

Last time we wrote half of our anagrams_for function signature. We looked at the different ways we can declare a string, and how to decide which type our function would accept. Exciting!

We have one more parameter to handle, though. The tests we have from Exercism have us calling anagrams_for like this:

  let inputs = ["tan", "stand", "at"];
  anagram::anagrams_for("ant", &inputs)

And our current anagrams_for accepts just the “ant” part of that.

fn anagrams_for(s: &str) {
  //Amazing logic TBD
}

We have two problems to solve:

  • What is inputs type?
  • What is that & that keeps showing up?

Turns out these two things have much in common, and they reveal an important part of Rust’s memory management story. More excitement awaits!

What is inputs type?

A Rubyist like myself looks at the line

let inputs = ["tan", "stand", "at"];

And says, “Well, obviously inputs is an Array.” Of course, in the last article I also said that "ant" was obviously a String, which it wasn’t. My track record isn’t great.

Remember that we chose &str over String because the latter was mutable and we didn’t need the overhead that comes with mutability.

In Rust, the Array primitive is immutable. And our anagrams_for function won’t have to mutate inputs. Since there’s no reason to mutate inputs let’s make it immutable and call it an Array.

You may be wondering why the clearly-named String primitive is mutable, while the similarly-clearly-named Array primitive is immutable. I have no idea. I’m sure there’s a great reason for it.

Yes, there are mutable collections, but I’ll get into them later. They aren’t relevant right now.

Anyway, now that we know our inputs parameter will be an Array, we can finish our function’s type signature, right?

fn anagrams_for(s: &str, inputs: Array) {
  //Amazing logic TBD
}

fn main() {
  let inputs = ["tan", "stand", "at"];
  anagrams_for("ant", inputs);
}

This nets us a nice compiler error: error: use of undeclared type name Array

The Array documentation shows us the error of our ways. There is no one Array type. Instead, when we create an Array its type is its elements’ type and their number. This is easier to see with some examples:

let number_array = [1, 2];
//number_array contains 2 i32s, so has the type [i32; 2];

let str_array = ["String"];
// str_array contains 1 &str, so has the type [&str; 1]

Knowing that, we could adjust our anagrams_for signature to

fn anagrams_for(s: &str, inputs: [&str; 3])

And our code compiles! Success. Well, a limited form of success as our function now only accepts Arrays with 3 string elements. If we try:

anagrams_for("tan", ["a", "b", "c", "d"]);

We get a compiler error: (expected an array with a fixed size of 3 elements, found one with 4 elements)

Our anagrams_for function must work with Arrays of any size, so we have to take a different approach. My first instinct is to try leaving out the size.

fn anagrams_for(s: &str, inputs: [&str]) {
  //Amazing logic TBD
}

fn main() {
  let inputs = ["tan", "stand", "at"];
  anagrams_for("ant", inputs);
}

First we get a warning: [&str] does not have a constant size known at compile-time all local variables must have a statically known size

Then we get an error: mismatched types: expected [&str], found [&str; 3]

There is a solution. But before we discover it, we must first dig into that & symbol that keeps popping up.

What is that & that keeps showing up?

We’ve seen & show up a lot. It’s part of &str and it’s in the test provided by Exercism:

anagram::anagrams_for("ant", &inputs)

But our most recent code leaves it out:

anagrams_for("ant", inputs);

The difference between these two pieces of code (as I understand it) is that &inputs is Borrowed by anagrams_for while inputs is Owned by anagrams_for

I’m still struggling through learning these concepts, so I’m bound to get some of this wrong. But my understanding is as follows:

When our function owns the array, the array’s data is copied to a second location in memory.

When our function borrows the array, the array is not copied. Instead, the function gets a reference to the location where the array’s data is stored.

One of Rust’s main goals is memory safety. Think about how Rust is going to approach these two situations when it comes to memory. When Rust is going to copy data, it has to set aside memory at compilation. That means Rust has to know exactly how much memory to set aside. In this code:

fn anagrams_for(s: &str, inputs: [&str; 3])

Rust can say “Oh, I need to set aside enough space for an array that contains 3 str”. But this code:

fn anagrams_for(s: &str, inputs: [&str])

Doesn’t tell Rust how much memory to set aside for copying. Hence that warning about, “all local variables must have a statically known size” and our program’s subsequent failure.

So, if we want to have our function accept arrays of arbitrary size, it must borrow them. Again, if we knew what we were looking for, the code from Exercism would have shown us the way.

let inputs = ["tan", "stand", "at"];
anagram::anagrams_for("ant", &inputs)

That &inputs says that we are borrowing inputs to the function. Therefore our function must also borrow:

fn anagrams_for(s: &str, inputs: &[&str]) {
  //Amazing logic TBD
}

fn main() {
  let inputs = ["tan", "stand", "at"];
  anagrams_for("ant", &inputs);
}

This compiles! Neat. We’ve finished our entire method signature while learning something about & and borrowing in the process.

But I’m still a little confused. Why does &str always contain a &? Rust’s documentation comes to my rescue again

Rust’s str type is one of the core primitive types of the language. &str is the borrowed string type.

When I write code like:

let s = "String";

s has the type &str, meaning it’s just borrowing data that is owned by a str somewhere deeper in the system, probably beyond my reach. I can’t take ownership of something I’m just borrowing, nor can I give ownership to someone else. So code like this:

fn take_ownership(s: str) {
  //do stuff
}

fn main() {
  let borrowed_str = "you don't own me!";
  take_ownership(borrowed_str);
}

Blows up with the same mismatched type error we got before. borrowed_str is a &str, and take_ownership only wants a str. We have to play nice and borrow what we can’t own.

fn borrow(s: &str) {
  //do stuff
}

fn main() {
  let borrowed_str = "you don't own me!";
  borrow(&borrowed_str);
}

Ok. That is quite enough Rust for this post. We still need to write the actual logic in anagrams_for, but that will have to wait for another day. I am heading to RubyConf this weekend, and I expect to talk quite a lot about that (and maybe some Rust) in my newsletter. Newsletter? Yes. You can read previous newsletters, or sign up for free. Comments/feedback/&c. welcome on twitter, at ian@ianwhitney.com, or leave comments on the pull request for this post.