As a new Rustacean, I had heard about how awesome the Rust borrow system is, but I had not quite understood why. One thing I learned about myself recently is that if I wanted to be an efficient learner, it is important for me to understand the ‘why’ part.
So, that brings me to the question: why is Rust’s borrow system considered great? Why do seasoned Rustaceans strongly prefer borrowing over copying or cloning variables?
In order to understand the answers to those questions, I had to
understand what’s happening when a variable is moved, cloned, or
copied. My background is genetics, not computer science, so to
clone, I had to learn the basic
anatomy of computer memory:
the stack and the heap.
This blog post is a review/note to myself about these concepts in the context of Rust: what happens in the stack and the heap when a variable is moved, cloned or copied.
Computer memory is like the human memory in the brain; it stores information. The heap and the stack are separate regions in the memory with different functions like how different parts of the human brain have different responsibilities.
The stack is like, as Liene described, Pringles! Instead of a stack of potato chips, the computer’s stack is a stack of information! Like how there is a limited amount of space in a Pringles can, the stack has a limited amount of space in memory. The stack also has a set of strict rules:
You can only remove data at the top of the stack. Can you pull out the chip in the middle of the Pringles can without removing all the chips above it? No.
Similarly, you can only add a new piece of data to the top of the stack. With Pringles, how do you add a new chip to the middle of the can without altering the whole can of chips? You can’t. This also means that the age of the data on the top of the stack is younger than the data on the bottom.
Any piece of data that will be stored in the stack must know its size at compile time. I can store an array in the stack if I know that it has a static length, and will not grow or shrink. What if I didn’t know how long my data will be at runtime? If I were to store it in the stack and tried to increase its length, it will violate the stack rules because I can’t insert new data in the middle of the stack. That is why we created the heap.
The heap is like a large space, where any data can claim a free spot as long as it fits. For example, I can have a vector on the stack that has an address pointing to the heap, where its elements reside. That way, I can grow or shrink the number of elements in the vector as much as I want without violating the stack rules.
There is one cost to the heap’s flexibility: extra clean up. With the stack, data is cleaned up as it goes out of scope. With the heap, we need to clean it up ourselves. For example, when a vector is no longer in use, it must take care to also destroy its elements residing in the heap.
So what does the stack and the heap look like when I create a variable
binding in Rust? Say there is a coloring book named
free_coloring_book where each page is dedicated to a planet of our
solar system! And it’s up for grabs!
I used a
represent the coloring book.
free_coloring_book is a vector of
&str string literals. The data in the vector (references to our
planet strings, in our example) lives in the heap, and the vector can
grow if needs be. What does this look like in the memory?
The stack contains metadata about the data on the
free_coloring_book is a vector that owns 8 elements, and
capacity shows how much room is
reserved for this vector in case it grows. In this case, it’s about
twice as big as
pointer contains the address
heap, where the actual elements live. Each element is a pair
of a pointer to a
str and its length, known as a slice in Rust’s
parlance. For example,
"mercury" has a length of 7.
Now, let’s say a friend claims ownership to this
Our friend will make it mutable, so that they can add pages if they
What would this look like in the memory?
free_coloring_book is no longer accessible, and the Rust’s borrow
checker will tell us if we try to access it.
Link to code and the compiler’s message:
It’s because the value of
free_coloring_book moved to
friends_coloring_book. And we can see that
is totally accessible!:
Link to code and its output:
Upon flipping through the pages of the coloring book, our friend gets disappointed that the coloring book missed out on Pluto. So they decide to add Pluto to their coloring book. Because the coloring book is a vector, they can add a page pretty easily:
Link to code and its output:
What would happen in the memory?
capacity remains the same, because the vector still has enough room
in case it grows.
length, however, is now changed to 9. This allows
the pointer to know that the length of the vector is 9, starting at
So far, our friend claimed ownership of the
they modified it by adding
"pluto" to the book. What if you didn’t
like this change? You decide to make a clone of our friends coloring
book so that you can make your own version:
When the data is cloned, it creates an exactly identical copy of the
data that is independent of the original data0. You argue that Pluto is
a dwarf planet, and should not be on a coloring book of planets. So
you remove the last element (
Link to code and its output:
friends_coloring_book are different! They are
independent of each other, and that’s why they are both accessible by
println!. How would our coloring books look like in the memory?
Now, the data in the heap still includes
"pluto", but the length
pointer on the stack has decreased by one. This way, the computer can
be lazy about actually truncating data, which would be computationally
Things work slightly different with types that don’t require storing
descendent data in the heap, such as a number or a character. These
special types don’t own other elements on the heap like a vector
does. These types implement the
Copy trait, which means when the
data is duplicated, it is copied bit-by-bit at the surface level. If
this were to happen with a vector, the copied object would result in a
duplicate pointer to the heap, and it would not be clear which copy is
responsible for destroying the heap elements. But, a surface level
copy is sufficient for
Copy types because their data is simple and
Let’s see what this means. In my previous blog post, I wrote:
The reason the compiler was able to print both
y is that
y are different values on the stack. Unlike with vectors in
which case the value moved, when the compiler sees
let mut y = x;,
y as a separate copy of
x is a
What does this look like in the stack? Let’s visualize it:
5 doesn’t own other data types, so
y are able to
be stored only in the stack. When
y increments by
When we create a reference, like we did in the previous blog post:
So when you make a change like
*y += 1, it would look like:
Interesting! By incrementing
x’s data was modified.
Let’s go back to the question I started with: why do seasoned Rustaceans strongly prefer borrowing over cloning or copying variables? It’s because by borrowing instead of cloning or copying, you can increase performance and potentially save huge amounts of memory!
Also, if you follow data ownership all the way back to the the owning variable, when that variable goes out of scope, the whole tree of owned data is cleaned up. This removes the need for garbage collection, which can be a computationally expensive process. This makes Rust super fast, and is another reason Rust’s borrow system is awesome.
I’ve been really enjoying learning Rust, because it has encouraged me to think about what’s happening inside the computer when I type something. By copying and cloning, is my program taking up more than necessary space in the memory? Can I make a reference instead? How is memory managed differently in other languages, like Python?
So much to learn!
0 CUViper on Rust subreddit kindly pointed out an error in my diagram. Planets are
&'static str, and when they are cloned, pointer to the
&str is cloned. The actual
&'static str data are not cloned. If the planets were
String (which is internally a vector of
char) then the actual data will be cloned as well. My diagram makes it look like the string data are cloned as well, but that is not what actually would happen. I didn't know about this! Thank you u/CUViper!