from_fn
and then_some
OsString
and CString
Good work! You’re almost through the book—there are only five chapters left. For this chapter and the next, we are going to sit back and relax and go on a short tour of the standard library, including further details on some of the types we already know. You will certainly end up encountering these modules and methods as you continue to use Rust, so we might as well learn them now so that they are already familiar to you. Nothing in this chapter will be particularly difficult to learn, and we’ll keep things pretty brief and run through one type per section.
Arrays have become easier to work with over time, as we saw in the chapter on const generics. Some other nice changes have taken place that we’ll take a look at now.
In the past (before Rust 1.53), arrays didn’t implement Iterator
, and you needed to use methods like .iter()
on them in for
loops. (Another method was to use &
to get a slice in for
loops). So, the following code didn’t work in the past:
fn main() { let my_cities = ["Beirut", "Tel Aviv", "Nicosia"]; for city in my_cities { println!("{}", city); } }
The compiler used to give the following message:
error[E0277]: `[&str; 3]` is not an iterator
--> src\main.rs:5:17
|
| ^^^^^^^^^ borrow the array with `&` or call `.iter()`
➥ on it to iterate over it
Luckily, that isn’t a problem anymore! If you see any old Rust tutorials that mention that arrays can’t be used as iterators, remember that this isn’t the case anymore. So all three of these work:
fn main() { let my_cities = ["Beirut", "Tel Aviv", "Nicosia"]; for city in my_cities { println!("{city}"); } for city in &my_cities { println!("{city}"); } for city in my_cities.iter() { println!("{city}"); } }
Beirut Tel Aviv Nicosia Beirut Tel Aviv Nicosia Beirut Tel Aviv Nicosia
Destructuring works with arrays as well. To pull out variables from an array, you can put their names inside []
to destructure it in the same way as in a tuple or a named struct. This is the same as using a tuple in match
statements or to get variables from a struct:
fn main() { let my_cities = ["Beirut", "Tel Aviv", "Nicosia"]; let [city1, _city2, _city3] = my_cities; println!("{city1}"); }
Here’s an example of some more complex destructuring, which pulls out the first and last variable in an array:
fn main() { let my_cities = [ "Beirut", "Tel Aviv", "Calgary", "Nicosia", "Seoul", "Kurume", ]; let [first, .., last] = my_cities; println!("{first}, {last}"); }
The output this time will be Beirut, Kurume
.
Arrays have a .map()
method as well that lets you return an array of the same size but of a different type (or the same type, if you wish). It’s like the .map()
method for iterators, except you don’t have to call .collect()
because it already knows the array length and type. Here is a quick example:
fn main() { let int_array = [1, 5, 9, 13, 17, 21, 25, 29]; let string_array = int_array.map(|i| i.to_string()); println!("{int_array:?}"); println!("{string_array:?}"); }
The output is no surprise, but note that the original array is not destroyed:
[1, 5, 9, 13, 17, 21, 25, 29] ["1", "5", "9", "13", "17", "21", "25", "29"]
And here is an example of the same method that is a bit more interesting. We’ll make an Hours
enum that implements From<u32>
to determine whether an hour is a working hour, a non-working hour, or an error (an hour greater than 24):
#[derive(Debug)] enum Hours { Working(u32), NotWorking(u32), Error(u32), } impl From<u32> for Hours { fn from(value: u32) -> Self { match value { hour if (8..17).contains(&hour) => ➥Hours::Working(value), ① hour if (0..=24).contains(&hour) => ➥Hours::NotWorking(value), ② wrong_hour => Hours::Error(wrong_hour), } } } fn main() { let int_array = [1, 5, 9, 13, 17, 21, 25, 29]; let hours_array = int_array.map(Hours::from); println!("{hours_array:?}"); }
① Here, we will use an exclusive range (up to, but not including, 17) because if you work until 5 pm, and it’s 5 pm, you’re already going home and not working anymore.
② For the rest of the numbers, we will make the range inclusive. We already checked for working hours, so we are safe to match on anything between 0 and 24.
[NotWorking(1), NotWorking(5), Working(9), Working(13), NotWorking(17),
➥NotWorking(21), Error(25), Error(29)]
Knowing this .map()
method will come in handy for the next method, called from_fn()
.
The from_fn()
method was released fairly recently in the summer of 2022 with Rust 1.63; it allows you to construct an array on the spot. The from_fn()
method was introduced with the following code sample. Don’t worry if it doesn’t make much sense because a lot of people felt the same way when they first saw it:
fn main() { let array = std::array::from_fn(|i| i); assert_eq!(array, [0, 1, 2, 3, 4]); }
You can imagine that there was a lot of discussion about this sample. How does it even work? How can you just write (|i| i)
and get [0, 1, 2, 3, 4]
? This sample was later improved to reduce confusion, but let’s take a look on our own to see why the code works. First, we’ll look at the code inside from_fn()
:
pub fn from_fn<T, const N: usize, F>(mut cb: F) -> [T; N] where F: FnMut(usize) -> T, { let mut idx = 0; [(); N].map(|_| { let res = cb(idx); idx += 1; res }) }
The first lines tell us that this method makes an array of type T
and a length of N
and that it takes a closure. The closure is called cb
(for callback), but it could be called anything: f, my_closure
, and so on. Then, inside the function, it starts with a variable called idx
(the index), which starts at 0. Then it quickly makes an array of unit types (the ()
type) of the same length as N
and uses .map()
to make the new array. For each item, it carries out the instructions inside, which include increasing the index by 1 each time before returning the value under the variable name res
.
In other words, when you call from_fn
, you have the option to use the index number. If you don’t want to, you can write |_|
instead. Here’s an example:
fn main() { ①
let array = std::array::from_fn(|_| "Don't care about the index");
assert_eq!(
array,
[
"Don't care about the index",
"Don't care about the index",
"Don't care about the index",
"Don't care about the index",
"Don't care about the index"
]
);
}
① We could take the index for the array we are creating, but we don’t care about it.
So far, so good. But how did it know the length? Here, this is because of type inference. An array can only be compared to an array of the same type and length, so when you add an assert_eq!
, the compiler will know that the array to compare will also have to be the same type and length. And that means that if you take out the assert_eq!
, the code won’t compile!
fn main() { let array = std::array::from_fn(|_| "Don't care about the index"); }
The error message shows us that the compiler was able to determine the type of the array but not its length:
error[E0282]: type annotations needed for `[&str; _]` --> src\main.rs:2:9 | 2 | let array = std::array::from_fn(|_| "Don't care about the index"); | ^^^^^ | help: consider giving `array` an explicit type, where the the value of ➥const parameter `N` is specified | 2 | let array: [&str; _] = std::array::from_fn(|_| "Don't care about ➥the index"); | +++++++++++
And because it was able to determine the type, we can either write [&str; 5]
or [_; 5]
, and that will be enough information. So the next two arrays will work just fine:
fn main() { let array: [_; 5] = std::array::from_fn(|_| "Don't need the index"); let array: [&str; 5] = std::array::from_fn(|_| "Don't need the index"); }
When using from_fn()
for an array, you can pull in the index of each item if you want to use it or use |_|
if you don’t need it.
Most of the time, you will have to tell the compiler the length of the array.
If you are comparing one array to another, you won’t need to tell the compiler the length. But you might want to write out the length anyway for the benefit of anyone else reading your code.
Our old friend char
is pretty familiar by now, but let’s take a look at a few neat things that we might have missed.
You can use the .escape_unicode()
method to get the Unicode number for a char
:
fn main() {
let korean_word = "청춘예찬";
for character in korean_word.chars() {
print!("{} ", character.escape_unicode());
}
}
This prints \u{ccad} \u{cd98} \u{c608} \u{cc2c}
.
You can get a char
from u8
using the From
trait. However, to make a char
from a u32
, you have to use TryFrom
because it might not work. There are many more numbers in u32
than characters in Unicode. We can see this with a simple demonstration. We will first print a char
from a random u8
, and then try 100,000 times to make a char
from a random u32
:
use rand::random;
fn main() {
println!("This will always work: {}", char::from(100)); ①
println!("So will this: {}", char::from(random::<u8>()));
for _ in 0..100_000 {
if let Ok(successful_character) = char::try_from(random::<u32>()) {
print!("{successful_character}");
}
}
}
① The only implementation of From for char is From<u8>, so Rust will automatically choose a u8. It won’t compile if the number is too large for a u8.
The output will be different every time, but even after 100,000 tries, the number of successful characters will be very small. And most of them will end up being Chinese characters, because there are so many of them:
This will always work: D
So will this: Ñ
艴 薪 뙨 聍 掾
This makes sense because, at present, Unicode has a total of 149,186 characters, while a u32
can go up to 4,294,967,295. So, the chance of having a random u32
that is 149186
or less is extremely low. There is also a high chance that the character won’t show on your screen if you don’t have the fonts installed for the language of the character.
We learned near the beginning of the book that all chars are 4 bytes in length. If you want to know how many bytes a char
would be if it were a &str
, you can use the len_utf()
method. Let’s put some greetings in and see how many bytes each character would be:
fn main() { "Hi, привіт, 안녕, 𓋹 𓍑 𓋴" .chars() .for_each(|c| println!("{c}: {}", c.len_utf8())); }
H: 1 i: 1 ,: 1 : 1 п: 2 р: 2 и: 2 в: 2 і: 2 т: 2 ,: 1 : 1 안: 3 녕: 3 ,: 1 : 1 𓋹: 4 𓍑: 4 𓋴: 4
There are a ton of convenience methods for char
that are pretty easy to understand by their name, such as .is_alphanumeric(), .is_whitespace()
, and .make_ ascii_uppercase()
. There’s a good chance that a convenience method already exists if you need to validate or modify a char
in your code.
There are a lot of math methods for these types, like multiplying by powers, Euclidean modulo, logarithms, and so on, that we don’t need to look at here. But there are some other methods that are useful in our day-to-day work.
Integers all have the methods .checked_add(), .checked_sub(), .checked_mul()
, and .checked_div()
. These are good to use if you think you might produce a number that will overflow or underflow (i.e., be greater than the type’s maximum value or less that its minimum value). They return an Option
so you can safely check that your math works without making the program panic.
You might be wondering why Rust would even compile if a number overflows. It’s true that the compiler won’t compile if it knows at compile time that a number will overflow—for example:
fn main() { let some_number = 200_u8; println!("{}", some_number + 200); }
This is pretty obvious (even to us) that the number will be 400, which won’t fit into a u8
, and the compiler knows this as well:
error: this arithmetic operation will overflow
--> src/main.rs:3:20
|
3 | println!("{}", some_number + 200);
| ^^^^^^^^^^^^^^^^^ attempt to compute `200_u8 +
➥200_u8`, which would overflow
|
= note: `#[deny(arithmetic_overflow)]` on by default
However, if a number isn’t known at compile time, the behavior will be different:
Let’s trick the compiler into making this happen. First, we will make a u8
with a value of 255
, the highest value for a u8
. Then we will use the rand
crate to add 10 to it:
use rand::{thread_rng, Rng};
fn main() {
let mut rng = thread_rng();
let some_number = 255_u8;
println!("{}", some_number + rng.gen_range(10..=10)); ①
}
① We know that a range of 10..=10 will only return 10, but the Rust compiler doesn’t know this at compile time, so it will let us run the program.
In Release
mode, the number will overflow, and the program will print 10
without panicking. But in Debug
mode, we will see this:
Running `target/debug/playground` thread 'main' panicked at 'attempt to add with overflow', src/main.rs:6:20 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
We certainly don’t want to panic, and we also don’t want to add 10 to 255 and get 10. So let’s use .checked_add()
instead. Now we will never overflow or panic:
use rand::random; fn add_numbers(one: u8, two: u8) { match one.checked_add(two) { Some(num) => println!("Added {one} to {two}: {num}"), None => println!("Error: couldn't add {one} to {two}"), } } fn main() { for _ in 0..3 { let some_number = random::<u8>(); let other_number = random::<u8>(); add_numbers(some_number, other_number); } }
The output will be different every time, but it will look something like this:
Error: couldn't add 199 to 236 Added 34 to 97: 131 Added 61 to 109: 170
Environments that silently ignore integer overflows have been to blame for all kinds of crashes and security problems over the years, which is what makes methods like .checked_add()
particularly nice for a systems programming language. Be sure to use the .checked_
methods whenever you think an overflow could take place! And if you are often working with numbers that are larger than any integer in the standard library, take a look at the num_bigint
crate (https://docs.rs/num-bigint/latest/num_bigint/).
You might have noticed that the methods for integers use the variable name rhs
a lot. For example, the documentation on the method .checked_add()
starts with this:
pub const fn checked_add(self, rhs: i8) -> Option<i8>
Checked integer addition. Computes self + rhs, returning None if overflow
➥occurred.
The term rhs means “right-hand side”— in other words, the right-hand side when you do some math. For example, in 5 + 6
, the number 5
is on the left and 6
is on the right, so 6
is the rhs
. It is not a keyword, but you will see rhs
a lot in the standard library, so it’s good to know.
While we are on the subject, let’s learn how to implement Add
, which is the trait used for the +
operator in Rust. In other words, after you implement Add
, you can use +
on a type that you create. You need to implement Add
yourself (you can’t just use #[derive(Add)]
) because it’s impossible to guess how you might want to add one type to another type. Here’s the example from the page in the standard library:
use std::ops::Add; ① #[derive(Debug, Copy, Clone, PartialEq)] ② struct Point { x: i32, y: i32, } impl Add for Point { type Output = Self; ③ fn add(self, other: Self) -> Self { Self { x: self.x + other.x, y: self.y + other.y, } } }
① Add is found inside the std::ops module, which has all the traits used for operations. You can probably guess that the other traits have names like Sub, Mul, and so on.
② PartialEq is probably the most important part here. You want to be able to compare numbers.
③ Remember, this is called an associated type—a type that “goes together” with a trait. In this case, it’s another Point.
Now let’s implement Add
for our own type just for fun. Let’s imagine that we have a Country
struct that we’d like to add to another Country
. As long as we tell Rust how we want to add one to the other, Rust will cooperate, and then we will be able to use +
to add them. It looks like this:
use std::fmt; use std::ops::Add; #[derive(Clone)] struct Country { name: String, population: u32, gdp: u32, ① } impl Country { fn new(name: &str, population: u32, gdp: u32) -> Self { Self { name: name.to_string(), population, gdp, } } } impl Add for Country { type Output = Self; fn add(self, other: Self) -> Self { Self { name: format!("{} and {}", self.name, other.name), ② population: self.population + other.population, gdp: self.gdp + other.gdp, } } } impl fmt::Display for Country { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!( f, "In {} are {} people and a GDP of ${}", self.name, self.population, self.gdp ) } } fn main() { let nauru = Country::new("Nauru", 12_511, 133_200_000); let vanuatu = Country::new("Vanuatu", 219_137, 956_300_000); let micronesia = Country::new("Micronesia", 113_131, 404_000_000); println!("{}", nauru); let nauru_and_vanuatu = nauru + vanuatu; println!("{nauru_and_vanuatu}"); println!("{}", nauru_and_vanuatu + micronesia); }
② We decide that add means to concatenate the names, combine the population, and combine the GDP. It’s entirely up to us what we want Add to mean.
In Nauru are 12511 people and a GDP of $133200000
In Nauru and Vanuatu are 231648 people and a GDP of $1089500000
In Nauru and Vanuatu and Micronesia are 344779 people and a GDP of
➥$1493500000
The three others are called Sub, Mul
, and Div
, and they are basically the same to implement. There are quite a few other operators in the same module, such as +=, -=, *=
, and /=
, which use traits that start with the name Assign
: AddAssign, SubAssign, MulAssign
, and DivAssign
. You can see the full list of such traits here: http://mng.bz/468j. They are all named in a pretty predictable fashion. For example, %
is called Rem, -
is called Neg
, and so on.
Two other convenient traits, PartialEq
(http://mng.bz/JdwK) and PartialOrd
, (http://mng.bz/PRwY), are used to compare and order one variable with another. After these traits are implemented, you will be able to use signs like <
and ==
for your type in the same way that implementing Add
lets you use the +
sign.
Because comparing for equality and order are done among variables of the same type, these traits are easier to implement and are usually done using #[derive]
, as we saw in chapter 13. But you can also manually implement them if you want. As always, the standard library contains some simple examples implementing these traits that you can copy and paste and then change to suit your own type if you want to manually implement them.
f32
and f64
have a very large number of methods that you use when doing math. We won’t look at those, but here are some methods that you might use. They are: .floor(), .ceil(), .round()
, and .trunc()
. All of these return an f32
or an f64
that is like an integer (i.e., a whole number). They do the following:
.round()
—Gives you a higher number if 0.5 or more or the same number is less than 0.5. This is called rounding because it gives you a “round” number (a number that has a short, simple form).
.trunc()
—Cuts off the part after the period. Truncate means “to cut off.”
Here is a simple sample that prints them:
fn four_operations(input: f64) { println!( "For the number {}: floor: {} ceiling: {} rounded: {} truncated: {}\n", input, input.floor(), input.ceil(), input.round(), input.trunc() ); } fn main() { four_operations(9.1); four_operations(100.7); four_operations(-1.1); four_operations(-19.9); }
For the number 9.1: floor: 9 ceiling: 10 rounded: 9 ① truncated: 9 For the number 100.7: floor: 100 ceiling: 101 rounded: 101 ② truncated: 100 For the number -1.1: floor: -2 ceiling: -1 rounded: -1 truncated: -1 For the number -19.9: floor: -20 ceiling: -19 rounded: -20 truncated: -19
② Because it’s more than 100.5
f32
and f64
have a method called .max()
and .min()
that gives you the higher or the lower of two numbers. (For other types, you can use the std::cmp::max()
and std::cmp::min()
functions.)
These .max()
and .min()
methods are a good opportunity to show again that the .fold()
method for iterators isn’t just for adding numbers. In this case, you can use .fold()
to return the highest or lowest number in a Vec
or anything else that implements Iterator
:
fn main() { let nums = vec![8.0_f64, 7.6, 9.4, 10.0, 22.0, 77.345, -7.77, -10.0]; let max = nums .iter() .fold(f64::MIN, |num, next_num| num.max(*next_num)); ① let min = nums .iter() .fold(f64::MAX, |num, next_num| num.min(*next_num)); ② println!("{max}, {min}"); }
① To get the highest number, start with the lowest possible f64 value.
② Conversely, start with the highest possible f64 value to get the lowest number.
With this, we get the highest and the lowest values: 77.345 and −10.0.
On the left side of the documentation for Rust’s float types, you might notice that there are a lot of consts, known as “associated constants”: DIGITS, EPSILON, INFINITY, MANTISSA_DIGITS
, and so on. Plus, in the previous sample, we’ve used MIN
and MAX
, which we’ve also used with other types such as integers. How are these consts made anyway? Let’s take a quick look at that.
Rust has three types of associated items. We are already familiar with the first two and are now going to learn the third one, so this is a good time to sum up all three. Associated items are connected to the type or trait they are associated with by the ::
double colon. Let’s start with the first one, which we know very well: functions.
When you implement a method on a type or a trait, you are giving it an associated function. Most of the time, we see it in variable_name.function()
format when there is a self
parameter. But this is just a convenience instead of using forms like TypeName:: function(&variable_name)
or TypeName::function(&mut variable_name)
. When you use the dot operator (a period) to call a method, Rust is actually just using the ::
syntax, unseen to you, to call the function. Let’s look at a quick example:
struct MyStruct(String); impl MyStruct { ① fn print_self(&self) { println!("{}", self.0); } fn add_exclamation(&mut self) { self.0.push('!') } } fn main() { let mut my_struct = MyStruct("Hi".to_string()); my_struct.print_self(); ② MyStruct::print_self(&my_struct); my_struct.add_exclamation(); ③ MyStruct::add_exclamation(&mut my_struct); MyStruct::print_self(&my_struct); }
① MyStruct has two methods; 99.9% of the time, we would use the dot operator
② We are calling .print_self(). On this line, we use the dot operator, but on the following line, we use the associated item syntax. It’s exactly the same thing!
③ The same thing happens here, too. my_struct.add_exclamation() takes a &mut my_struct without us needing to specify that. But if we want, we can use the full associated item syntax like we do on the next line.
This sample is pretty easy, with an output of Hi, Hi
, and Hi!!
.
The next item we’ve seen is an associated type, which is the type you define when implementing a trait. We saw this most recently with the Add
trait:
pub trait Add<Rhs = Self> {
type Output;
fn add(self, rhs: Rhs) -> Self::Output; ①
}
Here, type Output
is defined when you implement the trait, and this also gets attached to the type with the ::
double colon. Here, as well, we can use the full associated type signature. Let’s use a really simple example: adding 10 to 10. This time, we will start with the full signature and work backward:
use std::ops::Add; fn main() { let num1 = 10; let num2 = 10; print!("{} ", i32::add(num1, num2)); ① print!("{} ", num1.add(num2)); ② print!("{}", num1 + num2); ③ }
① The i32 type implements Add, which gives it the add function: i32::add(). This function takes self plus another number.
② Since we have a self parameter, we can use the dot operator as well.
③ This last step is built into the language: if you implement Add, you can use + to add. This makes sense: nobody would want to use Rust if they had to type use std::ops::Add and 10.add(10) all the time just to add 10 and 10 together.
On each line, we are doing the same operation, so the output is just 20 20 20
.
Now let’s look at a simple example of our own. This time, we’ll have a trait that just requires that a type destroy itself and turn into another form. This is defined by whoever implements the trait and can be anything:
trait ChangeForm { type SomethingElse; ① fn change_form(self) -> Self::SomethingElse; ② } impl ChangeForm for String { ③ type SomethingElse = char; fn change_form(self) -> Self::SomethingElse { self.chars().next().unwrap_or(' ') } } impl ChangeForm for i32 { type SomethingElse = i64; fn change_form(self) -> Self::SomethingElse { println!("i32 just got really big!"); i64::MAX } } fn main() { let string1 = "Hello there!".to_string(); ④ println!("{}", string1.change_form()); let string2 = "I'm back!".to_string(); println!("{}", String::change_form(string2)); let small_num = 1; println!("{}", small_num.change_form()); let also_small_num = 0; println!("{}", i32::change_form(also_small_num)); }
① The type is called SomethingElse and can be anything.
② Note the signature here: it’s associated with Self and attached with the :: double colon.
③ We’ll implement it for String and char. It’s our own trait, so we can implement it on external types, too.
④ Here, as well, there are two ways to call the function: the method signature with the dot operator or the full associated type signature.
H I i32 just got really big! 9223372036854775807 i32 just got really big! 9223372036854775807
The associated function and type signature with the ::
should look pretty familiar by now!
And with that, we are now at the last associated item: associated consts.
Associated consts are actually incredibly easy to use. Just start an Impl
block, type const CONST_NAME: type_name = value
, and you’re done! Here’s a quick example:
struct SizeTenString(String); impl SizeTenString { const SIZE: usize = 5; } fn main() { println!("{}", SizeTenString::SIZE); }
With this associated const, our SizeFiveString
can pass on this SIZE
const to whatever needs it.
Here is a longer yet still simple example of this associated const. In this example, we can use the associated const to ensure that this type will always be 10 characters in length:
#[derive(Debug)] struct SizeTenString(String); impl SizeTenString { const SIZE: usize = 10; } impl TryFrom<&'static str> for SizeTenString { type Error = String; fn try_from(input: &str) -> Result<Self, Self::Error> { if input.chars().count() == Self::SIZE { Ok(Self(input.to_string())) } else { Err(format!("Length must be {} characters!", Self::SIZE)) } } } fn main() { println!("{:?}", SizeTenString::try_from("This one's long")); println!("{:?}", SizeTenString::try_from("Too short")); println!("{:?}", SizeTenString::try_from("Just right")); }
An associated const can be used with traits, too, in a similar way to functions on traits. A type can override these associated consts, too, in the same way that you can write your own trait method even if there is a default method:
trait HasNumbers { const SET_NUMBER: usize = 10; ① const EXTRA_NUMBER: usize; ② // fn set_number() -> usize { 10 } ③ // fn extra_number() -> usize; } struct NothingSpecial; impl HasNumbers for NothingSpecial { const EXTRA_NUMBER: usize = 10; // const SET_NUMBER: usize = 20; ④ } fn main() { print!("{} ", NothingSpecial::SET_NUMBER); print!("{}", NothingSpecial::EXTRA_NUMBER); }
① The value of the const SET_NUMBER is 10, so you don’t need to decide the value when implementing the trait.
② This other const, however, is unknown. You have to choose its value when implementing this trait.
③ These two commented-out functions are similar in behavior to the consts. One has a default implementation, while the other only shows the return type and has to be written out by anyone implementing the trait.
④ If you uncommented this, the struct NothingSpecial would have a value of 20 for SET_NUMBER instead of 10.
So this code will print 10 10
, but if you uncomment the one line out, it will print 20 10
.
That was a long enough detour, so let’s get on to our next standard library type!
Booleans are pretty simple in Rust but are quite robust compared to some other languages. (For comparison, one example of the difficulties of working with booleans in C can be found at http://mng.bz/1J51.) There are a few ways to use a bool
that we haven’t come across yet, so let’s look at them now.
In Rust, you can turn a bool
into an integer if you want because it’s safe to do that. But you can’t do it the other way around. As you can see, true
turns to 1, and false
turns to 0:
fn main() { let true_false = (true, false); println!("{} {}", true_false.0 as u8, true_false.1 as i32); }
This prints 1 0
. Or you can use .into()
if you tell the compiler the type:
fn main() { let true_false: (i128, u16) = (true.into(), false.into()); println!("{} {}", true_false.0, true_false.1); }
As of Rust 1.50 and 1.62, there are two methods, .then()
and .then_some()
, that turn a bool
into an Option
. With .then()
, you write a closure, and the closure is called if the item is true
. Whatever is returned from the closure gets wrapped in an Option
. Here’s a small example:
fn main() { let (tru, fals) = (true.then(|| 8), false.then(|| 8)); println!("{:?}, {:?}", tru, fals); }
These methods can be pretty nice for error handling. The following code shows how a simple Vec<bool>
can be turned into a Vec
of Result
s with some extra info as it is handled.
use std::time::{SystemTime, UNIX_EPOCH}; fn timestamp() -> f64 { ① SystemTime::now() .duration_since(UNIX_EPOCH) .unwrap() .as_secs_f64() } fn send_data_to_user() {} ② fn main() { let bool_vec = vec![true, false, true, false, false]; let result_vec = bool_vec .into_iter() .enumerate() .map(|(index, b)| { b.then(|| { let timestamp = timestamp(); ③ send_data_to_user(); timestamp }) .ok_or_else(|| { ④ let time = timestamp(); format!("Error with item {index} at {time}") }) }) .collect::<Vec<_>>(); println!("{result_vec:#?}"); }
① A small function to generate a timestamp as an f64 to make the following code easier to read
② This function is empty, but pretend that it sends the users of our system some data in case it comes across as true.
③ We turn the bool into an Option<f64> (the timestamp), sending the user the data before passing it on.
④ With ok_or_else(), we turn the Option into a Result and add some error info (the index number that failed).
The output at the end will look something like this:
Ok( 1685149117.2468076, ), Err( "Error with item 1 at 1685149117.246808", ), Ok( 1685149117.246833, ), Err( "Error with item 3 at 1685149117.2468333", ), Err( "Error with item 4 at 1685149117.2468338", ), ]
Vec
has a lot of methods that we haven’t looked at yet. Let’s start with .sort()
. The .sort()
method is not surprising at all. It uses a &mut self
to sort a vector in place (nothing is returned):
fn main() { let mut my_vec = vec![100, 90, 80, 0, 0, 0, 0, 0]; my_vec.sort(); println!("{:?}", my_vec); }
This prints [0, 0, 0, 0, 0, 80, 90, 100]
. But there is one more interesting way to sort called .sort_unstable()
, and it is usually faster. It can be faster because it doesn’t care about the order of items if they are the same value. In regular .sort()
, you know that the last 0, 0, 0, 0, 0
will be in the same order after .sort()
is performed. But .sort_unstable()
might move the last zero to index 0, then the third last zero to index 2, and so on. The documentation in the standard library explains it pretty well:
It is typically faster than stable sorting, except in a few special cases, e.g., when the slice consists of several concatenated sorted sequences.
.dedup()
means “de-duplicate.” It will remove items that are the same in a vector, but only if they are next to each other. This next code will not just print "sun", "moon"
:
fn main() {
let mut my_vec = vec!["sun", "sun", "moon", "moon", "sun", "moon",
➥"moon"];
my_vec.dedup();
println!("{:?}", my_vec);
}
Instead, it only gets rid of "sun"
next to the other "sun"
, then "moon"
next to one "moon"
, and again with "moon"
next to another "moon"
. The result is: ["sun", "moon", "sun", "moon"]
.
So, if you want to use .dedup()
to remove every duplicate, just .sort()
first:
fn main() {
let mut my_vec = vec!["sun", "sun", "moon", "moon", "sun", "moon",
➥"moon"];
my_vec.sort();
my_vec.dedup();
println!("{:?}", my_vec);
}
You can split a Vec
with .split_at()
, while .split_at_mut()
lets you do the same if you need to change the values. These give you two slices while leaving the original Vec
intact:
fn main() { let mut big_vec = vec![0; 6]; let (first, second) = big_vec.split_at_mut(3); std::thread::scope(|s| { s.spawn(|| { for num in first { *num += 1; } }); s.spawn(|| { for num in second { *num -= 5; } }); }); println!("{big_vec:?}"); }
The output is [1, 1, 1, -5, -5, -5]
.
The .drain()
method lets you pull a range of values out of a Vec
, giving you an iterator. This iterator keeps a mutable borrow on the original Vec
so doing something like collecting it into another Vec
or outright using the drop()
method will let you access the original Vec
again:
fn main() { let mut original_vec = ('A'..'K').collect::<Vec<_>>(); println!("{original_vec:?}"); let drain = original_vec.drain(2..=5); println!("Pulled these chars out: {drain:?}"); drop(drain); println!("Here's what's left: {original_vec:?}"); let drain_two = original_vec.drain(2..=4).collect::<Vec<_>>(); println!("Original vec: {original_vec:?}\nSecond drain: {drain_two:?}"); }
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'] Pulled these chars out: Drain(['C', 'D', 'E', 'F']) Here's what's left: ['A', 'B', 'G', 'H', 'I', 'J'] Original vec: ['A', 'B', 'J'] Second drain: ['G', 'H', 'I']
We learned before that a String is kind of like a Vec, because it holds one (a Vec<u8>
). A String isn’t just a simple smart pointer over a Vec<u8>
, but sometimes it almost feels like one because so many of the methods are exactly the same.
One of these is String::with_capacity()
. This method can help avoid too many allocations if you are pushing char
s to it with .push()
or pushing &str
s to it with .push_str()
. Here’s an example of a String
that has too many allocations:
fn main() {
let mut push_string = String::new();
for _ in 0..100_000 {
let capacity_before = push_string.capacity(); ①
push_string.push_str("I'm getting pushed into the string!");
let capacity_after = push_string.capacity();
if capacity_before != capacity_after {
println!("Capacity raised to {capacity_after}");
}
}
}
① We check the capacity before and after the &str is pushed and print out the new capacity if it has changed.
Capacity raised to 35 Capacity raised to 70 Capacity raised to 140 Capacity raised to 280 Capacity raised to 560 Capacity raised to 1120 Capacity raised to 2240 Capacity raised to 4480 Capacity raised to 8960 Capacity raised to 17920 Capacity raised to 35840 Capacity raised to 71680 Capacity raised to 143360 Capacity raised to 286720 Capacity raised to 573440 Capacity raised to 1146880 Capacity raised to 2293760 Capacity raised to 4587520
We had to reallocate (copy everything over) 18 times. But now we know the final capacity. So we’ll give it the capacity right away, and we don’t need to reallocate—just one String
capacity is enough:
fn main() {
let mut push_string = String::with_capacity(4587520); ①
for _ in 0..100_000 {
let capacity_before = push_string.capacity();
push_string.push_str("I'm getting pushed into the string!");
let capacity_after = push_string.capacity();
if capacity_before != capacity_after {
println!("Capacity raised to {capacity_after}");
}
}
}
① We know the exact number in this case. Even if you only have a general idea (like “at least 10,000”), you could still use with_capacity() to avoid too many allocations.
And this prints nothing. Perfect! We never had to reallocate.
Of course, the actual length is certainly smaller than the final 4,587,520, which is simply a doubling of the previous capacity when it was 2,293,760. We can shrink it, though, with .shrink_to_fit()
, which is another Vec
method. But only do this once you are sure of the final length because the capacity will double again even if you push a single extra char
to the Vec
:
fn main() { let mut push_string = String::with_capacity(4587520); for _ in 0..100_000 { push_string.push_str("I'm getting pushed into the string!"); } println!("Current capacity as expected: {}", push_string.capacity()); push_string.shrink_to_fit(); println!("Actual needed capacity: {}", push_string.capacity()); push_string.push('a'); println!("Whoops, it doubled again: {}", push_string.capacity()); push_string.shrink_to_fit(); println!("Shrunk back to actual needed capacity: {}", push_string.capacity()); }
Current capacity: 4587520 Actual needed capacity: 3500000 Whoops, it doubled again: 7000000 Shrunk back to actual needed capacity: 3500001
The .pop()
method works for a String
, just like for a Vec
:
fn main() {
let mut my_string = String::from(".daer ot drah tib elttil a si gnirts
➥sihT");
while let Some(c) = my_string.pop() {
print!("{c}");
}
}
Try reading the String
backward to see what the output will be for this sample.
By the way, look at how readable the .pop()
method is: there’s no magic to it. At this point in the book, you could easily write this method yourself!
pub fn pop(&mut self) -> Option<char> { let ch = self.chars().rev().next()?; let newlen = self.len() - ch.len_utf8(); unsafe { self.vec.set_len(newlen); } Some(ch) }
One convenient method for String is .retain()
, which is a little bit like the .filter()
method we know for iterators. This method passes in a closure that we can use to evaluate whether to keep each character or not. The following code keeps only the characters inside a String
that are letters or spaces:
fn main() { let mut my_string = String::from("Age: 20 Height: 194 Weight: 80"); my_string.retain(|ch| ch.is_alphabetic() || ch == ' '); dbg!(my_string); }
[src\main.rs:4] my_string = "Age Height Weight "
The std::ffi
module of the standard library is the one that helps you use Rust with other languages or operating systems. This module includes types like OsString
and CString
, which are like String
for the operating system or String
for the language C. They each have their own &str
type, too: OsStr
and CStr
. The three letters ffi
stand for foreign function interface.
You can use OsString
when you have to work with an operating system that doesn’t use UTF-8. All Rust strings are UTF-8, but certain operating systems express strings in different ways. Here is a simplified version of the page in the standard library on why we have OsString
:
A string on Unix (Linux, etc.) might be a sequence of bytes together that don’t have zeros, and sometimes you read them as Unicode UTF-8.
A string on Windows might be made of sequences of 16-bit values that don’t have zeros.
In Rust, strings are always valid UTF-8, which may contain zeros.
So an OsString
is made to be read by all of them.
You can do all the regular things with an OsString like OsString::from("Write something here")
. It also has an interesting method called .into_string()
that tries to make it into a regular String
. It returns a Result
, but the Err
part is just the original OsString
:
pub fn into_string(self) -> Result<String, OsString>
So if it doesn’t work, you just get the previous OsString
back. You can’t call .unwrap()
because it will panic, but you can use match
to get the OsString
back. We can quickly prove that the Err
value is an OsString
by calling methods that don’t exist:
use std::ffi::OsString; fn main() { let os_string = OsString::from("This string works for your OS too."); match os_string.into_string() { Ok(valid) => valid.thth(), Err(not_valid) => not_valid.occg(), } }
Then the compiler tells us exactly what we want to know:
error[E0599]: no method named `thth` found for struct `std::string::String` ➥in the current scope --> src/main.rs:6:28 | 6 | Ok(valid) => valid.thth(), | ^^^^ method not found in `std::string::String` error[E0599]: no method named `occg` found for struct `std::ffi::OsString` ➥in the current scope --> src/main.rs:7:37 | 7 | Err(not_valid) => not_valid.occg(), | ^^^^ method not found in ➥`std::ffi::OsString`
This book doesn’t get into any FFI for Rust, but this module is a good place to start.
And with that, we are halfway through the tour! Hopefully, it has been pretty relaxing and enlightening so far, with nothing particularly difficult. The tour will finish up in the next chapter as we learn a lot of the methods related to memory, how to set up panic hooks and view backtraces, and some of the other convenient macros that we haven’t learned yet.
Even everyday types like bool
and char
have new methods added to them all the time, so keep an eye on the release notes for every new version of Rust to see what has been made available.
Be sure to use checked operations if you ever think any of your numeric types may overflow. They require a bit more typing, but the extra guarantees are worth it.
With associated consts, we now know all three associated items. The other two are associated functions and associated types.
Despite the long name, associated items are not that intimidating: associated functions are just functions, associated types are just types declared inside a trait, and associated constants are just const values on a type or a trait.
Try doing your own tour as well by taking a look at the methods and traits for the types you use the most in Rust. There is a lot in the standard library that we have only scratched the surface of.