20 A tour of the standard library

This chapter covers

Good work! You’re almost through the book—there are only five chapters left. For this chapter and the next, we are going to sit back and relax and go on a short tour of the standard library, including further details on some of the types we already know. You will certainly end up encountering these modules and methods as you continue to use Rust, so we might as well learn them now so that they are already familiar to you. Nothing in this chapter will be particularly difficult to learn, and we’ll keep things pretty brief and run through one type per section.

20.2 char

Our old friend char is pretty familiar by now, but let’s take a look at a few neat things that we might have missed.

You can use the .escape_unicode() method to get the Unicode number for a char:

fn main() {
    let korean_word = "청춘예찬";
    for character in korean_word.chars() {
        print!("{} ", character.escape_unicode());
    }
}

This prints \u{ccad} \u{cd98} \u{c608} \u{cc2c}.

You can get a char from u8 using the From trait. However, to make a char from a u32, you have to use TryFrom because it might not work. There are many more numbers in u32 than characters in Unicode. We can see this with a simple demonstration. We will first print a char from a random u8, and then try 100,000 times to make a char from a random u32:

use rand::random;
 
fn main() {
    println!("This will always work: {}", char::from(100));     ①
    println!("So will this: {}", char::from(random::<u8>()));
    
    for _ in 0..100_000 {
        if let Ok(successful_character) = char::try_from(random::<u32>()) {
            print!("{successful_character}");
        }        
    }
}

① The only implementation of From for char is From<u8>, so Rust will automatically choose a u8. It won’t compile if the number is too large for a u8.

The output will be different every time, but even after 100,000 tries, the number of successful characters will be very small. And most of them will end up being Chinese characters, because there are so many of them:

This will always work:  D
So will this: Ñ
  艴 薪             뙨   聍 掾

This makes sense because, at present, Unicode has a total of 149,186 characters, while a u32 can go up to 4,294,967,295. So, the chance of having a random u32 that is 149186 or less is extremely low. There is also a high chance that the character won’t show on your screen if you don’t have the fonts installed for the language of the character.

We learned near the beginning of the book that all chars are 4 bytes in length. If you want to know how many bytes a char would be if it were a &str, you can use the len_utf() method. Let’s put some greetings in and see how many bytes each character would be:

fn main() {
    "Hi, привіт, 안녕, 𓋹 𓍑 𓋴"
        .chars()
        .for_each(|c| println!("{c}: {}", c.len_utf8()));
}

Here is the output:

H: 1
i: 1
,: 1
 : 1
п: 2
р: 2
и: 2
в: 2
і: 2
т: 2
,: 1
 : 1
안: 3
녕: 3
,: 1
 : 1
𓋹: 4
𓍑: 4
𓋴: 4

There are a ton of convenience methods for char that are pretty easy to understand by their name, such as .is_alphanumeric(), .is_whitespace(), and .make_ ascii_uppercase(). There’s a good chance that a convenience method already exists if you need to validate or modify a char in your code.

20.3 Integers

There are a lot of math methods for these types, like multiplying by powers, Euclidean modulo, logarithms, and so on, that we don’t need to look at here. But there are some other methods that are useful in our day-to-day work.

20.3.1 Checked operations

Integers all have the methods .checked_add(), .checked_sub(), .checked_mul(), and .checked_div(). These are good to use if you think you might produce a number that will overflow or underflow (i.e., be greater than the type’s maximum value or less that its minimum value). They return an Option so you can safely check that your math works without making the program panic.

You might be wondering why Rust would even compile if a number overflows. It’s true that the compiler won’t compile if it knows at compile time that a number will overflow—for example:

fn main() {
    let some_number = 200_u8;
    println!("{}", some_number + 200);
}

This is pretty obvious (even to us) that the number will be 400, which won’t fit into a u8, and the compiler knows this as well:

error: this arithmetic operation will overflow
 --> src/main.rs:3:20
  |
3 |     println!("{}", some_number + 200);
  |                    ^^^^^^^^^^^^^^^^^ attempt to compute `200_u8 +
  ➥200_u8`, which would overflow
  |
  = note: `#[deny(arithmetic_overflow)]` on by default

However, if a number isn’t known at compile time, the behavior will be different:

Debug mode—The program will panic.

Release mode—The number will overflow.

Let’s trick the compiler into making this happen. First, we will make a u8 with a value of 255, the highest value for a u8. Then we will use the rand crate to add 10 to it:

use rand::{thread_rng, Rng};
 
fn main() {
    let mut rng = thread_rng();
    let some_number = 255_u8;
    println!("{}", some_number + rng.gen_range(10..=10));    ①
}

① We know that a range of 10..=10 will only return 10, but the Rust compiler doesn’t know this at compile time, so it will let us run the program.

In Release mode, the number will overflow, and the program will print 10 without panicking. But in Debug mode, we will see this:

     Running `target/debug/playground`
thread 'main' panicked at 'attempt to add with overflow', src/main.rs:6:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

We certainly don’t want to panic, and we also don’t want to add 10 to 255 and get 10. So let’s use .checked_add() instead. Now we will never overflow or panic:

use rand::random;
 
fn add_numbers(one: u8, two: u8) {
    match one.checked_add(two) {
        Some(num) => println!("Added {one} to {two}: {num}"),
        None => println!("Error: couldn't add {one} to {two}"),
    }
}
 
fn main() {
    for _ in 0..3 {
        let some_number = random::<u8>();
        let other_number = random::<u8>();
        add_numbers(some_number, other_number);
    }
}

The output will be different every time, but it will look something like this:

Error: couldn't add 199 to 236
Added 34 to 97: 131
Added 61 to 109: 170

Environments that silently ignore integer overflows have been to blame for all kinds of crashes and security problems over the years, which is what makes methods like .checked_add() particularly nice for a systems programming language. Be sure to use the .checked_ methods whenever you think an overflow could take place! And if you are often working with numbers that are larger than any integer in the standard library, take a look at the num_bigint crate (https://docs.rs/num-bigint/latest/num_bigint/).

20.3.2 The Add trait and other similar traits

You might have noticed that the methods for integers use the variable name rhs a lot. For example, the documentation on the method .checked_add() starts with this:

pub const fn checked_add(self, rhs: i8) -> Option<i8>
Checked integer addition. Computes self + rhs, returning None if overflow
➥occurred.

The term rhs means “right-hand side”— in other words, the right-hand side when you do some math. For example, in 5 + 6, the number 5 is on the left and 6 is on the right, so 6 is the rhs. It is not a keyword, but you will see rhs a lot in the standard library, so it’s good to know.

While we are on the subject, let’s learn how to implement Add, which is the trait used for the + operator in Rust. In other words, after you implement Add, you can use + on a type that you create. You need to implement Add yourself (you can’t just use #[derive(Add)]) because it’s impossible to guess how you might want to add one type to another type. Here’s the example from the page in the standard library:

use std::ops::Add;                          ①
 
#[derive(Debug, Copy, Clone, PartialEq)]    ②
struct Point {
    x: i32,
    y: i32,
}
 
impl Add for Point {
    type Output = Self;                     ③
 
    fn add(self, other: Self) -> Self {
        Self {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

① Add is found inside the std::ops module, which has all the traits used for operations. You can probably guess that the other traits have names like Sub, Mul, and so on.

② PartialEq is probably the most important part here. You want to be able to compare numbers.

③ Remember, this is called an associated type—a type that “goes together” with a trait. In this case, it’s another Point.

Now let’s implement Add for our own type just for fun. Let’s imagine that we have a Country struct that we’d like to add to another Country. As long as we tell Rust how we want to add one to the other, Rust will cooperate, and then we will be able to use + to add them. It looks like this:

use std::fmt;
use std::ops::Add;
 
#[derive(Clone)]
struct Country {
    name: String,
    population: u32,
    gdp: u32,                                                    ①
}
 
impl Country {
    fn new(name: &str, population: u32, gdp: u32) -> Self {
        Self {
            name: name.to_string(),
            population,
            gdp,
        }
    }
}
 
impl Add for Country {
    type Output = Self;
 
    fn add(self, other: Self) -> Self {
        Self {
            name: format!("{} and {}", self.name, other.name),   ②
            population: self.population + other.population,
            gdp: self.gdp + other.gdp,
        }
    }
}
 
impl fmt::Display for Country {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(
            f,
            "In {} are {} people and a GDP of ${}",
            self.name, self.population, self.gdp
        )
    }
}
 
fn main() {
    let nauru = Country::new("Nauru", 12_511, 133_200_000);
    let vanuatu = Country::new("Vanuatu", 219_137, 956_300_000);
    let micronesia = Country::new("Micronesia", 113_131, 404_000_000);
 
    println!("{}", nauru);
    let nauru_and_vanuatu = nauru + vanuatu;
    println!("{nauru_and_vanuatu}");
    println!("{}", nauru_and_vanuatu + micronesia);
}

① Size of the economy

② We decide that add means to concatenate the names, combine the population, and combine the GDP. It’s entirely up to us what we want Add to mean.

This prints

In Nauru are 12511 people and a GDP of $133200000
In Nauru and Vanuatu are 231648 people and a GDP of $1089500000
In Nauru and Vanuatu and Micronesia are 344779 people and a GDP of
➥$1493500000

The three others are called Sub, Mul, and Div, and they are basically the same to implement. There are quite a few other operators in the same module, such as +=, -=, *=, and /=, which use traits that start with the name Assign: AddAssign, SubAssign, MulAssign, and DivAssign. You can see the full list of such traits here: http://mng.bz/468j. They are all named in a pretty predictable fashion. For example, % is called Rem, - is called Neg, and so on.

Two other convenient traits, PartialEq (http://mng.bz/JdwK) and PartialOrd, (http://mng.bz/PRwY), are used to compare and order one variable with another. After these traits are implemented, you will be able to use signs like < and == for your type in the same way that implementing Add lets you use the + sign.

Because comparing for equality and order are done among variables of the same type, these traits are easier to implement and are usually done using #[derive], as we saw in chapter 13. But you can also manually implement them if you want. As always, the standard library contains some simple examples implementing these traits that you can copy and paste and then change to suit your own type if you want to manually implement them.

20.4 Floats

f32 and f64 have a very large number of methods that you use when doing math. We won’t look at those, but here are some methods that you might use. They are: .floor(), .ceil(), .round(), and .trunc(). All of these return an f32 or an f64 that is like an integer (i.e., a whole number). They do the following:

Here is a simple sample that prints them:

fn four_operations(input: f64) {
    println!(
"For the number {}:
floor: {}
ceiling: {}
rounded: {}
truncated: {}\n",
        input,
        input.floor(),
        input.ceil(),
        input.round(),
        input.trunc()
    );
}
 
fn main() {
    four_operations(9.1);
    four_operations(100.7);
    four_operations(-1.1);
    four_operations(-19.9);
}

This prints

For the number 9.1:
floor: 9
ceiling: 10
rounded: 9        ①
truncated: 9
 
For the number 100.7:
floor: 100
ceiling: 101
rounded: 101      ②
truncated: 100
 
For the number -1.1:
floor: -2
ceiling: -1
rounded: -1
truncated: -1
 
For the number -19.9:
floor: -20
ceiling: -19
rounded: -20
truncated: -19

① Because it’s less than 9.5

② Because it’s more than 100.5

f32 and f64 have a method called .max() and .min() that gives you the higher or the lower of two numbers. (For other types, you can use the std::cmp::max() and std::cmp::min() functions.)

These .max() and .min() methods are a good opportunity to show again that the .fold() method for iterators isn’t just for adding numbers. In this case, you can use .fold() to return the highest or lowest number in a Vec or anything else that implements Iterator:

fn main() {
    let nums = vec![8.0_f64, 7.6, 9.4, 10.0, 22.0, 77.345, -7.77, -10.0];
    let max = nums
        .iter()
        .fold(f64::MIN, |num, next_num| num.max(*next_num));    ①
    let min = nums
        .iter()
        .fold(f64::MAX, |num, next_num| num.min(*next_num));    ②
    println!("{max}, {min}");
}

① To get the highest number, start with the lowest possible f64 value.

② Conversely, start with the highest possible f64 value to get the lowest number.

With this, we get the highest and the lowest values: 77.345 and −10.0.

On the left side of the documentation for Rust’s float types, you might notice that there are a lot of consts, known as “associated constants”: DIGITS, EPSILON, INFINITY, MANTISSA_DIGITS, and so on. Plus, in the previous sample, we’ve used MIN and MAX, which we’ve also used with other types such as integers. How are these consts made anyway? Let’s take a quick look at that.

20.5 Associated items and associated constants

Rust has three types of associated items. We are already familiar with the first two and are now going to learn the third one, so this is a good time to sum up all three. Associated items are connected to the type or trait they are associated with by the :: double colon. Let’s start with the first one, which we know very well: functions.

20.5.1 Associated functions

When you implement a method on a type or a trait, you are giving it an associated function. Most of the time, we see it in variable_name.function() format when there is a self parameter. But this is just a convenience instead of using forms like TypeName:: function(&variable_name) or TypeName::function(&mut variable_name). When you use the dot operator (a period) to call a method, Rust is actually just using the :: syntax, unseen to you, to call the function. Let’s look at a quick example:

struct MyStruct(String);
 
impl MyStruct {                                    ①
    fn print_self(&self) {
        println!("{}", self.0);
    }
    fn add_exclamation(&mut self) {
        self.0.push('!')
    }
}
 
fn main() {
    let mut my_struct = MyStruct("Hi".to_string());
 
    my_struct.print_self();                        ②
    MyStruct::print_self(&my_struct);
 
    my_struct.add_exclamation();                   ③
    MyStruct::add_exclamation(&mut my_struct);
 
    MyStruct::print_self(&my_struct);
}

① MyStruct has two methods; 99.9% of the time, we would use the dot operator

② We are calling .print_self(). On this line, we use the dot operator, but on the following line, we use the associated item syntax. It’s exactly the same thing!

③ The same thing happens here, too. my_struct.add_exclamation() takes a &mut my_struct without us needing to specify that. But if we want, we can use the full associated item syntax like we do on the next line.

This sample is pretty easy, with an output of Hi, Hi, and Hi!!.

20.5.2 Associated types

The next item we’ve seen is an associated type, which is the type you define when implementing a trait. We saw this most recently with the Add trait:

pub trait Add<Rhs = Self> {
    type Output;
 
    fn add(self, rhs: Rhs) -> Self::Output;    ①
}

① Required method

Here, type Output is defined when you implement the trait, and this also gets attached to the type with the :: double colon. Here, as well, we can use the full associated type signature. Let’s use a really simple example: adding 10 to 10. This time, we will start with the full signature and work backward:

use std::ops::Add;
 
fn main() {
    let num1 = 10;
    let num2 = 10;
    
    print!("{} ", i32::add(num1, num2));    ①
    print!("{} ", num1.add(num2));          ②
    print!("{}", num1 + num2);              ③
}

① The i32 type implements Add, which gives it the add function: i32::add(). This function takes self plus another number.

② Since we have a self parameter, we can use the dot operator as well.

③ This last step is built into the language: if you implement Add, you can use + to add. This makes sense: nobody would want to use Rust if they had to type use std::ops::Add and 10.add(10) all the time just to add 10 and 10 together.

On each line, we are doing the same operation, so the output is just 20 20 20.

Now let’s look at a simple example of our own. This time, we’ll have a trait that just requires that a type destroy itself and turn into another form. This is defined by whoever implements the trait and can be anything:

trait ChangeForm {
    type SomethingElse;                             ①
    fn change_form(self) -> Self::SomethingElse;    ②
}
 
impl ChangeForm for String {                        ③
    type SomethingElse = char;
    fn change_form(self) -> Self::SomethingElse {
        self.chars().next().unwrap_or(' ')
    }
}
 
impl ChangeForm for i32 {
    type SomethingElse = i64;
    fn change_form(self) -> Self::SomethingElse {
        println!("i32 just got really big!");
        i64::MAX
    }
}
 
fn main() {
    let string1 = "Hello there!".to_string();       ④
    println!("{}", string1.change_form());
 
    let string2 = "I'm back!".to_string();
    println!("{}", String::change_form(string2));
 
    let small_num = 1;
    println!("{}", small_num.change_form());
 
    let also_small_num = 0;
    println!("{}", i32::change_form(also_small_num));
}

① The type is called SomethingElse and can be anything.

② Note the signature here: it’s associated with Self and attached with the :: double colon.

③ We’ll implement it for String and char. It’s our own trait, so we can implement it on external types, too.

④ Here, as well, there are two ways to call the function: the method signature with the dot operator or the full associated type signature.

Here’s the output:

H
I
i32 just got really big!
9223372036854775807
i32 just got really big!
9223372036854775807

The associated function and type signature with the :: should look pretty familiar by now!

And with that, we are now at the last associated item: associated consts.

20.5.3 Associated consts

Associated consts are actually incredibly easy to use. Just start an Impl block, type const CONST_NAME: type_name = value, and you’re done! Here’s a quick example:

struct SizeTenString(String);
 
impl SizeTenString {
    const SIZE: usize = 5;
}
 
fn main() {
    println!("{}", SizeTenString::SIZE);
}

With this associated const, our SizeFiveString can pass on this SIZE const to whatever needs it.

Here is a longer yet still simple example of this associated const. In this example, we can use the associated const to ensure that this type will always be 10 characters in length:

#[derive(Debug)]
struct SizeTenString(String);
 
impl SizeTenString {
    const SIZE: usize = 10;
}
 
impl TryFrom<&'static str> for SizeTenString {
    type Error = String;
    fn try_from(input: &str) -> Result<Self, Self::Error> {
        if input.chars().count() == Self::SIZE {
            Ok(Self(input.to_string()))
        } else {
            Err(format!("Length must be {} characters!", Self::SIZE))
        }
    }
}
 
fn main() {
    println!("{:?}", SizeTenString::try_from("This one's long"));
    println!("{:?}", SizeTenString::try_from("Too short"));
    println!("{:?}", SizeTenString::try_from("Just right"));
}

An associated const can be used with traits, too, in a similar way to functions on traits. A type can override these associated consts, too, in the same way that you can write your own trait method even if there is a default method:

trait HasNumbers {
    const SET_NUMBER: usize = 10;           ①
    const EXTRA_NUMBER: usize;              ②
    // fn set_number() -> usize { 10 }      ③
    // fn extra_number() -> usize;
}
 
struct NothingSpecial;
 
impl HasNumbers for NothingSpecial {
    const EXTRA_NUMBER: usize = 10;
    // const SET_NUMBER: usize = 20;        ④
}
 
fn main() {
  print!("{} ", NothingSpecial::SET_NUMBER);
  print!("{}", NothingSpecial::EXTRA_NUMBER);
}

① The value of the const SET_NUMBER is 10, so you don’t need to decide the value when implementing the trait.

② This other const, however, is unknown. You have to choose its value when implementing this trait.

③ These two commented-out functions are similar in behavior to the consts. One has a default implementation, while the other only shows the return type and has to be written out by anyone implementing the trait.

④ If you uncommented this, the struct NothingSpecial would have a value of 20 for SET_NUMBER instead of 10.

So this code will print 10 10, but if you uncomment the one line out, it will print 20 10.

That was a long enough detour, so let’s get on to our next standard library type!

20.6 bool

Booleans are pretty simple in Rust but are quite robust compared to some other languages. (For comparison, one example of the difficulties of working with booleans in C can be found at http://mng.bz/1J51.) There are a few ways to use a bool that we haven’t come across yet, so let’s look at them now.

In Rust, you can turn a bool into an integer if you want because it’s safe to do that. But you can’t do it the other way around. As you can see, true turns to 1, and false turns to 0:

fn main() {
    let true_false = (true, false);
    println!("{} {}", true_false.0 as u8, true_false.1 as i32);
}

This prints 1 0. Or you can use .into() if you tell the compiler the type:

fn main() {
    let true_false: (i128, u16) = (true.into(), false.into());
    println!("{} {}", true_false.0, true_false.1);
}

This prints the same thing.

As of Rust 1.50 and 1.62, there are two methods, .then() and .then_some(), that turn a bool into an Option. With .then(), you write a closure, and the closure is called if the item is true. Whatever is returned from the closure gets wrapped in an Option. Here’s a small example:

fn main() {
    let (tru, fals) = (true.then(|| 8), false.then(|| 8));
    println!("{:?}, {:?}", tru, fals);
}

This prints Some(8), None.

These methods can be pretty nice for error handling. The following code shows how a simple Vec<bool> can be turned into a Vec of Results with some extra info as it is handled.

use std::time::{SystemTime, UNIX_EPOCH};
 
fn timestamp() -> f64 {                          ①
    SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .unwrap()
        .as_secs_f64()
}
 
fn send_data_to_user() {}                        ②
 
fn main() {
    let bool_vec = vec![true, false, true, false, false];
 
    let result_vec = bool_vec
        .into_iter()
        .enumerate()
        .map(|(index, b)| {
            b.then(|| {
                let timestamp = timestamp();     ③
                send_data_to_user();
                timestamp
            })
            .ok_or_else(|| {                     ④
                let time = timestamp();
                format!("Error with item {index} at {time}")
            })
        })
        .collect::<Vec<_>>();
    println!("{result_vec:#?}");
}

① A small function to generate a timestamp as an f64 to make the following code easier to read

② This function is empty, but pretend that it sends the users of our system some data in case it comes across as true.

③ We turn the bool into an Option<f64> (the timestamp), sending the user the data before passing it on.

④ With ok_or_else(), we turn the Option into a Result and add some error info (the index number that failed).

The output at the end will look something like this:

    Ok(
        1685149117.2468076,
    ),
    Err(
        "Error with item 1 at 1685149117.246808",
    ),
    Ok(
        1685149117.246833,
    ),
    Err(
        "Error with item 3 at 1685149117.2468333",
    ),
    Err(
        "Error with item 4 at 1685149117.2468338",
    ),
]

20.7 Vec

Vec has a lot of methods that we haven’t looked at yet. Let’s start with .sort(). The .sort() method is not surprising at all. It uses a &mut self to sort a vector in place (nothing is returned):

fn main() {
    let mut my_vec = vec![100, 90, 80, 0, 0, 0, 0, 0];
    my_vec.sort();
    println!("{:?}", my_vec);
}

This prints [0, 0, 0, 0, 0, 80, 90, 100]. But there is one more interesting way to sort called .sort_unstable(), and it is usually faster. It can be faster because it doesn’t care about the order of items if they are the same value. In regular .sort(), you know that the last 0, 0, 0, 0, 0 will be in the same order after .sort() is performed. But .sort_unstable() might move the last zero to index 0, then the third last zero to index 2, and so on. The documentation in the standard library explains it pretty well:

It is typically faster than stable sorting, except in a few special cases, e.g., when the slice consists of several concatenated sorted sequences.

.dedup() means “de-duplicate.” It will remove items that are the same in a vector, but only if they are next to each other. This next code will not just print "sun", "moon":

fn main() {
    let mut my_vec = vec!["sun", "sun", "moon", "moon", "sun", "moon",
    ➥"moon"];
    my_vec.dedup();
    println!("{:?}", my_vec);
}

Instead, it only gets rid of "sun" next to the other "sun", then "moon" next to one "moon", and again with "moon" next to another "moon". The result is: ["sun", "moon", "sun", "moon"].

So, if you want to use .dedup() to remove every duplicate, just .sort() first:

fn main() {
    let mut my_vec = vec!["sun", "sun", "moon", "moon", "sun", "moon",
    ➥"moon"];
    my_vec.sort();
    my_vec.dedup();
    println!("{:?}", my_vec);
}

The result is["moon", "sun"].

You can split a Vec with .split_at(), while .split_at_mut() lets you do the same if you need to change the values. These give you two slices while leaving the original Vec intact:

fn main() {
    let mut big_vec = vec![0; 6];
    let (first, second) = big_vec.split_at_mut(3);
 
    std::thread::scope(|s| {
        s.spawn(|| {
            for num in first {
                *num += 1;
            }
        });
        s.spawn(|| {
            for num in second {
                *num -= 5;
            }
        });
    });
    println!("{big_vec:?}");
}

The output is [1, 1, 1, -5, -5, -5].

The .drain() method lets you pull a range of values out of a Vec, giving you an iterator. This iterator keeps a mutable borrow on the original Vec so doing something like collecting it into another Vec or outright using the drop() method will let you access the original Vec again:

fn main() {
    let mut original_vec = ('A'..'K').collect::<Vec<_>>();
    println!("{original_vec:?}");
 
    let drain = original_vec.drain(2..=5);
    println!("Pulled these chars out: {drain:?}");
    drop(drain);
    println!("Here's what's left: {original_vec:?}");
 
    let drain_two = original_vec.drain(2..=4).collect::<Vec<_>>();
    println!("Original vec: {original_vec:?}\nSecond drain: {drain_two:?}");
}

Here’s the output:

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
Pulled these chars out: Drain(['C', 'D', 'E', 'F'])
Here's what's left: ['A', 'B', 'G', 'H', 'I', 'J']
Original vec: ['A', 'B', 'J']
Second drain: ['G', 'H', 'I']

20.8 String

We learned before that a String is kind of like a Vec, because it holds one (a Vec<u8>). A String isn’t just a simple smart pointer over a Vec<u8>, but sometimes it almost feels like one because so many of the methods are exactly the same.

One of these is String::with_capacity(). This method can help avoid too many allocations if you are pushing chars to it with .push() or pushing &strs to it with .push_str(). Here’s an example of a String that has too many allocations:

fn main() {
    let mut push_string = String::new();
 
    for _ in 0..100_000 {
        let capacity_before = push_string.capacity();                ①
        push_string.push_str("I'm getting pushed into the string!");
        let capacity_after = push_string.capacity();
        if capacity_before != capacity_after {
            println!("Capacity raised to {capacity_after}");
        }
    }
}

① We check the capacity before and after the &str is pushed and print out the new capacity if it has changed.

This prints

Capacity raised to 35
Capacity raised to 70
Capacity raised to 140
Capacity raised to 280
Capacity raised to 560
Capacity raised to 1120
Capacity raised to 2240
Capacity raised to 4480
Capacity raised to 8960
Capacity raised to 17920
Capacity raised to 35840
Capacity raised to 71680
Capacity raised to 143360
Capacity raised to 286720
Capacity raised to 573440
Capacity raised to 1146880
Capacity raised to 2293760
Capacity raised to 4587520

We had to reallocate (copy everything over) 18 times. But now we know the final capacity. So we’ll give it the capacity right away, and we don’t need to reallocate—just one String capacity is enough:

fn main() {
    let mut push_string = String::with_capacity(4587520);     ①
 
    for _ in 0..100_000 {
        let capacity_before = push_string.capacity();
        push_string.push_str("I'm getting pushed into the string!");
        let capacity_after = push_string.capacity();
        if capacity_before != capacity_after {
            println!("Capacity raised to {capacity_after}");
        }
    }
}

① We know the exact number in this case. Even if you only have a general idea (like “at least 10,000”), you could still use with_capacity() to avoid too many allocations.

And this prints nothing. Perfect! We never had to reallocate.

Of course, the actual length is certainly smaller than the final 4,587,520, which is simply a doubling of the previous capacity when it was 2,293,760. We can shrink it, though, with .shrink_to_fit(), which is another Vec method. But only do this once you are sure of the final length because the capacity will double again even if you push a single extra char to the Vec:

fn main() {
    let mut push_string = String::with_capacity(4587520);
 
    for _ in 0..100_000 {
        push_string.push_str("I'm getting pushed into the string!");
    }
    println!("Current capacity as expected: {}", push_string.capacity());
    push_string.shrink_to_fit();
    println!("Actual needed capacity: {}", push_string.capacity());
    push_string.push('a');
    println!("Whoops, it doubled again: {}", push_string.capacity());
    push_string.shrink_to_fit();
    println!("Shrunk back to actual needed capacity: {}", push_string.capacity());
}

This prints

Current capacity: 4587520
Actual needed capacity: 3500000
Whoops, it doubled again: 7000000
Shrunk back to actual needed capacity: 3500001

The .pop() method works for a String, just like for a Vec:

fn main() {
    let mut my_string = String::from(".daer ot drah tib elttil a si gnirts
    ➥sihT");
    while let Some(c) = my_string.pop() {
        print!("{c}");
    }
}

Try reading the String backward to see what the output will be for this sample.

By the way, look at how readable the .pop() method is: there’s no magic to it. At this point in the book, you could easily write this method yourself!

    pub fn pop(&mut self) -> Option<char> {
        let ch = self.chars().rev().next()?;
        let newlen = self.len() - ch.len_utf8();
        unsafe {
            self.vec.set_len(newlen);
        }
        Some(ch)
    }

One convenient method for String is .retain(), which is a little bit like the .filter() method we know for iterators. This method passes in a closure that we can use to evaluate whether to keep each character or not. The following code keeps only the characters inside a String that are letters or spaces:

fn main() {
    let mut my_string = String::from("Age: 20 Height: 194 Weight: 80");
    my_string.retain(|ch| ch.is_alphabetic() || ch == ' ');
    dbg!(my_string);
}

This prints

[src\main.rs:4] my_string = "Age  Height  Weight "

20.9 OsString and CString

The std::ffi module of the standard library is the one that helps you use Rust with other languages or operating systems. This module includes types like OsString and CString, which are like String for the operating system or String for the language C. They each have their own &str type, too: OsStr and CStr. The three letters ffi stand for foreign function interface.

You can use OsString when you have to work with an operating system that doesn’t use UTF-8. All Rust strings are UTF-8, but certain operating systems express strings in different ways. Here is a simplified version of the page in the standard library on why we have OsString:

A string on Unix (Linux, etc.) might be a sequence of bytes together that don’t have zeros, and sometimes you read them as Unicode UTF-8.

A string on Windows might be made of sequences of 16-bit values that don’t have zeros.

In Rust, strings are always valid UTF-8, which may contain zeros.

So an OsString is made to be read by all of them.

You can do all the regular things with an OsString like OsString::from("Write something here"). It also has an interesting method called .into_string() that tries to make it into a regular String. It returns a Result, but the Err part is just the original OsString:

pub fn into_string(self) -> Result<String, OsString>

So if it doesn’t work, you just get the previous OsString back. You can’t call .unwrap() because it will panic, but you can use match to get the OsString back. We can quickly prove that the Err value is an OsString by calling methods that don’t exist:

use std::ffi::OsString;
 
fn main() {
    let os_string = OsString::from("This string works for your OS too.");
    match os_string.into_string() {
        Ok(valid) => valid.thth(),
        Err(not_valid) => not_valid.occg(),
    }
}

Then the compiler tells us exactly what we want to know:

error[E0599]: no method named `thth` found for struct `std::string::String`
➥in the current scope
 --> src/main.rs:6:28
  |
6 |         Ok(valid) => valid.thth(),
  |                            ^^^^ method not found in `std::string::String`
 
error[E0599]: no method named `occg` found for struct `std::ffi::OsString`
➥in the current scope
 --> src/main.rs:7:37
  |
7 |         Err(not_valid) => not_valid.occg(),
  |                                     ^^^^ method not found in
  ➥`std::ffi::OsString`