8 Networking

This chapter covers

Implementing a networking stack
Handling multiple error types within local scope
When to use trait objects
Implementing state machines in Rust

This chapter describes how to make HTTP requests multiple times, stripping away a layer of abstraction each time. We start by using a user-friendly library, then boil that away until we’re left with manipulating raw TCP packets. When we’re finished, you’ll be able to distinguish an IP address from a MAC address. And you’ll learn why we went straight from IPv4 to IPv6.

You’ll also learn lots of Rust in this chapter, most of it related to advanced error handling techniques that become essential for incorporating upstream crates. Several pages are devoted to error handling. This includes a thorough introduction to trait objects.

Networking is a difficult subject to cover in a single chapter. Each layer is a fractal of complexity. Networking experts will hopefully overlook my lack of depth in treating such a diverse topic.

Figure 8.1 provides an overview of the topics that the chapter covers. Some of the projects that we cover include implementing DNS resolution and generating standards-compliant MAC addresses, including multiple examples of generating HTTP requests. A bit of a role-playing game is added for light relief.

Figure 8.1 Networking chapter map. The chapter incorporates a healthy mix of theory and practical exercises.

8.1 All of networking in seven paragraphs

Rather than trying to learn the whole networking stack, let’s focus on something that’s of practical use. Most readers of this book will have encountered web programming. Most web programming involves interacting with some sort of framework. Let’s look there.

HTTP is the protocol that web frameworks understand. Learning more about HTTP enables us to extract the most performance out of our web frameworks. It can also help us to more easily diagnose any problems that occur. Figure 8.2 shows networking protocols for content delivery over the internet.

Figure 8.2 Several layers of networking protocols involved with delivering content over the internet. The figure compares some common models, including the seven-layer OSI model and the four-layer TCP/IP model.

Networking is comprised of layers. If you’re new to the field, don’t be intimidated by a flood of acronyms. The most important thing to remember is that lower levels are unaware of what’s happening above them, and higher levels are agnostic to what’s happening below them. Lower levels receive a stream of bytes and pass it on. Higher levels don’t care how messages are sent; they just want them sent.

Let’s consider one example: HTTP. HTTP is known as an application-level protocol. Its job is to transport content like HTML, CSS, JavaScript, WebAssembly modules, images, video, and other formats. These formats often include other embedded formats via compression and encoding standards. HTTP itself often redundantly includes information provided by one of the layers below it, TCP. Between HTTP and TCP sits TLS. TLS (Transport Layer Security), which has replaced SSL (Secure Sockets Layer), adds the S to HTTPS.

TLS provides encrypted messaging over an unencrypted connection. TLS is implemented on top of TCP. TCP sits upon many other protocols. These go all the way down to specifying how voltages should be interpreted as 0s and 1s. And yet, as complicated as this story is so far, it gets worse. These layers, as you have probably seen in your dealings with those as a computer user, bleed together like watercolor paint.

HTML includes a mechanism to supplement or overwrite directives omitted or specified within HTTP: the <meta> tag’s http-equiv attribute. HTTP can make adjustments downwards to TCP. The “Connection: keep-alive” HTTP header instructs TCP to maintain its connection after this HTTP message has been received. These sorts of interactions occur all through the stack. Figure 8.2 provides one view of the networking stack. It is more complicated than most attempts. And even that complicated picture is highly simplified.

Despite all of that, we’re going to try to implement as many layers as possible within a single chapter. By the end of it, you will be sending HTTP requests with a virtual networking device and a minimal TCP implementation that you created yourself, using a DNS resolver that you also created yourself.

8.2 Generating an HTTP GET request with reqwest

Our first implementation will be with a high-level library that is focused on HTTP. We’ll use the reqwest library because its focus is primarily on making it easy for Rust programmers to create an HTTP request.

Although it’s the shortest, the reqwest implementation is the most feature-complete. As well as being able to correctly interpret HTTP headers, it also handles cases like content redirects. Most importantly, it understands how to handle TLS properly.

In addition to expanded networking capabilities, reqwest also validates the content’s encoding and ensures that it is sent to your application as a valid String. None of our lower-level implementations do any of that. The following shows the project structure for listing 8.2:

ch8-simple/
├── src
│   └── main.rs
└── Cargo.toml

The following listing shows the metadata for listing 8.2. The source code for this listing is in ch8/ch8-simple/Cargo.toml.

Listing 8.1 Crate metadata for listing 8.2

[package]
name = "ch8-simple"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]
reqwest = "0.9"

The following listing illustrates how to make an HTTP request with the reqwest library. You’ll find the source in ch8/ch8-simple/src/main.rs.

Listing 8.2 Making an HTTP request with reqwest

 1 use std::error::Error;
 2  
 3 use reqwest;
 4  
 5 fn main() -> Result<(), Box<dyn Error>> {       ①
 6   let url = "http:/ /www.rustinaction.com/";
 7   let mut response = reqwest::get(url)?;
 8  
 9   let content = response.text()?;
10   print!("{}", content);
11  
12   Ok(())
13 }

① Box<dyn Error> represents a trait object, which we’ll cover in section 8.3.

If you’ve ever done any web programming, listing 8.2 should be straightforward. reqwest::get() issues an HTTP GET request to the URL represented by url. The response variable holds a struct representing the server’s response. The response .text() method returns a Result that provides access to the HTTP body after validating that the contents are a legal String.

One question, though: What on earth is the error side of the Result return type Box<dyn std::error::Error>? This is an example of a trait object that enables Rust to support polymorphism at runtime. Trait objects are proxies for concrete types. The syntax Box<dyn std::error::Error> means a Box (a pointer) to any type that implements std::error:Error’s.

Using a library that knows about HTTP allows our programs to omit many details. For example

Knowing when to close the connection. HTTP has rules for telling each of the parties when the connection ends. This isn’t available to us when manually making requests. Instead, we keep the connection open for as long as possible and hope that the server will close.
Converting the byte stream to content. Rules for translating the message body from [u8] to String (or perhaps an image, video, or some other content) are handled as part of the protocol. This can be tedious to handle manually as HTTP allows content to be compressed into several methods and encoded into several plain text formats.
Inserting or omitting port numbers. HTTP defaults to port 80. A library that is tailored for HTTP, such as reqwest, allows you to omit port numbers. When we’re building requests by hand with generic TCP crates, however, we need to be explicit.
Resolving the IP addresses. The TCP protocol doesn’t actually know about domain names like www.rustinaction.com, for example. The library resolves the IP address for www.rustinaction.com on our behalf.

8.3 Trait objects

This section describes trait objects in detail. You will also develop the world’s next best-selling fantasy role-playing game—the rpg project. If you would like to focus on networking, feel free to skip ahead to section 8.4.

There is a reasonable amount of jargon in the next several paragraphs. Brace yourself. You’ll do fine. Let’s start by introducing trait objects by what they achieve and what they do, rather than focusing on what they are.

8.3.1 What do trait objects enable?

While trait objects have several uses, they are immediately helpful by allowing you to create containers of multiple types. Although players of our role-playing game can choose different races, and each race is defined in its own struct, you’ll want to treat those as a single type. A Vec<T> won’t work here because we can’t easily have types T, U, and V wedged into Vec<T> without introducing some type of wrapper object.

8.3.2 What is a trait object?

Trait objects add a form of polymorphism—the ability to share an interface between types—to Rust via dynamic dispatch. Trait objects are similar to generic objects. Generics offer polymorphism via static dispatch. Choosing between generics and type objects typically involves a trade off between disk space and time:

Generics use more disk space with faster runtimes.
Trait objects use less disk space but incur a small runtime overhead caused by pointer indirection.

Trait objects are dynamically-sized types, which means that these are always seen in the wild behind a pointer. Trait objects appear in three forms: &dyn Trait, &mut dyn Trait, and Box<dyn Trait>.¹ The primary difference between the three forms is that Box<dyn Trait> is an owned trait object, whereas the other two are borrowed.

8.3.3 Creating a tiny role-playing game: The rpg project

Listing 8.4 is the start of our game. Characters in the game can be one of three races: humans, elves, and dwarves. These are represented by the Human, Elf, and Dwarf structs, respectively.

Characters interact with things. Things are represented by the Thing type.² Thing is an enum that currently represents swords and trinkets. There’s only one form of interaction right now: enchantment. Enchanting a thing involves calling the enchant() method:

character.enchant(&mut thing)

When enchantment is successful, thing glows brightly. When a mistake occurs, thing is transformed into a trinket. Within listing 8.4, we create a party of characters with the following syntax:

58 let d = Dwarf {};
59 let e = Elf {};
60 let h = Human {};
61  
62 let party: Vec<&dyn Enchanter> = vec![&d, &h, &e];     ①

① Although d, e, and h are different types, using the type hint &dyn Enchanter tells the compiler to treat each value as a trait object. These now all have the same type.

Casting the spell involves choosing a spellcaster. We make use of the rand crate for that:

58 let spellcaster = party.choose(&mut rand::thread_rng()).unwrap();
59 spellcaster.enchant(&mut it)

The choose() method originates from the rand::seq::SliceRandom trait that is brought into scope in listing 8.4. One of the party is chosen at random. The party then attempts to enchant the object it. Compiling and running listing 8.4 results in a variation of this:

$ cargo run 
...
   Compiling rpg v0.1.0 (/rust-in-action/code/ch8/ch8-rpg)
    Finished dev [unoptimized + debuginfo] target(s) in 2.13s
     Running `target/debug/rpg`
Human mutters incoherently. The Sword glows brightly.
 
$ target/debug/rpg                                                      ①
Elf mutters incoherently. The Sword fizzes, then turns into a worthless trinket.

① Re-executes the command without recompiling

The following listing shows the metadata for our fantasy role-playing game. The source code for the rpg project is in ch8/ch8-rpg/Cargo.toml.

Listing 8.3 Crate metadata for the rpg project

[package]
name = "rpg"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]
rand = "0.7"

Listing 8.4 provides an example of using a trait object to enable a container to hold several types. You’ll find its source in ch8/ch8-rpg/src/main.rs.

Listing 8.4 Using the trait object &dyn Enchanter

 1 use rand;
 2 use rand::seq::SliceRandom;
 3 use rand::Rng;
 4  
 5 #[derive(Debug)]
 6 struct Dwarf {}
 7  
 8 #[derive(Debug)]
 9 struct Elf {}
10  
11 #[derive(Debug)]
12 struct Human {}
13  
14 #[derive(Debug)]
15 enum Thing {
16   Sword,
17   Trinket,
18 }
19  
20 trait Enchanter: std::fmt::Debug {
21   fn competency(&self) -> f64;
22  
23   fn enchant(&self, thing: &mut Thing) {
24     let probability_of_success = self.competency();
25     let spell_is_successful = rand::thread_rng()
26       .gen_bool(probability_of_success);                       ①
27  
28     print!("{:?} mutters incoherently. ", self);
29     if spell_is_successful {
30       println!("The {:?} glows brightly.", thing);
31     } else {
32       println!("The {:?} fizzes, \
33              then turns into a worthless trinket.", thing);
34       *thing = Thing::Trinket {};
35     }
36   }
37 }
38  
39 impl Enchanter for Dwarf {
40   fn competency(&self) -> f64 {
41     0.5                                                        ②
42   }
43 }
44 impl Enchanter for Elf {
45   fn competency(&self) -> f64 {
46     0.95                                                       ③
47   }
48 }
49 impl Enchanter for Human {
50   fn competency(&self) -> f64 {
51     0.8                                                        ④
52   }
53 }
54  
55 fn main() {
56   let mut it = Thing::Sword;
57  
58   let d = Dwarf {};
59   let e = Elf {};
60   let h = Human {};
61  
62   let party: Vec<&dyn Enchanter> = vec![&d, &h, &e];           ⑤
63   let spellcaster = party.choose(&mut rand::thread_rng()).unwrap();
64  
65   spellcaster.enchant(&mut it);
66 }

① gen_bool() generates a Boolean value, where true occurs in proportion to its argument. For example, a value of 0.5 returns true 50% of the time.

② Dwarves are poor spellcasters, and their spells regularly fail.

③ Spells cast by elves rarely fail.

④ Humans are proficient at enchanting things. Mistakes are uncommon.

⑤ We can hold members of different types within the same Vec as all these implement the Enchanter trait.

Trait objects are a powerful construct in the language. In a sense, they provide a way to navigate Rust’s rigid type system. As you learn about this feature in more detail, you will encounter some jargon. For example, trait objects are a form of type erasure. The compiler does not have access to the original type during the call to enchant().

One of the frustrating things about Rust’s syntax for beginners is that trait objects and type parameters look similar. But types and traits are used in different places. For example, consider these two lines:

use rand::Rng;
use rand::rngs::ThreadRng;

Although these both have something to do with random number generators, they’re quite different. rand::Rng is a trait; rand::rngs::ThreadRng is a struct. Trait objects make this distinction harder.

When used as a function argument and in similar places, the form &dyn Rng is a reference to something that implements the Rng trait, whereas &ThreadRng is a reference to a value of ThreadRng. With time, the distinction between traits and types becomes easier to grasp. Here’s some common use cases for trait objects:

Creating collections of heterogeneous objects.
Returning a value. Trait objects enable functions to return multiple concrete types.
Supporting dynamic dispatch, whereby the function that is called is determined at runtime rather than at compile time.

Before the Rust 2018 edition, the situation was even more confusing. The dyn keyword did not exist. This meant that context was needed to decide between &Rng and &ThreadRng.

Trait objects are not objects in the sense that an object-oriented programmer would understand. They’re perhaps closer to a mixin class. Trait objects don’t exist on their own; they are agents of some other type.

An alternative analogy would be a singleton object that is delegated with some authority by another concrete type. In listing 8.4, the &Enchanter is delegated to act on behalf of three concrete types.

8.4 TCP

Dropping down from HTTP, we encounter TCP (Transmission Control Protocol). Rust’s standard library provides us with cross-platform tools for making TCP requests. Let’s use those. The file structure for listing 8.6, which creates an HTTP GET request, is provided here:

ch8-stdlib
├── src
│   └── main.rs
└── Cargo.toml

The following listing shows the metadata for listing 8.6. You’ll find the source for this listing in ch8/ch8-stdlib/Cargo.toml.

Listing 8.5 Project metadata for listing 8.6

[package]
name = "ch8-stdlib"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]

The next listing shows how to use the Rust standard library to construct an HTTP GET request with std::net::TcpStream. The source for this listing is in ch8/ch8-stdlib/src/main.rs.

Listing 8.6 Constructing an HTTP GET request

 1 use std::io::prelude::*;
 2 use std::net::TcpStream;
 3  
 4 fn main() -> std::io::Result<()> {
 5   let host = "www.rustinaction.com:80";      ①
 6  
 7   let mut conn =
 8     TcpStream::connect(host)?;
 9  
10   conn.write_all(b"GET / HTTP/1.0")?;
11   conn.write_all(b"\r\n")?;                  ②
12  
13   conn.write_all(b"Host: www.rustinaction.com")?;
14   conn.write_all(b"\r\n\r\n")?;              ③
15  
16   std::io::copy(                             ④
17     &mut conn,                               ④
18     &mut std::io::stdout()                   ④
19   )?;                                        ④
20  
21   Ok(())
22 }

① Explicitly specifying the port (80) is required. TcpStream does not know that this will become a HTTP request.

② In many networking protocols, \r\n signifies a new line.

③ Two blank new lines signify end of request

④ std::io::copy() streams bytes from a Reader to a Writer.

Some remarks about listing 8.6:

On line 10, we specify HTTP 1.0. Using this version of HTTP ensures that the connection is closed when the server sends its response. HTTP 1.0, however, does not support “keep alive” requests. Specifying HTTP 1.1 actually confuses this code as the server will refuse to close the connection until it has received another request, and the client will never send one.
On line 13, we include the hostname. This may feel redundant given that we used that exact hostname when we connected on lines 7–8. However, one should remembers that the connection is established over IP, which does not have host names. When TcpStream::connect() connects to the server, it only uses an IP address. Adding the Host HTTP header allows us to inject that information back into the context.

8.4.1 What is a port number?

Port numbers are purely virtual. They are simply u16 values. Port numbers allow a single IP address to host multiple services.

8.4.2 Converting a hostname to an IP address

So far, we’ve provided the hostname www.rustinaction.com to Rust. But to send messages over the internet, the IP (internet protocol) address is required. TCP knows nothing about domain names. To convert a domain name to an IP address, we rely on the Domain Name System (DNS) and its process called domain name resolution.

We’re able to resolve names by asking a server, which can recursively ask other servers. DNS requests can be made over TCP, including encryption with TLS, but are also sent over UDP (User Datagram Protocol). We’ll use DNS here because it’s more useful for learning purposes.

To explain how the translation from a domain name to an IP address works, we’ll create a small application that does the translation. We’ll call the application resolve. You’ll find its source code in listing 8.9. The application makes use of public DNS services, but you can easily add your own with the -s argument.

Our resolve application only understands a small portion of DNS protocol, but that portion is sufficient for our purposes. The project makes use of an external crate, trust-dns, to perform the hard work. The trust-dns crate implements RFC 1035, which defines DNS and several later RFCs quite faithfully using terminology derived from it. Table 8.1 outlines some of the terms that are useful to understand.

Table 8.1 Terms that are used in RFC 1035, the trust_dns crate, and listing 8.9, and how these interlink

Term	Definition	Representation in code
Domain name	A domain name is almost what you probably think of when you use the term domain name in your everyday language. The technical definition includes some special cases such as the root domain, which is encoded as a single dot, and domain names that need to be case-insensitive.	Defined in trust_dns::domain::Name pub struct Name { is_fqdn: bool, ① labels: Vec<Label>, } ① fqdn stands for fully-qualified domain name.
Message	A message is a container for both requests to DNS servers (called queries) and responses back to clients (called answers). Messages must contain a header, but other fields are not required. A `Message` struct represents this and includes several `Vec<T>` fields. These do not need to be wrapped in `Option` to represent missing values as their length can be 0.	Defined in trust_dns::domain::Name struct Message { header: Header, queries: Vec<Query>, answers: Vec<Record>, name_servers: Vec<Record>, additionals: Vec<Record>, sig0: Vec<Record>, ① edns: Option<Edns>, ② } ① sig0, a cryptographically signed record, verifies the message’s integrity. It is defined in RFC 2535. ② edns indicates whether the message includes extended DNS.
Message type	A message type identifies the message as a query or as an answer. Queries can also be updates, which are functionality that our code ignores.	Defined in trust_dns::op::MessageType pub enum MessageType { Query, Response, }
Message ID	A number that is used for senders to link queries and answers.	`u16`
Resource record type	The resource record type refers to the DNS codes that you’ve probably encountered if you’ve ever configured a domain name. Of note is how trust_dns handles invalid codes. The `RecordType` enum contains an `Unknown(u16)` variant that can be used for codes that it doesn’t understand.	Defined in trust_dns::rr::record_type::RecordType pub enum RecordType { A, AAAA, ANAME, ANY, // ... Unknown(u16), ZERO, }
Query	A `Query` struct holds the domain name and the record type that we’re seeking the DNS details for. These traits also describe the DNS class and allow queries to distinguish between messages sent over the internet from other transport protocols.	Defined in trust_dns::op::Query pub struct Query { name: Name, query_type: RecordType, query_class: DNSClass, }
Opcode	An `OpCode` enum is, in some sense, a subtype of `MessageType`. This is an extensibility mechanism that allows future functionality. For example, RFC 1035 defines the `Query` and `Status` opcodes but others were defined later. The `Notify` and `Update` opcodes are defined by RFC 1996 and RFC 2136, respectively.	Defined in trust_dns::op::OpCode pub enum OpCode { Query, Status, Notify, Update, }

An unfortunate consequence of the protocol, which I suppose is a consequence of reality, is that there are many options, types, and subtypes involved. Listing 8.7, an excerpt from listing 8.9, shows the process of constructing a message that asks, “Dear DNS server, what is the IPv4 address for domain_name?” The listing constructs the DNS message, whereas the trust-dns crate requests an IPv4 address for domain_name.

Listing 8.7 Constructing a DNS message in Rust

35 let mut msg = Message::new();                      ①
36 msg
37   .set_id(rand::random::<u16>())                   ②
38   .set_message_type(MessageType::Query)
39   .add_query(                                      ③
40       Query::query(domain_name, RecordType::A)     ④
41   )
42   .set_op_code(OpCode::Query)
43   .set_recursion_desired(true);                    ⑤

① A Message is a container for queries (or answers).

② Generates a random u16 number

③ Multiple queries can be included in the same message.

④ The equivalent type for IPv6 addresses is AAAA.

⑤ Requests that the DNS server asks other DNS servers if it doesn’t know the answer

We’re now in a position where we can meaningfully inspect the code. It has the following structure:

Parses command-line arguments
Builds a DNS message using trust_dns types
Converts the structured data into a stream of bytes
Sends those bytes across the wire

After that, we need to accept the response from the server, decode the incoming bytes, and print the result. Error handling remains relatively ugly, with many calls to unwrap() and expect(). We’ll address that problem shortly in section 8.5. The end process is a command-line application that’s quite simple.

Running our resolve application involves little ceremony. Given a domain name, it provides an IP address:

$ resolve www.rustinaction.com 35.185.44.232

Listings 8.8 and 8.9 are the project’s source code. While you are experimenting with the project, you may want to use some features of cargo run to speed up your process:

$ cargo run -q -- www.rustinaction.com       ①
35.185.44.232

① Sends arguments to the right of -- to the executable it compiles. The -q option mutes any intermediate output.

To compile the resolve application from the official source code repository, execute these commands in the console:

$ git clone https:/ /github.com/rust-in-action/code rust-in-action 
Cloning into 'rust-in-action'...
 
$ cd rust-in-action/ch8/ch8-resolve 
  
$ cargo run -q -- www.rustinaction.com      ①
35.185.44.232

① It may take a while to download the project’s dependencies and compile the code. The -q flag mutes intermediate output. Adding two dashes (--) sends further arguments to the compiled executable.

To compile and build from scratch, follow these instructions to establish the project structure:

At the command-line, enter these commands:

$ cargo new resolve 
     Created binary (application) `resolve` package
 
$ cargo install cargo-edit 
...
 
$ cd resolve 
 
$ cargo add rand@0.6 
    Updating 'https:/ /github.com/rust-lang/crates.io-index' index
      Adding rand v0.6 to dependencies
 
$ cargo add clap@2 
    Updating 'https:/ /github.com/rust-lang/crates.io-index' index
      Adding rand v2 to dependencies
 
$ cargo add trust-dns@0.16 --no-default-features 
    Updating 'https:/ /github.com/rust-lang/crates.io-index' index
      Adding trust-dns v0.16 to dependencies

Once the structure has been established, you check that your Cargo.toml matches listing 8.8, available in ch8/ch8-resolve/Cargo.toml.
Replace the contents of src/main.rs with listing 8.9. It is available from ch8/ch8-resolve/src/main.rs.

The following snippet provides a view of how the files of the project and the listings are interlinked:

ch8-resolve
├── Cargo.toml      ①
└── src
    └── main.rs     ②

① See listing 8.8

② See listing 8.9

Listing 8.8 Crate metadata for the resolve app

[package]
name = "resolve"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]
rand = "0.6"
clap = "2.33"
trust-dns = { version = "0.16", default-features = false }

Listing 8.9 A command-line utility to resolve IP addresses from hostnames

 1 use std::net::{SocketAddr, UdpSocket};
 2 use std::time::Duration;
 3  
 4 use clap::{App, Arg};
 5 use rand;
 6 use trust_dns::op::{Message, MessageType, OpCode, Query};
 7 use trust_dns::rr::domain::Name;
 8 use trust_dns::rr::record_type::RecordType;
 9 use trust_dns::serialize::binary::*;
10  
11 fn main() {
12   let app = App::new("resolve")
13     .about("A simple to use DNS resolver")
14     .arg(Arg::with_name("dns-server").short("s").default_value("1.1.1.1"))
15     .arg(Arg::with_name("domain-name").required(true))
16     .get_matches();
17  
18   let domain_name_raw = app                            ①
19     .value_of("domain-name").unwrap();                 ①
20   let domain_name =                                    ①
21     Name::from_ascii(&domain_name_raw).unwrap();       ①
22  
23   let dns_server_raw = app                             ②
24     .value_of("dns-server").unwrap();                  ②
25   let dns_server: SocketAddr =                         ②
26     format!("{}:53", dns_server_raw)                   ②
27     .parse()                                           ②
28     .expect("invalid address");                        ②
29  
30   let mut request_as_bytes: Vec<u8> =                  ③
31     Vec::with_capacity(512);                           ③
32   let mut response_as_bytes: Vec<u8> =                 ③
33     vec![0; 512];                                      ③
34  
35   let mut msg = Message::new();                        ④
36   msg
37     .set_id(rand::random::<u16>())
38     .set_message_type(MessageType::Query)              ⑤
39     .add_query(Query::query(domain_name, RecordType::A))
40     .set_op_code(OpCode::Query)
41     .set_recursion_desired(true);
42  
43   let mut encoder =
44     BinEncoder::new(&mut request_as_bytes);            ⑥
45   msg.emit(&mut encoder).unwrap();
46  
47   let localhost = UdpSocket::bind("0.0.0.0:0")         ⑦
48     .expect("cannot bind to local socket");
49   let timeout = Duration::from_secs(3);
50   localhost.set_read_timeout(Some(timeout)).unwrap();
51   localhost.set_nonblocking(false).unwrap();
52  
53   let _amt = localhost
54     .send_to(&request_as_bytes, dns_server)
55     .expect("socket misconfigured");
56  
57   let (_amt, _remote) = localhost
58     .recv_from(&mut response_as_bytes)
59     .expect("timeout reached");
60  
61   let dns_message = Message::from_vec(&response_as_bytes)
62     .expect("unable to parse response");
63  
64   for answer in dns_message.answers() {
65     if answer.record_type() == RecordType::A {
66       let resource = answer.rdata();
67       let ip = resource
68         .to_ip_addr()
69         .expect("invalid IP address received");
70       println!("{}", ip.to_string());
71     }
72   }
73 }

① Converts the command-line argument to a typed domain name

② Converts the command-line argument to a typed DNS server

③ An explanation of why two forms of initializing are used is provided after the listing.

④ Message represents a DNS message, which is a container for queries and other information such as answers.

⑤ Specifies that this is a DNS query, not a DNS answer. Both have the same representation over the wire, but not in Rust’s type system.

⑥ Converts the Message type into raw bytes with BinEncoder

⑦ 0.0.0.0:0 means listen to all addresses on a random port. The OS selects the actual port.

Listing 8.9 includes some business logic that deserves explaining. Lines 30–33, repeated here, use two forms of initializing a Vec<u8>. Why?

30   let mut request_as_bytes: Vec<u8> =
31     Vec::with_capacity(512);
32   let mut response_as_bytes: Vec<u8> =
33     vec![0; 512];

Each form creates a subtly different outcome:

Vec::with_capacity(512) creates a Vec<T> with length 0 and capacity 512.
vec![0; 512] creates a Vec<T> with length 512 and capacity 512.

The underlying array looks the same, but the difference in length is significant. Within the call to recv_from() at line 58, the trust-dns crate includes a check that response_as_bytes has sufficient space. That check uses the length field, which results in a crash. Knowing how to wriggle around with initialization can be handy for satisfying an APIs’ expectations.

UDP does not have a notion of long-lived connections. Unlike TCP, all messages are short-lived and one-way. Put another way, UDP does not support two-way (duplex ) communications. But DNS requires a response to be sent from the DNS server back to the client.

To enable two-way communications within UDP, both parties must act as clients and servers, depending on context. That context is defined by the protocol built on top of UDP. Within DNS, the client becomes a DNS server to receive the server’s reply. The following table provides a flow chart of the process.

Stage	DNS client role	DNS server role
Request sent from DNS client	UDP client	UDP server
Reply sent from DNS server	UDP server	UDP client

It’s time to recap. Our overall task in this section was to make HTTP requests. HTTP is built on TCP. Because we only had a domain name (www.rustinaction.com) when we made the request, we needed to use DNS. DNS is primarily delivered over UDP, so we needed to take a diversion and learn about UDP.

Now it’s almost time to return to TCP. Before we’re able to do that, though, we need to learn how to combine error types that emerge from multiple dependencies.

8.5 Ergonomic error handling for libraries

Rust’s error handling is safe and sophisticated. However, it offers a few challenges. When a function incorporates Result types from two upstream crates, the ? operator no longer works because it only understands a single type. This proves to be important when we refactor our domain resolution code to work alongside our TCP code. This section discusses some of those challenges as well as strategies for managing them.

8.5.1 Issue: Unable to return multiple error types

Returning a Result<T, E> works great when there is a single error type E. But things become more complicated when we want to work with multiple error types.

Tip For single files, compile the code with rustc <filename> rather than using cargo build. For example, if a file is named io-error.rs, then the shell command is rustc io-error.rs && ./io-error[.exe].

To start, let’s look at a small example that covers the easy case of a single error type. We’ll try to open a file that does not exist. When run, listing 8.10 prints a short message in Rust syntax:

$ rustc ch8/misc/io-error.rs && ./io-error 
Error: Os { code: 2, kind: NotFound, message: "No such file or directory" }

We won’t win any awards for user experience here, but we get a chance to learn a new language feature. The following listing provides the code that produces a single error type. You’ll find its source in ch8/misc/io-error.rs.

Listing 8.10 A Rust program that always produces an I/O error

1 use std::fs::File;
2  
3 fn main() -> Result<(), std::io::Error> {
4     let _f = File::open("invisible.txt")?;
5  
6     Ok(())
7 }

Now, let’s introduce another error type into main(). The next listing produces a compiler error, but we’ll work through some options to get the code to compile. The code for this listing is in ch8/misc/multierror.rs.

Listing 8.11 A function that attempts to return multiple Result types

 1 use std::fs::File;
 2 use std::net::Ipv6Addr;
 3  
 4 fn main() -> Result<(), std::io::Error> {
 5   let _f = File::open("invisible.txt")?;    ①
 6  
 7   let _localhost = "::1"                    ②
 8     .parse::<Ipv6Addr>()?;                  ②
 9  
10   Ok(())
11 }

① File::open() returns Result<(), std::io::Error>.

② "".parse::<Ipv6Addr>() returns Result<Ipv6Addr, std::net::AddrParseError>.

To compile listing 8.11, enter the ch8/misc directory and use rustc. This produces quite a stern, yet helpful, error message:

$ rustc multierror.rs 
error[E0277]: `?` couldn't convert the error to `std::io::Error`
 --> multierror.rs:8:25
  |
4 | fn main() -> Result<(), std::io::Error> {
  |              -------------------------- expected `std::io::Error`
                                            because of this
...
8 |     .parse::<Ipv6Addr>()?;
  |                         ^ the trait `From<AddrParseError>`
                              is not implemented for `std::io::Error`
  |
  = note: the question mark operation (`?`) implicitly performs a
          conversion on the error value using the `From` trait
  = help: the following implementations were found:
            <std::io::Error as From<ErrorKind>>
            <std::io::Error as From<IntoInnerError<W>>>
            <std::io::Error as From<NulError>>
  = note: required by `from`
 
error: aborting due to previous error
 
For more information about this error, try `rustc --explain E0277`.

The error message can be difficult to interpret if you don’t know what the question mark operator (?) is doing. Why are there multiple messages about std::convert::From? Well, the ? operator is syntactic sugar for the try! macro. try! performs two functions:

When it detects Ok(value), the expression evaluates to value.
When Err(err) occurs, try!/? returns early after attempting to convert err to the error type defined in the calling function.

In Rust-like pseudocode, the try! macro could be defined as

macro try {
  match expression {
    Result::Ok(val) => val,                        ①
    Result::Err(err) => {
      let converted = convert::From::from(err);    ②
      return Result::Err(converted);               ③
    }
  });
}

① Uses val when an expression matches Result::Ok(val)

② Converts err to the outer function’s error type when it matches Result::Err(err) and then returns early

③ Returns from the calling function, not the try! macro itself

Looking at listing 8.11 again, we can see the try! macro in action as ?:

 4 fn main() -> Result<(), std::io::Error> {
 5   let _f = File::open("invisible.txt")?;     ①
 6  
 7   let _localhost = "::1"                     ②
 8     .parse::<Ipv6Addr>()?;                   ②
 9  
10   Ok(())
11 }

① File::open() returns std::io::Error, so no conversion is necessary.

② "".parse() presents ? with a std::net::AddrParseError. We don’t define how to convert std::net::AddrParseError to std::io::Error, so the program fails to compile.

In addition to saving you from needing to use explicit pattern matching to extract the value or return an error, the ? operator also attempts to convert its argument into an error type if required. Because the signature of main is main() → Result<(), std::io ::Error>, Rust attempts to convert the std::net::AddrParseError produced by parse::<Ipv6Addr>() into a std::io::Error. Don’t worry, though; we can fix this! Earlier, in section 8.3, we introduced trait objects. Now we’ll be able to put those to good use.

Using Box<dyn Error> as the error variant in the main() function allows us to progress. The dyn keyword is short for dynamic, implying that there is a runtime cost for this flexibility. Running listing 8.12 produces this output:

$ rustc ch8/misc/traiterror.rs && ./traiterror
Error: Os { code: 2, kind: NotFound, message: "No such file or directory" }

I suppose it’s a limited form of progress, but progress nonetheless. We’ve circled back to the error we started with. But we’ve passed through the compiler error, which is what we wanted.

Going forward, let’s look at listing 8.12. It implements a trait object in a return value to simplify error handling when errors originate from multiple upstream crates. You can find the source for this listing in ch8/misc/traiterror.rs.

Listing 8.12 Using a trait object in a return value

 1 use std::fs::File;
 2 use std::error::Error;
 3 use std::net::Ipv6Addr;
 4  
 5 fn main() -> Result<(), Box<dyn Error>> {      ①
 6  
 7   let _f = File::open("invisible.txt")?;       ②
 8  
 9   let _localhost = "::1"
10     .parse::<Ipv6Addr>()?                      ③
11  
12   Ok(())
13 }

① A trait object, Box<dyn Error>, represents any type that implements Error.

② Error type is std::io::Error

③ Error type is std::net::AddrParseError

Wrapping trait objects in Box is necessary because their size (in bytes on the stack) is unknown at compile time. In the case of listing 8.12, the trait object might originate from either File::open() or "::1".parse(). What actually happens depends on the circumstances encountered at runtime. A Box has a known size on the stack. Its raison d’être is to point to things that don’t, such as trait objects.

8.5.2 Wrapping downstream errors by defining our own error type

The problem that we are attempting to solve is that each of our dependencies defines its own error type. Multiple error types in one function prevent returning Result. The first strategy we looked at was to use trait objects, but trait objects have a potentially significant downside.

Using trait objects is also known as type erasure. Rust is no longer aware that an error has originated upstream. Using Box<dyn Error> as the error variant of a Result means that the upstream error types are, in a sense, lost. The original errors are now converted to exactly the same type.

It is possible to retain the upstream errors, but this requires more work on our behalf. We need to bundle upstream errors in our own type. When the upstream errors are needed later (say, for reporting errors to the user), it’s possible to extract these with pattern matching. Here is the process:

Define an enum that includes the upstream errors as variants.
Annotate the enum with #[derive(Debug)].
Implement Display.
Implement Error, which almost comes for free because we have implemented Debug and Display.
Use map_err() in your code to convert the upstream error to your omnibus error type.

Note You haven’t previously encountered the map_err() function. We’ll explain what it does when we get there later in this section.

It’s possible to stop with the previous steps, but there’s an optional extra step that improves the ergonomics. We need to implement std::convert::From to remove the need to call map_err(). To begin, let’s start back with listing 8.11, where we know that the code fails:

use std::fs::File;
use std::net::Ipv6Addr;
 
fn main() -> Result<(), std::io::Error> {
  let _f = File::open("invisible.txt")?;
 
  let _localhost = "::1"
    .parse::<Ipv6Addr>()?;
 
  Ok(())
}

This code fails because "".parse::<Ipv6Addr>() does not return a std::io::Error. What we want to end up with is code that looks a little more like the following listing.

Listing 8.13 Hypothetical example of the kind of code we want to write

 1 use std::fs::File;
 2 use std::io::Error;              ①
 3 use std::net::AddrParseError;    ①
 4 use std::net::Ipv6Addr;
 5  
 6 enum UpstreamError{
 7   IO(std::io::Error),
 8   Parsing(AddrParseError),
 9 }
10  
11 fn main() -> Result<(), UpstreamError> {
12   let _f = File::open("invisible.txt")?
13     .maybe_convert_to(UpstreamError);
14  
15   let _localhost = "::1"
16     .parse::<Ipv6Addr>()?
17     .maybe_convert_to(UpstreamError);
18  
19   Ok(())
20 }

① Brings upstream errors into local scope

Define an enum that includes the upstream errors as variants

The first thing to do is to return a type that can hold the upstream error types. In Rust, an enum works well. Listing 8.13 does not compile, but does do this step. We’ll tidy up the imports slightly, though:

use std::io;
use std::net;
 
enum UpstreamError{
  IO(io::Error),
  Parsing(net::AddrParseError),
}

Annotate the enum with #[derive(Debug)]

The next change is easy. It’s a single-line change—the best kind of change. To annotate the enum, we’ll add #[derive(Debug)], as the following shows:

use std::io;
use std::net;
 
#[derive(Debug)]
enum UpstreamError{
  IO(io::Error),
  Parsing(net::AddrParseError),
}

Implement std::fmt::Display

We’ll cheat slightly and implement Display by simply using Debug. We know that this is available to us because errors must define Debug. Here’s the updated code:

use std::fmt;
use std::io;
use std::net;
 
#[derive(Debug)]
enum UpstreamError{
  IO(io::Error),
  Parsing(net::AddrParseError),
}
 
impl fmt::Display for UpstreamError {
  fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
    write!(f, "{:?}", self)                                   ①
  }
}

① Implements Display in terms of Debug via the "{:?}" syntax

Implement std::error::Error

Here’s another easy change. To end up with the kind of code that we’d like to write, let’s make the following change:

use std::error;                            ①
use std::fmt;
use std::io;
use std::net;
 
#[derive(Debug)]
enum UpstreamError{
  IO(io::Error),
  Parsing(net::AddrParseError),
}
 
impl fmt::Display for UpstreamError {
  fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
    write!(f, "{:?}", self)
  }
}
 
impl error::Error for UpstreamError { }    ②

① Brings the std::error::Error trait into local scope

② Defers to default method implementations. The compiler will fill in the blanks.

The impl block is—well, we can rely on default implementations provided by the compiler—especially terse. Because there are default implementations of every method defined by std::error::Error, we can ask the compiler to do all of the work for us.

Use map_err()

The next fix is to add map_err() to our code to convert the upstream error to the omnibus error type. Back at listing 8.13, we wanted to have a main() that looks like this:

fn main() -> Result<(), UpstreamError> {
  let _f = File::open("invisible.txt")?
    .maybe_convert_to(UpstreamError);
 
  let _localhost = "::1"
    .parse::<Ipv6Addr>()?
    .maybe_convert_to(UpstreamError);
 
  Ok(())
}

I can’t offer you that. I can, however, give you this:

fn main() -> Result<(), UpstreamError> {
  let _f = File::open("invisible.txt")
    .map_err(UpstreamError::IO)?;
 
  let _localhost = "::1"
    .parse::<Ipv6Addr>()
    .map_err(UpstreamError::Parsing)?;
 
  Ok(())
}

This new code works! Here’s how. The map_err() function maps an error to a function. (Variants of our UpstreamError enum can be used as functions here.) Note that the ? operator needs to be at the end. Otherwise, the function can return before the code has a chance to convert the error.

Listing 8.14 provides the new code. When run, it produces this message to the console:

$ rustc ch8/misc/wraperror.rs && ./wraperror 
Error: IO(Os { code: 2, kind: NotFound, message: "No such file or directory" })

To retain type safety, we can use the new code in the following listing. You’ll find its source in ch8/misc/wraperror.rs.

Listing 8.14 Wrapping upstream errors in our own type

 1 use std::io;
 2 use std::fmt;
 3 use std::net;
 4 use std::fs::File;
 5 use std::net::Ipv6Addr;
 6  
 7 #[derive(Debug)]
 8 enum UpstreamError{
 9   IO(io::Error),
10   Parsing(net::AddrParseError),
11 }
12  
13 impl fmt::Display for UpstreamError {
14   fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
15     write!(f, "{:?}", self)
16   }
17 }
18  
19 impl error::Error for UpstreamError { }
20  
21 fn main() -> Result<(), UpstreamError> {
22   let _f = File::open("invisible.txt")
23     .map_err(UpstreamError::IO)?;
24  
25   let _localhost = "::1"
26     .parse::<Ipv6Addr>()
27     .map_err(UpstreamError::Parsing)?;
28  
29   Ok(())
30 }

It’s also possible to remove the calls to map_err(). But to enable that, we need to implement From.

Implement std::convert::From to remove the need to call map_err()

The std::convert::From trait has a single required method, from(). We need two impl blocks to enable our two upstream error types to be convertible. The following snippet shows how:

impl From<io::Error> for UpstreamError {
  fn from(error: io::Error) -> Self {
    UpstreamError::IO(error)
  }
}
 
impl From<net::AddrParseError> for UpstreamError {
  fn from(error: net::AddrParseError) -> Self {
    UpstreamError::Parsing(error)
  }
}

Now the main() function returns to a simple form of itself:

fn main() -> Result<(), UpstreamError> {
  let _f = File::open("invisible.txt")?;
  let _localhost = "::1".parse::<Ipv6Addr>()?;
 
  Ok(())
}

The full code listing is provided in listing 8.15. Implementing From places the burden of extra syntax on the library writer. It results in a much easier experience when using your crate, simplifying its use by downstream programmers. You’ll find the source for this listing in ch8/misc/wraperror2.rs.

Listing 8.15 Implementing std::convert::From for our wrapper error type

 1 use std::io;
 2 use std::fmt;
 3 use std::net;
 4 use std::fs::File;
 5 use std::net::Ipv6Addr;
 6  
 7 #[derive(Debug)]
 8 enum UpstreamError{
 9   IO(io::Error),
10   Parsing(net::AddrParseError),
11 }
12  
13 impl fmt::Display for UpstreamError {
14   fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
15     write!(f, "{:?}", self)                                    1((CO20-1)) 
16   }
17 }
18  
19 impl error::Error for UpstreamError { }
20  
21 impl From<io::Error> for UpstreamError {
22   fn from(error: io::Error) -> Self {
23     UpstreamError::IO(error)
24   }
25 }
26  
27 impl From<net::AddrParseError> for UpstreamError {
28   fn from(error: net::AddrParseError) -> Self {
29     UpstreamError::Parsing(error)
30   }
31 }
32  
33 fn main() -> Result<(), UpstreamError> {
34   let _f = File::open("invisible.txt")?;
35   let _localhost = "::1".parse::<Ipv6Addr>()?;
36  
37   Ok(())
38 }

8.5.3 Cheating with unwrap() and expect()

The final approach for dealing with multiple error types is to use unwrap() and expect(). Now that we have the tools to handle multiple error types in a function, we can continue our journey.

Note This is a reasonable approach when writing a main() function, but it isn’t recommended for library authors. Your users don’t want their programs to crash because of things outside of their control.

8.6 MAC addresses

Several pages ago in listing 8.9, you implemented a DNS resolver. That enabled conversions from a host name such as www.rustinaction.com to an IP address. Now we have an IP address to connect to.

The internet protocol enables devices to contact each other via their IP addresses. But that’s not all. Every hardware device also includes a unique identifier that’s independent of the network it’s connected to. Why a second number? The answer is partially technical and partially historical.

Ethernet networking and the internet started life independently. Ethernet’s focus was on local area networks (LANs). The internet was developed to enable communication between networks, and Ethernet is the addressing system understood by devices that share a physical link (or a radio link in the case of WiFi, Bluetooth, and other wireless technologies).

Perhaps a better way to express this is that MAC (short for media access control ) addresses are used by devices that share electrons (figure 8.3). But there are a few differences:

IP addresses are hierarchical, but MAC addresses are not. Addresses appearing close together numerically are not close together physically, or organizationally.
MAC addresses are 48 bits (6 bytes) wide. IP addresses are 32 bits (4 bytes) wide for IPv4 and 128 bits (16 bytes) for IPv6.

Figure 8.3 In-memory layout for MAC addresses

There are two forms of MAC addresses:

Universally administered (or universal) addresses are set when devices are manufactured. Manufacturers use a prefix assigned by the IEEE Registration Authority and a scheme of their choosing for the remaining bits.
Locally administered (or local) addresses allow devices to create their own MAC addresses without registration. When setting a device’s MAC address yourself in software, you should make sure that your address is set to the local form.

MAC addresses have two modes: unicast and multicast. The transmission behavior for these forms is identical. The distinction is made when a device makes a decision about whether to accept a frame. A frame is a term used by the Ethernet protocol for a byte slice at this level. Analogies to frame include a packet, wrapper, and envelope. Figure 8.4 shows this distinction.

Figure 8.4 The differences between multicast and unicast MAC addresses

Unicast addresses are intended to transport information between two points that are in direct contact (say, between a laptop and a router). Wireless access points complicate matters somewhat but don’t change the fundamentals. A multicast address can be accepted by multiple recipients, whereas unicast has a single recipient. The term unicast is somewhat misleading, though. Sending an Ethernet packet involves more than two devices. Using a unicast address alters what devices do when they receive packets but not which data is transmitted over the wire (or through the radio waves).

8.6.1 Generating MAC addresses

When we begin talking about raw TCP in section 8.8, we’ll create a virtual hardware device in listing 8.22. To convince anything to talk to us, we need to learn how to assign our virtual device a MAC address. The macgen project in listing 8.17 generates the MAC addresses for us. The following listing shows the metadata for that project. You can find its source in ch8/ch8-mac/Cargo.toml.

Listing 8.16 Crate metadata for the macgen project

[package]
name = "ch8-macgen"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]
rand = "0.7"

The following listing shows the macgen project, our MAC address generator. The source code for this project is in the ch8/ch8-mac/src/main.rs file.

Listing 8.17 Creating macgen, a MAC address generator

 1 extern crate rand;
 2  
 3 use rand::RngCore;
 4 use std::fmt;
 5 use std::fmt::Display;
 6  
 7 #[derive(Debug)]
 8 struct MacAddress([u8; 6]);                           ①
 9  
10 impl Display for MacAddress {
11   fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
12     let octet = &self.0;
13     write!(
14       f,
15       "{:02x}:{:02x}:{:02x}:{:02x}:{:02x}:{:02x}",    ②
16       octet[0], octet[1], octet[2],                   ②
17       octet[3], octet[4], octet[5]                    ②
18     )
19   }
20 }
21  
22 impl MacAddress {
23   fn new() -> MacAddress {
24     let mut octets: [u8; 6] = [0; 6];
25     rand::thread_rng().fill_bytes(&mut octets);
26     octets[0] |= 0b_0000_0011;                        ③
27     MacAddress { 0: octets }
28   }
29  
30   fn is_local(&self) -> bool {
31     (self.0[0] & 0b_0000_0010) == 0b_0000_0010
32   }
33  
34   fn is_unicast(&self) -> bool {
35     (self.0[0] & 0b_0000_0001) == 0b_0000_0001
36   }
37 }
38  
39 fn main() {
40   let mac = MacAddress::new();
41   assert!(mac.is_local());
42   assert!(mac.is_unicast());
43   println!("mac: {}", mac);
44 }

① Uses the newtype pattern to wrap a bare array without any extra overhead

② Converts each byte to hexadecimal notation

③ Sets the MAC address to local and unicast

The code from listing 8.17 should feel legible. Line 25 contains some relatively obscure syntax, though. octets[0] |= 0b_0000_0011 coerces the two flag bits described at figure 8.3 to a state of 1. That designates every MAC address we generate as locally assigned and unicast.

8.7 Implementing state machines with Rust’s enums

Another prerequisite for handling network messages is being able to define a state machine. Our code needs to adapt to changes in connectivity.

Listing 8.22 contains a state machine, implemented with a loop, a match, and a Rust enum. Because of Rust’s expression-based nature, control flow operators also return values. Every time around the loop, the state is mutated in place. The following listing shows the pseudocode for how a repeated match on a enum works together.

Listing 8.18 Pseudocode for a state machine implementation

enum HttpState {
    Connect,
    Request,
    Response,
}
 
loop {
    state = match state {
        HttpState::Connect if !socket.is_active() => {
            socket.connect();
            HttpState::Request
        }
 
        HttpState::Request if socket.may_send() => {
            socket.send(data);
            HttpState::Response
        }
 
        HttpState::Response if socket.can_recv() => {
            received = socket.recv();
            HttpState::Response
        }
 
        HttpState::Response if !socket.may_recv() => {
            break;
        }
        _ => state,
    }
}

More advanced methods to implement finite state machines do exist. This is the simplest, however. We’ll make use of it in listing 8.22. Making use of an enum embeds the state machine’s transitions into the type system itself.

But we’re still at a level that is far too high! To dig deeper, we’re going to need to get some assistance from the OS.

8.8 Raw TCP

Integrating with the raw TCP packets typically requires root/superuser access. The OS starts to get quite grumpy when an unauthorized user asks to make raw network requests. We can get around this (on Linux) by creating a proxy device that non-super users are allowed to communicate with directly.

If you’re running another OS, there are many virtualization options available. Here are a few:

The Multipass project (https://multipass.run/) provides fast Ubuntu virtual machines on macOS and Windows hosts.
WSL, the Windows Subsystem for Linux (https://docs.microsoft.com/en-us/windows/wsl/about), is another option to look into.
Oracle VirtualBox (https://www.virtualbox.org/) is an open source project with excellent support for many host operating systems.

8.9 Creating a virtual networking device

To proceed with this section, you will need to create virtual networking hardware. Using virtual hardware provides more control to freely assign IP and MAC addresses. It also avoids changing your hardware settings, which could affect its ability to connect to the network. To create a TAP device called tap-rust, execute the following command in your Linux console:

$ sudo \                ①
>  ip tuntap \          ②
>    add \              ③
>    mode tap \         ④
>    name tap-rust \    ⑤
>    user $USER         ⑥

① Executes as the root user

② Tells ip that we’re managing TUN/TAP devices

③ Uses the add subcommand

④ Uses the TUN tunnelling mode

⑤ Gives your device a unique name

⑥ Grants access to your non-root user account

When successful, ip prints no output. To confirm that our tap-rust device was added, we can use the ip tuntap list subcommand as in the following snippet. When executed, you should see the tap-rust device in the list of devices in the output:

$ ip tuntap list 
tap-rust: tap persist user

Now that we have created a networking device, we also need to allocate an IP address for it and tell our system to forward packets to it. The following shows the commands to enable this functionality:

$ sudo ip link set tap-rust up                        ①
$ sudo ip addr add 192.168.42.100/24 dev tap-rust     ②
 
$ sudo iptables \                                     ③
>   -t nat\                                           ③
>   -A POSTROUTING \                                  ③
>   -s 192.168.42.0/24 \                              ③
>   -j MASQUERADE                                     ③
 
 
$ sudo sysctl net.ipv4.ip_forward=1                   ④

① Establishes a network device called tap-rust and activates it

② Assigns the IP address 192.168.42.100 to the device

③ Enables internet packets to reach the source IP address mask (-s 192.168.42.100/24) by appending a rule (-A POSTROUTING) that dynamically maps IP addresses to a device (-j MASQUERADE)

④ Instructs the kernel to enable IPv4 packet forwarding

The following shows how to remove the device (once you have completed this chapter) by using del rather than add:

$ sudo ip tuntap del mode tap name tap-rust

8.10 “Raw” HTTP

We should now have all the knowledge we need to take on the challenge of using HTTP at the TCP level. The mget project (mget is short for manually get ) spans listings 8.20–8.23. It is a large project, but you’ll find it immensely satisfying to understand and build. Each file provides a different role:

main.rs (listing 8.20)—Handles command-line parsing and weaves together the functionality provided by its peer files. This is where we combine the error types using the process outlined in section 8.5.2.
ethernet.rs (listing 8.21)—Generates a MAC address using the logic from listing 8.17 and converts between MAC address types (defined by the smoltcp crate) and our own.
http.rs (listing 8.22)—Carries out the work of interacting with the server to make the HTTP request.
dns.rs (listing 8.23)—Performs DNS resolution, which converts a domain name to an IP address.

Note The source code for these listings (and every code listing in the book) is available from https://github.com/rust-in-action/code or https://www .manning.com/books/rust-in-action.

It’s important to acknowledge that listing 8.22 was derived from the HTTP client example within the smoltcp crate itself. whitequark (https://whitequark.org/) has built an absolutely fantastic networking library. Here’s the file structure for the mget project:

ch8-mget
├── Cargo.toml          ①
└── src
    ├── main.rs         ②
    ├── ethernet.rs     ③
    ├── http.rs         ④
    └── dns.rs          ⑤

① See listing 8.19.

② See listing 8.20.

③ See listing 8.21.

④ See listing 8.22.

⑤ See listing 8.23.

To download and run the mget project from source control, execute these commands at the command line:

$ git clone https:/ /github.com/rust-in-action/code rust-in-action 
Cloning into 'rust-in-action'...
 
$ cd rust-in-action/ch8/ch8-mget

Here are the project setup instructions for those readers who enjoy doing things step by step (with the output omitted).

Enter these commands at the command-line:

$ cargo new mget
 
$ cd mget
 
$ cargo install cargo-edit
 
$ cargo add clap@2
 
$ cargo add url@02
 
$ cargo add rand@0.7
 
$ cargo add trust-dns@0.16 --no-default-features
 
$ cargo add smoltcp@0.6 --features='proto-igmp proto-ipv4 verbose log'

Check that your project’s Cargo.toml matches listing 8.19.
Within the src directory, listing 8.20 becomes main.rs, listing 8.21 becomes ethernet.rs, listing 8.22 becomes http.rs, and listing 8.23 becomes dns.rs.

The following listing shows the metadata for mget. You’ll find its source code in the ch8/ch8-mget/Cargo.toml file.

Listing 8.19 Crate metadata for mget

[package]
name = "mget"
version = "0.1.0"
authors = ["Tim McNamara <author@rustinaction.com>"]
edition = "2018"
 
[dependencies]
clap = "2"                        ①
rand = "0.7"                      ②
smoltcp = {                       ③
  version = "0.6",
  features = ["proto-igmp", "proto-ipv4", "verbose", "log"]
}
trust-dns = {                     ④
  version = "0.16",
  default-features = false
}
url = "2"                         ⑤

① Provides command-line argument parsing

② Selects a random port number

③ Provides a TCP implementation

④ Enables connecting to a DNS server

⑤ Parses and validates URLs

The following listing shows the command-line parsing for our project. You’ll find this source in ch8/ch8-mget/src/main.rs.

Listing 8.20 mget command-line parsing and overall coordination

 1 use clap::{App, Arg};
 2 use smoltcp::phy::TapInterface;
 3 use url::Url;
 4  
 5 mod dns;
 6 mod ethernet;
 7 mod http;
 8  
 9 fn main() {
10   let app = App::new("mget")
11     .about("GET a webpage, manually")
12     .arg(Arg::with_name("url").required(true))           ①
13     .arg(Arg::with_name("tap-device").required(true))    ②
14     .arg(
15       Arg::with_name("dns-server")
16         .default_value("1.1.1.1"),                       ③
17     )
18     .get_matches();                                      ④
19  
20   let url_text = app.value_of("url").unwrap();
21   let dns_server_text =
22     app.value_of("dns-server").unwrap();
23   let tap_text = app.value_of("tap-device").unwrap();
24  
25   let url = Url::parse(url_text)                         ⑤
26     .expect("error: unable to parse <url> as a URL");
27  
28   if url.scheme() != "http" {                            ⑤
29     eprintln!("error: only HTTP protocol supported");
30     return;
31   }
32  
33   let tap = TapInterface::new(&tap_text)                 ⑤
34     .expect(
35       "error: unable to use <tap-device> as a \
36        network interface",
37     );
38  
39   let domain_name =
40     url.host_str()                                       ⑤
41       .expect("domain name required");
42  
43   let _dns_server: std::net::Ipv4Addr =
44     dns_server_text
45       .parse()                                           ⑤
46       .expect(
47         "error: unable to parse <dns-server> as an \
48          IPv4 address",
49       );
50  
51   let addr =
52     dns::resolve(dns_server_text, domain_name)           ⑥
53       .unwrap()
54       .unwrap();
55  
56   let mac = ethernet::MacAddress::new().into();          ⑦
57  
58   http::get(tap, mac, addr, url).unwrap();               ⑧
59  
60 }

① Requires a URL to download data from

② Requires a TAP networking device to connect with

③ Makes it possible for the user to select which DNS server to use

④ Parses the command-line arguments

⑤ Validates the command-line arguments

⑥ Converts the URL’s domain name into an IP address that we can connect to

⑦ Generates a random unicode MAC address

⑧ Makes the HTTP GET request

The following listing generates our MAC address and converts between MAC address types defined by the smoltcp crate and our own. The code for this listing is in ch8/ch8-mget/src/ethernet.rs.

Listing 8.21 Ethernet type conversion and MAC address generation

 1 use rand;
 2 use std::fmt;
 3 use std::fmt::Display;
 4  
 5 use rand::RngCore;
 6 use smoltcp::wire;
 7  
 8 #[derive(Debug)]
 9 pub struct MacAddress([u8; 6]);
10  
11 impl Display for MacAddress {
12   fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
13     let octet = self.0;
14     write!(
15       f,
16       "{:02x}:{:02x}:{:02x}:{:02x}:{:02x}:{:02x}",
17       octet[0], octet[1], octet[2],
18       octet[3], octet[4], octet[5]
19     )
20   }
21 }
22  
23 impl MacAddress {
24   pub fn new() -> MacAddress {
25     let mut octets: [u8; 6] = [0; 6];
26     rand::thread_rng().fill_bytes(&mut octets);    ①
27     octets[0] |= 0b_0000_0010;                     ②
28     octets[0] &= 0b_1111_1110;                     ③
29     MacAddress { 0: octets }
30   }
31 }
32  
33 impl Into<wire::EthernetAddress> for MacAddress {
34   fn into(self) -> wire::EthernetAddress {
35     wire::EthernetAddress { 0: self.0 }
36   }
37 }

① Generates a random number

② Ensures that the local address bit is set to 1

③ Ensures the unicast bit is set to 0

The following listing shows how to interact with the server to make the HTTP request. The code for this listing is in ch8/ch8-mget/src/http.rs.

Listing 8.22 Manually creating an HTTP request using TCP primitives

  1 use std::collections::BTreeMap;
  2 use std::fmt;
  3 use std::net::IpAddr;
  4 use std::os::unix::io::AsRawFd;
  5  
  6 use smoltcp::iface::{EthernetInterfaceBuilder, NeighborCache, Routes};
  7 use smoltcp::phy::{wait as phy_wait, TapInterface};
  8 use smoltcp::socket::{SocketSet, TcpSocket, TcpSocketBuffer};
  9 use smoltcp::time::Instant;
 10 use smoltcp::wire::{EthernetAddress, IpAddress, IpCidr, Ipv4Address};
 11 use url::Url;
 12  
 13 #[derive(Debug)]
 14 enum HttpState {
 15   Connect,
 16   Request,
 17   Response,
 18 }
 19  
 20 #[derive(Debug)]
 21 pub enum UpstreamError {
 22   Network(smoltcp::Error),
 23   InvalidUrl,
 24   Content(std::str::Utf8Error),
 25 }
 26  
 27 impl fmt::Display for UpstreamError {
 28   fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
 29     write!(f, "{:?}", self)
 30   }
 31 }
 32  
 33 impl From<smoltcp::Error> for UpstreamError {
 34   fn from(error: smoltcp::Error) -> Self {
 35     UpstreamError::Network(error)
 36   }
 37 }
 38  
 39 impl From<std::str::Utf8Error> for UpstreamError {
 40   fn from(error: std::str::Utf8Error) -> Self {
 41     UpstreamError::Content(error)
 42   }
 43 }
 44  
 45 fn random_port() -> u16 {
 46   49152 + rand::random::<u16>() % 16384
 47 }
 48  
 49 pub fn get(
 50   tap: TapInterface,
 51   mac: EthernetAddress,
 52   addr: IpAddr,
 53   url: Url,
 54 ) -> Result<(), UpstreamError> {
 55   let domain_name = url.host_str().ok_or(UpstreamError::InvalidUrl)?;
 56  
 57   let neighbor_cache = NeighborCache::new(BTreeMap::new());
 58  
 59   let tcp_rx_buffer = TcpSocketBuffer::new(vec![0; 1024]);
 60   let tcp_tx_buffer = TcpSocketBuffer::new(vec![0; 1024]);
 61   let tcp_socket = TcpSocket::new(tcp_rx_buffer, tcp_tx_buffer);
 62  
 63   let ip_addrs = [IpCidr::new(IpAddress::v4(192, 168, 42, 1), 24)];
 64  
 65   let fd = tap.as_raw_fd();
 66   let mut routes = Routes::new(BTreeMap::new());
 67   let default_gateway = Ipv4Address::new(192, 168, 42, 100);
 68   routes.add_default_ipv4_route(default_gateway).unwrap();
 69   let mut iface = EthernetInterfaceBuilder::new(tap)
 70     .ethernet_addr(mac)
 71     .neighbor_cache(neighbor_cache)
 72     .ip_addrs(ip_addrs)
 73     .routes(routes)
 74     .finalize();
 75  
 76   let mut sockets = SocketSet::new(vec![]);
 77   let tcp_handle = sockets.add(tcp_socket);
 78  
 79   let http_header = format!(
 80     "GET {} HTTP/1.0\r\nHost: {}\r\nConnection: close\r\n\r\n",
 81     url.path(),
 82     domain_name,
 83   );
 84  
 85   let mut state = HttpState::Connect;
 86   'http: loop {
 87     let timestamp = Instant::now();
 88     match iface.poll(&mut sockets, timestamp) {
 89       Ok(_) => {}
 90       Err(smoltcp::Error::Unrecognized) => {}
 91       Err(e) => {
 92         eprintln!("error: {:?}", e);
 93       }
 94     }
 95  
 96     {
 97       let mut socket = sockets.get::<TcpSocket>(tcp_handle);
 98  
 99       state = match state {
100         HttpState::Connect if !socket.is_active() => {
101           eprintln!("connecting");
102           socket.connect((addr, 80), random_port())?;
103           HttpState::Request
104         }
105  
106         HttpState::Request if socket.may_send() => {
107           eprintln!("sending request");
108           socket.send_slice(http_header.as_ref())?;
109           HttpState::Response
110         }
111  
112         HttpState::Response if socket.can_recv() => {
113           socket.recv(|raw_data| {
114             let output = String::from_utf8_lossy(raw_data);
115             println!("{}", output);
116             (raw_data.len(), ())
117           })?;
118           HttpState::Response
119         }
120  
121         HttpState::Response if !socket.may_recv() => {
122           eprintln!("received complete response");
123           break 'http;
124         }
125         _ => state,
126       }
127     }
128  
129     phy_wait(fd, iface.poll_delay(&sockets, timestamp))
130       .expect("wait error");
131   }
132  
133   Ok(())
134 }

And finally, the following listing performs the DNS resolution. The source for this listing is in ch8/ch8-mget/src/dns.rs.

Listing 8.23 Creating DNS queries to translate domain names to IP addresses

  1 use std::error::Error;
  2 use std::net::{SocketAddr, UdpSocket};
  3 use std::time::Duration;
  4  
  5 use trust_dns::op::{Message, MessageType, OpCode, Query};
  6 use trust_dns::proto::error::ProtoError;
  7 use trust_dns::rr::domain::Name;
  8 use trust_dns::rr::record_type::RecordType;
  9 use trust_dns::serialize::binary::*;
 10  
 11 fn message_id() -> u16 {
 12   let candidate = rand::random();
 13   if candidate == 0 {
 14     return message_id();
 15   }
 16   candidate
 17 }
 18  
 19 #[derive(Debug)]
 20 pub enum DnsError {
 21   ParseDomainName(ProtoError),
 22   ParseDnsServerAddress(std::net::AddrParseError),
 23   Encoding(ProtoError),
 24   Decoding(ProtoError),
 25   Network(std::io::Error),
 26   Sending(std::io::Error),
 27   Receving(std::io::Error),
 28 }
 29  
 30 impl std::fmt::Display for DnsError {
 31   fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
 32     write!(f, "{:#?}", self)
 33   }
 34 }
 35  
 36 impl std::error::Error for DnsError {}                 ①
 37  
 38 pub fn resolve(
 39   dns_server_address: &str,
 40   domain_name: &str,
 41 ) -> Result<Option<std::net::IpAddr>, Box<dyn Error>> {
 42   let domain_name =
 43     Name::from_ascii(domain_name)
 44       .map_err(DnsError::ParseDomainName)?;
 45  
 46   let dns_server_address =
 47     format!("{}:53", dns_server_address);              ②
 48   let dns_server: SocketAddr = dns_server_address
 49     .parse()
 50     .map_err(DnsError::ParseDnsServerAddress)?;
 51  
 52   let mut request_buffer: Vec<u8> =                    ③
 53     Vec::with_capacity(64);                            ③
 54   let mut response_buffer: Vec<u8> =                   ④
 55     vec![0; 512];                                      ④
 56  
 57   let mut request = Message::new();
 58   request.add_query(                                   ⑤
 59     Query::query(domain_name, RecordType::A)           ⑤
 60   );                                                   ⑤
 61  
 62   request
 63     .set_id(message_id())
 64     .set_message_type(MessageType::Query)
 65     .set_op_code(OpCode::Query)
 66     .set_recursion_desired(true);                      ⑥
 67  
 68   let localhost =
 69     UdpSocket::bind("0.0.0.0:0").map_err(DnsError::Network)?;
 70  
 71   let timeout = Duration::from_secs(5);
 72   localhost
 73     .set_read_timeout(Some(timeout))
 74     .map_err(DnsError::Network)?;                      ⑦
 75  
 76   localhost
 77     .set_nonblocking(false)
 78     .map_err(DnsError::Network)?;
 79  
 80   let mut encoder = BinEncoder::new(&mut request_buffer);
 81   request.emit(&mut encoder).map_err(DnsError::Encoding)?;
 82  
 83   let _n_bytes_sent = localhost
 84     .send_to(&request_buffer, dns_server)
 85     .map_err(DnsError::Sending)?;
 86  
 87   loop {                                               ⑧
 88     let (_b_bytes_recv, remote_port) = localhost
 89       .recv_from(&mut response_buffer)
 90       .map_err(DnsError::Receving)?;
 91  
 92     if remote_port == dns_server {
 93       break;
 94     }
 95   }
 96  
 97   let response =
 98     Message::from_vec(&response_buffer)
 99       .map_err(DnsError::Decoding)?;
100  
101   for answer in response.answers() {
102     if answer.record_type() == RecordType::A {
103       let resource = answer.rdata();
104       let server_ip =
105         resource.to_ip_addr().expect("invalid IP address received");
106       return Ok(Some(server_ip));
107     }
108   }
109  
110   Ok(None)
111 }

① Falls back to default methods

② Attempts to build the internal data structures using the raw text input

③ Because our DNS request will be small, we only need a little bit of space to hold it.

④ DNS over UDP uses a maximum packet size of 512 bytes.

⑤ DNS messages can hold multiple queries, but here we only use a single one.

⑥ Asks the DNS server to make requests on our behalf if it doesn’t know the answer

⑦ Binding to port 0 asks the OS to allocate a port on our behalf.

⑧ There is a small chance another UDP message will be received on our port from some unknown sender. To avoid that, we ignore packets from IP addresses that we don’t expect.

mget is an ambitious project. It brings together all the threads from the chapter, is dozens of lines long, and yet is less capable than the request::get(url) call we made in listing 8.2. Hopefully it’s revealed several interesting avenues for you to explore. Perhaps, surprisingly, there are several more networking layers to unwrap. Well done for making your way through a lengthy and challenging chapter.

Summary

Networking is complicated. Standard models such as OSIs are only partially accurate.
Trait objects allow for runtime polymorphism. Typically, programmers prefer generics because trait objects incur a small runtime cost. However, this situation is not always clear-cut. Using trait objects can reduce space because only a single version of each function needs to be compiled. Fewer functions also benefits cache coherence.
Networking protocols are particular about which bytes are used. In general, you should prefer using &[u8] literals (b"...") over &str literals ("...") to ensure that you retain full control.
There are three main strategies for handling multiple upstream error types within a single scope:
- Create an internal wrapper type and implement From for each of the upstream types
- Change the return type to make use of a trait object that implements std:: error:Error
- Use .unwrap() and its cousin .expect()
Finite state machines can be elegantly modeled in Rust with an enum and a loop. At each iteration, indicate the next state by returning the appropriate enum variant.
To enable two-way communications in UDP, each side of the conversation must be able to act as a client and a server.

^1.In old Rust code, you may see &Trait, and Box<Trait>. While legal syntax, these are officially deprecated. Adding dyn keyword is strongly encouraged.

^2.Naming is hard.