There’s only one everything
They Might Be Giants, “One Everything” (2008)
In this chapter, you will write a Rust version of the uniq program (pronounced unique), which will find the distinct lines of text from either a file or STDIN.
Among its many uses, it is often employed to count how many times each unique string is found.
Along the way, you will learn how to do the following:
Write to a file or STDOUT
Use a closure to capture a variable
Apply the don’t repeat yourself (DRY) concept
Use the Write trait and the write! and writeln! macros
Use temporary files
Indicate the lifetime of a variable
As usual, I’ll start by explaining how uniq works so that you understand what is expected of your program.
Following is part of the manual page for the BSD version of uniq.
The challenge program in this chapter will only implement the reading of a file or STDIN, writing to a file or STDOUT, and counting the lines for the -c flag, but I include more of the documentation so that you can see the full scope of the program:
UNIQ(1) BSD General Commands Manual UNIQ(1)
NAME
uniq -- report or filter out repeated lines in a file
SYNOPSIS
uniq [-c | -d | -u] [-i] [-f num] [-s chars] [input_file [output_file]]
DESCRIPTION
The uniq utility reads the specified input_file comparing adjacent lines,
and writes a copy of each unique input line to the output_file. If
input_file is a single dash ('-') or absent, the standard input is read.
If output_file is absent, standard output is used for output. The second
and succeeding copies of identical adjacent input lines are not written.
Repeated lines in the input will not be detected if they are not adja-
cent, so it may be necessary to sort the files first.
The following options are available:
-c Precede each output line with the count of the number of times
the line occurred in the input, followed by a single space.
-d Only output lines that are repeated in the input.
-f num Ignore the first num fields in each input line when doing compar-
isons. A field is a string of non-blank characters separated
from adjacent fields by blanks. Field numbers are one based,
i.e., the first field is field one.
-s chars
Ignore the first chars characters in each input line when doing
comparisons. If specified in conjunction with the -f option, the
first chars characters after the first num fields will be
ignored. Character numbers are one based, i.e., the first char-
acter is character one.
-u Only output lines that are not repeated in the input.
-i Case insensitive comparison of lines.
In the 06_uniqr/tests/inputs directory of the book’s Git repository, you will find the following input files I’ll use for testing:
empty.txt: an empty file
one.txt: a file with one line of text
two.txt: a file with two lines of the same text
three.txt: a file with 13 lines of 4 unique values
skip.txt: a file with four lines of two unique values plus an empty line
The other files t[1–6].txt are examples from a Perl program used to test the GNU version. These are generated by the mk-outs.sh file:
$ cat mk-outs.sh #!/usr/bin/env bash ROOT="tests/inputs" OUT_DIR="tests/expected" [[ ! -d "$OUT_DIR" ]] && mkdir -p "$OUT_DIR" # Cf https://github.com/coreutils/coreutils/blob/master/tests/misc/uniq.pl echo -ne "a\na\n" > $ROOT/t1.txtecho -ne "a\na" > $ROOT/t2.txt
echo -ne "a\nb" > $ROOT/t3.txt
echo -ne "a\na\nb" > $ROOT/t4.txt
echo -ne "b\na\na\n" > $ROOT/t5.txt
echo -ne "a\nb\nc\n" > $ROOT/t6.txt
for FILE in $ROOT/*.txt; do BASENAME=$(basename "$FILE") uniq $FILE > ${OUT_DIR}/${BASENAME}.out uniq -c $FILE > ${OUT_DIR}/${BASENAME}.c.out uniq < $FILE > ${OUT_DIR}/${BASENAME}.stdin.out uniq -c < $FILE > ${OUT_DIR}/${BASENAME}.stdin.c.out done

Two lines each ending with a newline

No trailing newline on last line

Two different lines, no trailing newline

Two lines the same; last is different with no trailing newline

Two different values with newlines on each

Three different values with newlines on each
To demonstrate uniq, note that it will print nothing when given an empty file:
$ uniq tests/inputs/empty.txt
Given a file with just one line, the one line will be printed:
$ uniq tests/inputs/one.txt a
It will also print the number of times a line occurs before the line when run with the
-c option.
The count is right-justified in a field four characters wide and is followed by a single space and then the line of text:
$ uniq -c tests/inputs/one.txt 1 a
The file tests/inputs/two.txt contains two duplicate lines:
$ cat tests/inputs/two.txt a a
Given this input, uniq will emit one line:
$ uniq tests/inputs/two.txt a
With the -c option, uniq will also include the count of unique lines:
$ uniq -c tests/inputs/two.txt 2 a
A longer input file shows that uniq only considers the lines in order and not globally.
For example, the value a appears four times in this input file:
$ cat tests/inputs/three.txt a a b b a c c c a d d d d
When counting, uniq starts over at 1 each time it sees a new string.
Since a occurs in three different places in the input file, it will also appear three times in the output:
$ uniq -c tests/inputs/three.txt 2 a 2 b 1 a 3 c 1 a 4 d
If you want the actual unique values, you must first sort the input, which can be done with the aptly named sort command.
In the following output, you’ll finally see that a occurs a total of four times in the input file:
$ sort tests/inputs/three.txt | uniq -c 4 a 2 b 3 c 4 d
The file tests/inputs/skip.txt contains a blank line:
$ cat tests/inputs/skip.txt a a b
The blank line acts just like any other value, and so it will reset the counter:
$ uniq -c tests/inputs/skip.txt 1 a 1 1 a 1 b
If you study the Synopsis of the usage closely, you’ll see a very subtle indication of how to write the output to a file.
Notice how input_file and output_file in the following are grouped inside square brackets to indicate that they are optional as a pair.
That is, if you provide input_file, you may also optionally provide output_file:
uniq [-c | -d | -u] [-i] [-f num] [-s chars] [input_file [output_file]]
For example, I can count tests/inputs/two.txt and place the output into out:
$ uniq -c tests/inputs/two.txt out
$ cat out
2 a
With no positional arguments, uniq will read from STDIN by default:
$ cat tests/inputs/two.txt | uniq -c
2 a
If you want to read from STDIN and indicate the output filename, you must use a dash (-) for the input filename:
$ cat tests/inputs/two.txt | uniq -c - out
$ cat out
2 a
The GNU version works basically the same while also providing many more options:
$ uniq --help
Usage: uniq [OPTION]... [INPUT [OUTPUT]]
Filter adjacent matching lines from INPUT (or standard input),
writing to OUTPUT (or standard output).
With no options, matching lines are merged to the first occurrence.
Mandatory arguments to long options are mandatory for short options too.
-c, --count prefix lines by the number of occurrences
-d, --repeated only print duplicate lines, one for each group
-D, --all-repeated[=METHOD] print all duplicate lines
groups can be delimited with an empty line
METHOD={none(default),prepend,separate}
-f, --skip-fields=N avoid comparing the first N fields
--group[=METHOD] show all items, separating groups with an empty line
METHOD={separate(default),prepend,append,both}
-i, --ignore-case ignore differences in case when comparing
-s, --skip-chars=N avoid comparing the first N characters
-u, --unique only print unique lines
-z, --zero-terminated end lines with 0 byte, not newline
-w, --check-chars=N compare no more than N characters in lines
--help display this help and exit
--version output version information and exit
A field is a run of blanks (usually spaces and/or TABs), then nonblank
characters. Fields are skipped before chars.
Note: 'uniq' does not detect repeated lines unless they are adjacent.
You may want to sort the input first, or use 'sort -u' without 'uniq'.
Also, comparisons honor the rules specified by 'LC_COLLATE'.
As you can see, both the BSD and GNU versions have many more options, but this is as much as the challenge program is expected to implement.
This chapter’s challenge program should be called uniqr (pronounced you-neek-er) for a Rust version of uniq.
Start by running cargo new uniqr, then modify your Cargo.toml to add the following dependencies:
[dependencies]clap="2.33"[dev-dependencies]assert_cmd="2"predicates="2"tempfile="3"rand="0.8"

The tests will create temporary files using the tempfile crate.
Copy the book’s 06_uniqr/tests directory into your project, and then run cargo test to ensure that the program compiles and the tests run and fail.
Update your src/main.rs to the following:
fnmain(){ifletErr(e)=uniqr::get_args().and_then(uniqr::run){eprintln!("{}",e);std::process::exit(1);}}
I suggest you start src/lib.rs with the following:
useclap::{App,Arg};usestd::error::Error;typeMyResult<T>=Result<T,Box<dynError>>;#[derive(Debug)]pubstructConfig{in_file:String,out_file:Option<String>,count:bool,}

This is the input filename to read, which may be STDIN if the filename is a dash.

The output will be written either to an optional output filename or STDOUT.

count is a Boolean for whether or not to print the counts of each line.
Here is an outline for get_args:
pubfnget_args()->MyResult<Config>{letmatches=App::new("uniqr").version("0.1.0").author("Ken Youens-Clark <kyclark@gmail.com>").about("Rust uniq")// What goes here?.get_matches();Ok(Config{in_file:...out_file:...count:...})}
I suggest you start your run by printing the config:
pubfnrun(config:Config)->MyResult<()>{println!("{:?}",config);Ok(())}
Your program should be able to produce the following usage:
$ cargo run -- -h
uniqr 0.1.0
Ken Youens-Clark <kyclark@gmail.com>
Rust uniq
USAGE:
uniqr [FLAGS] [ARGS]
FLAGS:
-c, --count Show counts
-h, --help Prints help information
-V, --version Prints version information
ARGS:
<IN_FILE> Input file [default: -]
<OUT_FILE> Output file 

The -c|--count flag is optional.

The input file is the first positional argument and defaults to a dash (-).

The output file is the second positional argument and is optional.
By default the program will read from STDIN, which can be represented using a dash:
$ cargo run
Config { in_file: "-", out_file: None, count: false }
The first positional argument should be interpreted as the input file and the second positional argument as the output file.1
Note that clap can handle options either before or after positional arguments:
$ cargo run -- tests/inputs/one.txt out --count
Config { in_file: "tests/inputs/one.txt", out_file: Some("out"), count: true }
Take a moment to finish get_args before reading further.
I assume you are an upright and moral person who figured out the preceding function on your own, so I will now share my solution:
pubfnget_args()->MyResult<Config>{letmatches=App::new("uniq").version("0.1.0").author("Ken Youens-Clark <kyclark@gmail.com>").about("Rust uniq").arg(Arg::with_name("in_file").value_name("IN_FILE").help("Input file").default_value("-"),).arg(Arg::with_name("out_file").value_name("OUT_FILE").help("Output file"),).arg(Arg::with_name("count").short("c").help("Show counts").long("count").takes_value(false),).get_matches();Ok(Config{in_file:matches.value_of_lossy("in_file").unwrap().to_string(),out_file:matches.value_of("out_file").map(String::from),count:matches.is_present("count"),})}

Convert the in_file argument to a String.

Convert the out_file argument to an Option<String>.

The count is either present or not, so convert this to a bool.
Because the in_file argument has a default value, it is safe to call Option::unwrap and convert the value to a String.
There are several other ways to get the same result, none of which is necessarily superior.
You could use Option::map to feed the value to String::from and then unwrap it:
in_file:matches.value_of_lossy("in_file").map(String::from).unwrap(),
You could also use a closure that calls Into::into to convert the value into a String because Rust can infer the type:
in_file:matches.value_of_lossy("in_file").map(|v|v.into()).unwrap(),
The preceding can also be expressed using the Into::into function directly because functions are first-class values that can be passed as arguments:
in_file:matches.value_of_lossy("in_file").map(Into::into).unwrap(),
The out_file is optional, but if there is an option, you can use Option::map to convert a Some value to a String:
out_file:matches.value_of("out_file").map(|v|v.to_string()),
The test suite in tests/cli.rs is fairly large, containing 78 tests that check the program under the following conditions:
Input file as the only positional argument, check STDOUT
Input file as a positional argument with --count option, check STDOUT
Input from STDIN with no positional arguments, check STDOUT
Input from STDIN with --count and no positional arguments, check STDOUT
Input and output files as positional arguments, check output file
Input and output files as positional arguments with --count, check output file
Input from STDIN and output files as positional arguments with --count, check output file
Given how large and complicated the tests became, you may be interested to see how I structured tests/cli.rs, which starts with the following:
useassert_cmd::Command;usepredicates::prelude::*;userand::{distributions::Alphanumeric,Rng};usestd::fs;usetempfile::NamedTempFile;typeTestResult=Result<(),Box<dynstd::error::Error>>;structTest{input:&'staticstr,out:&'staticstr,out_count:&'staticstr,}

This is used to create temporary output files.

A struct to define the input files and expected output values with and without the counts.
Note the use of 'static to denote the lifetime of the values.
I want to define structs with &str values, and the Rust compiler would like to know exactly how long the values are expected to stick around relative to one another.
The 'static annotation shows that this data will live for the entire lifetime of the program.
If you remove it and run the tests, you’ll see similar errors from the compiler, as shown in the previous section, along with a suggestion of how to fix it:
error[E0106]: missing lifetime specifier
--> tests/cli.rs:8:12
|
8 | input: &str,
| ^ expected named lifetime parameter
|
help: consider introducing a named lifetime parameter
|
7 | struct Test<'a> {
8 | input: &'a str,
Next, I define some constant values I need for testing:
constPRG:&str="uniqr";constEMPTY:Test=Test{input:"tests/inputs/empty.txt",out:"tests/inputs/empty.txt.out",out_count:"tests/inputs/empty.txt.c.out",};

The name of the program being tested

The location of the input file for this test

The location of the output file without the counts

The location of the output file with the counts
After the declaration of EMPTY, there are many more Test structures followed by several helper functions.
The run function will use Test.input as an input file and will compare STDOUT to the contents of the Test.out file:
fnrun(test:&Test)->TestResult{letexpected=fs::read_to_string(test.out)?;Command::cargo_bin(PRG)?.arg(test.input).assert().success().stdout(expected);Ok(())}

The function accepts a Test and returns a TestResult.

Try to read the expected output file.

Try to run the program with the input file as an argument, verify it ran successfully, and compare STDOUT to the expected value.
The run_count helper function works very similarly, but this time it tests for the counting:
fnrun_count(test:&Test)->TestResult{letexpected=fs::read_to_string(test.out_count)?;Command::cargo_bin(PRG)?.args(&[test.input,"-c"]).assert().success().stdout(expected);Ok(())}

Read the Test.out_count file for the expected output.

Pass both the Test.input value and the flag -c to count the lines.
The run_stdin function will supply the input to the program through STDIN:
fnrun_stdin(test:&Test)->TestResult{letinput=fs::read_to_string(test.input)?;letexpected=fs::read_to_string(test.out)?;Command::cargo_bin(PRG)?.write_stdin(input).assert().success().stdout(expected);Ok(())}

Try to read the Test.input file.

Try to read the Test.out file.

Pass the input through STDIN and verify that STDOUT is the expected value.
The run_stdin_count function tests both reading from STDIN and counting the lines:
fnrun_stdin_count(test:&Test)->TestResult{letinput=fs::read_to_string(test.input)?;letexpected=fs::read_to_string(test.out_count)?;Command::cargo_bin(PRG)?.arg("--count").write_stdin(input).assert().success().stdout(expected);Ok(())}

Run the program with the long --count flag, feed the input to STDIN, and verify that STDOUT is correct.
The run_outfile function checks that the program accepts both the input and output files as positional arguments.
This is somewhat more interesting as I needed to use temporary files in the testing because, as you have seen repeatedly, Rust will run the tests in parallel.
If I were to use the same dummy filename like blargh to write all the output files, the tests would overwrite one another’s output.
To get around this, I use the tempfile::NamedTempFile to get a dynamically generated temporary filename that will automatically be removed when I finish:
fnrun_outfile(test:&Test)->TestResult{letexpected=fs::read_to_string(test.out)?;letoutfile=NamedTempFile::new()?;letoutpath=&outfile.path().to_str().unwrap();Command::cargo_bin(PRG)?.args(&[test.input,outpath]).assert().success().stdout("");letcontents=fs::read_to_string(&outpath)?;assert_eq!(&expected,&contents);Ok(())}

Try to get a named temporary file.

Get the path to the file.

Run the program with the input and output filenames as arguments, then verify there is nothing in STDOUT.

Try to read the output file.

Check that the contents of the output file match the expected value.
The next two functions are variations on what I’ve already shown, adding in the
--count flag and finally asking the program to read from STDIN when the input filename is a dash.
The rest of the module calls these helpers using the various structs to run all the tests.
I would suggest you start in src/lib.rs by reading the input file, so it makes sense to use the open function from previous chapters:
fnopen(filename:&str)->MyResult<Box<dynBufRead>>{matchfilename{"-"=>Ok(Box::new(BufReader::new(io::stdin()))),_=>Ok(Box::new(BufReader::new(File::open(filename)?))),}}
Be sure you expand your imports to include the following:
useclap::{App,Arg};usestd::{error::Error,fs::File,io::{self,BufRead,BufReader},};
You can borrow quite a bit of code from Chapter 3 that reads lines of text from an input file or STDIN while preserving the line endings:
pubfnrun(config:Config)->MyResult<()>{letmutfile=open(&config.in_file).map_err(|e|format!("{}: {}",config.in_file,e))?;letmutline=String::new();loop{letbytes=file.read_line(&mutline)?;ifbytes==0{break;}!("{}",line);line.clear();}Ok(())}

Either read STDIN if the input file is a dash or open the given filename. Create an informative error message when this fails.

Create a new, empty mutable String buffer to hold each line.

Create an infinite loop.

Read a line of text while preserving the line endings.

If no bytes were read, break out of the loop.

Print the line buffer.

Clear the line buffer.
Run your program with an input file to ensure it works:
$ cargo run -- tests/inputs/one.txt a
It should also work for reading STDIN:
$ cargo run -- - < tests/inputs/one.txt a
Next, make your program iterate the lines of input and count each unique run of lines, then print the lines with and without the counts.
Once you are able to create the correct output, you will need to handle printing it either to STDOUT or a given filename.
I suggest that you copy ideas from the open function and use File::create.
I’ll step you through how I arrived at a solution. Your version may be different, but it’s fine as long as it passes the test suite. I decided to create two additional mutable variables to hold the previous line of text and the running count. For now, I will always print the count to make sure it’s working correctly:
pubfnrun(config:Config)->MyResult<()>{letmutfile=open(&config.in_file).map_err(|e|format!("{}: {}",config.in_file,e))?;letmutline=String::new();letmutprevious=String::new();letmutcount:u64=0;loop{letbytes=file.read_line(&mutline)?;ifbytes==0{break;}ifline.trim_end()!=previous.trim_end(){ifcount>0{!("{:>4} {}",count,previous);}previous=line.clone();count=0;}count+=1;line.clear();}ifcount>0{!("{:>4} {}",count,previous);}Ok(())}

Create a mutable variable to hold the previous line of text.

Create a mutable variable to hold the count.

Compare the current line to the previous line, both trimmed of any possible trailing whitespace.

Print the output only when count is greater than 0.

Print the count right-justified in a column four characters wide followed by a space and the previous value.

Set the previous variable to a copy of the current line.

Reset the counter to 0.

Increment the counter by 1.

Handle the last line of the file.
I didn’t have to indicate the type u64 for the count variable. Rust will happily infer a type. On a 32-bit system, Rust would use an i32, which would limit the maximum number of duplicates to i32::MAX, or 2,147,483,647. That’s a big number that’s likely to be adequate, but I think it’s better to have the program work consistently by specifying u64.
If I run cargo test, this will pass a fair number of tests.
This code is clunky, though.
I don’t like having to check if count > 0 twice, as it violates the don’t repeat yourself (DRY) principle, where you isolate a common idea into a single abstraction like a function rather than copying and pasting the same lines of code throughout a program.
Also, my code always prints the count, but it should print the count only when config.count is true.
I can put all of this logic into a function, and I will specifically use a closure to close around the config.count value:
let=|count:u64,text:&str|{ifcount>0{ifconfig.count{!("{:>4} {}",count,text);}else{!("{}",text);}};};

The print closure will accept count and text values.

Print only if count is greater than 0.

Check if the config.count value is true.

Use the print! macro to print the count and text to STDOUT.

Otherwise, print the text to STDOUT.
I can update the rest of the function to use this closure:
loop{letbytes=file.read_line(&mutline)?;ifbytes==0{break;}ifline.trim_end()!=previous.trim_end(){(count,&previous);previous=line.clone();count=0;}count+=1;line.clear();}(count,&previous);
At this point, the program will pass several more tests.
All the failed test names have the string outfile because the program fails to write a named output file.
To add this last feature, you can open the output file in the same way as the input file, either by creating a named output file using File::create or by using std::io::stdout.
Be sure to add use std::io::Write for the following code, which you can place just after the file variable:
letmutout_file:Box<dynWrite>=match&config.out_file{Some(out_name)=>Box::new(File::create(out_name)?),_=>Box::new(io::stdout()),};

The mutable out_file will be a boxed value that implements the std::io::Write trait.

When config.out_file is Some filename, use File::create to try to create the file.

Otherwise, use std::io::stdout.
If you look at the documentation for File::create and io::stdout, you’ll see both have a “Traits” section showing the various traits they implement.
Both show that they implement Write, so they satisfy the type requirement Box<dyn Write>, which says that the value inside the Box must implement this trait.
The second change I need to make is to use out_file for the output.
I will replace the print! macro with write! to write the output to a stream like a filehandle or STDOUT.
The first argument to write! must be a mutable value that implements the Write trait.
The documentation shows that write! will return a std::io::Result because it might fail.
As such, I changed my print closure to return MyResult.
Here is the final version of my run function that passes all the tests:
pubfnrun(config:Config)->MyResult<()>{letmutfile=open(&config.in_file).map_err(|e|format!("{}: {}",config.in_file,e))?;letmutout_file:Box<dynWrite>=match&config.out_file{Some(out_name)=>Box::new(File::create(out_name)?),_=>Box::new(io::stdout()),};letmut=|count:u64,text:&str|->MyResult<()>{ifcount>0{ifconfig.count{write!(out_file,"{:>4} {}",count,text)?;}else{write!(out_file,"{}",text)?;}};Ok(())};letmutline=String::new();letmutprevious=String::new();letmutcount:u64=0;loop{letbytes=file.read_line(&mutline)?;ifbytes==0{break;}ifline.trim_end()!=previous.trim_end(){(count,&previous)?;previous=line.clone();count=0;}count+=1;line.clear();}(count,&previous)?;Ok(())}

Open either STDIN or the given input filename.

Open either STDOUT or the given output filename.

Create a mutable print closure to format the output.

Use the print closure to possibly print output. Use ? to propagate potential errors.

Handle the last line of the file.
Note that the print closure must be declared with the mut keyword to make it mutable because the out_file filehandle is borrowed.
Without this, the compiler will show the following error:
error[E0596]: cannot borrow `print` as mutable, as it is not declared as mutable
--> src/lib.rs:84:13
|
63 | let print = |count: u64, text: &str| -> MyResult<()> {
| ----- help: consider changing this to be mutable: `mut print`
...
66 | write!(out_file, "{:>4} {}", count, text)?;
| -------- calling `print` requires mutable binding
| due to mutable borrow of `out_file`
Again, it’s okay if your solution is different from mine, as long as it passes the tests. Part of what I like about writing with tests is that there is an objective determination of when a program meets some level of specifications. As Louis Srygley once said, “Without requirements or design, programming is the art of adding bugs to an empty text file.”2 I would say that tests are the requirements made incarnate. Without tests, you simply have no way to know when a change to your program strays from the requirements or breaks the design.
Can you find other ways to write this algorithm?
For instance, I tried another method that read all the lines of the input file into a vector and used Vec::windows to look at pairs of lines.
This was interesting but could fail if the size of the input file exceeded the available memory on my machine.
The solution presented here will only ever allocate memory for the current and previous lines and so should scale to any size file.
As usual, the BSD and GNU versions of uniq both have many more features than I chose to include in the challenge.
I would encourage you to add all the features you would like to have in your version.
Be sure to add tests for each feature, and always run the entire test suite to verify that all previous features still work.
In my mind, uniq is closely tied with sort, as I often use them together.
Consider implementing your own version of sort, at least to the point of sorting values lexicographically (in dictionary order) or numerically.
In about 100 lines of Rust, the uniqr program manages to replicate a reasonable subset of features from the original uniq program.
Compare this to the GNU C source code, which has more than 600 lines of code.
I would feel much more confident extending uniqr than I would using C due to the Rust compiler’s use of types and useful error messages.
Let’s review some of the things you learned in this chapter:
You can now open a new file for writing or print to STDOUT.
DRY says that any duplicated code should be moved into a single abstraction like a function or a closure.
A closure must be used to capture values from the enclosing scope.
When a value implements the Write trait, it can be used with the write! and writeln! macros.
The tempfile crate helps you create and remove temporary files.
The Rust compiler may sometimes require you to indicate the lifetime of a variable, which is how long it lives in relation to other variables.
In the next chapter, I’ll introduce Rust’s enumerated enum type and how to use regular expressions.
1 While the goal is to mimic the original versions as much as possible, I would note that I do not like optional positional parameters. In my opinion, it would be better to have an -o|--output option that defaults to STDOUT and have only one optional positional argument for the input file that defaults to STDIN.
2 Programming Wisdom (@CodeWisdom), “‘Without requirements or design, programming is the art of adding bugs to an empty text file.’ - Louis Srygley,” Twitter, January 24, 2018, 1:00 p.m., https://oreil.ly/FC6aS.