Chapter 5. Controlling OTP Behaviors

We have in the previous chapters covered the highlights of the gen_server behavior. You should by now have implemented your first client-server application and started to build an idea of how OTP behaviors help you to reduce your code base by allowing you to focus on the specifics of what your system has to do. This chapter digs deeper into behaviors, exploring some of the advanced topics intermixed with built-in functionality. While we are focusing on generic servers, most of what we write will apply to many of the other behaviors, including those you could implement yourself. Read with care, as we reference this chapter often in the remainder of this book.

The sys Module

We’ve mentioned many times the built-in functionality you get as a result of using OTP behaviors and the ease with which you can add your own features. Most of what we cover is accessed through the sys module, allowing you to generate trace events, inspect and manipulate behavior state, as well as send and receive system messages. All of this functionality works on the standard OTP behaviors, but also, as we show in Chapter 10, you can reuse it when defining your own behaviors.

Tracing and Logging

Let’s find out how built-in tracing works by running a little example. Start your frequency server in the shell and, using the sys module, try the following:

1> frequency:start().
{ok,<0.35.0>}
2> sys:trace(frequency, true).
ok
3> frequency:allocate().
*DBG* frequency got call {allocate,<0.33.0>} from <0.33.0>
*DBG* frequency sent {ok,10} to <0.33.0>,
      new state {[11,12,13,14,15],[{10,<0.33.0>}]}
{ok,10}
4> frequency:deallocate(10).
*DBG* frequency got cast {deallocate,10}
ok
*DBG* frequency new state {[10,11,12,13,14,15],[]}
5> sys:trace(frequency, false).
ok

By turning on the trace flags for our frequency allocator, we are able to generate printouts of system events, including messages and state changes. Our example pipes the messages out to the shell. If we instead use the sys:log/2 call, we store them in the server loop. They can be displayed using the print flag or can be retrieved as an Erlang data structure through the get flag:

6> sys:log(frequency, true).
ok
7> {ok, Freq} = frequency:allocate(), frequency:deallocate(Freq).
ok
8> sys:log(frequency, print).
*DBG* frequency got call {allocate,<0.33.0>} from <0.33.0>
*DBG* frequency sent {ok,10} to <0.33.0>, 
      new state {[11,12,13,14,15],[{10,<0.33.0>}]}
*DBG* frequency got cast {deallocate,10}
*DBG* frequency new state {[10,11,12,13,14,15],[]}
ok
9> sys:log(frequency, get).
{ok,[{{in,{'$gen_call',{<0.33.0>,#Ref<0.0.4.59>},
                       {allocate,<0.33.0>}}},
      frequency,#Fun<gen_server.0.40920150>},
     {{out,{ok,10},<0.33.0>,{[11,12,13,14,15],[{10,<0.33.0>}]}},
      frequency,#Fun<gen_server.6.40920150>},
     {{in,{'$gen_cast',{deallocate,10}}},
      frequency,#Fun<gen_server.0.40920150>},
     {{noreply,{[10,11,12,13,14,15],[]}},
      frequency,#Fun<gen_server.4.40920150>}]}
10> sys:log(frequency, false).
ok

When you use the sys:log/2 call to store trace events in the server loop, the default number of events stored is 10. You can override this number by passing the {true, Int} flag when enabling logging. Int is an integer denoting the new default number of events you want to store. When you plan to deal with large volumes of debug messages, or leave debugging turned on for a long time, use sys:log_to_file/2 to pipe the messages to a text file.

System Messages

Have a look at the return value of shell command 9 in the previous example. If we pass the get flag to sys:log/2, we get back a list of system events. The forms of the events in the log depend on the processes producing them, but generally each event contains a system message with one the following forms:

{in, Msg}

This system message is triggered when a message (including a timeout) is sent to the gen_server. Msg includes any construct that is part of the OTP message protocol, e.g., {'$gen_cast', Msg} for casts and {'$gen_call',{Pid, Ref}, Msg} for calls. For any regular Erlang term sent as a message to a gen_server process, Msg will simply be that term.

{out, Msg, To, State}

This system message is generated when replying to the client using the {reply, Reply, NewState} control tuple, but is not generated for replies sent via gen_server:reply/2. Msg is the reply sent to the client, and To is the pid of the client. State is the same as NewState specified in the reply tuple.

term()

System messages of any format are allowed. For example, the return value of shell command 9 includes the message {noreply,{[10,11,12,13,14,15],[]}}, which is the result of handle_cast/2 after handling the deallocate cast. The second element of the noreply tuple is the new state of the gen_server.

Note that the documentation at the time of writing (up to and including Erlang 18) for the sys module also specifies {in, Msg, From} and {out, Msg, To} as valid system messages, but these are not used by any standard behaviors.

Your Own Trace Functions

You can implement your own trace functions by implementing your own fun that gets triggered in conjunction with a system event. You can pattern match on the events, taking any course of action you like. Trace functions can be used to generate your own debug printouts, turn on low-level traces using dbg or the trace BIFs, enable logging of particular information, run diagnostic functions, or execute any other code you might need (or none at all).

The following example keeps a counter for every time a client is refused a frequency and prints a warning message.1 Note how we achieve this without touching the original frequency code:

11> F = fun(Count,{out, {error, no_frequency}, Pid, _LoopData}, ProcData) ->
              io:format("*DBG* Warning, Client ~p refused frequency! Count:~w~n",
              [Pid, Count]), Count + 1;
              (Count,  _, _) ->
                Count
         end.
#Fun<erl_eval.18.54118792>
12> sys:install(frequency, {F, 1}).
ok
13> frequency:allocate(), frequency:allocate(), frequency:allocate(),
    frequency:allocate(), frequency:allocate(), frequency:allocate().
{ok,15}
14> frequency:allocate().
*DBG* Warning, Client <0.33.0> refused frequency! Count:1
{error,no_frequency}
15> frequency:allocate().
*DBG* Warning, Client <0.33.0> refused frequency! Count:2
{error,no_frequency}
16> sys:remove(frequency, F).
false
17> frequency:allocate().
{error,no_frequency}

Let’s look at this example in more detail. We create a fun F that takes three arguments. The first, Count, is the state of the debug function, passed between calls. Count, in this example, acts like a state variable, as we’ve chosen to count the number of times the first function clause is matched. Other trace functions might use more complicated states. The second argument is the system message, in which we pattern match on outbound messages of the format {error, no_frequency}. The third argument, ProcData, is specific to the behavior being traced; for example, for a gen_server it’s either the registered name of the process or its pid, whereas for a gen_fsm it is a tuple of the process name or pid and the current state name of the FSM (we cover the gen_fsm behavior in Chapter 6). All other system messages are ignored due to the second clause of the F function. We set the state of the debug function Count to the integer 1 in the second element of the tuple of the sys:install/2 call in shell command 12. In this command, we also pass the fun F to the frequency server, enabling the debug printout. We continue by calling frequency:allocate/0 enough times to run out of frequencies, triggering the debug printout twice and increasing the counter. Every time it is executed, F returns the Count state variable, incremented by 1 if the first clause pattern matches or unchanged if the second clause matches. Returning the atom done in the debug function is equivalent to disabling the function by calling sys:remove/2, as shown in command line 16.

Statistics, Status, and State

The sys module also lets you collect general statistics on behaviors as well as retrieve information about their internal state, including loop data, without having to reinvent the wheel or implement anything new:

18> sys:statistics(frequency, true).
ok
19> frequency:allocate().
{error,no_frequency}
20> sys:statistics(frequency,get).
{ok,[{start_time,{{2015,11,29},{20,10,54}}},
     {current_time,{{2015,11,29},{20,12,9}}},
     {reductions,33},
     {messages_in,1},
     {messages_out,0}]}
21> sys:statistics(frequency, false).
ok
22> sys:get_status(frequency).
{status,<0.35.0>,
        {module,gen_server},
        [[{'$ancestors',[<0.33.0>]},
          {'$initial_call',{frequency,init,1}}],
         running,<0.33.0>,[],
         [{header,"Status for generic server frequency"},
          {data,[{"Status",running},
                 {"Parent",<0.33.0>},
                 {"Logged events",[]}]},
          {data,[{"State",
                  {{available,[]},
                   {allocated,[{15,<0.33.0>},
                               {14,<0.33.0>},
                               {13,<0.33.0>},
                               {12,<0.33.0>},
                               {11,<0.33.0>},
                               {10,<0.33.0>}]}}}]}]]}

While sys:statistics/2 returns a list of self-explanatory tagged values, the tuple returned by sys:get_status/1 is not as obvious. It returns a tuple of the format:

{status, Pid, {module,Mod}, [ProcessDictionary, SysState, Parent, Dbg, Misc]}

where Pid and Mod are the behavior’s process identifier and callback module, respectively. The ProcessDictionary is a list of key-value tuples. Note that while we do not use the process dictionary in our frequency server example, the gen_server library module and other behaviors we have yet to cover all do.

SysState tells us whether the behavior’s state is running or suspended. By calling sys:suspend/1 and sys:resume/1, we can stop the behavior from handling normal messages, in which case only system messages are handled. Usually you suspend a process when upgrading software using the OTP-specified upgrade capabilities or when testing edge conditions. You might also suspend a process when defining your own behaviors, but most probably not when using standard behaviors. The only way you should suspend Erlang processes in the business logic of your programs is by using receive clauses when none of the messages in the mailbox match. Using the sys:suspend/1 call in your code is a no-no!

Parent is the parent pid, needed by behavior processes that trap exits. If the parent terminates, the behavior processes have to terminate as well. In this example, Parent is the shell process ID. DbgFlag holds the trace and statistics flags, which at the time we retrieved the status had all been turned off (hence the empty list).

Finally, Misc is a list of tagged tuples that contain behavior-specific information. The contained items vary among behaviors, and you are able to override them yourself by providing an optional callback function in your behavior callback module. When working with generic servers, the most important information in Misc is the loop data. You can influence the contents of the Misc value yourself by providing an optional callback function in your behavior callback module, using the function to format the {data, [{"State", ...}]} field to a value the end user might find simpler, more meaningful, or more helpful:

...
-export([format_status/2]).
...

format_status(Opt, [ProcDict, {Available, Allocated}]) ->
    {data, [{"State", {{available, Available}, {allocated, Allocated}}}]}.

If Opt is the atom normal, it tells us the status is being retrieved as a result of the sys:get_status/1 call. If the behavior is terminating abnormally and the status is being retrieved to incorporate it in an error report, Opt is set to terminate.

ProcDict is a list of key-value tuples containing the process dictionary. In the earlier example, the new state would be:

{data,[{"State", {{available, []},{allocated, [{15,<0.33.0>}, {14,<0.33.0>}, 
                                               {13,<0.33.0>}, {12,<0.33.0>},
                                               {11,<0.33.0>}, {10,<0.33.0>}]}}}]}

While it is not mandatory to return a tuple of the format {data, [{"State", State}]}, it is recommended in order to stay consistent with what is currently in use.

To examine just the loop data stored in the behavior process by the callback module, use sys:get_state/1:

23> {Free, Alloc} = sys:get_state(frequency).
{[],
 [{15,<0.33.0>}, {14,<0.33.0>}, {13,<0.33.0>}, {12,<0.33.0>},
  {11,<0.33.0>}, {10,<0.33.0>}]}

This handy method allows you to avoid having to extract the loop data from the results of sys:get_status/1, something that’s often difficult to do while debugging interactively in the shell. The sys:get_state/1 call is intended only for debugging, in fact, as is the corresponding function sys:replace_state/2, which allows you to replace the loop state of a running behavior process. For example, imagine you are debugging in the shell and you want to quickly add a few frequencies. You could do it by recompiling the code and restarting the server—something that’s simple to do when there are only a few frequencies available, as in our example, but much more difficult if you are in the middle of a test with thousands of allocated frequencies and need to retain the state:

24> sys:replace_state(frequency, fun(_) -> {[16,17], Alloc} end).
{[16,17],
 [{15,<0.33.0>}, {14,<0.33.0>}, {13,<0.33.0>}, {12,<0.33.0>},
  {11,<0.33.0>}, {10,<0.33.0>}]}
25> frequency:allocate().
{ok,16}

Replacing the loop data requires passing a function that receives the current value of the loop data and returns a new value. This allows you to easily modify only the necessary portions of a complex loop data value. In this example, we replace the empty list of available frequencies with a list of two new frequencies while keeping the list of allocated frequencies. The sys:replace_state/2 function returns the new loop data. Since the new value in our example adds available frequencies, the next call to frequency:allocate/0, which previously was returning {error, no_frequency}, now returns {ok,16}.

The sys Module Recap

To sum up, let’s take another look at the functions in the sys module we have seen. Note the notation we are using for [,Timeout] in our function descriptions. It means an optional argument to the call, defining functions of arity 2 and 3. Because these functions are nothing other than synchronous calls to our behavior, using Timeout allows us to override the 5-second default timeout time with a value more suited for our application. The functions we’ve covered are:

sys:trace(Name,TraceFlag [,Timeout]) -> ok

sys:log(Name,LogFlag [,Timeout]) -> ok | {ok, EventList}
sys:log_to_file(Name,FileFlag [,Timeout]) -> ok | {error, open_file}

sys:install(Name,{Func,FuncState} [,Timeout]) -> ok
sys:remove(Name,Func [,Timeout])

sys:statistics(Name,Flag [,Timeout]) -> ok | {ok, Statistics}.

sys:get_status(Name [,Timeout]) -> {status, Pid, {module, Mod}, Status}

sys:get_state(Name [,Timeout]) -> State

sys:replace_state(Name,ReplaceFun [,Timeout]) -> State

sys:suspend(Name [,Timeout]) -> ok
sys:resume(Name [,Timeout]) -> ok

To print trace events to the shell, use trace/2. When logging the events for later retrieval, use log/2. Turn logging on and off by setting LogFlag to true or false. By default, the last 10 events are stored; you can override this value by turning on logging using {true, Int}, where Int is a non-negative integer.

Events can be retrieved using the print and get flags. When using log_to_file/2, events are stored in textual format. The FileFlag is a string denoting the absolute or relative filename, or the atom false to turn it off. Use sys:install/2 to write your own triggers and trace functions in conjunction with system events and sys:remove/2 to recall them.

When using statistics/2, turn the gathering of statistics on and off by setting Flag to true or false, respectively. Use get_state/1 to examine loop data and replace_state/2 to replace it. And finally, get_status/1 returns all the available data relative to the internal behavior state. The get_state/1, replace_state/2, and get_status/1 functions are incredibly helpful when debugging and troubleshooting live systems.

Remember the Opts parameter passed as the last argument to the gen_server start functions? We used the empty list as a placeholder. You can enable tracing, logging, and statistics when starting your behavior by using the Opts field. If you pass [{debug, DbgList}], where DbgList contains one or more of the entries trace, log, statistics, and {log_to_file, FileName}, these flags are enabled as soon as the behavior process is started.

Spawn Options

When starting a behavior, you can change the default memory and garbage collector settings to address performance and memory utilization. The settings you pass are the same ones taken by the spawn_opt/4 BIF, but passed as an argument of the format [{spawn_opts, OptsList}] along with the debug options in the behavior Opts field.

Use your spawn options with care! The only way to be sure you have performance issues and bottlenecks related to memory management is by profiling and benchmarking your systems. In doing so, you need to understand how the underlying heaps, memory allocation, and garbage collection mechanisms work. Premature optimization is the root of all evil (after shared memory and mutable state). If you do not believe us, you will soon learn that attempts to optimize memory management often have the opposite effect and make your programs slower. The vast majority of cases do not call for performance tuning, but those that do will greatly benefit being spawned with a larger heap or more (or less) frequent garbage collection cycles.

Memory Management and Garbage Collection

If you suspect that your performance issues can be addressed through memory management, benchmark your system while manipulating the heap and the garbage collector settings. Memory-related options that can be changed include:

min_heap_size

Sets the size the process heap will grow to before the garbage collector (gc) is triggered. This name is misleading, though, as it is in fact the maximum size the heap is allowed to grow to before triggering the gc.

min_bin_vheap_size

Sets the initial and minimal value of the space this process is allowed to use in the shared binary heap before triggering a garbage collection on the binaries.

fullsweep_after

Determines the number of generational garbage collections that have to be executed before a complete garbage collection pass.

In the following example, we start the frequency server and trace events related to the garbage collector. We use the dbg tracer to measure how many microseconds the process spends garbage collecting. When allocating five frequencies, the total was 9 microseconds (911,345–911,336):

1> dbg:tracer().
{ok,<0.35.0>}
2> {ok, Pid} = frequency:start().
{ok,<0.38.0>}
3> dbg:p(Pid, [garbage_collection, timestamp]).
{ok,[{matched,nonode@nohost,1}]}
4> frequency:allocate(), frequency:allocate(), frequency:allocate(),
   frequency:allocate(), frequency:allocate().
{ok,14}
(<0.38.0>) gc_start [{old_heap_block_size,0},
 {heap_block_size,233},
 {mbuf_size,0},
 {recent_size,0},
 {stack_size,12},
 {old_heap_size,0},
 {heap_size,213},
 {bin_vheap_size,0},
 {bin_vheap_block_size,46422},
 {bin_old_vheap_size,0},
 {bin_old_vheap_block_size,46422}] (Timestamp: {1448,829619,911336})
(<0.38.0>) gc_end [{old_heap_block_size,0},
 {heap_block_size,233},
 {mbuf_size,0},
 {recent_size,44},
 {stack_size,12},
 {old_heap_size,0},
 {heap_size,44},
 {bin_vheap_size,0},
 {bin_vheap_block_size,46422},
 {bin_old_vheap_size,0},
 {bin_old_vheap_block_size,46422}] (Timestamp: {1448,829619,911345})

If we now spawn the frequency server, setting the minimum heap size to 1,024 words (a smaller size would have been enough), we have enough memory to allocate the frequencies without triggering the garbage collector:

1> dbg:tracer().
{ok,<0.35.0>}
2> {ok, Pid} = gen_server:start_link({local, frequency}, frequency, [],
                                      [{spawn_opt, [{min_heap_size, 1024}]}]).
{ok,<0.38.0>}
3> dbg:p(Pid, [garbage_collection, timestamp]).
{ok,[{matched,nonode@nohost,1}]}
4> frequency:allocate(), frequency:allocate(), frequency:allocate(),
   frequency:allocate(), frequency:allocate().
{ok,14}

Full sweep of the heap

By setting the {fullsweep_after, Number} spawn option, you can specify the number of generational garbage collections that take place before executing a full sweep. Setting Number to 0 disables the generational garbage collection mechanism, freeing all unused data in both the young heap and the old heap every time it is triggered. This will help in environments with little RAM where memory has to be strictly managed. The zero setting may also be useful when a lot of large binaries that are no longer referenced collect in the old heap and you want to remove them frequently. Setting a small value will be suitable if your data is short-lived and benchmarks demonstrate that it is cluttering up your heap. The Erlang documentation suggests a value of 10 or 20, but you should pick your own based on the properties displayed by your system. The default value is much larger!

A full-sweep garbage collection is also triggered every time you hibernate your process. This might help reduce the memory footprint when working with processes that have memory-intensive computations but little overall activity. You can set the full-sweep value globally for all processes using the erlang:system_flag/2 call, but we recommend you don’t. You can use the process_info/2 BIF to get information on the settings you change:

5> process_info(Pid, garbage_collection).
{garbage_collection,[{min_bin_vheap_size,46422},
                     {min_heap_size,1598},
                     {fullsweep_after,65535},
                     {minor_gcs,0}]}

Note the default setting of fullsweep_after, a value much higher than you might expect. We had set the min_heap_size to 1024, but in shell prompt 5, it appears to be 1598. We requested 1,024 words, but 1,598 is the first value greater than 1,024 in the Fibonacci recurrence sequence of heap sizes the VM uses, so that value is selected instead of 1,024.

Warning

If you start playing with the heap size and garbage collection settings, keep in mind that memory is freed only when the garbage collector is triggered. There might be cases where the process heap contains binary references to potentially large binaries in the shared heap. Each reference to a binary is relatively small, so even if the process does not refer to these binaries anymore, potentially huge amounts of memory can be consumed without the garbage collector being triggered, because there is still plenty of space on the process heap. That is why the per-process virtual binary heap is there, calculating the total amount of memory used up by the binaries in the shared heap and helping ensure they get garbage collected more promptly. Under these circumstances, hibernating the process or triggering garbage collection using the erlang:garbage_collect() BIF might prove more useful.

Another potential risk is running out of memory. As an example, having a large min_heap_size and using the dangerously high default fullsweep_after value of 65535 might result in the old heap growing because garbage collections are far apart, resulting in your system running out of memory before the first full sweeps are triggered. Always stress test your systems, and let soak test runs span days, if not weeks.

What’s Next?

We park online tracing for now, until we implement our own behaviors (learning how it all works behind the scenes) in Chapter 10, and ignore performance tuning until we reach Chapter 13. In the next chapters, we focus on the remaining behaviors, starting with FSMs, followed by event managers, supervisors, and applications. Remember that they are all built on the same foundations, so the sys module and all of the spawn and debug options we have discussed in this chapter will be valid.

1 The io:format/2 executed in the fun attaches itself to the group leader of the traced behavior, causing warnings to be printed in the local shell. If you connect from a remote shell, you will not be able to see them.