Chapter 5. Controlling OTP Behaviors

We have in the previous chapters covered the highlights of the gen_server behavior. You should by now have implemented your first client-server application and started to build an idea of how OTP behaviors help you to reduce your code base by allowing you to focus on the specifics of what your system has to do. This chapter digs deeper into behaviors, exploring some of the advanced topics intermixed with built-in functionality. While we are focusing on generic servers, most of what we write will apply to many of the other behaviors, including those you could implement yourself. Read with care, as we reference this chapter often in the remainder of this book.

The sys Module

We’ve mentioned many times the built-in functionality you get as a result of using OTP behaviors and the ease with which you can add your own features. Most of what we cover is accessed through the sys module, allowing you to generate trace events, inspect and manipulate behavior state, as well as send and receive system messages. All of this functionality works on the standard OTP behaviors, but also, as we show in Chapter 10, you can reuse it when defining your own behaviors.

Tracing and Logging

Let’s find out how built-in tracing works by running a little example. Start your frequency server in the shell and, using the sys module, try the following:

1> frequency:start().
{ok,<0.35.0>}
2> sys:trace(frequency, true).
ok
3> frequency:allocate().
*DBG* frequency got call {allocate,<0.33.0>} from <0.33.0>
*DBG* frequency sent {ok,10} to <0.33.0>,
      new state {[11,12,13,14,15],[{10,<0.33.0>}]}
{ok,10}
4> frequency:deallocate(10).
*DBG* frequency got cast {deallocate,10}
ok
*DBG* frequency new state {[10,11,12,13,14,15],[]}
5> sys:trace(frequency, false).
ok

By turning on the trace flags for our frequency allocator, we are able to generate printouts of system events, including messages and state changes. Our example pipes the messages out to the shell. If we instead use the sys:log/2 call, we store them in the server loop. They can be displayed using the print flag or can be retrieved as an Erlang data structure through the get flag:

6> sys:log(frequency, true).
ok
7> {ok, Freq} = frequency:allocate(), frequency:deallocate(Freq).
ok
8> sys:log(frequency, print).
*DBG* frequency got call {allocate,<0.33.0>} from <0.33.0>
*DBG* frequency sent {ok,10} to <0.33.0>, 
      new state {[11,12,13,14,15],[{10,<0.33.0>}]}
*DBG* frequency got cast {deallocate,10}
*DBG* frequency new state {[10,11,12,13,14,15],[]}
ok
9> sys:log(frequency, get).
{ok,[{{in,{'$gen_call',{<0.33.0>,#Ref<0.0.4.59>},
                       {allocate,<0.33.0>}}},
      frequency,#Fun<gen_server.0.40920150>},
     {{out,{ok,10},<0.33.0>,{[11,12,13,14,15],[{10,<0.33.0>}]}},
      frequency,#Fun<gen_server.6.40920150>},
     {{in,{'$gen_cast',{deallocate,10}}},
      frequency,#Fun<gen_server.0.40920150>},
     {{noreply,{[10,11,12,13,14,15],[]}},
      frequency,#Fun<gen_server.4.40920150>}]}
10> sys:log(frequency, false).
ok

When you use the sys:log/2 call to store trace events in the server loop, the default number of events stored is 10. You can override this number by passing the {true, Int} flag when enabling logging. Int is an integer denoting the new default number of events you want to store. When you plan to deal with large volumes of debug messages, or leave debugging turned on for a long time, use sys:log_to_file/2 to pipe the messages to a text file.

System Messages

Have a look at the return value of shell command 9 in the previous example. If we pass the get flag to sys:log/2, we get back a list of system events. The forms of the events in the log depend on the processes producing them, but generally each event contains a system message with one the following forms:

{in, Msg}: This system message is triggered when a message (including a timeout) is sent to the gen_server. Msg includes any construct that is part of the OTP message protocol, e.g., {'$gen_cast', Msg} for casts and {'$gen_call',{Pid, Ref}, Msg} for calls. For any regular Erlang term sent as a message to a gen_server process, Msg will simply be that term.
{out, Msg, To, State}: This system message is generated when replying to the client using the {reply, Reply, NewState} control tuple, but is not generated for replies sent via gen_server:reply/2. Msg is the reply sent to the client, and To is the pid of the client. State is the same as NewState specified in the reply tuple.
term(): System messages of any format are allowed. For example, the return value of shell command 9 includes the message {noreply,{[10,11,12,13,14,15],[]}}, which is the result of handle_cast/2 after handling the deallocate cast. The second element of the noreply tuple is the new state of the gen_server.

Note that the documentation at the time of writing (up to and including Erlang 18) for the sys module also specifies {in, Msg, From} and {out, Msg, To} as valid system messages, but these are not used by any standard behaviors.

Your Own Trace Functions

You can implement your own trace functions by implementing your own fun that gets triggered in conjunction with a system event. You can pattern match on the events, taking any course of action you like. Trace functions can be used to generate your own debug printouts, turn on low-level traces using dbg or the trace BIFs, enable logging of particular information, run diagnostic functions, or execute any other code you might need (or none at all).

The following example keeps a counter for every time a client is refused a frequency and prints a warning message.1 Note how we achieve this without touching the original frequency code:

11> F = fun(Count,{out, {error, no_frequency}, Pid, _LoopData}, ProcData) ->
              io:format("*DBG* Warning, Client ~p refused frequency! Count:~w~n",
              [Pid, Count]), Count + 1;
              (Count,  _, _) ->
                Count
         end.
#Fun<erl_eval.18.54118792>
12> sys:install(frequency, {F, 1}).
ok
13> frequency:allocate(), frequency:allocate(), frequency:allocate(),
    frequency:allocate(), frequency:allocate(), frequency:allocate().
{ok,15}
14> frequency:allocate().
*DBG* Warning, Client <0.33.0> refused frequency! Count:1
{error,no_frequency}
15> frequency:allocate().
*DBG* Warning, Client <0.33.0> refused frequency! Count:2
{error,no_frequency}
16> sys:remove(frequency, F).
false
17> frequency:allocate().
{error,no_frequency}

Let’s look at this example in more detail. We create a fun F that takes three arguments. The first, Count, is the state of the debug function, passed between calls. Count, in this example, acts like a state variable, as we’ve chosen to count the number of times the first function clause is matched. Other trace functions might use more complicated states. The second argument is the system message, in which we pattern match on outbound messages of the format {error, no_frequency}. The third argument, ProcData, is specific to the behavior being traced; for example, for a gen_server it’s either the registered name of the process or its pid, whereas for a gen_fsm it is a tuple of the process name or pid and the current state name of the FSM (we cover the gen_fsm behavior in Chapter 6). All other system messages are ignored due to the second clause of the F function. We set the state of the debug function Count to the integer 1 in the second element of the tuple of the sys:install/2 call in shell command 12. In this command, we also pass the fun F to the frequency server, enabling the debug printout. We continue by calling frequency:allocate/0 enough times to run out of frequencies, triggering the debug printout twice and increasing the counter. Every time it is executed, F returns the Count state variable, incremented by 1 if the first clause pattern matches or unchanged if the second clause matches. Returning the atom done in the debug function is equivalent to disabling the function by calling sys:remove/2, as shown in command line 16.

Statistics, Status, and State

The sys module also lets you collect general statistics on behaviors as well as retrieve information about their internal state, including loop data, without having to reinvent the wheel or implement anything new:

18> sys:statistics(frequency, true).
ok
19> frequency:allocate().
{error,no_frequency}
20> sys:statistics(frequency,get).
{ok,[{start_time,{{2015,11,29},{20,10,54}}},
     {current_time,{{2015,11,29},{20,12,9}}},
     {reductions,33},
     {messages_in,1},
     {messages_out,0}]}
21> sys:statistics(frequency, false).
ok
22> sys:get_status(frequency).
{status,<0.35.0>,
        {module,gen_server},
        [[{'$ancestors',[<0.33.0>]},
          {'$initial_call',{frequency,init,1}}],
         running,<0.33.0>,[],
         [{header,"Status for generic server frequency"},
          {data,[{"Status",running},
                 {"Parent",<0.33.0>},
                 {"Logged events",[]}]},
          {data,[{"State",
                  {{available,[]},
                   {allocated,[{15,<0.33.0>},
                               {14,<0.33.0>},
                               {13,<0.33.0>},
                               {12,<0.33.0>},
                               {11,<0.33.0>},
                               {10,<0.33.0>}]}}}]}]]}

While sys:statistics/2 returns a list of self-explanatory tagged values, the tuple returned by sys:get_status/1 is not as obvious. It returns a tuple of the format:

{status, Pid, {module,Mod}, [ProcessDictionary, SysState, Parent, Dbg, Misc]}

where Pid and Mod are the behavior’s process identifier and callback module, respectively. The ProcessDictionary is a list of key-value tuples. Note that while we do not use the process dictionary in our frequency server example, the gen_server library module and other behaviors we have yet to cover all do.

SysState tells us whether the behavior’s state is running or suspended. By calling sys:suspend/1 and sys:resume/1, we can stop the behavior from handling normal messages, in which case only system messages are handled. Usually you suspend a process when upgrading software using the OTP-specified upgrade capabilities or when testing edge conditions. You might also suspend a process when defining your own behaviors, but most probably not when using standard behaviors. The only way you should suspend Erlang processes in the business logic of your programs is by using receive clauses when none of the messages in the mailbox match. Using the sys:suspend/1 call in your code is a no-no!

Parent is the parent pid, needed by behavior processes that trap exits. If the parent terminates, the behavior processes have to terminate as well. In this example, Parent is the shell process ID. DbgFlag holds the trace and statistics flags, which at the time we retrieved the status had all been turned off (hence the empty list).

Finally, Misc is a list of tagged tuples that contain behavior-specific information. The contained items vary among behaviors, and you are able to override them yourself by providing an optional callback function in your behavior callback module. When working with generic servers, the most important information in Misc is the loop data. You can influence the contents of the Misc value yourself by providing an optional callback function in your behavior callback module, using the function to format the {data, [{"State", ...}]} field to a value the end user might find simpler, more meaningful, or more helpful:

...
-export([format_status/2]).
...

format_status(Opt, [ProcDict, {Available, Allocated}]) ->
    {data, [{"State", {{available, Available}, {allocated, Allocated}}}]}.

If Opt is the atom normal, it tells us the status is being retrieved as a result of the sys:get_status/1 call. If the behavior is terminating abnormally and the status is being retrieved to incorporate it in an error report, Opt is set to terminate.

ProcDict is a list of key-value tuples containing the process dictionary. In the earlier example, the new state would be:

{data,[{"State", {{available, []},{allocated, [{15,<0.33.0>}, {14,<0.33.0>}, 
                                               {13,<0.33.0>}, {12,<0.33.0>},
                                               {11,<0.33.0>}, {10,<0.33.0>}]}}}]}

While it is not mandatory to return a tuple of the format {data, [{"State", State}]}, it is recommended in order to stay consistent with what is currently in use.

To examine just the loop data stored in the behavior process by the callback module, use sys:get_state/1:

23> {Free, Alloc} = sys:get_state(frequency).
{[],
 [{15,<0.33.0>}, {14,<0.33.0>}, {13,<0.33.0>}, {12,<0.33.0>},
  {11,<0.33.0>}, {10,<0.33.0>}]}

This handy method allows you to avoid having to extract the loop data from the results of sys:get_status/1, something that’s often difficult to do while debugging interactively in the shell. The sys:get_state/1 call is intended only for debugging, in fact, as is the corresponding function sys:replace_state/2, which allows you to replace the loop state of a running behavior process. For example, imagine you are debugging in the shell and you want to quickly add a few frequencies. You could do it by recompiling the code and restarting the server—something that’s simple to do when there are only a few frequencies available, as in our example, but much more difficult if you are in the middle of a test with thousands of allocated frequencies and need to retain the state:

24> sys:replace_state(frequency, fun(_) -> {[16,17], Alloc} end).
{[16,17],
 [{15,<0.33.0>}, {14,<0.33.0>}, {13,<0.33.0>}, {12,<0.33.0>},
  {11,<0.33.0>}, {10,<0.33.0>}]}
25> frequency:allocate().
{ok,16}

Replacing the loop data requires passing a function that receives the current value of the loop data and returns a new value. This allows you to easily modify only the necessary portions of a complex loop data value. In this example, we replace the empty list of available frequencies with a list of two new frequencies while keeping the list of allocated frequencies. The sys:replace_state/2 function returns the new loop data. Since the new value in our example adds available frequencies, the next call to frequency:allocate/0, which previously was returning {error, no_frequency}, now returns {ok,16}.

The sys Module Recap

To sum up, let’s take another look at the functions in the sys module we have seen. Note the notation we are using for [,Timeout] in our function descriptions. It means an optional argument to the call, defining functions of arity 2 and 3. Because these functions are nothing other than synchronous calls to our behavior, using Timeout allows us to override the 5-second default timeout time with a value more suited for our application. The functions we’ve covered are:

sys:trace(Name,TraceFlag [,Timeout]) -> ok

sys:log(Name,LogFlag [,Timeout]) -> ok | {ok, EventList}
sys:log_to_file(Name,FileFlag [,Timeout]) -> ok | {error, open_file}

sys:install(Name,{Func,FuncState} [,Timeout]) -> ok
sys:remove(Name,Func [,Timeout])

sys:statistics(Name,Flag [,Timeout]) -> ok | {ok, Statistics}.

sys:get_status(Name [,Timeout]) -> {status, Pid, {module, Mod}, Status}

sys:get_state(Name [,Timeout]) -> State

sys:replace_state(Name,ReplaceFun [,Timeout]) -> State

sys:suspend(Name [,Timeout]) -> ok
sys:resume(Name [,Timeout]) -> ok

To print trace events to the shell, use trace/2. When logging the events for later retrieval, use log/2. Turn logging on and off by setting LogFlag to true or false. By default, the last 10 events are stored; you can override this value by turning on logging using {true, Int}, where Int is a non-negative integer.

Events can be retrieved using the print and get flags. When using log_to_file/2, events are stored in textual format. The FileFlag is a string denoting the absolute or relative filename, or the atom false to turn it off. Use sys:install/2 to write your own triggers and trace functions in conjunction with system events and sys:remove/2 to recall them.

When using statistics/2, turn the gathering of statistics on and off by setting Flag to true or false, respectively. Use get_state/1 to examine loop data and replace_state/2 to replace it. And finally, get_status/1 returns all the available data relative to the internal behavior state. The get_state/1, replace_state/2, and get_status/1 functions are incredibly helpful when debugging and troubleshooting live systems.

Remember the Opts parameter passed as the last argument to the gen_server start functions? We used the empty list as a placeholder. You can enable tracing, logging, and statistics when starting your behavior by using the Opts field. If you pass [{debug, DbgList}], where DbgList contains one or more of the entries trace, log, statistics, and {log_to_file, FileName}, these flags are enabled as soon as the behavior process is started.

Spawn Options

When starting a behavior, you can change the default memory and garbage collector settings to address performance and memory utilization. The settings you pass are the same ones taken by the spawn_opt/4 BIF, but passed as an argument of the format [{spawn_opts, OptsList}] along with the debug options in the behavior Opts field.

Use your spawn options with care! The only way to be sure you have performance issues and bottlenecks related to memory management is by profiling and benchmarking your systems. In doing so, you need to understand how the underlying heaps, memory allocation, and garbage collection mechanisms work. Premature optimization is the root of all evil (after shared memory and mutable state). If you do not believe us, you will soon learn that attempts to optimize memory management often have the opposite effect and make your programs slower. The vast majority of cases do not call for performance tuning, but those that do will greatly benefit being spawned with a larger heap or more (or less) frequent garbage collection cycles.

Memory Management and Garbage Collection

If you suspect that your performance issues can be addressed through memory management, benchmark your system while manipulating the heap and the garbage collector settings. Memory-related options that can be changed include:

min_heap_size: Sets the size the process heap will grow to before the garbage collector (gc) is triggered. This name is misleading, though, as it is in fact the maximum size the heap is allowed to grow to before triggering the gc.
min_bin_vheap_size: Sets the initial and minimal value of the space this process is allowed to use in the shared binary heap before triggering a garbage collection on the binaries.
fullsweep_after: Determines the number of generational garbage collections that have to be executed before a complete garbage collection pass.

In the following example, we start the frequency server and trace events related to the garbage collector. We use the dbg tracer to measure how many microseconds the process spends garbage collecting. When allocating five frequencies, the total was 9 microseconds (911,345–911,336):

1> dbg:tracer().
{ok,<0.35.0>}
2> {ok, Pid} = frequency:start().
{ok,<0.38.0>}
3> dbg:p(Pid, [garbage_collection, timestamp]).
{ok,[{matched,nonode@nohost,1}]}
4> frequency:allocate(), frequency:allocate(), frequency:allocate(),
   frequency:allocate(), frequency:allocate().
{ok,14}
(<0.38.0>) gc_start [{old_heap_block_size,0},
 {heap_block_size,233},
 {mbuf_size,0},
 {recent_size,0},
 {stack_size,12},
 {old_heap_size,0},
 {heap_size,213},
 {bin_vheap_size,0},
 {bin_vheap_block_size,46422},
 {bin_old_vheap_size,0},
 {bin_old_vheap_block_size,46422}] (Timestamp: {1448,829619,911336})
(<0.38.0>) gc_end [{old_heap_block_size,0},
 {heap_block_size,233},
 {mbuf_size,0},
 {recent_size,44},
 {stack_size,12},
 {old_heap_size,0},
 {heap_size,44},
 {bin_vheap_size,0},
 {bin_vheap_block_size,46422},
 {bin_old_vheap_size,0},
 {bin_old_vheap_block_size,46422}] (Timestamp: {1448,829619,911345})

If we now spawn the frequency server, setting the minimum heap size to 1,024 words (a smaller size would have been enough), we have enough memory to allocate the frequencies without triggering the garbage collector:

1> dbg:tracer().
{ok,<0.35.0>}
2> {ok, Pid} = gen_server:start_link({local, frequency}, frequency, [],
                                      [{spawn_opt, [{min_heap_size, 1024}]}]).
{ok,<0.38.0>}
3> dbg:p(Pid, [garbage_collection, timestamp]).
{ok,[{matched,nonode@nohost,1}]}
4> frequency:allocate(), frequency:allocate(), frequency:allocate(),
   frequency:allocate(), frequency:allocate().
{ok,14}

5> process_info(Pid, garbage_collection).
{garbage_collection,[{min_bin_vheap_size,46422},
                     {min_heap_size,1598},
                     {fullsweep_after,65535},
                     {minor_gcs,0}]}

Note the default setting of fullsweep_after, a value much higher than you might expect. We had set the min_heap_size to 1024, but in shell prompt 5, it appears to be 1598. We requested 1,024 words, but 1,598 is the first value greater than 1,024 in the Fibonacci recurrence sequence of heap sizes the VM uses, so that value is selected instead of 1,024.

Warning

If you start playing with the heap size and garbage collection settings, keep in mind that memory is freed only when the garbage collector is triggered. There might be cases where the process heap contains binary references to potentially large binaries in the shared heap. Each reference to a binary is relatively small, so even if the process does not refer to these binaries anymore, potentially huge amounts of memory can be consumed without the garbage collector being triggered, because there is still plenty of space on the process heap. That is why the per-process virtual binary heap is there, calculating the total amount of memory used up by the binaries in the shared heap and helping ensure they get garbage collected more promptly. Under these circumstances, hibernating the process or triggering garbage collection using the erlang:garbage_collect() BIF might prove more useful.

Another potential risk is running out of memory. As an example, having a large min_heap_size and using the dangerously high default fullsweep_after value of 65535 might result in the old heap growing because garbage collections are far apart, resulting in your system running out of memory before the first full sweeps are triggered. Always stress test your systems, and let soak test runs span days, if not weeks.

Spawn Options to Avoid

The following options should be avoided because they either do not work with behaviors or are considered to be bad programming practice. Although monitor can be passed as an option when using the spawn_opt/3 BIF, it is disallowed in generic servers and will result in the process terminating with a badarg. While you are allowed to use link as an option, starting the behaviors with start_link is preferred.

Process priorities should never be set using the {priority, Level} option, where Level is the atom low, normal, or high. Changing process priorities is even more dangerous than meddling with memory and garbage collection, as it can upset the VM’s balance and have serious repercussions on the soft real-time properties of your system. Changing priorities can cause the VM’s schedulers to behave strangely and unfairly; processes with a higher priority have been known to starve when the ratio between them and those with a lower priority reached certain limits. Furthermore, processes with a lower priority have caused the runtime system to run out of memory when, under heavy load, messages were not consumed as fast as they were produced. You obviously never notice these issues when testing your system; rather, they tend to come back and bite you when the live system comes under heavy load. Let the runtime system decide on your behalf, especially when dealing with hundreds of thousands of processes. You have been warned!

Timeouts

If you want to limit the time a behavior spends in its init function, include the option {timeout,Timeout}. If after Timeout milliseconds the init callback function is still executing, the process is terminated and the start function returns {error,timeout}. This option is useful in very specific circumstances, often in a running system with dynamic children responsible for a particular transient resource. We don’t recommend using it when starting your system, though; we instead suggest that you try to minimize the amount of work executed in the init function so as to not slow down the startup procedure.

Summing Up

There are many options to control and monitor your behaviors. Start with built-in tracing and logging functionality. You can then dynamically add generic trace and debug triggers or change your process state using funs and the sys module, during runtime and without the need to recompile your code. This is a priceless feature, as you can use it on systems you have never seen that have been running for years on end without the need to restart them. You can read more about this in the sys module’s reference manual page.

Optimizing processes through the use of the memory flags in their options is trickier, as it requires you to benchmark your system and base your optimizations on the information you extract as a result of your tests. It is rare that you will have to manipulate the default garbage collector settings or play with your heap sizes. But if and when you are having performance problems, you will be grateful you have read this far in this chapter. If you need more information, look at the documentation of the spawn_opt BIF in the erlang module’s manual page.

What’s Next?

We park online tracing for now, until we implement our own behaviors (learning how it all works behind the scenes) in Chapter 10, and ignore performance tuning until we reach Chapter 13. In the next chapters, we focus on the remaining behaviors, starting with FSMs, followed by event managers, supervisors, and applications. Remember that they are all built on the same foundations, so the sys module and all of the spawn and debug options we have discussed in this chapter will be valid.

¹ The io:format/2 executed in the fun attaches itself to the group leader of the traced behavior, causing warnings to be printed in the local shell. If you connect from a remote shell, you will not be able to see them.