Sum types and state machines in C++17

By Colin

Jul 22, 2017

Designing a library for Finite State Machines (FSM) is an interesting idea, because it's such a broad concept. State machines represent a complete model of computation, so it seems that either (a) supporting arbitrary state machines requires a library with similar complexity to a complete programming language or (b) no library is necessary, because a state machine encompasses just a subset of everything the language is already capable of.

It turns out that this can be the case in many functional languages that support such features as algebraic datatypes and pattern matching natively, but certainly a small library can make implementing state machines easier and enforce some level of consistency in imperative languages like C++.

What does a state machine library provide?

In practice, most explicit state machines are event-driven, in that they only change state or produce effects when some input changes (a timer expires, a button is pressed, etc). Such FSMs usually consist of three things:

A finite list of possible states which the machine may occupy.
Variables, some of which are shared across states and some of which might be used by only one state.
Actions to perform upon specific events, and criteria upon which to transition to a new state.

Some people break the third thing into two, but I tend to consider the transitions themselves as actions as well because doing so usually gives a little more versatility (which will be shown later).

Which libraries are out there?

Within the realm of C++, most state machine libraries occupy one of two categories.

In the first of these categories, the user creates an enum with as many values as there are states. The user then creates a class that holds an instance of this enum, plus whichever variables are needed by any of the possible states. The user also creates an enum that enumerates all the possible events the state machine can handle. Finally, they create a 2-dimensional "transition table", where upon receiving an event, the state machine indexes the table based on the value of its state enum and the event type to obtain a pointer to a function that performs some action or alters some variables on the state machine and may yield a new state.

With code instead of words, that's something like this:

struct StateMachine {
    State activeState;
    // declare variables that any states need below
    // [...]
    void onEvent(Event event) {
        // Look up the handler for this state/event combo
        auto handler = transitionTable[(int)activeState][(int)event];
        // Let the handler perform actions & generate the new state.
        activeState = handler(this);
    }
};

Among the downsides to this approach is the limitation that all states share all data. This is a bit unclean (it's like declaring all your variables public, everywhere), wastes memory and limits the extent to which the programmer can take advantage of paradigms like RAII for management of state-specific resources. Additionally, we might like for our events to have some associated data, but enums don't allow for that.

In the second category, each possible state is its own class, which derives from a common base class. The base class provides a virtual method for handling events, which returns a new state. There is no need for a transition table.

class State {
    virtual unique_ptr<State> onEvent(Event event) = 0;
};

struct StateMachine {
    unique_ptr<State> activeState;
    void onEvent(Event event) {
	activeState = activeState->onEvent(event);
    }
};

In this method, each state can implement its own local variables and RAII is possible. The downside is that it requires heap allocation and vtables + virtual dispatch, including the extra layer of indirection that comes from pointers. For many state machines, this leads to suboptimal performance and the heap requirement does limit things for some embedded applications (software that drives hardware peripherals tends to use a lot of state machines and perform operations from within interrupts, where memory allocations are risky or impossible).

The most promising library I could find is the Boost MSM. This works without a heap and does support state-specific variables, which aren't visible to other states, but it looks like they're still persisted across all states. So efficient usage of memory and RAII still aren't possible.

Sum types

C++17 offers variant types (also known as tagged unions or sum types). A variant is much like the C union construct, but type safe. Like a union, it allows a variable to be assigned different types throughout its lifetime, but it only ever has one value at a time.

For those with experience in dynamic programming languages, it's not too different from what languages like Python allow:

def halve(x):
    if isinstance(x, int):
        return x/2
    elif isinstance(x, str):
        return x[0:len(x)/2]
    else:
        raise TypeError('Expected an int or str')

>>> halve(24)
12
>>> halve('hello, world!')
'hello,'

In this case, x is similar to a sum type: it can be either an int or a string and we only know at runtime. Today, this can be implemented in C++ as follows:

typedef union {
    int,
    std::string
} IntStrUnion;

struct IntStrSumType {
    /// Holds '0' to indicate 'data' is an int.
    /// Holds '1' to indicate 'data' is a string.
    int tag;
    IntStrUnion data;
};

IntStrSumType halve(const IntStrSumType& x) {
    if (x.tag == 0) {
        return IntStrSumType(0, *((int*)&x) / 2);
    } else if (x.tag == 1) {
        return IntStrSumType(1,
            ((std::string*)&x)->substr(0, ((std::string*)&x)->length()/2));
    } else {
        DIE("Illegal type tag: expected 0 or 1");
    }
}

Note though that this method is incredibly error prone. Beyond the obvious, memory leaks are difficult to avoid with this approach. Because the compiler doesn't know the type of the value contained in a union, it can't call any constructors when the union is destroyed. Thus, the caller will have to explicitly call the destructor of the appropriate type when done with the returned value.

We could overcome some of these errors by wrapping this in a safer interface - make a class whose destructor checks the tag of the union and calls the destructor of the active type, then overload the assignment operator so that it sets the appropriate tag based on if you assign it an int or a string, and finally make the tag private and force all casts to be checked against the tag. This is essentially what std::variant does, but it also introduces the concept of visitors.

Below is how the halve function could be implemented in C++17. Note the existence of the overloaded keyword: this lets us overload lambda functions in the same way that C++ lets one overload named functions. The visit function does some magic behind the scenes so that the tag will be checked at runtime and then the appropriate overload of our lambda will be called. We'll make the syntax a bit friendlier later on, too.

#include <variant>

using IntStrSumType = std::variant<int, std::string>;

IntStrSumType halve(IntStrSumType x) {
    std::visit(overloaded {
            [](int&& x_int) {
                return x_int/2;
            },
            [](string&& x_str) {
                return x_str.substr(x_str.length()/2);
            }
        },
        x
    );
}

State machines from sum types

A finite state machine (FSM) can be viewed as a sum type: there is a fixed (at compile time) set of states that the machine may occupy, and at any given time it is in exactly one state.

Logically, event handling is identical to the previous examples: when an event occurs, we dispatch it to the appropriate handler based on the active state. This handler can perform some actions and return a new state for the machine. Here's how we might perform event dispatch for a vending machine.

/// Data that a caller will supply us when the user presses a button on the vending machine.
struct ButtonPressed {
    int index;
};

/// List of errors we might encounter.
enum class ErrorType {
    OutOfStock,
    InsufficientFunds
};

// Three different states our vending machine might be in.
struct Idle {
    /// Amount of cash (in pennies) deposited by the user so far.
    int funds;
};
struct Errored {
    ErrorType error;
};
struct FetchingItem {
    int itemRow, itemCol;
};

using State = std::variant<Idle, Error, FetchingItem>;
State state = Idle;
void dispatchEvent(ButtonPressed event) {
    state = std::visit(overloaded {
            [&](Idle&& state) {
                int cost = getPrice(event.index);
                if (!hasStock(event.index)) {
                    return Error{ErrorType::OutOfStock};
                } else if (cost > state.funds) {
                    return Error{ErrorType::ErrorInsufficientFunds};
                } else {
                    giveChange(state.funds - cost);
                    // assume each row has 8 different items.
                    return FetchingItem{event.index/8, event.index%8};
                }
            },
            [&](FetchingItem&& state) {
                return std::move(state);
            },
            [&](Errored&& state) {
                return std::move(state);
            }
        },
        std::move(state)
    );
}

Note the use of std::move when passing the state into the visitor. This will crop up again, and its purpose is to make sure that we avoid ever copying the state as we process it. That's not hugely relevant in the vending machine example, but other state machines may allocate data on the heap (if using a heap) or have destructors that cause side effects, and we would only want to trigger those when actually leaving the state.

The image below shows how the State type will be arranged in memory. The first word will be the tag - an integer that indicates which variant is active: Idle, Error, or FetchingItem. The second word is occupied by funds if the state is Idle, otherwise it is error or itemRow, depending on if the state machine is in the Errored or FetchingItem state respectively. The third word will hold itemCol when in the FetchingItem state, and is unused in all other states.

The state machine as a whole always occupies 3 words of memory, it's just that the meaning of these words varies based on which state is active. When compared to an approach in which all state is always persisted and shared, we saved 2 words by folding funds, error and itemRow together. This is possible because they're state-local variables that never need to exist simultaneously. The memory savings can be impressive when it comes to large state machines with many variables. When compared to a virtual dispatch approach, we still saved one word by avoiding any vtable pointers.

Possible memory layout of the State type.

Finally, we might add more events, like a CoinInserted event. At this point, we could make the dispatchEvent function take a variant over the possible events, and then the visitor would visit some function that matches both the state and the event type. While we're at it, we can remove the ugly overloaded lambda and name our event handlers by elevating the dispatchEvent function into a Fsm base class. Here's what the user code might look like:

struct CoinInserted {
    /// Value of the coin, in pennies.
    int value;
};

using Event = std::variant<ButtonPressed, CoinInserted>;
using State = std::variant<Idle, Error, FetchingItem>;
// Note that the Fsm takes the derived class as a template parameter. This is
// because it needs to know where to find our `onEvent` handlers.
struct VendingMachine : public Fsm<VendingMachine, Event, State> {
    // Variables that should be shared across all states can be placed below.
    int stock[NUM_ITEMS];
    int prices[NUM_ITEMS];
    // What to do when in the Idle state and a ButtonPressed event occurs.
    State onEvent(Idle&& state, ButtonPressed event) {
        int cost = prices[event.index];
        if (stock[event.index] <= 0) {
            return Error{ErrorType::OutOfStock};
        } else if (cost > state.funds) {
            return Error{ErrorType::InsufficientFunds};
        } else {
            giveChange(state.funds - cost);
            --stock[event.index];
            return FetchingItem{event.index/8, event.index%8};
        }
    }
    // Ignore button presses in all other states.
    State onEvent(FetchingItem&& state, ButtonPressed) {
        return std::move(state);
    }
    State onEvent(Errored&& state, ButtonPressed) {
        return std::move(state);
    }
    State onEvent(Idle&& state, CoinInserted event) {
        state.funds += event.value;
        return std::move(state);
    }
    // Spit the coins back out in non-idle states.
    // Note: we can use a template here to avoid repeatitition.
    // C++ function overloads work such that events associated with the Idle
    // state will be processed by the explicitly defined function above instead
    // of this template.
    template <typename S> State onEvent(S&& state, CoinInserted event) {
        giveChange(event.value);
        return std::move(state);
    }
private:
    void giveChange(int value);
};

And here's the library code that handles all the ugly bits.

/// Calls the appropriate `onEvent(State, Event)` function of the derived class.
template <typename Implementor, typename State> struct EventDispatcher {
    EventDispatcher(Implementor& self_) : self(self_) {}
    template <typename ActiveState, typename Event>
    State operator()(ActiveState&& state, const Event& event) {
        return std::move(self.onEvent(std::move(state), event));
    }
    Implementor& self;
};

/// Base type for all Finite State Machines.
template <typename Implementor, typename Event, typename State> struct Fsm {
    /// dispatch an event variant to the appropriate handler, based on the current state.
    void dispatch(const Event& event) {
        // dispatch the event and obtain the new state.
        EventDispatcher<Implementor, State> dispatcher(*self);
        this->fsmState = std::move(std::visit(
            dispatcher, std::move(fsmState), event));
    }
private:
    State fsmState;
};

Surprisingly simple, though a bit dense.

RAII

One of the nice things about a variant-based approach to state machines is that we can use the Resource Acquisition Is Initialization (RAII) design pattern because the state's destructor is always called when we transition into a new state. For example, if the FetchingItem state requires various hardware power supplies to be active but other states don't, we can bring up the power supplies in the constructor of FetchingItem, and turn them off in its destructor. This could be more reliable than manually doing that for each possible transition to/from the FetchingItem state (especially within the context of exceptions, if applicable).

In closing

This loose framework should work for many scenarios. The library code, along with examples, is available here, and has also been extended to support callbacks whenever a state is entered or exited (for when RAII isn't applicable). If you want to make use of variants, but are stuck on a non-C++17 codebase, take a look at the fantastic mpark library: a drop-in replacement for std::variant for C++11 and up.

I wouldn't have considered using variants in C++ for this sort of purpose due to perceived ergonomic issues. Kalle Huttunen showed that it doesn't have to look that terrible in his post, Implementing State Machines with std::variant. This work merely improves upon that by providing further ergonomic improvements and by avoiding copies when not transitioning states, thereby allowing RAII.