Stateful functions as a service

One of the trends in cloud computing that I’m most excited about is stateful functions as a service (FaaS). The technology is still in its very early stages, but I think it’s the next leap for serverless.

Serverless functions today

The serverless functions we know today are services like AWS Lambda and Google Cloud Functions. They essentially consist of one massive load balancing layer that receives all requests (for HTTP functions) and events (for event processing functions), and routes them to a machine that is running your function, dynamically spinning machines up and down depending on load.

The result is a lovely developer experience where you just bundle up your code and hand it over to the cloud provider. You don’t have to worry about where your code runs or how it scales. And when there aren’t any requests or events to handle, it doesn’t cost anything!

These functions are stateless, meaning they don’t have any persistent disks and don’t have any shared memory. The way they can keep state is by connecting to an external data system, such as Postgres, Redis or FaunaDB. That’s not a problem for many ETL and CRUD applications – most services don’t rely on local disk or memory between requests anyway.

Use cases for stateful serverless functions

Nonetheless, there’s a long tail of interesting applications that rely on local state, and I think they’ll become more popular in the future. They’re things like:

  • Streaming aggregations that roll up many events over a period of time. For example, counting unique visitors per page in real-time.
  • Streaming joins that pair events from different sources. For example, joining observations with matching timestamps from two different IoT sensors.
  • Real-time collaboration, where many users edit the same object and receive updates. For example, multi-person document editing or multi-player games.
  • Low-latency metadata lookups at scale. For example, routing tables or looking up user permissions.

Today, you need machines with dedicated memory and disk space to handle these use cases. And you might also need a distributed configuration store like Consul to manage sharding, or a stream processing system like Flink to coordinate and shuffle data between machines.

How will they work?

A lot of the use cases for stateful functions depend on what kinds of guarantees the platform will be able to provide. The dream is for a function invocation to treat global state as an in-memory object that it manipulates without interference from concurrent functions. In practice, that’s probably not feasible and you might need some map/reduce-style logic for merging distributed state changes.

There are some systems today that can claim they’re a “stateful FaaS platform” (see the resources below), but it’s far from a solved problem. There are many things to consider, such as consistency, isolation, exactly-once processing, latency, hot keys, etc. Under the hood, the platform will have to coordinate where to run functions depending on which physical machines hold what state, and also how to merge changes consistently. Data will have to be shuffled and replicated between machines depending on load. These problems become even harder if you want to run stateful functions in different geographic regions to get lower end-user latency.

A world computer

If your serverless functions can manipulate global state without worrying about scalability, consistency or latency, you can effectively treat the stateful FaaS platform as one giant machine that never goes down. You could get rid of your external database altogether – just write data to the global state, and read or update it in later function invocations. I like the idea of a “world computer”, a term the Ethereum project uses (Ethereum’s smart contracts are essentially stateful functions, although not scalable).

In theory, it’s the end-state for cloud, where you write your code like it runs on one giant server that scales infinitely and is responsive across the globe. In practice, there’s probably going to be caveats, but it’s going to be exciting to see where it leads.

Resources

Here are some useful resources for learning and thinking about stateful serverless functions:

© Benjamin Egelund-Müller 2021