Free Monad: Discussion on when to use it, maybe you are using it already but don't know it!

Free Monads are popular these days but they are still not in widespread use primarily because of their syntatic complexity and high abstraction level. Free monads allow you to operate in monadic structures with "commands" or "instructions" that mimic imperative programs. Imperative programming is fairly easy as each instruction is sequenced and tracing the flow of logic is mostly easy. However, in reactive programming for example web programming that involves "callbacks", it is much harder to trace the time-based sequencing because callbacks can be called at any time. This is especially true in high-latency enviroments such as the internet.
If we skip the concept of free monads for a moment, we see that no matter where we look, when we program in a complex enviroment like the internet, we often create Domain Specific Languages to deal with the complexity. In both c# and scala, as well as javascript to some degree, the concepts of async/await have been introduced to make callbacky programs look imperative:
def getLhs(): Future[Int] = ...
def getRhs(): Future[Int] = ...

val myresult = async {
  val lhs = await(getLhs())
  val rhs = await(getRhs())
  lhs + rhs
myResult.flatMap{ ...use lhs+rhs... }
The reality is though that once in Future (or its equivalent) you remain in it and its difficult to escape. This is true for javascript Promises or any other construct that "contains" a value. The async/await Future-based concept has been generalized to any monadic structure using monadless/monadless:
def getLhs(): Task[Int] = ...
def getRhs(): Task[Int] = ...

val myresult = lift {
  val lhs = unlift(getLhs())
  val rhs = unlift(getRhs())
  lhs + rhs
myResult.flatMap{ ...use lhs+rhs... }
A Future, javascript Promise or other construct that captures asynchronous behavior, are examples of Monads. Monads can be considered a "container" for a value. A List is monadic as it holds a set of ordered values. A Future is a monad because it holds the future result of a computation. A Task is a monad because it holds a result of computation but allows you to choose when you wish to start processing to obtain the value. A Monad also helps you sequence operations. In the case of a Future, you can map/flatMap into the Future and your "program" runs on the Future's result once the Future has produced the value. In this sense you have sequenced your computation.
But we see that sequencing a computation on a Future, for example, is equivalent to having a set of "instructions" that are run by a smart interpreter:
val instructions = Seq(makeWebCall, useResult, moreComputations1, ...)
We need an interpreter that knows when you encounter the instruction makeWebCall, that this must be performed asynchronously and that useResult should not be processed until the result of makeWebCall has completed and that when useResult is called, it should have access to the result of makeWebCall.
Instead of a sequence of instructions like above, we could also use the more common idiom of:
def makeWebCallFut(): Future[Int] = ...

makeWebCallFut().map(result => result + 2).map(result => moreComputations1(result))
In this case, the monadic machinery that implements map or flatMap takes care of sequencing the operation. The mapapproach is much shorter and clear while the approach using instructions is more abstract.
The approach using instructions is a common "interpreter" design approach and many programs have the concept of an interpreter built into them just by the very fact that if you use map or flatMap you are essentially building an interpreter directly in code. The CQRS design pattern, popular today in large systems and "state" machinery such as Redux for Reactjs on the web, are examples of instructions. Each "command" is a mutation instruction of some sort. In fact, the idea of sending instructions is explicit in the Actor model where each "message" can be considered a "command." It's obvious that once you abstract your program to a sequence of "instructions" its harder to understand but its better than the alternatives sometimes, e.g. a contorted mess of map/flatMaps or other programming pradigm constructs. In the area of asynchronous processing, streams and go-routines have become popular ways to help manage the abstraction complexity--all with good results for certain kinds of problems. In fact, different languages provide different ways to handle the abstractions. You could use, for example, lisp, and have the interpreter and associated data structures be explicit and built into the language. Or, you could use haskell or scala, and leverage the free monad with type safety but with increased boilerplate.
The Free Monad is one approach to managing the complexity and can be implemented with or without type safety in different languages. Let's say you choose scala for your program. The real question of when to use the free monad is "when is the free monad's complexity buy you more than it's abstraction/boilerplate cost?" For simple to moderate problems, you can probably use standard functional composition to manage complexity.
For example, when you need to access web resources, say through HTTP, you may develop your own "fetch" client. Fetch is a common verb used to describing obtaining asynchronous results from a web service. Fetch is a function in javascript, a library in scala from 47degrees (on github) and common names of methods in web service libraries. But let's consider that we need to use a fetch concept in our program an we need some special processing, such as that for OData, a restful, data access standard that is much like SQL for web services.
Fairly quickly, when writing your own fetch, you realize that you have a request-response paradigm. Its asynchronous, so you typically represent the result as a Future or wrap in a Task and you have several "handling" issues. Handling refers to both header processing, since the result can be returned as a header value, body processing, which may be asynchronous for large responses, or error handling, in case the request fails in some way or header/body processing fails. Because you have alot of variety, you fairly quickly write a higher order function with different methods to handle these areas:
def fetch[R](requestParams, handleError, handler): Future[R] = { ... }
You fairly quickly realize that you could "bury" some aspects of handling into the function e.g.:
def fetchReturnError[R](requestParams, handler): Future[Either[Throwable, R]] = { ... }
// or
def fetchLogErrorsReturnJson(reuestParams, handle): Future[JsValue] = { ... }
// or
// ...
In other words, you start providing different entry points depending on the error handling and body handling logic. Or you could allow your fetch parameters carry that burden. For example, you could start building up handlers using functional composition to handle many different scenarios.
For small surface areas, the approaches listed above work fine. But you do realize there is a pattern and trade-offs. In the case above where you provide multiple functions or bury more advanced handling in the handler, you are actually embedding "instructions" directly in the function (and using a special name to indicate what "instruction" it perfoms) or you are adding "instructions" through function composition when you build up your handler e.g. val handler = checkStatus andThen checkForValidHeaders andThen convertBodyToJson. You are really just moving complexity around and in some cases handling it better and in some cases, not handling it better. Notice also that the handler functional composition model looks like a series of instructions sequenced using andThen.
The challenge we face is that in addition to some of these behaviors we know need to be "parameterizid" in some way, we need to adopt to change. For example, if we want to add automatic "retry" behavior, we would need to change the API above. Some libraries allow you to specify the retry behavior through "strings" and "service providers" such that mentioned "retry" in a string parameter instructs fetch to load a service provider that implements that "retry" behavior. There is nothing wrong with this approach as long as you cover all the types of retry behaviors you want to implement. The cost of this approach is uing a service loader model and making sure the strings match up to the right service provider. You may choose to use dependency injection frameworks to ensure the right "strategy" implementation is injected into the "fetch" object. In a type safe world, we like to be more specific and have the compiler check these things for us. The idea is that there are different ways to do the same job.
Another type of change could be that we want use a different "monad" to contain the asynchronous computation. For example, you may wish to use a Future, or a Task or just Id and have things run synchronously. Newer libraries give you choice, but many you do not. In javascript, many APIs are converting on the Promise interface, which looks like a scala Future. But what if you need to use a different Promise? Some polyfill API's allow you switch out the Promise interface for another, essentially changing the "effects." But the standard javascript APIs do not allow this. As javascript becomes popular on the server, you may need more choice. Some libraries like scala's fs2 library, make it easy to switch out your "effect", others do not.
It is these types of complexities that may motivate finding a better abstraction for your specific program. A free monad may be that abstraction. The key elements of a free monad are:
  • You need an instruction set. The instruction set is specific to your application.
  • You use monadic data structures to "write down" your instructions in the sequence you want to execute them. This is in contrast to a simple "list" of instructions. The instructions can be super fat, and do alot of work, your instructions can be low-level, equivalent to "getting" a resoure off an object.
  • You need an interpreter to run over the instructions. The interpreter may interpret the instructions in a specific way or alter the instruction flow. The interpreter may also contain state to capture intermediate results. An interpreter can also be pure.
In lisp, with or with static typechecking (yes you can do that in lisp), you can use the basic "list" structure to capture your instructions an an interpreter is easy to write using pattern matching. In other languages with static type checking, such is scala or haskell, you have some more boilerplate to contend with, but the ideas are pretty much the same. Instead of a list of instructions, you capture your instructions in the "flatMap" part of the free mondad.
The free monad allows you to compose parts of your program at two different levels. First, by specifying the instructions in the sequence you wish, you obtain the ability to compose "functions" in the order you want to execute just like the example with andThen above. Second, you can add different instruction sets together. Adding different instructions together takes more boilerplate with your instruction set and intpreters but it is straight forward boilerplate.
So is a free monad worth it? In addition to the above considerations, you may also want to look at the type of application you are building. Fetching web resouces is a low level task. Many programmers just want a simple API and use a giant try-catch block because their application requirements do not call for more than that. Perhaps they can rely the external enviroment, such as an enterprise scheduler, to retry operations by rerunning the entire application. There is nothing wrong with this approach as long as it fits. This is an example of pushing complexity around the eco-system to achieve your objectives. The world of virtual machines, "let it fail" for actors, cacheing systems and other ecosystem level enviroments are designed to offload complexity from your application and shift it into other parts of the architecture where they can be handled better, cheaper or easier.
Just like all of these other approaches, the fee monad is not a free lunch. You may wish to hide the complexity of the free monad in a library that others will use, such 47degrees' fetch. Or, you may wish to expose the free monad in your API such as in doobie, a free monad layer on top of java's SQL API.
I think it is certainly a true statement that once you want to go beyond the imperative or the simple asynchronous program, complexity increases very, very fast. Hence, looking at toy examples is not helpful in understanding whether the trade-off is worth it since toy examples do not "model" or "show" the complexity of a larger program. If you have to have high performance and asynchronous and mix in different remote resources using different APIs, programs are complex almost no matter what abstraction you use. If you are gluing together these abstractions yourself, that's alot of work no matter what. If you are trying to use a framework that hides the complexity from you, sometimes you hit a roadblock with that framework quickly if your problem does not match the framework's design intent. Ouch!
In another post, I'll provide more examples.


Popular posts from this blog

zio layers and framework integration

typescript and react types

dotty+scala.js+async: interesting options