scala and state monad: no free lunch

I was working on using the State monad in my cake example, but needed to fiddle with some simple examples first. The code below can run in the scale REPL. The main thought is that if your system is already setup to handle the state type already, say by sharing a cache, then you modify your code to return states and the State monad can help you sequence and ensure that the state information is available for each function call.

However, if you are accessing resources that do not return state information structurally aligned with your state value, you still have to use some ugly code to convert a function's return value to the structure needed by your state. When using the state monad, you can pull that code out of the functions you are calling which is good, but you still have that code and its still messy--no surprise here. It's clear though that the components you are using must be designed to use the same state structure.

This last point effectively makes this design pattern most useful when you are designing a set of related components. Of course, if its a set of related components, you could do this in other ways, for example, my making a class that has the state that is accessed by the functions directly--you just need to initialize the state once. This class could also sequence the computation and you would not need to have the functions return a state object.

Hence, the state monad reduces complexity when all your functions have state or values aligned with other functions of interest. But if they do not, you still have messy code. The state monad helps here, but perhaps less dramatically since you must now have some wrappers to help adapt the information and in essence you are merely taking a function parameter and turning the original function into a function that returns a function that takes that parameter--not sure that approach is an automatic easier-to-understand choice.

In others words, there is no free lunch.

Imagine you wrote a large number of functions to process data from a database ORM. The ORM uses a session, which is a form of state. Often, that state is kept in a thead-local global variable. If you did not want to do that and use the State monad, you would have to modify your functions to return a new state given the current state as well as the "value" that the function previously produced. In this case, you still have to change your code to employ the pattern because the functions must now return a state or in your for-comprehension you have to wrap them with some code to call your functions correctly.

The real impact on your code from using the state monad is that your functions need to return a function or you have to write some functions to adapt your functions. The State objects wraps a function--essentially your function must now return a function instead of its original return values.

Inside the returned function, you can use the state, or any other information that is accessible, to create the new state and a value. In other words, the complexity you may have in the original function that accesses and manipulates state can be moved out of the function into another function. In the case where the state manipulation is the same, for example if all the clever functions you are writing return the same value, then the state monad can pull code out of your original functions nicely. But there are other OO ways to handle this as well that are probably more readable if the functions you are sequencing do not have uniform interfaces. The State pattern is probably best applied when you are writing several functions in your library/application and you control the API. This appears to be less a pattern directed towards integrating disparate functions with different API than sequencing/composing a set of functions with aligned API.

But there is value in pulling out and structuring your code to adapt different functions to use State and for-comprehension because it forces the programmer to structure the code in a specific way. This is valuable on its own and help decrease coupling and increases cohesion. The code for adapting functions has to live somewhere.

For those of you who wish to think through how the State monad works, think of it like this:
  • A State monad is a class that wraps a function. The function that is wrapped must take a value, called the state, and produce a tuple of the new state and a value. To use the State monad, you have to alter what your original function returned or adapt them with other functions or adapt them inside the for-comprehension. You cannot just use the same function directly.
  • You can define your state function using several different methods in the API. These methods let you place a value into a state (e.g. put) which defines an initial state, create a state to state function, create a state to tuple function or run your functions via "run" function. You must provide an initial state to run the state-enabled function you have created because like all functions, you have to call a function with an argument. ("run" is equivalent to calling the function).
    • As a small digression, the reason the syntax like put[Int] where Int is your state looks so strange is that you have to remember that put[Int] returns a state whose wrapped function is designed to put a value into the state object. When you eventually need to run the function you are composing, put[Int] as the first function to run essentially says, take the initial state value you provide (e.g. the integer 10) and run put[Int] on it. put[Int] then places the value 10 into an instance of the State class--you now have a state to start your processing sequence. Think of put[Int] as a more expressive way of constructing a State object once it has a value to initialize the state with. There are other ways to create the initial State object given a programmer's value for that state.
  • Because functional languages allow you to compose functions, when it comes time to run a sequence of functions together, you can compose your functions use the State monads, after all, they are functions themselves that return functions wrapped in a State class. So composing a series of functions together creates the "sequence" or "composition." You can use a for-comprehension for doing this easily.
  • If you compose multiple functions together you now have a new function. That function must be "run" in order to return a result. Here "run" just means you need to call the composed function. You can call this using scala's function application syntax (myFunc(arg)), the method "run", "eval" or "exec." Each of these returns various combinations of the final state, the final value produced by the last function in the composition or both pieces of information in the tuple mentioned above.
  • Since State is monad, you can use a for-comprehension loop to make the syntax look cleaner.
Benefits (all benefits are relative to a baseline, which we consider to be OO java style coding):
  • Its easier to make your state object immutable.
  • You can pull the code that manipulates the state object out of your original function and not have to create another "wrapper" class around all your functions to do it.
  • You can simplify the final coding of the sequencing so its easier to read the intended high-level logic of your algorithm.

The code can be found at gisthub

There are also some other follow-on posts on this topic that provide extensive examples and usage of the different State methods and for-comprehension idioms.

Popular posts from this blog

graphql (facebook), falcor (netflix) and odata and ...

React, Redux, Recompose and some simple steps to remove "some" boilerplate and improve reuse

Using wye and tee with scalaz-stream