null was not the billion dollar mistake

null was not the billion dollar mistake

There is always much buzz about programming language and features. One claim is that “null” was a billion dollar mistake.

I think that’s not actually true. I think the billion dollar mistake was using null in a programming language that has too few tools to manage null as a value. The issue is actually not about “null” but about “worlds.”

Let me explain.

My perspective is informed from ployglot programming. For example, you can do polyglot programming with the same “value” on Oracle’s graal platform. You can use both dynamically typed and statically languages on the same value since the same values can be accessed from different languages. Once you start switching between languages operating on the same value, you realize that issues for managing “values” is usually influenced by programming language limitations. Some languages are better at others at handling different concerns around values than others.

Take a value “a”’. There are concerns about “a” that may need to be represented in your program to solve a problem.

  • The absence of “a”.
  • A list of “a”.
  • “a” being available for use at some point in the future, but not now.
  • An error in producing “a”.
  • Documentation about “a”.
  • The history of “a” values.
  • Moving “a” from one processing environment to another.
  • Documentation on the history of processing “a” in a program.
  • Persisting “a” for later use.

Programming languages impose constraints on how these concerns are represented. Some concerns must be addressed completely outside the programming language. For example, in javascript, a list of “a” is captured in an array, which is actually just a dictionary. “Absence” is handled by “null” or “undefined.” In java, a list of “a” can be represented with several different classes. Variance in java allows most, but not all, of those ways to look the same (java.util.List) so its less work to use a list even if its represented differently underneath. You can think of these concerns and how they are represented as “metadata” about “a”.

In other worlds, programming languages impose constraints on how to address these concerns while leaving it up to user-specific methods (think frameworks and tools) to handle others. A programming language may have features and syntax to help address some of these.

In a language designed to handle the absence of “a”, say using special syntax, null is fine and value management in the presence of nulls is generally easy to do. Javascript has “?” and “??” syntax support. Some languages, like java today and scala, use Option[a] to represent missing values even though null exists in those languages. Sometimes having both Option and null can be confusing as some libraries will use null and some will use Option. In scala and java there is no special null syntax but it is realtively easy to wrap and unwrap values in Option to express “absence of a”.

As another example using scala, there is debate around the tagless-final style. The core idea behind tagless-final is to delay the decision around a “container” for as long as necessary:

trait Repo[M[_]] {
   def getValue(arg: Int): M[Value]
   def maybeValue(arg: Int): M[Option[Value]]
   def getList(): M[Seq[Value]]

Then there is great gnashing of teeth around things like monads etc. M can represent Try or a List or an effect like cats-effect IO or an object that can contain (sometimes untyped) an error state.

I think alot of this thinking misses the mark.

M can represent some things but it completely misses representing others problems that cause programming friction. For example, to be truly generic we need to abstract it out a bit more and allow different implementations based on what’s needed:

// M=>asynchronous,error, L=>lists, O=>absence
trait Repo[M[_], L[_], O[_]] {
   def getValue(arg: Int): M[Value]
   def maybeValue(arg: Int): M[O[Value]]
   def getList(): M[L[Value]]

In other words, tagless-final is not tagless enough.

From my polyglot perspective there are still too many assumptions being made with just using M. M address some concerns but not others. For example, how a list of "Value"s are represented or missing value representation is not necessarily addressed by a single M. These concerns create friction in programs.

In the scala.js world, I’d rather use js.UndefOr and js.Array to represent “unknown” and “list” concepts since they are more efficient. Converting between values, something all languages support, may be impractical for some values. A programming language and its target deployment model (e.g. jvm) force too many assumptions on the abstraction.

Perhaps there is a W = World such that:

trait Repo[W] {
   def getValue(arg: Int): W[Value]
   def maybeValue(arg: Int): W.OptionalConcept[Value]
   def getList(): W.ListConcept[Value]

zio, a scala effects library, is interesting. zio cannot solve all of the “world” problems but it explicitly targets a few troublesome areas around concurrent programming. “Concurrency” is represented by a type ZIO while still letting you be explicit about representing “errors” for values. Its highly targeted towards these troublesome areas so it can be as helpful as possible when processing values and provides explicit support to manipulate these areas. But it does not address all concerns.

A polyglot platform like graal cannot address all of these concerns directly other that to provide API that allows you to probe values and understand what concept they represent. Depending on the language that you are using, there may or may not be special syntax to help use those values easily. That’s a good compromise.

So in the end, null was not the billion dollar mistake. Null is perfectly fine. It’s just the absence of capabilities to manage null values in data processing easily that was missing.

You can certainly do a W = World in scala and some other languages but at some point the cost of abstracting is too high unless it solves real pain points.


Popular posts from this blog

zio layers and framework integration

typescript and react types

dotty+scala.js+async: interesting options