Hackle's blog
between the abstractions we want and the abstractions we get.
So you know what null
is, but have you heard about explicit null
and implicit null
?
We are told null
represents the "lack of value"; but it's clearly not always true: in many languages, null
(or nil
, Nothing
) is a value that can be assigned to variables, and passed around, and is definitely first-class.
Right! So it's both a value and "lack of value", isn't that the most confusing thing ever?
Consider the below method in C#,
public class Person
{
public string Nickname;
}
// in some other class
string Update(Person person)
{
if (person.Nickname == null)
{
// what to do?
}
}
And let me ask you this: when person.Nickname
is null
, how can we know if it's,
null
that is never given a value, as with new Person()
, ornull
as with new Person(Nickname = null)
?The answer is, by looking at the value alone, there is no way to tell!
What's the fuss, you say. If this never bothers you, great, count your blessings; but if it does, then you know it can actually be a pretty big deal.
One famous use case for differentiating explicit and implicit null
, is JSON merge PATCH.
Let's say this Person
record is currently saved in the data store.
{ "id": 1, "name": "Hackle", "nickname": "Hacks" }
And a client wants to reset the nickname
, so it sends a PATCH
request.
PATCH /person/1
{ "nickname": null }
Notice the explicit null
, it clearly indicates nickname
should be set to null
. A good server-side implementation should update the record so it looks like,
{ "id": 1, "name": "Hackle", "nickname": null } // or remove the nickname field completely
This is a non-issue for JavaScript. You would guess, is it because it's untyped or dynamically typed? Not just! In this case, it's actually because JavaScript is very well-typed, and almost statically.
JavaScript differentiates null
and no value is given
, a.k.a. undefined
. So imagine the server-side code written in Node,
if (person.nickname === undefined) {
// do nothing!
}
A stroke of genius indeed! Now we know undefined
is nothing to sneeze at.
Other languages may not be so lucky. Without undefined
or its equivalent, we are left with the annoying confusion between implicit and explicit null
.
Suffice to say, JSON merge PATCH is a pain to implement in a old-school statically typed language like C#. Why? Let's look at the example below,
using System;
using System.Text.Json;
record Person
{
public string Name;
public string Nickname;
}
class Program {
public static void Main (string[] args) {
var person1 = JsonSerializer.Deserialize<Person>("{\"name\":\"Hacks\"}");
var person2 = JsonSerializer.Deserialize<Person>("{\"name\":\"Hacks\",\"nickname\":null}");
Console.WriteLine (person1 == person2);
// outputs: True
}
}
For all its static typing, C# cannot tell implicit null from explicit null. Is nickname
set to null
in the JSON request body, or not set at all? Don't know!
Some very useful information is lost. As far as JSON deserialization is concerned, the type system is inadequate here for lack of a undefined
equivalent.
The serializer should not take the blame here. In fact, most serializers support deserializing into a dynamic hashmap. After all, a JSON object is nothing more than a map itself.
See how this is done pretty easily in C# with the mysterious dynamic
keyword!
var person1 = JsonSerializer.Deserialize<dynamic>("{\"name\":\"Hacks\"}");
var person2 = JsonSerializer.Deserialize<dynamic>("{\"name\":\"Hacks\",\"nickname\":null}");
// person1 is a System.Text.Json.JsonElement
Console.WriteLine(person1);
{"name":"Hacks"}
Console.WriteLine(person2);
{"name":"Hacks","nickname":null}
You see, the serializer is totally capable of differentiating between undefined
and null
. Why don't we use dynamic
instead? What's the problem?!
The problem is we are addicted to convenience! Deserialising a JSON object to a "strongly-typed" model is golden standard, even at the cost of information loss. Nobody wants to break the standards, and downgrade to the level of manipulating JSON objects directly, what are we, savages?
So this is the core of the problem: for the sake of convenience in the disguise of "correctness" (or "strong" typing), we accept information loss, which actually undermines correctness. This is the real dead end!
Boy did the JSON merge Patch problem make engineers scramble. But if there is one thing engineers do well, is to work around problems by PATCHing (pun intended) over them.
It's too late to introduce undefined
to existing type systems, so the Java peeps are quick to copy undefined
from JavaScript, with JsonNullable. ASP.NET users love their model binding so a library must be made to suit.
Wait, Haskell has undefined
, although, it's not supposed to be touched.
ghci> undefined
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:74:14 in base:GHC.Err
undefined, called at <interactive>:2:1 in interactive:Ghci1
ghci> undefined == undefined
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:74:14 in base:GHC.Err
undefined, called at <interactive>:3:1 in interactive:Ghci1
Give its prestige as a language and community, surely Haskell peeps have handled this with flying colours? Not necessarily. Take this example with Aeson
,
import Data.Aeson
import Data.Text
import GHC.Generics
data Person = Person {
name :: Text,
nickname :: Maybe Text
} deriving (Generic, Show)
instance FromJSON Person
person = decode "{\"name\":\"Hacks\"}" :: Maybe Person
main :: IO ()
main = do
putStrLn $ show person
-- prints
Just (Person {name = "Hacks", nickname = Nothing})
Instead of implicit null
we have implicit Nothing
, not much different than C#. It looks like the core of the problem is not with languages but with conventions, and conventions go deep.
So we really love "strong-typing" so much, and never want to regress into using JSON objects directly, are we stuck with JavaScript for JSON merge PATCH? What are the alternatives?
Only if we could change the conventions just a little, there may be a way out. For example, utilising Haskell's tagged unions, we could introduce a new type such as data MaybeUndefined a
as below. Missing fields deserialise into Undefined
, otherwise their intended type a
, which can be nullable itself. For example,
data MaybeUndefined a = Undefined | Defined a
data Person = Person {
name :: Text,
nickname :: MaybeUndefined (Maybe Text)
}
Now we have deterministic interpretation,
Undefined
: the nickname
field is missing, i.z. no value is givenDefined Nothing
: a null
value is set explicitlyDefined (Just "Hacks")
: a value other than null
is given(NOTE: I have not managed to implement this with Aeson)
This technique would slot in pretty naturally for tagged unions, but not so well for untagged unions which collapse - for example, in TypeScript
, Optional<T>
is the same as OptionalOptional<T>
, proof below,
type Optional<T> = T | undefined;
type OptionalOptional<T> = Optional<T> | undefined;
type StrictEq<T, U> =
[T] extends [U]
? [U] extends [T]
? true : false
: false;
// Type 'false' is not assignable to type 'true'.ts(2322)
const areEqual: StrictEq<Optional<string>, OptionalOptional<string>> = false;
This is hardly the end of the world but it does mean more heavy-handed wiring is needed to introduce the type level equivalent of undefined
.
Worth noting this issue does not stop at PATCH
.
Let's say for our wildly popular endpoint, nickname
is a new field added to the existing Person
contract. And let's assume people are wary of versioning hell so that's not an option.
For the various client sides, this should be completely backward compatible, right? Not really.
When a PATCH
request is received, we are faced with even more interpretations when nickname
is null
after deserialisation.
nickname
as is, ornickname
to null
, ornickname
, so nickname
will be missing, ornickname
, and is setting null
values consciouslyToo much guessing required!
Who would have thought JSON merge PATCH would bring so much envy for undefined
and JavaScript?
Isn't interesting, and maybe a bit embarrassing to admit how much trouble null
is still causing us? Even the more modern, powerful languages are no exception, because the conventions are pervasive and run deep.
But these languages do have the edge of extra expressive power, which helps to make the solution less wacky and more elegant. All we need to fight are the conventions - but would we be able to?