A billion dollar mistake
Can you believe something that feels as natural as a null
reference can have a named and recognized creator?
Not only that, but that this creator deeply regrets his invention?
In 1965, Tony Hoare (the guy who invented Quicksort, btw.) was designing ALGOL W when he created the null
reference:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965 […] This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Tony Hoare in 2011
Nullability
Having everything be possibly null
is to me the perfect ingredient for paranoid coding: If any object, anywhere in the code can be nullable, it’s only normal for one to wonder what he can trust. This leads to checks everywhere, which is noisy code, because you don’t pay attention when it matters. When everything is nullable, you might not pay attention when null
really is a possibility.
In Java, Objects are never held “by value”, this means a variable never contains the actual type data but instead contains a reference, an address if you will, to memory somewhere that may or may not be allocated.
public class T { int x = 5; }
void main()
{
T obj = new T(); // obj is a reference to the memory
// where T (an int containing 5) was allocated
T anotherOne = null; // Thus, in Java you express the intent of pointing to "nothing"
T yetAnotherOne; // null is the default value
}
All Java Objects are nullable by default. This is slightly different in languages such as C++
struct T { int x{}; };
int main()
{
T obj; // obj contains the data, an instance of T was default constructed with an x default initialized (= 0)
T* addr = &obj; // If you want a "reference" (Java sense of the word) you can use a pointer.
T* another = nullptr; // Pointers can be null.
T& ref = obj; // Or you can use a reference (C++ sense of the word) which is non-nullable.
}
Chosen nullability
While I think it’s a rather cold take to say “having nullability everywhere is not good”, nullability is not bad per se.
I can think of many situations in which you need to communicate that a value can be absent, and you need to communicate it with something that is outside of the possible values of your underlying type.
static void foo(double value)
{
// Is value there?
if (value != 0.)
// ...
}
I’m not a fan of these checks, but they often do the trick. However, this comes at the cost of excluding 0 from your significant values. Here, I reserved the number 0 for saying that there is no value.
What if I need 0? What if any double value is a valid value?
C++, Rust — and others — offer was is commonly known as the Option monad.
template <typename T>
class optional
{
public:
bool has_value() const;
T value();
private:
T m_data;
bool m_validity;
};
Simplified declaration of std::optional<T>
When reading or debugging, optional
does a great job at communicating the intent that data could be absent and does so in a way that I find more explicit than a pointer.
static void foo(std::optional<double> opt)
{
// Is value there?
if (opt.has_value()) // can also use if (opt)
// doing stuff with opt.value()
}
Or the pointer way:
static void foo(double* value)
{
// Is value there?
if (value)
// doing stuff with *value
}
I see two main differences: The first is that double*
points to memory somewhere else, whereas std::optional<double>
you have double
data as value! (You also have the bool
bit wrapped in the type, and maybe some padding.)
This is clearer on who owns the data, and presents some performance advantages (you don’t pay the cost of dereferencing the pointer) however performance is outside the scope of my skills, so I’ll refrain from speculating more than what’s already common knowledge.
The second thing is what happens if you make a mistake and forget the check. In the case of std::optional<double>
you have a nice std::bad_optional_access
exception, while with double*
you just have undefined behaviour. Don’t rely on UB.
When receiving a optional and needing to access the underlying data, you’re also way less at risk to forget handling the std::nullopt
case than you are with a pointer you got too comfortable with: I mean, you are literally using an Option, you can’t forget to check for validity. Nullability is part of the package, it is actually expected, you* chose it.
* Or the person who wrote the code you’re consuming but whatever, the contract cannot be clearer, expect null values.