I really hate the “stop doing [whatever]” or “you’ve been doing X or Y wrong all these years” type of articles that inundate Medium’s tech space these days. Hate them. Yet, I find myself wanting to write one after just dealing with an issue in a database I inherited at my current day job, and wasting an hour of my “first thing in the morning” routine.
Let’s just call it therapy.
While I think it’s a good piece of tech knowledge based on a lot of years of practice, don’t read too much into my insistence that if you do this thing, you’re pretty much destroying society and good data-integrity design. ;)
Too much use of NULL is lazy and crappy database design. There. I said it.
Ted Codd put forth as one of his fundamental rules that NULL is meant to express an unknown or currently-unknowable value. I think some people took that way, way too much to heart.
Bobby is a user in the database. The database allows for Bobby to have some apples. Is the number of Bobby’s apples NULL or is it 0? An argument could be made for either one being technically correct and accurate. But I call bullshit on using NULL in this case.
“Bobby, how many apples do you have?” Bobby is never going to answer “that information is not knowable” or “my answer is null”. If there’s a quantity of something, zero is a perfectly acceptable placeholder because because numbers are elegant, actual things, and you can always amend the answer without having to do ANY special handling for NULL when the value changes from 0 to something else.
That’s the issue, right there. NULL never works as a calculable value, so when you’re trying to find some value on a fixed domain, and there’s a NULL, you need special code or rules. I put forth that most of the time, using NULL, especially with numbers, you’ve done some bad design and your penance should be harsh. Like, a fish to the face kind of harsh.
Forgive the hyperbole.
You might not know how many apples Bobby has, but Bobby knows, and since you just asked him (the database query), he shouldn’t return NULL, he should tell you and he doesn’t have any apples, return zero. The cases where Bobby doesn’t know how many apples he has, or if he has any at all, or ever did, or just has no way of knowing… exceedingly rare.
“Reasonable defaults based on the design of the data and its intended use.”
That’s a phrase I learned very, very early in my career and it’s led to some interesting discussions over the years.
I won’t go as far as to say that nulls are evil, but I will say in a huge and vast majority of the times I’ve had to include them in a database design and in a domain of data, most of the time, they’ve been applied to dates. Dates are a pain in the ass and there simply IS no reasonable default in so many cases.
What’s Bobby’s birthday? I have zero earthly idea and it would be stupid to apply some default value. *BAM* That’s where NULLs work best. And very, very limited uses in other places. Some designs allow for and require them. I get it. I’m not debating that. They’re real and useful.
All I’m saying is, when you find a NULL coming up in your database design, repeat after me and look for: “Reasonable defaults based on the design of the data and its intended use.”
Bobby will never have NULL apples.