The Story of None: Part 6 - Avoiding It
part 1 part 2 part 3 part 4 part 5 part 6
Last time...
Last
time
we've discussed guard clauses and when not to use them. We've discussed
the paranoia developers sometimes feel that causes them to write useless
or even harmful guard clauses. The best way to reduce paranoia about
None
is to make sure it can't be there in the first place.
So let's talk about ways to accomplish this.
Date Validator Redux
The date validator in its last incarnation looked like this:
def validate_end_date_later_than_start(start_date, end_date):
if start_date is None or end_date is None:
return
if end_date <= start_date:
raise ValidationError(
"The end date should be later than the start date.")
Here we want to validate two date values which may be missing, in which
case we treat start_date
as "indefinite past" and end_date
as
"indefinite future".
We could create special sentinel objects for "indefinite future" and "indefinite past":
INDEFINITE_PAST = date(datetime.MINYEAR, 1, 1)
INDEFINITE_FUTURE = date(datetime.MAXYEAR, 12, 31)
where MINYEAR
is 1
and MAXYEAR
is 9999
.
(Too bad datetime doesn't allow negative dates, or we could've used
Bishop Ussher's date
for the creation of the universe, date(-4004, 10, 23)
. Though that's
in the proleptic Julian calendar, and I don't care to know what that is
right now. Plus it's bogus. But it'd be amusing.)
If we can be sure that those are used instead of None
before the
validate_end_date_later_than_start
function is called, we can simplify
it to this:
def validate_end_date_later_than_start(start_date, end_date):
if end_date <= start_date:
raise ValidationError(
"The end date should be later than the start date.")
which is what we started out with in the first place in Part 1 long ago, without any guards! Awesome!
Edge case
This handwaves the edge case where start_date
and end_date
are both
equal to INDEFINITE_PAST
or INDEFINITE_FUTURE
, which can be argued
should not raise ValidationError
. In software for a time machine it
might be important to get this right, but in many applications not
handling this edge case is fine.
Really avoiding the edge cases
If we insist on making the edge case go away, we could deal with it by
subclassing the date
class to construct these sentinels instead:
class IndefinitePast(date):
def __lt__(self, other):
return True
def __le__(self, other):
return True
def __gt__(self, other):
return False
def __ge__(self, other):
return False
class IndefiniteFuture(date):
def __lt__(self, other):
return False
def __le__(self, other):
return False
def __gt__(self, other):
return True
def __ge__(self, other):
return True
INDEFINITE_PAST = IndefinitePast(datetime.MINYEAR, 1, 1)
INDEFINITE_FUTURE = IndefiniteFuture(datetime.MAXYEAR, 12, 31)
This is a lot more code though, and therefore in many situations this would be overkill.
(As a puzzle for the reader in this case one could safely skip
implementing __le__
and __ge__
for these classes and still have it
all work for any possible date. I kept them in for clarity.)
Normalization
So what have we done here? We've made sure that our input was
normalized to a date
before it even reached the validation function. This way we don't have
to worry about our friend None
when we deal with dates.
The idea is to normalize the input a soon as possible before it reaches
the rest of our codebase, so we can stop worrying about non-normalized
cases (such as None
) everywhere else. In effect you put the guard
clauses as far on the outside of the calling chain as possible.
In the case of our date input, somewhere in the input processing we'd call these functions:
def process_start_date(d):
if d is None:
return INDEFINITE_PAST
return d
def process_end_date(d):
if d is None:
return INDEFINITE_FUTURE
return d
None of those None
's to worry about anymore after that!
Drawbacks
Normalization also has some potential drawbacks. Here are some that may apply to this case:
- to understand how empty date fields are treated in the validation function, we need to read normalization code that may be somewhere else. Our validation function that worried about None was all in one place.
- it's more code to understand and maintain, especially with the custom date subclasses.
- normalization of
None
to a date may be nice during validation, but it may not be what we want to store in a database; we might want to store None there. If we have this requirement we'd need two code paths: one for storage and one for validation.
It all depends on the exact details of your project. If the project is
going to compare a lot of dates in many places, it makes sense to
normalize missing values to proper dates as soon as possible, and it's a
much better approach than having to worry about None
everywhere. But
if the project only needs a single validation rule that can handle
missing dates, then it makes more sense to write one that deals with
None
directly.
Conclusion
This concludes the Story of None! I hope you've enjoyed it! Perhaps you've learned something.
Let me know if you would like to see more stuff like this -discussions of fairly low-level patterns that happen during development.