The Story of None: Part 6 - Avoiding It
The Story of None: Part 6 - Avoiding It
part 1 part 2 part 3 part 4 part 5 part 6
Last time...
Last time we've discussed guard clauses and when not to use
them. We've discussed the paranoia developers sometimes feel that
causes them to write useless or even harmful guard clauses. The best
way to reduce paranoia about None
is to make sure it can't be
there in the first place.
So let's talk about ways to accomplish this.
Date Validator Redux
The date validator in its last incarnation looked like this:
def validate_end_date_later_than_start(start_date, end_date): if start_date is None or end_date is None: return if end_date <= start_date: raise ValidationError( "The end date should be later than the start date.")
Here we want to validate two date values which may be missing, in
which case we treat start_date
as "indefinite past" and
end_date
as "indefinite future".
We could create special sentinel objects for "indefinite future" and "indefinite past":
INDEFINITE_PAST = date(datetime.MINYEAR, 1, 1) INDEFINITE_FUTURE = date(datetime.MAXYEAR, 12, 31)
where MINYEAR
is 1
and MAXYEAR
is 9999
.
(Too bad datetime doesn't allow negative dates, or we could've used
Bishop Ussher's date for the creation of the universe,
date(-4004, 10, 23)
. Though that's in the proleptic Julian
calendar, and I don't care to know what that is right now. Plus it's
bogus. But it'd be amusing.)
If we can be sure that those are used instead of None
before the
validate_end_date_later_than_start
function is called, we can
simplify it to this:
def validate_end_date_later_than_start(start_date, end_date): if end_date <= start_date: raise ValidationError( "The end date should be later than the start date.")
which is what we started out with in the first place in Part 1 long ago, without any guards! Awesome!
Edge case
This handwaves the edge case where start_date
and end_date
are
both equal to INDEFINITE_PAST
or INDEFINITE_FUTURE
, which can
be argued should not raise ValidationError
. In software for a time
machine it might be important to get this right, but in many
applications not handling this edge case is fine.
Really avoiding the edge cases
If we insist on making the edge case go away, we could deal with it by
subclassing the date
class to construct these sentinels instead:
class IndefinitePast(date): def __lt__(self, other): return True def __le__(self, other): return True def __gt__(self, other): return False def __ge__(self, other): return False class IndefiniteFuture(date): def __lt__(self, other): return False def __le__(self, other): return False def __gt__(self, other): return True def __ge__(self, other): return True INDEFINITE_PAST = IndefinitePast(datetime.MINYEAR, 1, 1) INDEFINITE_FUTURE = IndefiniteFuture(datetime.MAXYEAR, 12, 31)
This is a lot more code though, and therefore in many situations this would be overkill.
(As a puzzle for the reader in this case one could safely skip
implementing __le__
and __ge__
for these classes and still
have it all work for any possible date. I kept them in for clarity.)
Normalization
So what have we done here? We've made sure that our input was
normalized to a date before it even reached the validation
function. This way we don't have to worry about our friend None
when
we deal with dates.
The idea is to normalize the input a soon as possible before it
reaches the rest of our codebase, so we can stop worrying about
non-normalized cases (such as None
) everywhere else. In effect you
put the guard clauses as far on the outside of the calling chain as
possible.
In the case of our date input, somewhere in the input processing we'd call these functions:
def process_start_date(d): if d is None: return INDEFINITE_PAST return d def process_end_date(d): if d is None: return INDEFINITE_FUTURE return d
None of those None
's to worry about anymore after that!
Drawbacks
Normalization also has some potential drawbacks. Here are some that may apply to this case:
to understand how empty date fields are treated in the validation function, we need to read normalization code that may be somewhere else. Our validation function that worried about None was all in one place.
it's more code to understand and maintain, especially with the custom date subclasses.
normalization of
None
to a date may be nice during validation, but it may not be what we want to store in a database; we might want to store None there. If we have this requirement we'd need two code paths: one for storage and one for validation.
It all depends on the exact details of your project. If the project is
going to compare a lot of dates in many places, it makes sense to
normalize missing values to proper dates as soon as possible, and it's
a much better approach than having to worry about None
everywhere. But if the project only needs a single validation rule
that can handle missing dates, then it makes more sense to write one
that deals with None
directly.
Comments
Comments powered by Disqus