The Story of None: Part 5 -- More on Guarding
Last time in the Story of None we've discussed the concept of a
guard clause. This is simply an
if statement at the beginning of
a function that returns early if a certain condition is true.
The guard clause pattern is applicable to more than just the
scenario. We could be writing a function where we need a specific
treatment if an integer value is
0, for instance. A guard clause
often does more than just a bare boned
return. Instead, it could
return a value or raise an exception.
So let's discuss some other brief examples of guard clauses.
Raising an exception is good when the input really should be considered an error and the developer should know about it:
def divide(a, b): if b == 0: raise ZeroDivisionError("Really I don't know, man!") return a / b
In this case you know that the way to handle
0 is to
not handle it and instead to loudly complain that there is some
error, by raising an exception.
Complaining loudly is important: it is tempting to make up some return value and let the code proceed. We'll go into this in more detail later on.
.get guard clause
Let's say we want to implement dictionary
.get as a function
ourselves, with a guard against a missing dictionary key:
def get(d, key, default=None): if key not in d: return default return d[key]
This guard clause returns a default value if the guard clause
condition is true. As you can see here the guard clause can be
dependent on multiple arguments, in this case
the function in question is a method, it can be dependent on object
state as well.
Complain loudly for possible input that you cannot handle; it makes debugging easier.
Guard clauses in recursion
Guard clauses often occur in recursive functions, where they guard against the stopping criterion.
Let's consider this rather contrived (as there are much better implementations without recursion in Python) example where we recursively add all numbers in a list:
def add(numbers): if len(numbers) == 0: return 0 return numbers + add(numbers[1:])
The main part of the function says: the sum of all the numbers in a
numbers is the first entry in that list added to the sum of
the rest of the entries in the list.
But what if the list of numbers is empty? We cannot obtain the first
entry in that case, so we need some kind of guard clause to handle
this. We know that the resulting sum of an empty list should be
0, so we can simply return this. This is also the stopping
criterion for the recursion, because we cannot recurse further into an
Don't be paranoid
Don't overdo it, and put in guard clauses that guard against cases where you don't actually know they can happen, or where you don't know how to handle them. Guard against exceptional forms of expected input, not against input that is unexpected altogether. Guarding against the expected is sensible, but guarding against the unexpected is paranoia.
So, in the case of
None, we don't want to clutter our code with
lots of guard clauses just to make sure that the input wasn't
if we don't even know that the input can be
None in the first
Python tends to do the right thing in the face of the unexpected: its core operations tends to fail early with an exception if asked to do something which they cannot handle: dividing by zero, comparing a date with None, getting a non-existent key from a dictionary. Rely on this behavior, be happy Python is eager to tell you something is wrong, and avoid clutter in your code.
even in the face of the unexpected: they let you add a string to a
number, and if a property is missing you don't get an exception but a
all kinds of paranoid guard clauses, because I still don't expect
these cases, and I don't want to clutter up my code.
If you feel paranoid
We all feel paranoid sometimes. Sometimes we think we need to handle some type of input. If you feel inclined to handle something but aren't sure what to do, here is a list of things to consider:
Don't return from the function early. This is the worst thing you can do.
If you handle an unexpected value by returning something you make up, you really are creating a bug. Made-up data is now propagating further through the codebase. You either end up with an exception deeper down in call chain where it becomes harder to debug, or you end up with something worse: a seemingly functional program which delivers bogus data.
You may think you can avoid returning something made-up by using a plain
returnstatement. But if you do that in case of a function that really needs to return a result, you are implicitly returning
None(in Python) or
It might seem like we did such a plain return in case of the date validation function; we got a
Noneand we handled it by returning early.
But in the case of the date validation function,
Nonewas according to our requirements expected input, just an exceptional case where the normal case was date input. And a validation function like this has no return value at all, and can stop validating right away as soon as the input is judged valid, so returning early is fine.
You could use a print statement to print out arguments, so you can see whether they are unexpected by reading the output.
You don't want to clutter up your code with a lot of print statements. You'll get a lot of output that will be hard to understand.
While print is quick and appropriate sometimes, do consider learning a debugger tool for your language as it can help a lot for more complex cases. For Python the built-in debugger is
Don't log everything.
Logging for debugging purposes is the advanced form of using
Logging is very useful, but I prefer logging to be application-appropriate, not to help debug the program flow. If I want to debug program flow I use a debugger or a bunch of throw-away print statements.
Of course there are exceptions to this rule; you might for instance want to log debugging information if a bug is hard to find and turns up in production. But use it in moderation, only when necessary.
Don't print or log and then return early with a made up value.
You can print or log some diagnostic information (the value of arguments, say) and then return early with a made-up value.
The impulse to want to get diagnostics is correct. The impulse to stop going further is also correct. But returning with a made-up value is still wrong -- and you aren't really stopping your program anyway.
If you return with a made-up value, your program will continue and will likely log some more information. But since you've returned a made-up value, everything logged after this case is nonsense; there's no point in continuing.
You could instead print diagnostics and do an early exit (
sys.exit()in Python). Instead, if your language lets you do it, just throw an exception.
Throw an exception.
If you feel you need to handle an unexpected case, throw an exception. In Python you can throw a basic exception like this:
if value == 0: raise Exception("Something went wrong")
But of course it makes sense to use more specific exceptions when you can:
if not isinstance(value, basestring): raise TypeError("Expected string, not %r" % value)
Sometimes you need to make up new exceptions specific to your library or application as none of the built-in ones is appropriate:
class WorkflowError(Exception): pass ... if invalid(workflow_state): raise WorkflowError("Invalid workflow state: %s" % workflow_state)
Exceptions do the right stuff automatically:
bail out early when you can't handle something.
give diagnostic information in the form of a message and a traceback of the function call chain.
allow you to handle them after all if you want.
Doing nothing is often the right impulse.
In the case of unexpected input, I can often rely on the language to fail with an exception anyway in the appropriate spot.
Next time we'll consider a way to avoid having to scatter guard clauses throughout our codebase: normalization.