A nice summary of the debate pro/anti-early returns can be found on the Wikipedia page for 'structured programming'.
I find I tend to use early returns when I'm writing hurriedly, and when I do so I transfer the 'short term context' into an 'imperative' style of code (which can be more succinct but ultimately less clear to re-read later on, or by another person, who no longer has the short-term context of the original author and instead must build it back up).
I repeatedly find that Python code which uses early returns can end up duplicating code at the return sites, and even include unnecessary re-computation (presumably as it was unclear exactly what state was available).
Another way of referring to this problem of duplicating code at the return sites is that single exit functions are “easier to instrument”.
Guard clauses
A common event handling pattern is to run some basic checks on the event you've received before processing. These are known as guard clauses (or pre-conditions). The main idea of the structured programming paradigm is that it's best to make the pre- and post-conditions of a program explicit.
Here's an example:
Note: put
from __future__ import annotations
at the top of your imports to run these examples pre-3.11
def guarded_double(a: int | None) -> int | None:
if a is None:
return None
return a * 2
- While
return
on its own may seem equivalent toreturn None
, both PEP8 and the type checker mypy disagree, hence I usereturn None
in place ofreturn
Here's the same logic, refactored to use a single return statement:
def double(a: int | None) -> int | None:
return None if a is None else (a * 2)
I can verify this refactor was correct by writing exhaustive test cases: 1 binary condition, two functions, so 2 arguments for each function = 2 pairs of outputs that should be equivalent.
>>> guarded_double(None) is double(None) is None
True
>>> guarded_double(1) == double(1) == 2
True
One way to verify the correctness of this refactor is with an overloaded signature that can be type checked.
@typing.overload
was explained nicely by Adam Johnson, who jokes "May type hints never overload you," which is suggestive of the cognitive load of refactoring. You must essentially hold these overloaded function signatures in your head at once, until finished.
@overload
def overloaded_double(a: int) -> int:
...
@overload
def overloaded_double(a: None) -> None:
...
def overloaded_double(a: int | None) -> int | None:
if a is None:
return
return a * 2
What we've done here isn't quite the same as writing out test cases (as test cases would need to be checked at runtime), but we have made the implicit function overloading of the guard clause explicit in the separated (overloaded) function signatures.
mypy
can check the correctness from the type signature (which I've put on GitHub [here][example1]:
compare before and after).
If our function accidentally changed behaviour during the refactor (let's say int
input can now
give None
output) then neither the overloaded signatures nor the simple one will allow mypy to detect this.
The tool is simply not able to do this at present.
For brevity I'm not showing both versions for this one, but you can find them on GitHub
def double(a: int | None) -> int | None:
if a is None:
output = None
elif a > 2:
output = (a * 2)
else:
output = None
return output
$ mypy mutants/
Success: no issues found in 2 source files
This means that if you're not careful doing your refactor, there's no way to check that you did it correctly, besides writing exhaustive unit tests.
In a real world scenario, you'd probably not encounter the assumptions of the problem already laid out nicely in overloaded function signatures.
Even if you had done, the above example is trivial for another reason.
All of the necessary information is present in the argument types declared in the function signature.
However it's easy to consider examples that would escape this sort of checking,
and render our mypy
overloaded signature verification method useless to verify refactor correctness.
To illustrate, consider a function which measures the length of the "payload" entry in a dict
.
def guarded_handler(event: dict[str, str]) -> int | None:
if (payload := event.get("payload")) is None:
return None
return len(payload)
We can refactor it in the same way, but it quickly becomes a bit unsightly:
def ternary_handler(event: dict) -> int | None:
return None if (payload := event.get("payload")) is None else len(payload)
In real world examples you'd use if
/else
blocks for greater readability:
def handler(event: dict) -> int | None:
payload = event.get("payload")
if payload is None:
output = None
else:
output = len(payload)
return output
These are 3 versions of the 'same' function: they all follow identical logic with the input:
- The
guarded_handler
function uses a guard condition, as previously covered - The
ternary_handler
function uses a ternary condition when assigning to the return value - The
handler
function uses structured programming style (nested if/else block with explicit assignments to a named object which becomes the return value)
How can we verify the equivalence of 1, 2, and 3?