Give Doctest a chance

Doctest is one of my favorite Python modules. With doctest, it is possible to execute code snippets from documentation. You could, for example, write something like this in your  turorial.md

>>> f()
1

…and then execute the command python -mdoctest tutorial.md. If f() returns 1, nothing will happen. If it returns something else, though, an error message will appear, similar to this one:

**********************************************************************
File "f.txt", line 2, in f.txt
Failed example:
    f()
Expected:
    1
Got:
    2
**********************************************************************
1 items had failures:
   1 of   2 in f.txt
***Test Failed*** 1 failures.

It is an impressive tool, but also an unpopular one.  The problem is, Doctest is often improperly used. For example, it is common to try to write unit tests with doctests. Great mistake.

Nonetheless, I believe it is unfair to disregard the module due to these misunderstandings. Doctest can and should be used for what it does best: to keep your documentation alive, and even to guide your development!

Let me show an example.

When you don’t know what to do

Some days ago, I was writing a class to modify an HTML document using xml.dom.minidom. At one point, I needed a function to map CSS classes to nodes from the document. That alone would be a complicated function! I had no idea of where to start.

In theory, unit tests could be useful here. They just would not be very practical: this was an internal, private function, an implementation detail. To test it, I’d have to expose it. We would also need a new file, for the tests. And test cases are not that legible anyway.

Reading the documentation from the future

Instead, I documented the function first. I wrote a little paragraph describing what it would do. It alone was enough to clarify my ideas a bit:

Given an xml.dom.minidom.Node, returns a map
from every “class” attribute to a list of nodes
with this class.

Then, I though about how to write the same thing, but with a code example. In my head, this function (which I called get_css_class_dict()) would receive xml.dom.minidom document. So, I wrote an example:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''')

Given this snippet, I would expect the function to return a dict. My document has two CSS classes, “a” and “b,” and then my dict would have two keys. Each key would have a list of the nodes with the CSS class. Something like this:

    >>> d = get_css_class_dict(doc)
    >>> d['a']  # doctest: +ELLIPSIS
    [, ]
    >>> d['b']  # doctest: +ELLIPSIS
    []

I put these sketches in the docstring  of get_css_class_dict(). So far, we have this function:

def get_css_class_dict(node):
    """
    Given an xml.dom.minidom.Node, returns a map from every "class" attribute
    from it to a list of nodes with this class.

    For example, for the document below:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''') ...we will get this: >>> d = get_css_class_dict(doc) >>> d['a'] # doctest: +ELLIPSIS [, ] >>> d['b'] # doctest: +ELLIPSIS [] """ pass

I could do something similar with unit tests but there would be much more code around, polluting the documentation. Besides that, the prose graciously complements the code, giving rhythm to the reading.

I execute the doctests and this is the result:

**********************************************************************
File "vtodo/listing/filler.py", line 75, in filler.get_css_class_dict
Failed example:
    d['a']
Exception raised:
    Traceback (most recent call last):
      File "/usr/lib/python3.6/doctest.py", line 1330, in __run
        compileflags, 1), test.globs)
      File "", line 1, in 
        d['a']
    TypeError: 'NoneType' object is not subscriptable
**********************************************************************
File "vtodo/listing/filler.py", line 77, in filler.get_css_class_dict
Failed example:
    d['b']
Exception raised:
    Traceback (most recent call last):
      File "/usr/lib/python3.6/doctest.py", line 1330, in __run
        compileflags, 1), test.globs)
      File "<https://suspensao.blog.br/disbelief/wp-admin/edit-tags.php?taxonomy=category;doctest filler.get_css_class_dict[3]>", line 1, in 
        d['b']
    TypeError: 'NoneType' object is not subscriptable
**********************************************************************
1 items had failures:
   2 of   4 in filler.get_css_class_dict
***Test Failed*** 2 failures.

I’m following test-driven development, basically, but with executable documentation. At once, I got a readable example and a basic test.

Now, we just need to implement the function! I used some recursion and, if the code is not the most succinct ever at first…

def get_css_class_dict(node):
    """
    Given an xml.dom.minidom.Node, returns a map from every "class" attribute
    from it to a list of nodes with this class.

    For example, for the document below:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''') ...we will get this: >>> d = get_css_class_dict(doc) >>> d['a'] # doctest: +ELLIPSIS [, ] >>> d['b'] # doctest: +ELLIPSIS [] """ css_class_dict = {} if node.attributes is not None and 'class' in node.attributes: css_classes = node.attributes['class'].value for css_class in css_classes.split(): css_class_list = css_class_dict.get(css_class, []) css_class_list.append(node) css_class_dict[css_class] = css_class_list childNodes = getattr(node, 'childNodes', []) for cn in childNodes: ccd = get_css_class_dict(cn) for css_class, nodes_list in ccd.items(): css_class_list = css_class_dict.get(css_class, []) css_class_list.extend(nodes_list) css_class_dict[css_class] = css_class_list return css_class_dict

…at least it works as expected:

$ python -mdoctest vtodo/listing/filler.py 
**********************************************************************
File "vtodo/listing/filler.py", line 77, in filler.get_css_class_dict
Failed example:
    d['b']  # doctest: +ELLIPSIS
Expected:
    []
Got:
    []
**********************************************************************
1 items had failures:
   1 of   4 in filler.get_css_class_dict
***Test Failed*** 1 failures.

Wait a minute. What was that?!

When the documentation is wrong

Well, there is a mistake in my doctest! The span element does not have the “b” class—the div element does. So, I just need to change the line

[<DOM Element: span at ...>]

to

[<DOM Element: div at ...>]

and the Doctest will pass.

Isn’t it wonderful? I found a slip in my documentation almost immediately. More than that: if my function’s behavior changes someday, the example from my docstring will fail. I’ll know exactly where the documentation will need updates.

Making doctests worth it

That is the rationale behind Doctest. Our documentation had a subtle mistake and we found it by executing it. Doctests do not guarantee the correctness of the code; they reinforces the correctness of documentation. It is a well-known aspect of the package but few people seem to believe it is worth it.

I think it is! Documentation is often deemed an unpleasant work but it does not have to be so.  Just as TDD make tests exciting, it is possible to make documentation fun with doctests.

Besides that, in the same way TDD can point to design limitations, a hard time writing doctests can point out to API problems. If it was hard to write a clear and concise example of use for your API, surrounded by explaining text, it is likely too complicated, right?

Give Doctest a chance

In the end, I see doctests limitations. They are surely inadequate for unit tests, for example.  And yet, doctest makes documenting so easy and fun! I don’t see why it is so unpopular.

Nonetheless, its greatest advantage is how doctest makes the development process easier. Some time ago, I joked that we need to create DocDD:

With Doctest, it is not just a joke anymore.

This post is a translation of Dê uma chance a Doctest.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.