Implementing malloc() e free() — reducing the heap even more

In our journey implementing malloc() and free(), we learned to reuse memory blocks. Today, we will make a very simple optimization: reduce the heap size as much as possible.

This post is part of a series on implementing the malloc() and free() functions. In the previous article, we learned how to reuse memory blocks. It was a significant advancement, but there’s much more room for improvement.

One example is reducing the size of the heap, as explained in the first post. When we free the last memory block, we move the top of the heap to the end of the previous block. However, this previous block might also be free, as well as others. Consider the scenario below:

void *ptr1 = abmalloc(8);
void *ptr2 = abmalloc(8);
abfree(ptr1);
abfree(ptr2);

In this case, when we free the block pointed to by ptr2, we make ptr1 the last block. However, ptr1 is also free, so we could further reduce the heap size.

To achieve this, we’ll iterate over the pointers from the end of the list until there are no more free blocks. If the header of the received pointer points to the last block and the previous block is free, we move the header pointer to it. We repeat this process until we reach an available block whose previous block is in use (or NULL if it’s the first block). Then, we execute the heap reduction procedure:

if (header == last) {
  while ((header->previous != NULL) && header->previous->available) {
    header = header->previous;
  }
  last = header->previous;
  brk(header);
} else {

Now, though, we need to fix a bug in abfree(). According to the specification, the free() function should accept a null pointer and do nothing. However, if abfree() receives NULL, we will have a segmentation fault! Fortunately, it is easy to fix by adding a check at the beginning of the function:

void abfree(void *ptr) {
   if (ptr == NULL) {
     return;
   }
   Header *header = (Header*) ptr - 1;

So, here’s our abfree() function at the moment:

void abfree(void *ptr) {
   if (ptr == NULL) {
     return;
   }
   Header *header = (Header*) ptr - 1;
   if (header == last) {
     while ((header->previous != NULL) && header->previous->available) {
       header = header->previous;
     }
     last = header->previous;
     brk(header);
   } else {
     header->available = true;
   }
 }

Reducing the size of the heap is a simple optimization, but there are still challenges ahead. In the next post, we’ll discuss how to avoid reusing very large memory blocks for small requests.

(This post is a translation of Implementando malloc() e free() — reduzindo ainda mais o heap, first published in Suspensão de Descrença.)

Implementing malloc() and free() — reusing memory blocks

Dynamic memory allocation is of no use if we cannot reuse freed memory, right? Proceeding with our implementation, we will make our malloc() function use freed blocks of memory when possible!

  1. This post is part of a series on how to implement the malloc() and free() functions. In a previous article, we changed our functions to free up some memory blocks. However, this only occurred if the freed blocks were deallocated from newest to oldest.

This wouldn’t make much difference. Dynamically allocated memory rarely behaves like a stack, where the newest block is always deallocated first. The big advantage of dynamic memory allocation, after all, is that it doesn’t work like a stack.

To understand the limitations of our implementation, consider the code below:

void *ptr1 = abmalloc(8);
void *ptr2 = abmalloc(8);
abfree(ptr1);
void *ptr3 = abmalloc(8);

In the first line, we allocate eight bytes, and free them in the third line. In the last line, we allocate eight bytes again. However, we cannot reuse the freed memory. To truly save memory, we need a more sophisticated solution.

One option is to reuse free blocks. To do this, we add a Boolean field to the block header, called available, which will indicate whether the block is free. As a block can only be reused if the memory requested by abmalloc() is less than or equal to that available in the block, we also need a field in the header indicating the size of the block, which we will call size.

typedef struct Header {
  struct Header *previous;
  size_t size;
  bool available;
} Header;

When the block is allocated, the value of the available field must be false (since the block is not available). We also record the block size in the size field:

void *abmalloc(size_t size) {
  Header *header = sbrk(sizeof(Header) + size);
  header->previous = last;
  header->size = size;
  header->available = false;
  last = header;
  return last + 1;
}

We have the information in the header but we are not yet reusing deallocated memory. To reuse the available blocks, we need to find them! The algorithm for this is very simple: abmalloc() will start iterating over the blocks, starting from the last until reaching the first. Since the previous pointer of the first block is always NULL, we stop when we find such value:

void *abmalloc(size_t size) {
   Header *header = last;
   while (header != NULL) {
     header = header->previous;
   }

In each iteration, we check whether the block is available and has an acceptable size. If in the middle of this process we find an available block greater than or equal to what we need, we got lucky! Just mark the block as unavailable, and return it.

void *abmalloc(size_t size) {
Header *header = last;
while (header != NULL) {
if (header->available && (header->size >= size)) {
header->available = false;
return header + 1;
}
header = header->previous;
}

What if we don’t find a block that satisfies these conditions? In this case, the abmalloc() function increases the heap, as it used to do:

void *abmalloc(size_t size) {
  Header *header = last;
  while (header != NULL) {
    if (header->available && (header->size >= size)) {
      header->available = false;
      return header + 1;
    }
    header = header->previous;
  }
  header = sbrk(sizeof(Header) + size);
  header->previous = last;
  header->size = size;
  header->available = false;
  last = header;
  return last + 1;
}

When it comes to deallocating, we have two possible situations. If the block deallocated by abfree() is the last one, nothing changes: we move the top of the heap to the beginning of the block, we change the last pointer. But what if the block is not on top of the heap? We simply mark it as available, as can be seen in the else clause of the function below:

void abfree(void *ptr) {
   Header *header = (Header*) ptr - 1;
   if (header == last) {
     last = header->previous;
     brk(header);
   } else {
     header->available = true;
   }
 }

Reusing blocks of memory is a huge advance. However, we can be even more efficient in memory usage. For example, we only reduce the heap size if we deallocate the last block. If there are more unused blocks right before it, we could free them too. We will see how to do this in the next post.

(This post is a translation of Implementando malloc() and free() — reutilizando blocos de memória, originally published in Suspensão de Descrença.);

Implementing malloc() and free() — adding metadata to the memory blocks

When malloc() reserves blocks of memory, it needs to somehow make it able to unreserve them later, when free() is called. We fall short of any real solution for this in our last post. In this post, though, we take the first, most fundamental steps to bring real memory efficient to our implementations of malloc() and free()!

This post is part of a series on implementing the malloc() and free() functions. Previously, we implemented a rather simplistic approach that almost doesn’t free any memory: a pointer points to the last allocated block, enabling free() to deallocate it, but only it.

A better option is to make the last block point to the second-to-last, the second-to-last block to the third-to-last, and so on, forming a linked list. To achieve this, we create a struct that will serve as the header of the blocks, containing a pointer to the previous block:

typedef struct Header {
  struct Header *previous;
} Header;

Additionally, the pointer to the last block, which used to be void*, is now of type Header*:

Header *last = NULL;

To use these headers, abmalloc() reserves enough memory to store both the header and the requested size:

void *abmalloc(size_t size) {
  Header *header = sbrk(sizeof(Header) + size);

In this way, we use the beginning of the block to store necessary information, such as a pointer to the last allocated block before the new one:

  header->previous = last;

Then, we update last to point to the new block:

  last = header;

Finally, we return a pointer to the memory that the user can use. Since header points to the metadata, we cannot simply return it. Otherwise, all header information would be overwritten when the user used the pointer! Instead, we return a pointer to just after the header. This pointer is easy to calculate: it is the memory address of the header plus the size of the header:

  return header + 1;
}

Note how we increment the header pointer by 1. Since the pointer type is Header*, the increment is actually the number of bytes of the Header struct, not just one byte. The type of the pointer is very relevant in pointer arithmetic.

Now that our memory blocks have metadata at the beginning, we need to take this into account when deallocating. free() receives a pointer not to the start of the block but to the memory made available to the user. Therefore, we need to find the start of the block from the pointer the user passed. Nothing that a little pointer arithmetic can’t solve:

void abfree(void *ptr) {
  Header *header = (Header*) ptr - 1;

If header points to the last allocated block, the previous block will become the last. In this case, we can return memory from the heap to the operating system through brk():

  if (header == last) {
    last = header->previous;
    brk(header);
  }
}

Here are our new malloc() and free() functions:

typedef struct Header {
   struct Header *previous;
 } Header;

 Header *last = NULL;

 void *abmalloc(size_t size) {
   Header *header = sbrk(sizeof(Header) + size);
   header->previous = last;
   last = header;
   return header + 1;
 }

 void abfree(void *ptr) {
   Header *header = (Header*) ptr - 1;
   if (header == last) {
     last = header->previous;
     brk(header);
   }
 }

abmalloc() and abfree() may be slightly more memory-efficient now, but not by much. Dynamically allocated memory rarely behaves like a stack, where the oldest block is always deallocated first. In the next post, we will see how to use the memory of older blocks that are no longer in use.

(This post is a translation of Implementando malloc() e free() — adicionando metadados aos blocos de memória, from Suspensão de Descrença.)

Implementing malloc() and free() — first steps

Following the wonderful journey that is reading Crafting Interpreters, I reached the point where we implemented an interpreter in C! As always, Bob Nystrom mercilessly proposes very interesting challenges that keep us busy for long periods. For instance, in this chapter, he suggests implementing our own memory allocator, without any real need! Inevitably, I was nerdsniped.

The challenge allows us to allocate a large memory region with an existing malloc() function and manage it, but I decided to implement the malloc() from scratch. Since I use Ubuntu, it was necessary to first understand the memory layout of a process on Linux better.

Consider the diagram below, which represents the memory layout of a process.

In the memory allocated for the process, there are various sections. When the program starts its execution, the shaded part is not yet in use. Throughout its execution, the program declares local variables, causing the stack to grow backward.

On the other hand, dynamically allocated memory is obtained from the heap, which grows in the opposite direction. The popular way to expand the heap is by increasing the size of the data segment (i.e., the section that contains global and static variables) with the sbrk() system call.

Diagram representing how srbk() works, by increasing the data segment pointer but returning the old value.

The above diagram illustrates how this functional system call works. sbrk() takes an integer parameter that will be added to the pointer indicating the end of the data segment. After that, sbrk() returns the value of the pointer before the increment.

In a way, the behavior of sbrk() is already sufficient for memory allocation. Our malloc() function can simply invoke sbrk() and return to the user the pointer to the beginning of the allocated memory block:

void *abmalloc(size_t size) {
   return sbrk(size);
}

In principle, free() doesn’t need to do anything: since in this implementation, we always use memory from the top of the heap, there is nothing we can do to reuse older memory blocks. In that sense, free() can perfectly be a no-op:

void abfree(void *ptr) {
}

A useful operation can be done, however, if the block to be freed is the last one allocated. This means it is at the top of the stack, so we just need to move the stack pointer back with the brk() system call. This syscall takes a pointer as a parameter and, if this pointer is a “reasonable” value (not null, does not point into the stack, does not point before the heap), it uses the pointer’s value as the new top of the heap. The result would be something like this:

void abfree(void *ptr) {
  if (ptr == last_block) {
      brk(last_block);
  }
}

This deallocation, however, is practically useless. Consider the example below:

void *ptr1 = abmalloc(8);
void *ptr2 = abmalloc(8);
abfree(ptr2);
abfree(ptr1);

With the current version of abfree(), we can free the memory pointed to by ptr1, but not the one pointed to by ptr2. To be able to free ptr2, it would be necessary to know that, once ptr1 has been deallocated, the next last block is ptr2. Could we create a second_last_block variable? It wouldn’t help: we would have the same problem with the penultimate block, and so on.

We need a more powerful data structure here, and that’s what we’ll see in our next post.

(This post is a translation of Implementando malloc() e free() — primeiros passos, originally published in Suspensão de Descrença.)

Test utilities, or set-up methods considered harmful

One of the most interesting learnings I had in the old SEA Tecnologia is the creation of test utilities .

Test utilities are a way to reuse code in unit tests. Usually, this is done using setUpor @Before methods, but this has some disadvantages. For example, in a test case, we can have the following initialization:

private Address address;
private AddressDAO addressDAO;

@Before
public void setUp() {
    address = new Address();
    address.setStreet("Rua fulano");
    address.setNumber("123/A");
    addressDAO = new AddressDAO();
}

This initialization works well in the test below…

@Test
public void testGetAllAddresses(){
    addressDAO.addAddress(address);

    List<Address> addresses = addressDAO.getAllAddresses();

    assertEquals(1, addresses.size());
    assertEquals("Rua fulano", addresses.get(0).getStreet());
    assertEquals("123/A", addresses.get(0).getNumber());
}

However, in the following test, the created object is a waste, is not used at all:

@Test
public void testGetNoAddress() {
    List<Address> addresses = addressDAO.getAllAddresses();

    assertEquals(0, addresses.size());
}

In the next following test, we have code redundancy. We also have to decide whether the other object should be created in the @Before method or in the test method:

@Test
public void testGetAllAddressesMoreThanOne() {
    addressDAO.addAddress(address);
    Address address2 = new Address();
    address2.setStreet("Outra rua");
    address2.setNumber("111");
    addressDAO.addAddress(address2);
    List<Address> addresses = addressDAO.getAllAddresses(); 
    assertEquals(1, addresses.size());
    assertEquals("Rua fulano", addresses.get(0).getStreet());
    assertEquals("123/A", addresses.get(0).getNumber()); 
}

These inconveniences are minor when compared to the task of creating a network of dependencies. For example, to test a class Person that adds a test case Address in another test case, we will have to have one @Before similar to this:

private Person person;
private Address address;
private PersonDAO personDAO;

@Before     
public void setUp() {
    address = new Address();
    address.setStreet("Rua fulano");
    address.setNumber("123/A");
    person = new Person();
    person.setName("João");
    person.setAddress(address);
    personDAO = new PersonDAO();
}

The code for creating addresses was duplicated, and it is difficult to create the dependencies. In these examples, we see simple cases, but it is easy to see how the situation would get complicated.

We solve this problem by creating a class to create these objects. This class would be something like this:

public class TestUtil {
    public static Address utilCreateAddress(String street, String number) {
        Address address = new Address();
        address.setStreet("Rua fulano");
        address.setNumber("123/A");
        return address;     
    }

    public static Person utilCreatePerson(String name, Address address) {
        Person person = new Person();
        person.setName(name);
        person.setAddress(address);
        return person;
    }
}

Our test cases extended TestUtil, making object creation easier:

public class TestAddress2 extends TestUtil {
    private AddressDAO addressDAO = new AddressDAO();

    @Test
    public void testGetAllAddresses() {
        Address address = utilCreateAddress("Rua fulano", "123/A");
        addressDAO.addAddress(address);

        List<Address> addresses = addressDAO.getAllAddresses();

        assertEquals(1, addresses.size());
        assertEquals("Rua fulano", addresses.get(0).getStreet());
        assertEquals("123/A", addresses.get(0).getNumber());
    }

    @Test
    public void testGetNoAddress() {
        List<Address> addresses = addressDAO.getAllAddresses();

        assertEquals(0, addresses.size());
    }

    @Test
    public void testGetAllAddressesMoreThanOne() {
        Address address = utilCreateAddress("Rua fulano", "123/A");
        Address address2 = utilCreateAddress("Outra rua", "111");
        addressDAO.addAddress(address);
        addressDAO.addAddress(address2);

        List<Address> addresses = addressDAO.getAllAddresses();

        assertEquals(2, addresses.size());
        assertEquals("Rua fulano", addresses.get(0).getStreet());
        assertEquals("123/A", addresses.get(0).getNumber());
    } 
}

As we also frequently needed some specific object with just one or two parameters to be defined, we created methods variants:

public static Address utilCreateAddress() {
    return utilCreateAddress("Qualquer", "Qualquer");
}

public static Person utilCreatePerson() {
    return utilCreatePerson("José", utilCreateAddress());
}

We learned that in a somewhat complex project, with large networks of object dependencies. The use of these test utilities made it possible to practice TDD on the system. It was exciting to discover that, to create that document that depended on seven other documents and five or six users, all you had to do was call a method.

Of course, there is more to our testing utilities than has been written here, and there may be even more that we haven’t even done. (For example, it may be interesting to write test utilities for specific classes, instead of one gigantic utility.) However, as the idea is very simple, we hope this first step motivates you to think about the topic. Until later!

(This is a translation of the post “Utilitários de Teste” from Suspensão de Descrença. It was originally posted in the old SEA Tecnologia blog. As the original post went offline but the topic remains relevant, I decided to republish it.)

No comments. Now what?

Traditionally, it is considered good practice to comment code. However, this wisdom has been revisited in recent times. At Liferay, for example, we follow a policy of not commenting code. Personally, I am an enthusiast of this philosophy. But I don’t want to present or defend this strategy here, there is a lot of good material on this subject. I want to discuss an open question.

Whoever comments wants to convey some important information. What information is this? And, most importantly, where can we register it? Let’s look at some alternatives.

What do these lines do?

Function names are excellent for explaining what the code does. If a block of code requires a comment, consider extracting it into a function or class. The name of the entity will already clarify its purpose.

Observe, for example, the lines below, taken from this test class:

Assert.assertNotNull(recurrence);
Assert.assertNull(recurrence.getUntilJCalendar());
Assert.assertEquals(0, recurrence.getCount());

These lines check if an event’s RRule has certain properties: it must exist, have a null “untilCalendar, and a count of zero.

The concepts are complex; even I would be confused rereading these asserts. A comment could explain them. But this commit already clarified everything by moving these lines to a method and invoking it:

assertRepeatsForever(recurrence);

Those assertions were checking if the event repeats indefinitely! No comment was needed—fortunately, as these asserts were in various tests.

What is happening here?

If a comment would explain something relevant at runtime, consider turning it into a log message! Check the example below:

if (Validator.isBlank(serviceAccountKey)) {
    // If no credentials are set for GCS Store, the library will
    // use Application Default Credentials.
    _googleCredentials =
        ServiceAccountCredentials.getApplicationDefault();
}
else {
    _googleCredentials = ServiceAccountCredentials.fromStream(
        new ByteArrayInputStream(serviceAccountKey.getBytes()));
}

This comment may be relevant to someone reading the code. However, it would be crucial for someone investigating an authentication issue. Therefore, in practice, I chose to log a message:

if (Validator.isBlank(serviceAccountKey)) {
    if (_log.isInfoEnabled()) {
        _log.info(
            "No credentials set for GCS Store. Library will use " +
                "Application Default Credentials.");
    }

    _googleCredentials =
        ServiceAccountCredentials.getApplicationDefault();
}
else {
    _googleCredentials = ServiceAccountCredentials.fromStream(
        new ByteArrayInputStream(serviceAccountKey.getBytes()));
}

Why are they here?

Comments to explain why certain lines are there are also common. A better place to share this information is in commit messages.

These days, for example, I was asked to help with some code I worked on years ago. Reading a JSP—remember, years ago—I found these lines:

<liferay-portlet:renderURL portletName="<%= KaleoDesignerPortletKeys.KALEO_DESIGNER %>" var="viewURL">
    <portlet:param name="mvcPath" value="/designer/view_kaleo_definition_version.jsp" />
    <portlet:param name="redirect" value="<%= currentURL %>" />
    <portlet:param name="name" value="<%= kaleoDefinitionVersion.getName() %>" />
    <portlet:param name="draftVersion" value="<%= kaleoDefinitionVersion.getVersion() %>" />
</liferay-portlet:renderURL>

This tag is generating a URL to be used elsewhere. But my trained eyes found the portletName parameter strange. This value is usually set automatically.

A git blame clarified everything when I found this commit. The message is clear:

LPS-74977 / LPS-73731 By making the render URL explicitly use the Kaleo Designer name, it will be valid when rendered in another portlet.

I get it! This code will probably be invoked by some other portlet. In this case, the value would be automatically set by the other application, and for some reason, we want to avoid that.

(By the way, that’s why I prefer small commits: they make it easier to discover the reason for very specific code snippets. It’s like every line of code has a comment! It’s not a unanimous position, though: some prefer larger commits.)

The purpose of the line was clarified. But why can it be invoked by another application? This is not usual…

Why was this change made?

Well-written code explains how something was implemented. The commit message explains the why, but in a local context. How do you explain the broader motivation behind code without resorting to comments?

Issue tracker tickets are excellent for this. Typically written to guide development, these documents are very helpful in interpreting the code. If we add the ticket key to the commit message, we can track the reasons.

Returning to the example above. We found that a line allows using the same code in multiple portlets. But this is rarely necessary. Why do we need to reuse the code in this case? Fortunately, the message mentions two tickets. I checked the older one; I arrived at LPSA-64324:

[Information Architecture] EE — As a portal admin, I would like to access all workflow portlets from the control panel section under the same tab.

The title already helps, and the text clarifies it even more. For usability reasons, three different applications started appearing in tabs of the same portlet. It makes complete sense!

The comments we like

It’s important to highlight that we are trying to avoid disorganized comments that intertwine with the code and attempt to explain difficult-to-understand sections. There are various comments, often with standardized formats, that do not hinder readability. An obvious example is the copyright header.

Another effective way to use comments is through literate programming. In this programming style, comments take the spotlight: the source code contains more prose than executable code. This is useful when explaining the algorithm is more important than reading it, as in academic research and data analysis. Not surprisingly, it is the paradigm of popular tools like Jupyter Notebook and Quarto.

Even more relevant, tools like Javadoc, JSDoc, Doxygen, etc. read comments in a specific format to generate documentation. These comments do not affect readability. On the contrary, Javadocs are great for explaining how to use these entities. Combined with tools like my dear Doctest, we even get guarantees of accuracy and correctness!

A World of Possibilities

These are just a few examples of alternatives to comments. There are many other options, such as wikis and blogs. I’ve even found explanations for code I wrote myself on Stack Overflow! We can think of even more solutions to meet different needs. The key point is that with these tools at our disposal, adding comments directly to the code becomes unnecessary.

Naturally, not commenting is just one way to write readable code. Comments are not forbidden; in fact, there are strategies that can make them effective. However, in my experience, avoiding comments often leads to better results, and these techniques help document important information that doesn’t fit directly into the code.

Are you a follower of the “no comments” strategy? If so, where else do you convey information? If not, how do you ensure effective comments? What type of comment do you not see being replaced by these approaches? I’d love to hear your opinions.

(This post is a translation of “Sem Comentários. E Agora?from my blog Suspensão de Descrença.)

Importing ES 6 Modules from CommonJS

Here at Liferay, a few days ago, we needed to use the p-map package. There was only one problem: our application still uses the CommonJS format, and p-map releases ES6 modules only. Even some of the best references I found (e.g. this post) made it clear that it would not be possible to import ES6 modules from CommonJS.

The good news is that this is no longer true! Using dynamic import, we can load ES6 modules from CommonJS. Let’s look at an example.

In this project, the importer.js file tries to use require() to import an ES6 module:

const pmap = require('p-map');

exports.importer = () => {
  console.log('Yes, I could import p-map:', pmap);
}

Of course, it doesn’t work:

$ node index.js 
internal/modules/cjs/loader.js:1102
      throw new ERR_REQUIRE_ESM(filename, parentPath, packageJsonPath);
      ^

Error [ERR_REQUIRE_ESM]: Must use import to load ES Module: /home/adam/software/es6commonjs/node_modules/p-map/index.js
require() of ES modules is not supported.
require() of /home/adam/software/es6commonjs/node_modules/p-map/index.js from /home/adam/software/es6commonjs/importer.js is an ES module file as it is a .js file whose nearest parent package.json contains "type": "module" which defines all .js files in that package scope as ES modules.
Instead rename index.js to end in .cjs, change the requiring code to use import(), or remove "type": "module" from /home/adam/software/es6commonjs/node_modules/p-map/package.json.

    at new NodeError (internal/errors.js:322:7)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1102:13)
    at Module.load (internal/modules/cjs/loader.js:950:32)
    at Function.Module._load (internal/modules/cjs/loader.js:790:12)
    at Module.require (internal/modules/cjs/loader.js:974:19)
    at require (internal/modules/cjs/helpers.js:101:18)
    at Object.<anonymous> (/home/adam/software/es6commonjs/importer.js:1:14)
    at Module._compile (internal/modules/cjs/loader.js:1085:14)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
    at Module.load (internal/modules/cjs/loader.js:950:32) {
  code: 'ERR_REQUIRE_ESM'
}

The solution is to convert require() into a dynamic import. But there is one detail: import imports return Promises. There are many ways to deal with this; the simplest one is probably to make our function asynchronous, like in this version:

exports.importer = async () => {
  const pmap = await import('p-map');
  console.log('Yes, I could import p-map:', pmap);
}

Now our little app works!

$ node index.js 
ok
Yes, I could import p-map: [Module: null prototype] {
  AbortError: [class AbortError extends Error],
  default: [AsyncFunction: pMap],
  pMapSkip: Symbol(skip)
}

Some other adjustments may be necessary. (I had to adjust the eslint settings, for example.) The important thing is that this is possible. And it’s not a kludge: Node’s own documentation recommends this approach.

So, don’t be scared by outdated information: you won’t need to rewrite your entire application as ES 6 modules, at least for now. For us, this was quite a relief!

(This post is a translation of Importando Módulos ES6 em CommonJS, first published in Suspensão da Descrença.)

Don’t Interpret Me Wrong: Improvising Tests for an Interpreter

I’m in love with the Crafting Interpreters book. In it, Bob Nystrom teach us how to writer an interpreter by implementing a little programming language called Lox. It was a long time since I had so much fun programming! Besides being well-written, the book is funny and teach way more than I would expect. But I have a problem.

The snippets in the bug are written in a way we can copy and paste them. However, the book has challenges at the end of each chapter, these challenges have no source code and sometime they force us to change the interpreter a lot. I do every one of these exercises and as a result my interpreter diverges too much from the source in the book. Consequently, I often break some part of my interpreter.

How to solve that?

Unity tests would be brittle since the code structure changes frequently. End-to-end tests seem more practical in this case. So, for each new feature of the language, I wrote a little program. For example, my interpreter should create closures, and to ensure that I copied the Lox program below to the file counter.lox:

return count;
}

var counter = makeCounter();
counter(); // “1”.
counter(); // “2”.</code></pre>
<p>

This program result should be the numbers 1 and 2 printed in different lines. So I put these values in a file called counter.lox.out. The program cannot fail either, so I created an empty file called counter.lox.err. (In some cases, it is necessary to ensure the Lox program will fail. In these cases, the file .lox.err should have content.)

Well, I wrote programs and output files for various examples; now I need to compare the programs’ results to the expected outputs. I decided to use the tool that helps me the most in urgent times: shell script. I did a Bash script with a for iterating over all examples:

done</code></pre>
<p>

For each example, I executed the Lox program, redirecting the outputs to temporary files:

Now, we compare the real output with the expected output through diff. When it compares two files, diff returns 0 if there is no difference, 1 if there exists a difference or 2 in case of error. Since in Bash the conditional if considers 0 as true, we just check the negation of diff‘s exit code.

If the program prints something in standard output that is different from what is in its .lox.out file, we have a failure:

if ! diff $l.out $out
then
FAIL=1
fi
done</code></pre>
<p>

We also check the standard error and the .lox.err file:

if ! diff $l.out $out
then
FAIL=1
fi

if ! diff $l.err $err
then
FAIL=1
fi
done</code></pre>
<p>

Finally, I check if there was some failure and report the result:

if ! diff $l.out $out
then
FAIL=1
fi

if ! diff $l.err $err
then
FAIL=1
fi

if [ &quot;$FAIL&quot; = &quot;1&quot; ]
then
echo &quot;FAIL&quot; $l
else
echo &quot;PASS&quot; $l
fi
done</code></pre>
<p>

Not all of my Lox programs can be checked, though. For example, there is a program which times loop executions, it is impossible to anticipate the value it will print. Because of that, I added the possibility to jump some programs: we need just to create a file with the .lox.skip extension:

out=$(mktemp)
err=$(mktemp)
java -classpath target/classes/ br.com.brandizzi.adam.myjlox.Lox $l &gt; $out 2&gt; $err

if ! diff $l.out $out
then
FAIL=1
fi

if ! diff $l.err $err
then
FAIL=1
fi

if [ &quot;$FAIL&quot; = &quot;1&quot; ]
then
echo &quot;FAIL&quot; $l
else
echo &quot;PASS&quot; $l
fi
done</code></pre>
<p>

If, however, I have a Lox example and it does not have expected output files (nor the .lox.skip file) then I have a problem and the entire script fails:

out=$(mktemp)
err=$(mktemp)
java -classpath target/classes/ br.com.brandizzi.adam.myjlox.Lox $l &gt; $out 2&gt; $err

if ! diff $l.out $out
then
FAIL=1
fi

if ! diff $l.err $err
then
FAIL=1
fi

if [ &quot;$FAIL&quot; = &quot;1&quot; ]
then
echo &quot;FAIL&quot; $l
else
echo &quot;PASS&quot; $l
fi
done</code></pre>
<p>

With that, my test script is done. Let us see how it behaves:

$ ./lcheck.sh
PASS examples/attr.lox
PASS examples/bacon.lox
PASS examples/badfun.lox
PASS examples/badret.lox
PASS examples/bagel.lox
PASS examples/bostoncream.lox
PASS examples/cake.lox
PASS examples/checkuse.lox
PASS examples/circle2.lox
PASS examples/circle.lox
1d0
< 3
1c1
<
---
> [line 1] Error at ',': Expect ')' after expression.
FAIL examples/comma.lox
PASS examples/counter.lox
PASS examples/devonshinecream.lox
PASS examples/eclair.lox
PASS examples/fibonacci2.lox
PASS examples/fibonacci.lox
PASS examples/func.lox
PASS examples/funexprstmt.lox
PASS examples/hello2.lox
PASS examples/hello3.lox
PASS examples/hello.lox
PASS examples/math.lox
PASS examples/notaclass.lox
PASS examples/noteveninaclass.lox
PASS examples/point.lox
PASS examples/retthis.lox
PASS examples/scope1.lox
PASS examples/scope.lox
PASS examples/supersuper.lox
PASS examples/thisout.lox
PASS examples/thrice.lox
SKIP examples/timeit.lox
PASS examples/twovars.lox
PASS examples/usethis.lox
PASS examples/varparam.lox

Oops, apparently I removed the support for the comma operator by accident. Good thing I wrote this script, right?

I hope this post was minimally interesting! Now, I am going to repair my comma operator and keep reading this wonderful book.

(This post is a translation of Não me Interprete Mal: Improvisando Testes para um Interpretador.)

Give Doctest a chance

Doctest is one of my favorite Python modules. With doctest, it is possible to execute code snippets from documentation. You could, for example, write something like this in your  turorial.md

>>> f()
1

…and then execute the command python -mdoctest tutorial.md. If f() returns 1, nothing will happen. If it returns something else, though, an error message will appear, similar to this one:

**********************************************************************
File "f.txt", line 2, in f.txt
Failed example:
    f()
Expected:
    1
Got:
    2
**********************************************************************
1 items had failures:
   1 of   2 in f.txt
***Test Failed*** 1 failures.

It is an impressive tool, but also an unpopular one.  The problem is, Doctest is often improperly used. For example, it is common to try to write unit tests with doctests. Great mistake.

Nonetheless, I believe it is unfair to disregard the module due to these misunderstandings. Doctest can and should be used for what it does best: to keep your documentation alive, and even to guide your development!

Let me show an example.

When you don’t know what to do

Some days ago, I was writing a class to modify an HTML document using xml.dom.minidom. At one point, I needed a function to map CSS classes to nodes from the document. That alone would be a complicated function! I had no idea of where to start.

In theory, unit tests could be useful here. They just would not be very practical: this was an internal, private function, an implementation detail. To test it, I’d have to expose it. We would also need a new file, for the tests. And test cases are not that legible anyway.

Reading the documentation from the future

Instead, I documented the function first. I wrote a little paragraph describing what it would do. It alone was enough to clarify my ideas a bit:

Given an xml.dom.minidom.Node, returns a map
from every “class” attribute to a list of nodes
with this class.

Then, I though about how to write the same thing, but with a code example. In my head, this function (which I called get_css_class_dict()) would receive xml.dom.minidom document. So, I wrote an example:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''')

Given this snippet, I would expect the function to return a dict. My document has two CSS classes, “a” and “b,” and then my dict would have two keys. Each key would have a list of the nodes with the CSS class. Something like this:

    >>> d = get_css_class_dict(doc)
    >>> d['a']  # doctest: +ELLIPSIS
    [, ]
    >>> d['b']  # doctest: +ELLIPSIS
    []

I put these sketches in the docstring  of get_css_class_dict(). So far, we have this function:

def get_css_class_dict(node):
    """
    Given an xml.dom.minidom.Node, returns a map from every "class" attribute
    from it to a list of nodes with this class.

    For example, for the document below:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''') ...we will get this: >>> d = get_css_class_dict(doc) >>> d['a'] # doctest: +ELLIPSIS [, ] >>> d['b'] # doctest: +ELLIPSIS [] """ pass

I could do something similar with unit tests but there would be much more code around, polluting the documentation. Besides that, the prose graciously complements the code, giving rhythm to the reading.

I execute the doctests and this is the result:

**********************************************************************
File "vtodo/listing/filler.py", line 75, in filler.get_css_class_dict
Failed example:
    d['a']
Exception raised:
    Traceback (most recent call last):
      File "/usr/lib/python3.6/doctest.py", line 1330, in __run
        compileflags, 1), test.globs)
      File "", line 1, in 
        d['a']
    TypeError: 'NoneType' object is not subscriptable
**********************************************************************
File "vtodo/listing/filler.py", line 77, in filler.get_css_class_dict
Failed example:
    d['b']
Exception raised:
    Traceback (most recent call last):
      File "/usr/lib/python3.6/doctest.py", line 1330, in __run
        compileflags, 1), test.globs)
      File "<https://suspensao.blog.br/disbelief/wp-admin/edit-tags.php?taxonomy=category;doctest filler.get_css_class_dict[3]>", line 1, in 
        d['b']
    TypeError: 'NoneType' object is not subscriptable
**********************************************************************
1 items had failures:
   2 of   4 in filler.get_css_class_dict
***Test Failed*** 2 failures.

I’m following test-driven development, basically, but with executable documentation. At once, I got a readable example and a basic test.

Now, we just need to implement the function! I used some recursion and, if the code is not the most succinct ever at first…

def get_css_class_dict(node):
    """
    Given an xml.dom.minidom.Node, returns a map from every "class" attribute
    from it to a list of nodes with this class.

    For example, for the document below:

    >>> doc = xml.dom.minidom.parseString(
    ...     '''
    ...     
    ...         
    ...


... ... ... ''') ...we will get this: >>> d = get_css_class_dict(doc) >>> d['a'] # doctest: +ELLIPSIS [, ] >>> d['b'] # doctest: +ELLIPSIS [] """ css_class_dict = {} if node.attributes is not None and 'class' in node.attributes: css_classes = node.attributes['class'].value for css_class in css_classes.split(): css_class_list = css_class_dict.get(css_class, []) css_class_list.append(node) css_class_dict[css_class] = css_class_list childNodes = getattr(node, 'childNodes', []) for cn in childNodes: ccd = get_css_class_dict(cn) for css_class, nodes_list in ccd.items(): css_class_list = css_class_dict.get(css_class, []) css_class_list.extend(nodes_list) css_class_dict[css_class] = css_class_list return css_class_dict

…at least it works as expected:

$ python -mdoctest vtodo/listing/filler.py 
**********************************************************************
File "vtodo/listing/filler.py", line 77, in filler.get_css_class_dict
Failed example:
    d['b']  # doctest: +ELLIPSIS
Expected:
    []
Got:
    []
**********************************************************************
1 items had failures:
   1 of   4 in filler.get_css_class_dict
***Test Failed*** 1 failures.

Wait a minute. What was that?!

When the documentation is wrong

Well, there is a mistake in my doctest! The span element does not have the “b” class—the div element does. So, I just need to change the line

[<DOM Element: span at ...>]

to

[<DOM Element: div at ...>]

and the Doctest will pass.

Isn’t it wonderful? I found a slip in my documentation almost immediately. More than that: if my function’s behavior changes someday, the example from my docstring will fail. I’ll know exactly where the documentation will need updates.

Making doctests worth it

That is the rationale behind Doctest. Our documentation had a subtle mistake and we found it by executing it. Doctests do not guarantee the correctness of the code; they reinforces the correctness of documentation. It is a well-known aspect of the package but few people seem to believe it is worth it.

I think it is! Documentation is often deemed an unpleasant work but it does not have to be so.  Just as TDD make tests exciting, it is possible to make documentation fun with doctests.

Besides that, in the same way TDD can point to design limitations, a hard time writing doctests can point out to API problems. If it was hard to write a clear and concise example of use for your API, surrounded by explaining text, it is likely too complicated, right?

Give Doctest a chance

In the end, I see doctests limitations. They are surely inadequate for unit tests, for example.  And yet, doctest makes documenting so easy and fun! I don’t see why it is so unpopular.

Nonetheless, its greatest advantage is how doctest makes the development process easier. Some time ago, I joked that we need to create DocDD:

With Doctest, it is not just a joke anymore.

This post is a translation of Dê uma chance a Doctest.