January 2024 – Suspension of Disbelief

Implementing malloc() and free() — adding metadata to the memory blocks

When malloc() reserves blocks of memory, it needs to somehow make it able to unreserve them later, when free() is called. We fall short of any real solution for this in our last post. In this post, though, we take the first, most fundamental steps to bring real memory efficient to our implementations of malloc() and free()!

This post is part of a series on implementing the malloc() and free() functions. Previously, we implemented a rather simplistic approach that almost doesn’t free any memory: a pointer points to the last allocated block, enabling free() to deallocate it, but only it.

A better option is to make the last block point to the second-to-last, the second-to-last block to the third-to-last, and so on, forming a linked list. To achieve this, we create a struct that will serve as the header of the blocks, containing a pointer to the previous block:

typedef struct Header {
  struct Header *previous;
} Header;

Additionally, the pointer to the last block, which used to be void*, is now of type Header*:

Header *last = NULL;

To use these headers, abmalloc() reserves enough memory to store both the header and the requested size:

void *abmalloc(size_t size) {
  Header *header = sbrk(sizeof(Header) + size);

In this way, we use the beginning of the block to store necessary information, such as a pointer to the last allocated block before the new one:

  header->previous = last;

Then, we update last to point to the new block:

  last = header;

Finally, we return a pointer to the memory that the user can use. Since header points to the metadata, we cannot simply return it. Otherwise, all header information would be overwritten when the user used the pointer! Instead, we return a pointer to just after the header. This pointer is easy to calculate: it is the memory address of the header plus the size of the header:

  return header + 1;
}

Note how we increment the header pointer by 1. Since the pointer type is Header*, the increment is actually the number of bytes of the Header struct, not just one byte. The type of the pointer is very relevant in pointer arithmetic.

Now that our memory blocks have metadata at the beginning, we need to take this into account when deallocating. free() receives a pointer not to the start of the block but to the memory made available to the user. Therefore, we need to find the start of the block from the pointer the user passed. Nothing that a little pointer arithmetic can’t solve:

void abfree(void *ptr) {
  Header *header = (Header*) ptr - 1;

If header points to the last allocated block, the previous block will become the last. In this case, we can return memory from the heap to the operating system through brk():

  if (header == last) {
    last = header->previous;
    brk(header);
  }
}

Here are our new malloc() and free() functions:

typedef struct Header {
   struct Header *previous;
 } Header;

 Header *last = NULL;

 void *abmalloc(size_t size) {
   Header *header = sbrk(sizeof(Header) + size);
   header->previous = last;
   last = header;
   return header + 1;
 }

 void abfree(void *ptr) {
   Header *header = (Header*) ptr - 1;
   if (header == last) {
     last = header->previous;
     brk(header);
   }
 }

abmalloc() and abfree() may be slightly more memory-efficient now, but not by much. Dynamically allocated memory rarely behaves like a stack, where the oldest block is always deallocated first. In the next post, we will see how to use the memory of older blocks that are no longer in use.

(This post is a translation of Implementando malloc() e free() — adicionando metadados aos blocos de memória, from Suspensão de Descrença.)

Tiny Ticket Types

Tickets in Jira tend to accumulate redundant and optional fields, becoming complex and confusing. I like Jira, but I understand the frustration it causes.

A plausible solution could be inspired by software development. We programmers are used to finding massive source files, and we know that breaking them into smaller files drastically improves code comprehension. Therefore, inspired by coding best practices, I suggest creating smaller tickets.

Only three states

One way to limit the size of tickets is to simplify the workflow by restricting the number of states. For example, we can define that each type of ticket would have, at most, three states:

To do
In progress
Done

To represent other stages, we can create new types of tickets, such as sub-tasks.

A moderately complex ticket type

Let’s look at an example. Consider the ticket below:

Key: XYZ-1234. Status: Testing. Title: Nasal demons. Description: Calling free() on a previosly dealocaded pointer results in demons coming out of the nose. Technical analysis: The root cause is an undefined behavior. Test results: The patch does not work, now ghosts pop out of the user’s ears. Release date: 2023-12-22

It would follow this workflow:

Open ⇨ To do ⇨ In Analysis ⇨ Doing ⇨ Testing ⇨ Release ⇨ Done

How could we reduce the number of phases?

We can start by removing the “In Analysis” stage. In its place, we create a new type of ticket called “Technical Analysis.” This way, the original task remains in progress (“Doing”) while the technical analysis is underway.

Fewer fields in a ticket

An advantage of this would be transferring fields to sub-tasks. Fields that would clutter the original ticket can appear only in tasks where they are relevant.

Consider the “Release date” field, which only makes sense in the “Release” phase. If developers, testers, etc., are not responsible for the release, this field is confusing and pollutes the original task. With a new task type called “Release,” this field would be in the most appropriate place, keeping the original ticket concise.

Repeating stages without regressing

Another advantage is that the original ticket can go through the same stage multiple times. It’s common for a ticket to have a development phase followed by quality tests, for example. However, if a problem arises in the evaluation, it’s not advisable to revert to the development phase. How to deal with this?

By working with sub-tasks, we can mark validation as completed and create a new implementation ticket. In our ticket, for example, we can remove the “Testing” phase and create a sub-task of type “Test,” as well as another one called “Development.” Every time the test fails, we close testing and open a new development task.

Result

Following this strategy, our ticket would look like this:

And the workflow would be much simpler:

Naturally, this strategy is flexible. In our case, for example, we haven’t removed the “To do” phase yet. Restricting it to five (including backlog and validation) is another possibility. The core idea is to limit the number of stages to a small value for all tickets.

Conclusions

In programming, it’s common to encounter the so-called “God objects,” huge objects responsible for various different functions. Breaking them down is a surefire way to achieve code quality. Therefore, I suspect the same principle can apply to tickets in Jira.

I’m not a project manager, but as a programmer, I believe that limiting the size and steps of tickets can be an effective idea. I’m curious to know if anyone has tried this and how it went.

Implementing malloc() and free() — first steps

Following the wonderful journey that is reading Crafting Interpreters, I reached the point where we implemented an interpreter in C! As always, Bob Nystrom mercilessly proposes very interesting challenges that keep us busy for long periods. For instance, in this chapter, he suggests implementing our own memory allocator, without any real need! Inevitably, I was nerdsniped.

The challenge allows us to allocate a large memory region with an existing malloc() function and manage it, but I decided to implement the malloc() from scratch. Since I use Ubuntu, it was necessary to first understand the memory layout of a process on Linux better.

Consider the diagram below, which represents the memory layout of a process.

In the memory allocated for the process, there are various sections. When the program starts its execution, the shaded part is not yet in use. Throughout its execution, the program declares local variables, causing the stack to grow backward.

On the other hand, dynamically allocated memory is obtained from the heap, which grows in the opposite direction. The popular way to expand the heap is by increasing the size of the data segment (i.e., the section that contains global and static variables) with the sbrk() system call.

Diagram representing how srbk() works, by increasing the data segment pointer but returning the old value.

The above diagram illustrates how this functional system call works. sbrk() takes an integer parameter that will be added to the pointer indicating the end of the data segment. After that, sbrk() returns the value of the pointer before the increment.

In a way, the behavior of sbrk() is already sufficient for memory allocation. Our malloc() function can simply invoke sbrk() and return to the user the pointer to the beginning of the allocated memory block:

void *abmalloc(size_t size) {
   return sbrk(size);
}

In principle, free() doesn’t need to do anything: since in this implementation, we always use memory from the top of the heap, there is nothing we can do to reuse older memory blocks. In that sense, free() can perfectly be a no-op:

void abfree(void *ptr) {
}

A useful operation can be done, however, if the block to be freed is the last one allocated. This means it is at the top of the stack, so we just need to move the stack pointer back with the brk() system call. This syscall takes a pointer as a parameter and, if this pointer is a “reasonable” value (not null, does not point into the stack, does not point before the heap), it uses the pointer’s value as the new top of the heap. The result would be something like this:

void abfree(void *ptr) {
  if (ptr == last_block) {
      brk(last_block);
  }
}

This deallocation, however, is practically useless. Consider the example below:

void *ptr1 = abmalloc(8);
void *ptr2 = abmalloc(8);
abfree(ptr2);
abfree(ptr1);

With the current version of abfree(), we can free the memory pointed to by ptr1, but not the one pointed to by ptr2. To be able to free ptr2, it would be necessary to know that, once ptr1 has been deallocated, the next last block is ptr2. Could we create a second_last_block variable? It wouldn’t help: we would have the same problem with the penultimate block, and so on.

We need a more powerful data structure here, and that’s what we’ll see in our next post.

(This post is a translation of Implementando malloc() e free() — primeiros passos, originally published in Suspensão de Descrença.)