October 2024 – Suspension of Disbelief

In the previous post of this series, we saw how the order in which we choose memory blocks to reuse can lead to greater or lesser memory consumption, and we changed our functions to avoid this waste. But we need to solve another, even more serious, problem: sometimes, a very large memory block can occupy the space that several smaller blocks could use. Consider the case below, where we allocate a large chunk of memory, deallocate it, and then allocate two much smaller blocks:

void *ptr1 = abmalloc(128);
void *ptr2 = abmalloc(8);
abfree(ptr1);
void *ptr3 = abmalloc(8);
void *ptr4 = abmalloc(8);

Here, we have a free 128-byte memory block, and when we allocate a block of just 8 bytes, all 128 bytes become unavailable. When we allocate another 8-byte block, the heap needs to grow again. This is not an efficient use of memory.

There are at least two popular solutions for this case. One, more efficient, is to use bins: lists that group blocks by size. This is a more sophisticated and efficient approach, but more complex. Another option, simpler, is to find a large block and split it into smaller blocks. We’ll follow this approach.

But remember: simpler doesn’t exactly mean simple 😉

Initial Refactoring

Before we begin, let’s do a small refactoring. Currently, the header_new() function does two things: it allocates more memory for a new block and initializes its header, setting the metadata and pointers to the previous block. The part of initializing the header might be useful, so let’s extract it. We’ll create two new functions to improve readability:

The header_plug() function, which “plugs” the initialized block to the previous and next blocks.
The header_init() function, which sets the initial values of the block’s metadata (size and availability).

Here’s how they look:

void header_init(Header *header, size_t size, bool available) {
    header->size = size;
    header->available = available;
}

void header_plug(Header *header, Header *previous, Header *next) {
    header->previous = previous;
    if (previous != NULL) {
        previous->next = header;
    }
    header->next = next;
    if (next != NULL) {
        next->previous = header;
    }
}

Now, we just need to modify header_new() to use these new functions:

Header *header_new(Header *previous, size_t size, bool available) {
    Header *header = sbrk(sizeof(Header) + size);
    header_init(header, size, available);
    header_plug(header, previous, NULL);
    return header;
}

(Additionally, we can remove the line last->previous->next = last; from the abmalloc() function, since header_plug() now takes care of that.)

Splitting Blocks

With these tools in hand, let’s create the header_split() function. Given a header and a minimum required size, this function splits the memory block into two if the original block is large enough to contain

the required size,
a new header for the new block, and
a bit of extra memory.

First, we check if the block is large enough:

Header *header_split(Header *header, size_t size) {
    size_t original_size = header->size;
    if (original_size >= size + sizeof(Header)) {

If this condition is met, we split the block. First, we reduce the size of the current block by subtracting the size of a header and the space requested by abmalloc:

header->size = original_size - size - sizeof(Header);

This leaves a memory space after the current block, which we’ll use to create the new block. We calculate the pointer for this new block:

Header *new_header = header + sizeof(Header) + header->size;

Now that we have the pointer to the new block, we initialize its header with header_init():

header_init(new_header, size, true);

And we connect the new block to the previous and next blocks using header_plug():

header_plug(new_header, header, header->next);

If the original block was the last one, the new block will now be the last, so we update the last pointer:

if (header == last) {
    last = new_header;
}

Finally, we return the new block:

return new_header;

If the original block is not large enough, we simply return the original block:

} else {
    return header;
}
}

Updating `abmalloc()`

Now, we just need to go back to the abmalloc() function, and in the place where we find a usable block, we invoke header_split() to try to split it:

if (header->available && (header->size >= size)) {
    header = header_split(header, size);
    header->available = false;
    return header + 1;
}

If the block can be split, the new block will be returned. Otherwise, the original block will be kept and returned as before.

Note on Block Splitting

Notice that we created the new block at the end of the original block. We could have created it at the beginning, but by creating the new used block at the end, the new free block stays closer to older blocks. This way, it will be found first the next time abmalloc() is invoked.

Splitting large memory blocks is a step forward, but there’s an opposite problem: small memory blocks can cause fragmentation, making larger requests cause the heap to grow. We’ll see how to solve this in the next post.

Month: October 2024

Implementing malloc() and free() — splitting large blocks

Initial Refactoring

Splitting Blocks

Updating `abmalloc()`

Note on Block Splitting

Initial Refactoring

Splitting Blocks

Updating abmalloc()

Note on Block Splitting

Updating `abmalloc()`