malloc & free
2022 May 07
See all posts
malloc & free
malloc & free
The post does walkthrough of a basic malloc/free implementation. It won't contain all the optimizations and concerns which the excellent Douglas Lea's Malloc aka dlmalloc covers. But the aim is to highlight the underlying system calls and internal structures which drives these memory management APIs.
brk and sbrk system calls

Resizing the heap (for allocating or freeing memory) is all about getting the kernel to adjust its idea of the process's program break – which is essentially the current limit of the heap. brk()
and sbrk()
system calls allow to manipulate the program break, thereby creating or freeing space on the heap.
#include <unistd.h>
void *
brk(const void *addr);
void *
sbrk(int incr);
In short, brk()
sets the program break to the input addr. sbrk()
accepts offset and increments the current program break by incr
. It returns the location of the last program break i.e. start of the incremented heap segment.
You can get the current program break by using sbrk(0)
– something we'll be using in the discussed implementation below.
How malloc & free works
With the brk/sbrk introduced, we can now start thinking about what else is needed for a malloc/free implementation. For this, let's start with their APIs:
void *
malloc(size_t size);
void
free(void *ptr);
A naive implementation
Let's start with malloc – it returns a pointer to a block of memory of size
bytes. A naive implementation just increments the program break.
void *malloc(size_t size) {
return sbrk(size);
}
This as you can imagine is horribly inefficient, as the program break always goes up and is never reset to lower positions. As the process proceeds, there will be heap segments (between &end
and program break) that gets freed up (holes). But the naive implementation above makes no use of those holes. In fact, a free()
here doesn't make any sense, as the malloc never utilises the space freed up.
Free list
It makes sense for the free()
to capture the deallocated memory chunks for subsequent use by the malloc()
. The chunks can be of different sizes, and we can find a chunk which fits the requested size.
malloc'ing a free chunk means it's not available for further allocation requests, until it is freed again. We define the following doubly linked list struct which resolves these requirements (calling it free list):
typedef struct free_l {
size_t chunk_size;
struct free_l* previous;
struct free_l* next;
} free_l;
free_l* head = NULL;
A malloc'ed segment looks like:

note that,
(actual) memory size allocated =
memory size requested (argument of `malloc`) + sizeof(size_t)
When the allocated segment is freed, the "memory size requested" portion is used to store the previous
and next
free list entries.

Extending further on how this might work. Say we have a free list entry f_entry
which is suitable to serve the malloc request, we need to do the following: - remove the f_entry
from the free list - return &(f_entry->previous)
, which is the memory addr at which chunk_size
ends and can be used by user.
The Implementation
Let's look at the malloc implementation. This section breakdowns the code. If you prefer to just read the complete code, have a look at the gist
Starting with the case when free list is empty:
static const size_t CHUNK_SIZE_SIZE = sizeof(size_t);
// size needed for bookeeping requirement via free list entry
static const size_t FREE_L_SIZE = CHUNK_SIZE_SIZE + 2 * sizeof(void*);
void* malloc(size_t size) {
// blob = free list entry 'header' + requested memory
size_t blob_size = size + FREE_L_SIZE;
if (head == NULL) {
// free list is empty
free_l* entry = sbrk(blob_size);
if (entry == NULL) {
return NULL;
}
entry->chunk_size = blob_size - CHUNK_SIZE_SIZE;
entry->previous = entry->next = NULL;
return &entry->previous; // skip 'chunk_size'
}
// rest of the code is left for brevity
}
Notice the two static variables there. CHUNK_SIZE_SIZE
is what it takes to store the chunk_size
; While in a free list, the size taken for the record keeping is two pointers + 1 size_t
variable, which reflects in FREE_L_SIZE
.
The rest of the code is about adjusting the page break, setting the values in free list entry and returning the address for the user.
Next we look at the case when the free list has some entries, and can use be used to allocate the memory. This would involve traversing the free list and finding the chunk which best fits the request. This is called best-fit. Other algorithms like first-fit can also be used.
// best fit strategy
free_l* best_candidate = NULL;
long int best_gap = LONG_MAX;
for (free_l* curr = head; curr != NULL; curr = curr->next) {
int hole_sz = curr->chunk_size - size;
if (hole_sz < best_gap && hole_sz >= 0) {
best_gap = hole_sz;
best_candidate = curr;
}
}
If we found a best_candidate
here, we pop it off the free list, and return the &best_candidate->previous
as the address from which the requested size is allocated.
When there's no candidate found to serve the request (e.g. the requested size doesn't fit any free chunk), we need to increment the current heap limit (via brk/sbrk), and return the appropriate address.
if (best_candidate == NULL) {
// no slot available big enough to fit `size`. Adjust brk
free_l* new_slot = sbrk(blob_size);
if (new_slot == NULL) {
return NULL;
}
new_slot->chunk_size = blob_size - CHUNK_SIZE_SIZE;
return &new_slot->previous;
}
pop_from_free_list(best_candidate);
return &best_candidate->previous;
The implementation for pop_from_free_list
is given:
void pop_from_free_list(free_l* entry) {
if (entry->previous == NULL) {
// entry is first in the list, adjust the head to next entry
head = entry->next;
} else {
entry->previous->next = entry->next;
}
if (entry->next != NULL) {
entry->next->previous = entry->previous;
}
}
The following is the implementation for free()
. It does a couple of things: - return the chunk to the free list - if the freed memory corresponds to program break, it can be adjusted.
void free(void* ptr) {
if (ptr == NULL) {
return;
}
// return slot to the free list
ptr -= CHUNK_SIZE_SIZE;
free_l* entry_ptr = (free_l*)ptr;
printf("entry location to free: %10p\n", ptr);
entry_ptr->previous = NULL;
entry_ptr->next = head;
head = entry_ptr;
}
The full gist has an additional optimization in free()
to lower the program break if memory is freed from the top of heap.
Conclusion
Here are some notable optimizations/concerns dlmalloc has (over the naive implementation above): - it differentiates between large requests (>=512 bytes) and small requests (<=64 bytes), and has various optimizations around it. Example: The free list entry additionally maintains a pair of pointers to navigate to the previous/next "large size" chunk. Or having a caching allocator (which maintains pools of quickly recycled chunks) for serving small requests. - memory tagging to differentiate between allocated/de-allocated chunks, which is for debugging purposes. - free()
coalesces consecutive chunks into a single free list entry. - when utilizing an entry, malloc()
serves the request with the exact memory required and creates a new/modified entry for the leftover memory space.
Resources
malloc & free
2022 May 07 See all postsmalloc & free
The post does walkthrough of a basic malloc/free implementation. It won't contain all the optimizations and concerns which the excellent Douglas Lea's Malloc aka dlmalloc covers. But the aim is to highlight the underlying system calls and internal structures which drives these memory management APIs.
brk and sbrk system calls
Resizing the heap (for allocating or freeing memory) is all about getting the kernel to adjust its idea of the process's program break – which is essentially the current limit of the heap.
brk()
andsbrk()
system calls allow to manipulate the program break, thereby creating or freeing space on the heap.In short,
brk()
sets the program break to the input addr.sbrk()
accepts offset and increments the current program break byincr
. It returns the location of the last program break i.e. start of the incremented heap segment.You can get the current program break by using
sbrk(0)
– something we'll be using in the discussed implementation below.How malloc & free works
With the brk/sbrk introduced, we can now start thinking about what else is needed for a malloc/free implementation. For this, let's start with their APIs:
A naive implementation
Let's start with malloc – it returns a pointer to a block of memory of
size
bytes. A naive implementation just increments the program break.This as you can imagine is horribly inefficient, as the program break always goes up and is never reset to lower positions. As the process proceeds, there will be heap segments (between
&end
and program break) that gets freed up (holes). But the naive implementation above makes no use of those holes. In fact, afree()
here doesn't make any sense, as the malloc never utilises the space freed up.Free list
It makes sense for the
free()
to capture the deallocated memory chunks for subsequent use by themalloc()
. The chunks can be of different sizes, and we can find a chunk which fits the requested size.malloc'ing a free chunk means it's not available for further allocation requests, until it is freed again. We define the following doubly linked list struct which resolves these requirements (calling it free list):
A malloc'ed segment looks like:

note that,
When the allocated segment is freed, the "memory size requested" portion is used to store the

previous
andnext
free list entries.Extending further on how this might work. Say we have a free list entry
f_entry
which is suitable to serve the malloc request, we need to do the following: - remove thef_entry
from the free list - return&(f_entry->previous)
, which is the memory addr at whichchunk_size
ends and can be used by user.The Implementation
Let's look at the malloc implementation. This section breakdowns the code. If you prefer to just read the complete code, have a look at the gist
Starting with the case when free list is empty:
Notice the two static variables there.
CHUNK_SIZE_SIZE
is what it takes to store thechunk_size
; While in a free list, the size taken for the record keeping is two pointers + 1size_t
variable, which reflects inFREE_L_SIZE
.The rest of the code is about adjusting the page break, setting the values in free list entry and returning the address for the user.
Next we look at the case when the free list has some entries, and can use be used to allocate the memory. This would involve traversing the free list and finding the chunk which best fits the request. This is called best-fit. Other algorithms like first-fit can also be used.
If we found a
best_candidate
here, we pop it off the free list, and return the&best_candidate->previous
as the address from which the requested size is allocated.When there's no candidate found to serve the request (e.g. the requested size doesn't fit any free chunk), we need to increment the current heap limit (via brk/sbrk), and return the appropriate address.
The implementation for
pop_from_free_list
is given:The following is the implementation for
free()
. It does a couple of things: - return the chunk to the free list - if the freed memory corresponds to program break, it can be adjusted.The full gist has an additional optimization in
free()
to lower the program break if memory is freed from the top of heap.Conclusion
Here are some notable optimizations/concerns dlmalloc has (over the naive implementation above): - it differentiates between large requests (>=512 bytes) and small requests (<=64 bytes), and has various optimizations around it. Example: The free list entry additionally maintains a pair of pointers to navigate to the previous/next "large size" chunk. Or having a caching allocator (which maintains pools of quickly recycled chunks) for serving small requests. - memory tagging to differentiate between allocated/de-allocated chunks, which is for debugging purposes. -
free()
coalesces consecutive chunks into a single free list entry. - when utilizing an entry,malloc()
serves the request with the exact memory required and creates a new/modified entry for the leftover memory space.Resources