Welcome back to the series on Memory Management in C++!
In the previous post we tried to understand the big picture of what we’re building on this series and discovered the pieces we’ll be building to complete the system. In this post we’ll dive into the code and start creating the foundation of our Memory Manager.
Are you ready?
If you recall from the previous post, the basic concept for our manager was that we’d get a big chunk of memory and we’d deal with how it is sliced.
So, the idea is: during the program’s startup we build our
MemoryManager. It will, in turn, request enough memory to the OS via the
new operator and keep the pointer to that memory. From then on, we can’t perform new memory requests: we’re doing it only once in a big chunk.
We could then create our
MemoryManager definition with this behavior:
::new refers to
operator new that’s defined in the global namespace: the standard one. It’s called explicitly this way so we don’t accidentally call other
operator new that could’ve been defined in the scope chain.
unsigned char can store values in the range
0..255 with a size of 1 byte and
m_memory is a pointer to the first byte in an array with as many bytes as requested in the constructor. If we were to use this class as it is now, we’d be leaking memory. So we need to release the array we requested to the OS. Let’s add that in the destructor:
And with this, we’ve got the start and end of the life cycle for the memory we’ll be managing! But before we continue on with the
MemoryChunk, we need some way to know we’re doing it right.
Dumping the contents
One of the most useful tools we want to have is the ability to dump the contents of the memory we’re managing in a readable format. Something similar to those awesome hexadecimal editors. We’ll use the most common format: a left column with the memory address, a central column with all of the bytes with an offset from that address and a right column with an ASCII representation of each byte. Let’s define a
dump method like so:
Wow, long code this time! And a bit ugly due to the
operator<< to deal with the
stringstream! But don’t panic, let’s talk about how it works!
The idea is to iterate over the memory in small step of bytes at a time (as many as given in
bytesPerRow, let’s say 16). The start of each iteration marks the address of the first element in the row, so we can display that address in the first column.
Then, within the step of
bytesPerRow, we display one byte at a time separated by the
: character. We use
characterIteratorAsInt for display purposes, if we were to use the gool old
printf family we’d use
%02X and pass the pointer to the byte instead of having an intermediate variable.
The last byte in the row also adds the string representation of the bytes in the row as the right column. It basically takes the address of the first element in the row, displays its string representation (or a
. if it can’t be printed) and then moves into the next byte within the same row until it’s done.
Phew, not bad! Show me an example then! Assume we have this code:
If we execute it in Visual Studio with the Debug x86 configuration, we get something similar to:
Please note that the addresses in the left column will change on each execution. The first byte in the first row has the address
0x000B6058, and we’re displaying 16 bytes per row. Each byte in the row would have offsets
+0F from the pointer in the left column. Let’s illustrate this by manually adding a header row:
As you can see, the memory is magically initialized to the
0xCD value. That’s because of the kind MSVC in a Debug build. As we mentioned before, the
. is used whenever the character doesn’t have a string representation. MSVC uses the
0xCD value to indicate it’s clean memory, allocated and initialized. If you want to know what other common magic numbers mean, check this Wikipedia article (thanks for the link, @dcanadasm!).
What happens if we use Visual Studio with the Release x86 configuration?
Wow, what is THAT? That’s all trash values from the memory we’ve received from the OS! Where did all of the
0xCD values go? We’re not in a Debug build anymore, so we are on the wild! Which brings me to another piece of advice:
Always initialize your variables.
And regularly test Release builds!
Now that we’ve seen the difference between Debug and Release builds, another useful tool we’d like to have is a way of initializing memory for the different scenarios we’ll come up with (memory we’ve just initialized, memory that was allocated in a chunk, memory that was returned to the manager, …).
Let’s make use of the knowledge we gained with the X-Macros post (because, why not?) and define it as:
What’s this sorcery?, you may ask. Let’s go through it.
We’re defining five values to model a bitmask so we can store several values in only one variable: the one that defines when to apply memory trashing. The
ON_ALL value represents the combination of the three ones it references. If we were to use a binary representation for all of the values we’d get:
We’re then creating two enumerations from this data: one to store the bitmask and one to store the trashing value. And both enumerations come from the same data we shown in the macro. Let’s define them as:
Now we can modify our constructor to receive the trashing mask, so we can have different configurations:
And finally, we can apply it! Let’s modify our constructor to check for this mask and apply the correct value:
Yeah, I know what you’re thinking right now. What on Earth are all those
static_cast<int> for? What an ugly code!
Well, we’ve used
enum class to define the enumerations to have type safe values, but with it we’ve lost the implicit conversion to
int (and that’s a good thing!). However, since we’re working with these typed values as a bitmask, we need to be able to use the
& operator to check whether the flag is set and that operator isn’t defined for enumerations.
Note that we’ve stored the mask for all of the trashing we’re performing in
m_trashing. Whenever we want to check if we have to apply trashing, we check it against the
eTrashing enumeration and then use the value in the
Okay! Let’s give it some runs!
It seems we’ve done it right! Congratulations :)
Remember we mentioned we were slicing our internal memory in small chunks of arbitrary size? We’re getting ready to do that now! Let me remember the definition of our
We’ve added two fields to the chunk: a flag to know if this chunk is free and the number of bytes in use for this chunk. There’s nothing where we’re storing the memory for this chunk, we’ll see where that’s happening in the next post.
If we were to use the
MemoryManager as a traditional Doubly Linked List we’d have a pointer to the head and tail of the list (
MemoryChunk nodes). However, because we’re using the memory we’re managing to store the chunks themselves, we’ll store all nodes in the managed memory (and just use the head for now)! And how will we do it?
Introducing placement new
I guess that, by now, you are somewhat used to the
new operator. You use it to request memory and build an object into the requested memory. However, there’s another kind of
new operator that doesn’t request memory but uses an existent one to build the object in. That is the
placement new. And what is the syntax for it?
Class is the class you want to build,
pointer is a pointer to some writable memory and
args is a collection of arguments to pass to the constructor.
With this in mind, we just need to build our first chunk in our memory! Let’s add it to our constructor so it looks like:
As you can see, we’ve added a new variable to our
MemoryManager that holds the amount of memory that’s still free to be used. We’ll update it when we add allocations and deallocations.
Okay, let’s check what happened to our memory now that we’ve added the chunk! Let’s dump the contents of a x86 build:
Now the first row of the dump has data! But which data? Let’s study it!
It seems like we’ve used 16 bytes. Do you know what does
sizeof(cMemoryChunk) yield in this build? Yeah, you guessed right: 16 bytes. Do you remember which members were defined in the
Woah, wait a second! If we sum up the individual sizes of the members we don’t get the total of 16 bytes! That’s because of memory padding.
We start our structure with a
bool, which takes 1 byte. Then we have an
unsigned int, which takes 4 bytes. The compiler will then add padding (extra bytes) before this member because their alignment requirements differ and the
unsigned int one is larger than the
bool one. That’s why, in our case, we’re spending 4 bytes to store a
bool (and wasting 3 of them!).
What else can we learn from the first row of the dumped memory? Let’s have a look at it again, but this time with the sizes we’ve seen separated:
As we can see, there’s some memory left unchanged with the
0xCD value: the one used to pad members.
0x00 == 0, which is our
false value for the
0x30 == 48, which is the correct number of free bytes we have in our memory (remember we have a total of 64 bytes and the chunk is 16 bytes). Both
m_next point to
nullptr, which is
Finally, there’s something else worth mentioning about the dump: the machine it was executed uses little endian ordering. If we were to represent
48 in hexadecimal with 4 bytes we’d use
0x0000030. However, we can see it’s stored as
30:00:00:00! In little endian architectures, the least significant byte is written first and the most significant one is written last. That’s why we see it flipped in memory.
I guess that’s all for this entry in the series! I hope you’re enjoying it! In the next post we’ll start manipulating the chunks in our memory.
Thanks for reading!