Booting A Cortex M with NO IDE : like a caveman : Part 2 Linker Script
In this post we will make our basic directory structure and write the linker script
Directory structure
- Make a directory for the project, I've called mine G0_BOOT open VSCode to that folder
- Make two files
- myLinkerScript.ld
- myStartUp.c
- Make three directories inside G0_BOOT :
- src
- inc
- outputs
Go ahead and make an empty main.c file inside the src folder for now. Next grab all the header files that we downloaded from ST's Repo in the previous port and save them into the inc folder.
Your directory structure should look like this:The Linker Script: Laying the Foundation
Before we dive into writing code, we need to understand how our application is mapped in the microcontroller’s memory. This is where the linker script comes into play. The linker script is a crucial piece of the puzzle, dictating where different sections of your program—like code, data, and stack—are placed in the MCU's memory. It essentially acts as the blueprint for your application's memory layout, ensuring everything is in its right place when the microcontroller boots up. Without a properly configured linker script, your code wouldn't know where to go, and nothing would work as intended.
The memory layout can be found in your MCU's reference manual. For the STM32G071 that I am using the memory layout looks like this:
Everything in a microcontroller is memory, if you set a pin high/low that just means a bit or two in the Peripheral memory section was set or cleared.
Per the diagram above these are the sections:
ARM Cortex M0+ Core Block (0xE0000000)
These are the core peripherals that are defined in the CMSIS headers we included.
- ARM Cortex-M0+ internal peripherals: Contains system control and core-related registers like NVIC, SysTick, etc.
- IOPORT: Memory-mapped I/O registers for external interfaces.
- AHB: Advanced High-performance Bus, used for high-speed peripherals.
- APB: Advanced Peripheral Bus, used for lower-speed peripherals.
- Option bytes: Configuration settings like write protection and readout protection.
- Engineering bytes: Specific bytes used for debugging and testing by engineers.
- OTP: One-Time Programmable memory, used for storing permanent data like calibration.
- System memory: Contains the ROM bootloader for system startup.
- Main Flash memory (0x0800 0000 - 0x0807 FFFF): Additional space for storing application code.
Writing the Linker Script
Entry Point
We’ve already created the myLinkerScript.ld
file in VSCode, so let's start by informing the linker about the entry point of our application. When the MCU is powered on or reset, the Program Counter (PC) needs to point to the address of the first instruction to execute. The ENTRY
directive in the linker script ensures that the reset vector, which typically points to the Reset_Handler
, is used as the initial value for the PC. In simpler terms, this means that on power-up, the MCU will jump straight to the Reset_Handler
. The Reset_Handler
is simply a function, often written in assembly, though it can also be written in C and you can name it anything you want it does not need to be called Reset_Handler.
Remember this top-secret image from the first post?
If you recall from the introduction post, at power-up, the processor loads the Stack Pointer (SP), typically the Main Stack Pointer (MSP), with the value stored at address 0x00000000. This value usually represents the top of your RAM region.
This means the MCU goes to address 0x00000000 and this is where everything starts this is also where our vector table will live. However, if you look again at the memory map our user flash space (Main Flash memory) does not start until 0x08000000.
The magic here is that the MCU will alias/remap the address 0x08000000 so that it appears to be 0x00000000 that way our MCU does not end up in the weeds. And this remapping is pretty handy, this is how you are able to boot from Main Flash memory, or System Memory and even RAM.
Ok so we know the MCU will end up at the right place. When it goes to to address 0x00000000 it finds our vector table.
The vector table will be nothing more than a C array and it will be located at address 0x0000000. All of the elements of that vector table are basically system exception and interrupt handlers but the first two elements are special.
Remember that at address 0x00000000 the MCU wants to fine the value which it will load into the stack pointer. So from the diagram above we see that the Main Stack Pointer (MSP) is located at the top of RAM, in otherwards the highest address of RAM. In code that will be RAM start + RAM Length. So we need to put that value as the first element in our vector table located at 0x00000000.
A few Symbols
Before we go further into the vector table and start up code, lets make a variable for the top of RAM location, which is also essentially our stack pointer, in our linker script. I called the variable _top_of_ram_stack_start , more commonly this variable is called _estack, in fact anything in the linker script file can be called whatever you want, RAM could be called CupCake instead, however, there are common conventions used for readability and portability that the embedded system community generally use the same names for things.
After that lets add some symbols to tell the linker the size of our heap and stack areas
Memory layout
Next, we need to define the memory layout of our MCU by specifying the start and length of our RAM and FLASH spaces in the MEMORY section of the linker script. This allows the linker to know exactly where to place the code, data, and other sections in memory. The values are taken directly from the memory map diagram above. You should know the size of your RAM and Flash from your MCU specifications in the datasheet.
The RAM has been given xrw attributes which mean execute, read and write because we can do all of that with our RAM.
FLASH is given (rx) read and execute permissions because we don’t want anything overwriting our code. While it is possible to write to FLASH, if you need to do so, you would typically designate a specific section of FLASH with read, write, and execute (xrw) attributes for that purpose.
The ORIGIN is basically the start of that memory section and length is self-explanatory.
Sections
The next block of code is the SECTIONS area. This area is used to define how and where different parts of your program—like code, data, and variables—are placed in memory.
The commonly found sections are as follows:
- .isr_vector: This section holds the Interrupt Service Routine (ISR) vector table.The vector table contains the addresses of all exception and interrupt handlers, as well as the initial stack pointer value. This section is usually located at the beginning of the Flash memory.
- .text : The .text section contains the program code. All of your executable code, including functions and constants marked as const, are stored here. This section is read-only and resides in Flash memory.
- .rodata : The .rodata section is for read-only data. This section stores constants, string literals, and other data that should not change during program execution. Like .text, it is placed in Flash memory.
- .data : The .data section is used for initialized global and static variables. Variables that have an initial value defined in your code are stored here. During startup, these values are copied from Flash to RAM so they can be modified during execution.
- .bss : The .bss section holds uninitialized global and static variables. This section is used for variables that are declared without an initial value. The startup code will zero-initialize this section before main() is called. It resides in RAM.
- .heap :The .heap section is reserved for dynamic memory allocation. Memory requested at runtime using functions like malloc() comes from the heap. The size of this section is defined in the linker script, and it resides in RAM.
- .stack : The .stack section is used for the program’s stack. The stack is where local variables and function call information (like return addresses) are stored. This section grows downward in memory and resides in RAM.
Here’s a little insight into linker scripts. When you see the .text syntax below, the dot (.) operator acts like a pointer, holding the current memory address. Initially, at the start of the SECTIONS it’s set to the start of the memory region 0x00000000), and as you define sections like .text, the dot operator increments by the size of the data or code placed in that section.
When you see the syntax . = ALIGN(4), it ensures that the current address (held by the dot operator) is aligned to a 4-byte boundary. This alignment is important because many processors, including ARM Cortex cores, require code and data to be aligned to specific boundaries for efficient access and execution. By aligning to 4 bytes, we ensure that instructions and data are placed in memory in a way that the processor can handle optimally.
As the linker processes each section, it updates the dot operator to reflect the current position in memory, making sure that all subsequent sections are correctly placed and aligned according to your specifications.
Now that we understand the dot (.) operator's current location at the start of the SECTIONS is 0x00000000, and knowing that this is where the MCU will look for the vector table, it only makes sense to place our vector table in that section.
Vector Table
Below, we open the SECTIONS block and define a section called .isr_vector. Inside this block, we start by aligning the memory address to 4 bytes, ensuring proper alignment for the vector table. We then use the KEEP directive to ensure that all content placed in the .isr_vector section (such as interrupt vectors and the reset handler) is retained by the linker, even if it isn't explicitly referenced elsewhere in the code. Later in out startup C file we will make the vector table array and using an attribute we will place it in this .isr_vector section. Then we make sure the end is also aligned to 4 bytes, its good habbit to start and end you sections with alignments (we will ignore the alignments from now on so I dont have to keep explaining them) Then finally we tell the linker that this .isr_section will need to be placed in FLASH.
Next we will add our program code which goes in the .text section. And to be clear the .isr_vector section can also go in the .text section as long as it is the first one but I just chose to make it its own section instead.
.text section
While still being inside SECTIONS curly brackets we will place the .text section under the .isr_section like so:
As you can see the .text section I have commented out the .isr_vector because I just wanted to show you that you could also do it like that instead of giving its own section.
- *(.text) and *(.text*) :These lines instruct the linker to place all the sections in the object files with the
.text
or.text*
names into this section. The*
wildcard ensures that all related sections (like.text.foo
) are included. Same applies to .rodata. *(.glue_7)
and*(.glue_7t): T
hese sections contain special glue code used to transition between ARM and Thumb instruction sets. Thumb instructions are a more compact, 16-bit encoding of the ARM instructions, and sometimes code needs to switch between these modes. The.glue_7
and.glue_7t
sections contain the necessary instructions to handle this transition.
.rodata section
.data section
- _data = .; Here we are saving the current position of the dot operator into a variable called _data. This will be our destination address in RAM
- *(.data) *(.data*) : This is just like the text section where we include all references to .data and wildcard .data*
- _edata = .; : Here we are storing the end, or length of the .data section in RAM basically our length
- _sidata = LOADADDR(.data); : This is getting the FLASH address of where this is loaded into and saving it to _sidata, this is our source address in FLASH.
Now we have a RAM start and end (Destination , length ) and we have a FLASH start address (source). We have everything the C code would need to copy data from Flash to RAM. But lets finish out linker script first.
.bss section
Up next is the .bss section that holds uninitialized variables. In our start up file we will need to zero out all this data, so just like the .data section we will need to make symbols to hold start and end addresses. This section, however, does not need to be copied anywhere just zeroed out.
Heap and Stack section
Finally we need to add space for our heap and stack using the symbols we created at the very top. You will notice that these are actually aligned to 8 bytes instead of 4. The short reason is that aligning the heap and stack to 8 bytes is a best practice that ensures proper alignment for data types, maintains performance, and complies with the ARM architecture's requirements and compliance with the ABI. The ARM EABI specifies that the stack pointer must be aligned to an 8-byte boundary when calling functions. This alignment helps prevent issues related to misaligned memory access and ensures that your code runs efficiently and correctly on ARM Cortex-M processors. Basically we HAVE to.
- . = ALIGN(8); The dot (.) operator in a linker script represents the current memory location or address. This ensures that the current memory address is aligned to an 8-byte boundary. This is important for performance and compatibility reasons, as discussed previously.
- PROVIDE ( end = . ); and PROVIDE ( _end = . ); The PROVIDE keyword in a linker script creates a symbol only if it is not already defined. This allows you to define symbols like end and _end that point to the current memory address (.). These symbols are used in the C code to get the end of the data and to determine where the heap starts.
- end and _end: These symbols mark the end of the memory region that the linker has allocated for your program's data sections, including .bss, .data, and .rodata. They are used as the starting point for the heap.
- end: Typically used as a reference for the start of the heap.
- _end: Another common name for the same reference point. Having both ensures compatibility with different libraries or codebases that may expect one or the other.
- . = . + _Min_Heap_Size; and . = . + _Min_Stack_Size; These lines increment the dot (.) operator by the sizes specified for the heap (_Min_Heap_Size) and stack (_Min_Stack_Size). This effectively reserves space in RAM for the heap and stack.
- . = . + _Min_Heap_Size;, reserves memory for the heap by incrementing the dot pointer.
- . = . + _Min_Stack_Size;, reserves memory for the stack. After this increment, the dot operator (.) will point to the end of the reserved stack area.
- . = ALIGN(8); This final alignment ensures that the memory address after reserving the heap and stack is aligned to an 8-byte boundary, maintaining proper alignment for any subsequent memory sections.
Behold the linker script
At this point, our linker script is complete. One important note is that the symbols we created here are accessed as pointers in our C code, so they need to be dereferenced to retrieve their actual values.
Below is the linker script in its entirety.
In the next post we will write our startup file and make use of the symbols from the linker script to zero out our .bss section as well as copy out .data section to ram. And do not worry will use the ARM GNU tools to check if our linker script works.
Comments