blob: 45aec7e01f071a0ff978d5edd8e7963daac84d68 [file] [log] [blame]
Tamas Ban8a7a5512020-05-29 16:25:07 +01001######################################################
2Code sharing between independently linked XIP binaries
3######################################################
4
5:Authors: Tamas Ban
6:Organization: Arm Limited
7:Contact: tamas.ban@arm.com
8:Status: Draft
9
10**********
11Motivation
12**********
13Cortex-M devices are usually constrained in terms of flash and RAM. Therefore,
14it is often challenging to fit bigger projects in the available memory. The PSA
15specifications require a device to both have a secure boot process in place at
16device boot-up time, and to have a partition in the SPE which provides
17cryptographic services at runtime. These two entities have some overlapping
18functionality. Some cryptographic primitives (e.g. hash calculation and digital
19signature verification) are required both in the bootloader and the runtime
20environment. In the current TF-M code base, both firmware components use the
21mbed-crypto library to implement these requirements. During the build process,
22the mbed-crpyto library is built twice, with different configurations (the
23bootloader requires less functionality) and then linked to the corresponding
24firmware component. As a result of this workflow, the same code is placed in the
25flash twice. For example, the code for the SHA-256 algorithm is included in
26MCUboot, but the exact same code is duplicated in the SPE cryptography
27partition. In most cases, there is no memory isolation between the bootloader
28and the SPE, because both are part of the PRoT code and run in the secure
29domain. So, in theory, the code of the common cryptographic algorithms could be
30reused among these firmware components. This could result in a big reduction in
31code footprint, because the cryptographic algorithms are usually flash hungry.
32Code size reduction can be a good opportunity for very constrained devices,
33which might need to use TF-M Profile Small anyway.
34
35*******************
36Technical challenge
37*******************
38Code sharing in a regular OS environment is easily achievable with dynamically
39linked libraries. However, this is not the case in Cortex-M systems where
40applications might run bare-metal, or on top of an RTOS, which usually lacks
41dynamic loading functionality. One major challenge to be solved in the Cortex-M
42space is how to share code between independently linked XIP applications that
43are tied to a certain memory address range to be executable and have absolute
44function and global data memory addresses. In this case, the code is not
45relocatable, and in most cases, there is no loader functionality in the system
46that can perform code relocation. Also, the lack of an MMU makes the address
47space flat, constant and not reconfigurable at runtime by privileged code.
48
49One other difficulty is that the bootloader and the runtime use the same RAM
50area during execution. The runtime firmware is executed strictly after the
51bootloader, so normally, it can reuse the whole secure RAM area, as it would be
52the exclusive user. No attention needs to be paid as to where global data is
53placed by the linker. The bootloader does not need to retain its state. The low
54level startup of the runtime firmware can freely overwrite the RAM with its data
55without corrupting bootloader functionality. However, with code sharing between
56bootloader and runtime firmware, these statements are no longer true. Global
57variables used by the shared code must either retain their value or must be
58reinitialised during low level startup of the runtime firmware. The startup code
59is not allowed to overwrite the shared global variables with arbitrary data. The
60following design proposal provides a solution to these challenges.
61
62**************
63Design concept
64**************
65The bootloader is sometimes implemented as ROM code (BL1) or stored in a region
66of the flash which is lockable, to prevent tampering. In a secure system, the
67bootloader is immutable code and thus implements a part of the Root of Trust
68anchor in the device, which is trusted implicitly. The shared code is primarily
69part of the bootloader, and is reused by the runtime SPE firmware at a later
70stage. Not all of the bootloader code is reused by the runtime SPE, only some
71cryptographic functions.
72
73Simplified steps of building with code sharing enabled:
74
75 - Complete the bootloader build process to have a final image that contains
76 the absolute addresses of the shared functions, and the global variables
77 used by these functions.
78 - Extract the addresses of the functions and related global variables that are
79 intended to be shared from the bootloader executable.
80 - When building runtime firmware, provide the absolute addresses of the shared
81 symbols to the linker, so that it can pick them up, instead of instantiating
82 them again.
83
84The execution flow looks like this:
85
86.. code-block:: bash
87
88 SPE MCUboot func1() MCUboot func2() MCUboot func3()
89 |
90 | Hash()
91 |------------->|
92 |----------------->|
93 |
94 Return |
95 Return |<-----------------|
96 |<-------------|
97 |
98 |
99 |----------------------------------------------------->|
100 |
101 Function pointer in shared global data() |
102 |<-----------------------------------------------------|
103 |
104 | Return
105 |----------------------------------------------------->|
106 |
107 Return |
108 |<-----------------------------------------------------|
109 |
110 |
111
112The execution flow usually returns from a shared function back to the SPE with
113an ordinary function return. So usually, once a shared function is called in the
114call path, all further functions in the call chain will be shared as well.
115However, this is not always the case, as it is possible for a shared function to
116call a non-shared function in SPE code through a global function pointer.
117
118For shared global variables, a dedicated data section must be allocated in the
119linker configuration file. This area must have the same memory address in both
120MCUboot's and the SPE's linker files, to ensure the integrity of the variables.
121For simplicity's sake, this section is placed at the very beginning of the RAM
122area. Also, the RAM wiping functionality at the end of the secure boot flow
123(that is intended to remove any possible secrets from the RAM) must not clear
124this area. Furthermore, it must be ensured that the linker places shared globals
125into this data section. There are two way to achieve this:
126
127 - Put a filter pattern in the section body that matches the shared global
128 variables.
129 - Mark the global variables in the source code with special attribute
130 `__attribute__((section(<NAME_OF_SHARED_SYMBOL_SECTION>)))`
131
132RAM memory layout in MCUboot with code sharing enabled:
133
134.. code-block:: bash
135
136 +------------------+
137 | Shared symbols |
138 +------------------+
139 | Shared boot data |
140 +------------------+
141 | Data |
142 +------------------+
143 | Stack (MSP) |
144 +------------------+
145 | Heap |
146 +------------------+
147
148RAM memory layout in SPE with code sharing enabled:
149
150.. code-block:: bash
151
152 +-------------------+
153 | Shared symbols |
154 +-------------------+
155 | Shared boot data |
156 +-------------------+
157 | Stack (MSP) |
158 +-------------------+
159 | Stack (PSP) |
160 +-------------------+
161 | Partition X Data |
162 +-------------------+
163 | Partition X Stack |
164 +-------------------+
165 .
166 .
167 .
168 +-------------------+
169 | Partition Z Data |
170 +-------------------+
171 | Partition Z Stack |
172 +-------------------+
173 | PRoT Data |
174 +-------------------+
175 | Heap |
176 +-------------------+
177
178Patching mbedTLS
179================
180In order to share some global function pointers from mbed-crypto that are
181related to dynamic memory allocation, their scope must be extended from private
182to global. This is needed because some compiler toolchain only extract the
183addresses of public functions and global variables, and extraction of addresses
184is a requirement to share them among binaries. Therefore, a short patch was
185created for the mbed-crypto library, which "globalises" these function pointers:
186
187`lib/ext/mbedcrypto/0005-Enable-crypto-code-sharing-between-independent-binar.patch`
188
189The patch need to manually applied in the mbedtls repo, if code sharing is
190enabled. The patch has no effect on the functional behaviour of the
191cryptographic library, it only extends the scope of some variables.
192
193*************
194Tools support
195*************
196All the currently supported compilers provide a way to achieve the above
197objectives. However, there is no standard way, which means that the code sharing
198functionality must be implemented on a per compiler basis. The following steps
199are needed:
200
201 - Extraction of the addresses of all global symbols.
202 - The filtering out of the addresses of symbols that aren't shared. The goal is
203 to not need to list all the shared symbols by name. Only a simple pattern
204 has to be provided, which matches the beginning of the symbol's name.
205 Matching symbols will be shared. Examples are in :
206 `bl2/src/shared_symbol_template.txt`
207 - Provision of the addresses of shared symbols to the linker during the SPE
208 build process.
209 - The resolution of symbol collisions during SPE linking. Because mbed-crypto
210 is linked to both firmware components as a static library, the external
211 shared symbols will conflict with the same symbols found within it. In order
212 to prioritize the external symbol, the symbol with the same name in
213 mbed-crypto must be marked as weak in the symbol table.
214
215The above functionalities are implemented in the toolchain specific CMake files:
216
217 - `toolchain_ARMCLANG.cmake`
218 - `toolchain_GNUARM.cmake`
219
220By the following two functions:
221
222 - `compiler_create_shared_code()`: Extract and filter shared symbol addresses
223 from MCUboot.
224 - `compiler_link_shared_code()`: Link shared code to the SPE and resolve symbol
225 conflict issues.
226
227ARMCLANG
228========
229The toolchain specific steps are:
230
231 - Extract all symbols from MCUboot: add `-symdefs` to the compiler command line
232 - Filter shared symbols: call CMake script `FilterSharedSymbols.cmake`
233 - Weaken duplicated (shared) symbols in the mbed-crypto static library that are
234 linked to the SPE: `arm-none-eabi-objcopy`
235 - Link shared code to SPE: Add the filtered output of `-symdefs` to the SPE
236 source file list.
237
238GNUARM
239======
240The toolchain specific steps are:
241
242 - Extract all symbols from MCUboot: `arm-none-eabi-nm`
243 - Filter shared symbols: call CMake script: `FilterSharedSymbols.cmake`
244 - Strip unshared code from MCUboot: `arm-none-eabi-strip`
245 - Weaken duplicated (shared) symbols in the mbed-crypto static library that are
246 linked to the SPE: `arm-none-eabi-objcopy`
247 - Link shared code to SPE: Add `-Wl -R <SHARED_STRIPPED_CODE.axf>` to the
248 compiler command line
249
250IAR
251===
252Functionality currently not implemented, but the toolchain supports doing it.
253
254**************************
255Memory footprint reduction
256**************************
257Build type: MinSizeRel
258Platform: mps2/an521
259Version: TF-Mv1.2.0 + code sharing patches
260MCUboot image encryption support is disabled.
261
262+------------------+-------------------+-------------------+-------------------+
263| | ConfigDefault | ConfigProfile-M | ConfigProfile-S |
264+------------------+----------+--------+----------+--------+----------+--------+
265| | ARMCLANG | GNUARM | ARMCLANG | GNUARM | ARMCLANG | GNUARM |
266+------------------+----------+--------+----------+--------+----------+--------+
267| CODE_SHARING=OFF | 122268 | 124572 | 75936 | 75996 | 50336 | 50224 |
268+------------------+----------+--------+----------+--------+----------+--------+
269| CODE_SHARING=ON | 113264 | 115500 | 70400 | 70336 | 48840 | 48988 |
270+------------------+----------+--------+----------+--------+----------+--------+
271| Difference | 9004 | 9072 | 5536 | 5660 | 1496 | 1236 |
272+------------------+----------+--------+----------+--------+----------+--------+
273
274If MCUboot image encryption support is enabled then saving could be up to
275~13-15KB.
276
277.. Note::
278
279 Code sharing on Musca-B1 was tested only with SW only crypto, so crypto
280 hardware acceleration must be turned off: -DCRYPTO_HW_ACCELERATOR=OFF
281
282
283*************************
284Useability considerations
285*************************
286Functions that only use local variables can be shared easily. However, functions
287that rely on global variables are a bit tricky. They can still be shared, but
288all global variables must be placed in the shared symbol section, to prevent
289overwriting and to enable the retention of their values.
290
291Some global variables might need to be reinitialised to their original values by
292runtime firmware, if they have been used by the bootloader, but need to have
293their original value when runtime firmware starts to use them. If so, the
294reinitialising functionality must be implemented explicitly, because the low
295level startup code in the SPE does not initialise the shared variables, which
296means they retain their value after MCUboot stops running.
297
298If a bug is discovered in the shared code, it cannot be fixed with a firmware
299upgrade, if the bootloader code is immutable. If this is the case, disabling
300code sharing might be a solution, as the new runtime firmware could contain the
301fixed code instead of relying on the unfixed shared code. However, this would
302increase code footprint.
303
304API backward compatibility also can be an issue. If the API has changed in newer
305version of the shared code. Then new code cannot rely on the shared version.
306The changed code and all the other shared code where it is referenced from must
307be ignored and the updated version of the functions must be compiled in the
308SPE binary. The mbedTLS library is API compatible with its current version
309(``v2.24.0``) since the ``mbedtls-2.7.0 release`` (2018-02-03).
310
311To minimise the risk of incompatibility, use the same compiler flags to build
312both firmware components.
313
314The artifacts of the shared code extraction steps must be preserved so as to
315remain available if new SPE firmware (that relies on shared code) is built and
316released. Those files are necessary to know the address of shared symbols when
317linking the SPE.
318
319************************
320How to use code sharing?
321************************
322Considering the above, code sharing is an optional feature, which is disabled
323by default. It can be enabled from the command line with a compile time switch:
324
325 - `TFM_CODE_SHARING`: Set to `ON` to enable code sharing.
326
327With the default settings, only the common part of the mbed-crypto library is
328shared, between MCUboot and the SPE. However, there might be other device
329specific code (e.g. device drivers) that could be shared. The shared
330cryptography code consists mainly of the SHA-256 algorithm, the `bignum` library
331and some RSA related functions. If image encryption support is enabled in
332MCUboot, then AES algorithms can be shared as well.
333
334Sharing code between the SPE and an external project is possible, even if
335MCUboot isn't used as the bootloader. For example, a custom bootloader can also
336be built in such a way as to create the necessary artifacts to share some of its
337code with the SPE. The same artifacts must be created like the case of MCUboot:
338
339 - `shared_symbols_name.txt`: Contains the name of the shared symbols. Used by
340 the script that prevents symbol collision.
341 - `shared_symbols_address.txt`: Contains the type, address and name of shared
342 symbols. Used by the linker when linking runtime SPE.
343 - `shared_code.axf`: GNUARM specific. The stripped version of the firmware
344 component, only contains the shared code. It is used by the linker when
345 linking the SPE.
346
347.. Note::
348
349 The artifacts of the shared code extraction steps must be preserved to be
350 able to link them to any future SPE version.
351
352When an external project is sharing code with the SPE, the `SHARED_CODE_PATH`
353compile time switch must be set to the path of the artifacts mentioned above.
354
355********************
356Further improvements
357********************
358This design focuses only on sharing the cryptography code. However, other code
359could be shared as well. Some possibilities:
360
361- Flash driver
362- Serial driver
363- Image metadata parsing code
364- etc.
365
366--------------
367
368*Copyright (c) 2020, Arm Limited. All rights reserved.*