Blame - docs/design_documents/code_sharing.rst - next/TF-M/trusted-firmware-m

blob: 45aec7e01f071a0ff978d5edd8e7963daac84d68 [file] [log] [blame]

Tamas Ban	8a7a551	2020-05-29 16:25:07 +0100	[diff] [blame]	1	######################################################
				2	Code sharing between independently linked XIP binaries
				3	######################################################
				4
				5	:Authors: Tamas Ban
				6	:Organization: Arm Limited
				7	:Contact: tamas.ban@arm.com
				8	:Status: Draft
				9
				10	**********
				11	Motivation
				12	**********
				13	Cortex-M devices are usually constrained in terms of flash and RAM. Therefore,
				14	it is often challenging to fit bigger projects in the available memory. The PSA
				15	specifications require a device to both have a secure boot process in place at
				16	device boot-up time, and to have a partition in the SPE which provides
				17	cryptographic services at runtime. These two entities have some overlapping
				18	functionality. Some cryptographic primitives (e.g. hash calculation and digital
				19	signature verification) are required both in the bootloader and the runtime
				20	environment. In the current TF-M code base, both firmware components use the
				21	mbed-crypto library to implement these requirements. During the build process,
				22	the mbed-crpyto library is built twice, with different configurations (the
				23	bootloader requires less functionality) and then linked to the corresponding
				24	firmware component. As a result of this workflow, the same code is placed in the
				25	flash twice. For example, the code for the SHA-256 algorithm is included in
				26	MCUboot, but the exact same code is duplicated in the SPE cryptography
				27	partition. In most cases, there is no memory isolation between the bootloader
				28	and the SPE, because both are part of the PRoT code and run in the secure
				29	domain. So, in theory, the code of the common cryptographic algorithms could be
				30	reused among these firmware components. This could result in a big reduction in
				31	code footprint, because the cryptographic algorithms are usually flash hungry.
				32	Code size reduction can be a good opportunity for very constrained devices,
				33	which might need to use TF-M Profile Small anyway.
				34
				35	*******************
				36	Technical challenge
				37	*******************
				38	Code sharing in a regular OS environment is easily achievable with dynamically
				39	linked libraries. However, this is not the case in Cortex-M systems where
				40	applications might run bare-metal, or on top of an RTOS, which usually lacks
				41	dynamic loading functionality. One major challenge to be solved in the Cortex-M
				42	space is how to share code between independently linked XIP applications that
				43	are tied to a certain memory address range to be executable and have absolute
				44	function and global data memory addresses. In this case, the code is not
				45	relocatable, and in most cases, there is no loader functionality in the system
				46	that can perform code relocation. Also, the lack of an MMU makes the address
				47	space flat, constant and not reconfigurable at runtime by privileged code.
				48
				49	One other difficulty is that the bootloader and the runtime use the same RAM
				50	area during execution. The runtime firmware is executed strictly after the
				51	bootloader, so normally, it can reuse the whole secure RAM area, as it would be
				52	the exclusive user. No attention needs to be paid as to where global data is
				53	placed by the linker. The bootloader does not need to retain its state. The low
				54	level startup of the runtime firmware can freely overwrite the RAM with its data
				55	without corrupting bootloader functionality. However, with code sharing between
				56	bootloader and runtime firmware, these statements are no longer true. Global
				57	variables used by the shared code must either retain their value or must be
				58	reinitialised during low level startup of the runtime firmware. The startup code
				59	is not allowed to overwrite the shared global variables with arbitrary data. The
				60	following design proposal provides a solution to these challenges.
				61
				62	**************
				63	Design concept
				64	**************
				65	The bootloader is sometimes implemented as ROM code (BL1) or stored in a region
				66	of the flash which is lockable, to prevent tampering. In a secure system, the
				67	bootloader is immutable code and thus implements a part of the Root of Trust
				68	anchor in the device, which is trusted implicitly. The shared code is primarily
				69	part of the bootloader, and is reused by the runtime SPE firmware at a later
				70	stage. Not all of the bootloader code is reused by the runtime SPE, only some
				71	cryptographic functions.
				72
				73	Simplified steps of building with code sharing enabled:
				74
				75	- Complete the bootloader build process to have a final image that contains
				76	the absolute addresses of the shared functions, and the global variables
				77	used by these functions.
				78	- Extract the addresses of the functions and related global variables that are
				79	intended to be shared from the bootloader executable.
				80	- When building runtime firmware, provide the absolute addresses of the shared
				81	symbols to the linker, so that it can pick them up, instead of instantiating
				82	them again.
				83
				84	The execution flow looks like this:
				85
				86	.. code-block:: bash
				87
				88	SPE MCUboot func1() MCUboot func2() MCUboot func3()
				89	\|
				90	\| Hash()
				91	\|------------->\|
				92	\|----------------->\|
				93	\|
				94	Return \|
				95	Return \|<-----------------\|
				96	\|<-------------\|
				97	\|
				98	\|
				99	\|----------------------------------------------------->\|
				100	\|
				101	Function pointer in shared global data() \|
				102	\|<-----------------------------------------------------\|
				103	\|
				104	\| Return
				105	\|----------------------------------------------------->\|
				106	\|
				107	Return \|
				108	\|<-----------------------------------------------------\|
				109	\|
				110	\|
				111
				112	The execution flow usually returns from a shared function back to the SPE with
				113	an ordinary function return. So usually, once a shared function is called in the
				114	call path, all further functions in the call chain will be shared as well.
				115	However, this is not always the case, as it is possible for a shared function to
				116	call a non-shared function in SPE code through a global function pointer.
				117
				118	For shared global variables, a dedicated data section must be allocated in the
				119	linker configuration file. This area must have the same memory address in both
				120	MCUboot's and the SPE's linker files, to ensure the integrity of the variables.
				121	For simplicity's sake, this section is placed at the very beginning of the RAM
				122	area. Also, the RAM wiping functionality at the end of the secure boot flow
				123	(that is intended to remove any possible secrets from the RAM) must not clear
				124	this area. Furthermore, it must be ensured that the linker places shared globals
				125	into this data section. There are two way to achieve this:
				126
				127	- Put a filter pattern in the section body that matches the shared global
				128	variables.
				129	- Mark the global variables in the source code with special attribute
				130	`__attribute__((section(<NAME_OF_SHARED_SYMBOL_SECTION>)))`
				131
				132	RAM memory layout in MCUboot with code sharing enabled:
				133
				134	.. code-block:: bash
				135
				136	+------------------+
				137	\| Shared symbols \|
				138	+------------------+
				139	\| Shared boot data \|
				140	+------------------+
				141	\| Data \|
				142	+------------------+
				143	\| Stack (MSP) \|
				144	+------------------+
				145	\| Heap \|
				146	+------------------+
				147
				148	RAM memory layout in SPE with code sharing enabled:
				149
				150	.. code-block:: bash
				151
				152	+-------------------+
				153	\| Shared symbols \|
				154	+-------------------+
				155	\| Shared boot data \|
				156	+-------------------+
				157	\| Stack (MSP) \|
				158	+-------------------+
				159	\| Stack (PSP) \|
				160	+-------------------+
				161	\| Partition X Data \|
				162	+-------------------+
				163	\| Partition X Stack \|
				164	+-------------------+
				165	.
				166	.
				167	.
				168	+-------------------+
				169	\| Partition Z Data \|
				170	+-------------------+
				171	\| Partition Z Stack \|
				172	+-------------------+
				173	\| PRoT Data \|
				174	+-------------------+
				175	\| Heap \|
				176	+-------------------+
				177
				178	Patching mbedTLS
				179	================
				180	In order to share some global function pointers from mbed-crypto that are
				181	related to dynamic memory allocation, their scope must be extended from private
				182	to global. This is needed because some compiler toolchain only extract the
				183	addresses of public functions and global variables, and extraction of addresses
				184	is a requirement to share them among binaries. Therefore, a short patch was
				185	created for the mbed-crypto library, which "globalises" these function pointers:
				186
				187	`lib/ext/mbedcrypto/0005-Enable-crypto-code-sharing-between-independent-binar.patch`
				188
				189	The patch need to manually applied in the mbedtls repo, if code sharing is
				190	enabled. The patch has no effect on the functional behaviour of the
				191	cryptographic library, it only extends the scope of some variables.
				192
				193	*************
				194	Tools support
				195	*************
				196	All the currently supported compilers provide a way to achieve the above
				197	objectives. However, there is no standard way, which means that the code sharing
				198	functionality must be implemented on a per compiler basis. The following steps
				199	are needed:
				200
				201	- Extraction of the addresses of all global symbols.
				202	- The filtering out of the addresses of symbols that aren't shared. The goal is
				203	to not need to list all the shared symbols by name. Only a simple pattern
				204	has to be provided, which matches the beginning of the symbol's name.
				205	Matching symbols will be shared. Examples are in :
				206	`bl2/src/shared_symbol_template.txt`
				207	- Provision of the addresses of shared symbols to the linker during the SPE
				208	build process.
				209	- The resolution of symbol collisions during SPE linking. Because mbed-crypto
				210	is linked to both firmware components as a static library, the external
				211	shared symbols will conflict with the same symbols found within it. In order
				212	to prioritize the external symbol, the symbol with the same name in
				213	mbed-crypto must be marked as weak in the symbol table.
				214
				215	The above functionalities are implemented in the toolchain specific CMake files:
				216
				217	- `toolchain_ARMCLANG.cmake`
				218	- `toolchain_GNUARM.cmake`
				219
				220	By the following two functions:
				221
				222	- `compiler_create_shared_code()`: Extract and filter shared symbol addresses
				223	from MCUboot.
				224	- `compiler_link_shared_code()`: Link shared code to the SPE and resolve symbol
				225	conflict issues.
				226
				227	ARMCLANG
				228	========
				229	The toolchain specific steps are:
				230
				231	- Extract all symbols from MCUboot: add `-symdefs` to the compiler command line
				232	- Filter shared symbols: call CMake script `FilterSharedSymbols.cmake`
				233	- Weaken duplicated (shared) symbols in the mbed-crypto static library that are
				234	linked to the SPE: `arm-none-eabi-objcopy`
				235	- Link shared code to SPE: Add the filtered output of `-symdefs` to the SPE
				236	source file list.
				237
				238	GNUARM
				239	======
				240	The toolchain specific steps are:
				241
				242	- Extract all symbols from MCUboot: `arm-none-eabi-nm`
				243	- Filter shared symbols: call CMake script: `FilterSharedSymbols.cmake`
				244	- Strip unshared code from MCUboot: `arm-none-eabi-strip`
				245	- Weaken duplicated (shared) symbols in the mbed-crypto static library that are
				246	linked to the SPE: `arm-none-eabi-objcopy`
				247	- Link shared code to SPE: Add `-Wl -R <SHARED_STRIPPED_CODE.axf>` to the
				248	compiler command line
				249
				250	IAR
				251	===
				252	Functionality currently not implemented, but the toolchain supports doing it.
				253
				254	**************************
				255	Memory footprint reduction
				256	**************************
				257	Build type: MinSizeRel
				258	Platform: mps2/an521
				259	Version: TF-Mv1.2.0 + code sharing patches
				260	MCUboot image encryption support is disabled.
				261
				262	+------------------+-------------------+-------------------+-------------------+
				263	\| \| ConfigDefault \| ConfigProfile-M \| ConfigProfile-S \|
				264	+------------------+----------+--------+----------+--------+----------+--------+
				265	\| \| ARMCLANG \| GNUARM \| ARMCLANG \| GNUARM \| ARMCLANG \| GNUARM \|
				266	+------------------+----------+--------+----------+--------+----------+--------+
				267	\| CODE_SHARING=OFF \| 122268 \| 124572 \| 75936 \| 75996 \| 50336 \| 50224 \|
				268	+------------------+----------+--------+----------+--------+----------+--------+
				269	\| CODE_SHARING=ON \| 113264 \| 115500 \| 70400 \| 70336 \| 48840 \| 48988 \|
				270	+------------------+----------+--------+----------+--------+----------+--------+
				271	\| Difference \| 9004 \| 9072 \| 5536 \| 5660 \| 1496 \| 1236 \|
				272	+------------------+----------+--------+----------+--------+----------+--------+
				273
				274	If MCUboot image encryption support is enabled then saving could be up to
				275	~13-15KB.
				276
				277	.. Note::
				278
				279	Code sharing on Musca-B1 was tested only with SW only crypto, so crypto
				280	hardware acceleration must be turned off: -DCRYPTO_HW_ACCELERATOR=OFF
				281
				282
				283	*************************
				284	Useability considerations
				285	*************************
				286	Functions that only use local variables can be shared easily. However, functions
				287	that rely on global variables are a bit tricky. They can still be shared, but
				288	all global variables must be placed in the shared symbol section, to prevent
				289	overwriting and to enable the retention of their values.
				290
				291	Some global variables might need to be reinitialised to their original values by
				292	runtime firmware, if they have been used by the bootloader, but need to have
				293	their original value when runtime firmware starts to use them. If so, the
				294	reinitialising functionality must be implemented explicitly, because the low
				295	level startup code in the SPE does not initialise the shared variables, which
				296	means they retain their value after MCUboot stops running.
				297
				298	If a bug is discovered in the shared code, it cannot be fixed with a firmware
				299	upgrade, if the bootloader code is immutable. If this is the case, disabling
				300	code sharing might be a solution, as the new runtime firmware could contain the
				301	fixed code instead of relying on the unfixed shared code. However, this would
				302	increase code footprint.
				303
				304	API backward compatibility also can be an issue. If the API has changed in newer
				305	version of the shared code. Then new code cannot rely on the shared version.
				306	The changed code and all the other shared code where it is referenced from must
				307	be ignored and the updated version of the functions must be compiled in the
				308	SPE binary. The mbedTLS library is API compatible with its current version
				309	(``v2.24.0``) since the ``mbedtls-2.7.0 release`` (2018-02-03).
				310
				311	To minimise the risk of incompatibility, use the same compiler flags to build
				312	both firmware components.
				313
				314	The artifacts of the shared code extraction steps must be preserved so as to
				315	remain available if new SPE firmware (that relies on shared code) is built and
				316	released. Those files are necessary to know the address of shared symbols when
				317	linking the SPE.
				318
				319	************************
				320	How to use code sharing?
				321	************************
				322	Considering the above, code sharing is an optional feature, which is disabled
				323	by default. It can be enabled from the command line with a compile time switch:
				324
				325	- `TFM_CODE_SHARING`: Set to `ON` to enable code sharing.
				326
				327	With the default settings, only the common part of the mbed-crypto library is
				328	shared, between MCUboot and the SPE. However, there might be other device
				329	specific code (e.g. device drivers) that could be shared. The shared
				330	cryptography code consists mainly of the SHA-256 algorithm, the `bignum` library
				331	and some RSA related functions. If image encryption support is enabled in
				332	MCUboot, then AES algorithms can be shared as well.
				333
				334	Sharing code between the SPE and an external project is possible, even if
				335	MCUboot isn't used as the bootloader. For example, a custom bootloader can also
				336	be built in such a way as to create the necessary artifacts to share some of its
				337	code with the SPE. The same artifacts must be created like the case of MCUboot:
				338
				339	- `shared_symbols_name.txt`: Contains the name of the shared symbols. Used by
				340	the script that prevents symbol collision.
				341	- `shared_symbols_address.txt`: Contains the type, address and name of shared
				342	symbols. Used by the linker when linking runtime SPE.
				343	- `shared_code.axf`: GNUARM specific. The stripped version of the firmware
				344	component, only contains the shared code. It is used by the linker when
				345	linking the SPE.
				346
				347	.. Note::
				348
				349	The artifacts of the shared code extraction steps must be preserved to be
				350	able to link them to any future SPE version.
				351
				352	When an external project is sharing code with the SPE, the `SHARED_CODE_PATH`
				353	compile time switch must be set to the path of the artifacts mentioned above.
				354
				355	********************
				356	Further improvements
				357	********************
				358	This design focuses only on sharing the cryptography code. However, other code
				359	could be shared as well. Some possibilities:
				360
				361	- Flash driver
				362	- Serial driver
				363	- Image metadata parsing code
				364	- etc.
				365
				366	--------------
				367
				368	Copyright (c) 2020, Arm Limited. All rights reserved.