native: documentation improvements (2eab0362) · Commits · Wenxi XU / zephyr

boards/posix/native_posix/doc/Port_vs_QEMU_vs.svg

0 → 100644

+0 −0

File added.

Preview suppressed by a .gitattributes entry or the file's encoding is unsupported.

boards/posix/native_posix/doc/board.rst

+271 −87

Original line number	Diff line number	Diff line
		@@ -4,6 +4,11 @@
		Native POSIX execution (native_posix)
		#######################################

		.. contents::
		:depth: 1
		:backlinks: entry
		:local:

		Overview
		********

		@@ -14,15 +19,6 @@ you use native host tools for compiling, debugging, and analyzing your
		Zephyr application, eliminating the need for architecture-specific
		target hardware in the early phases of development.

		.. figure:: native_layers.svg
		:align: center
		:alt: Zephyr layering in native build
		:figclass: align-center

		Zephyr layering when built against an embedded target (left), and
		targeting the native_posix board (right)


		Host system dependencies
		========================

		@@ -34,105 +30,101 @@ It has only been tested on Linux, but should also be compatible with macOS.
		You must have the 32-bit C library installed in your system
		(in Ubuntu 16.04 install the gcc-multilib package)

		.. note::

		Architecture
		************
		This port will not work in Windows Subsystem for Linux (WSL) because WSL
		does not support native 32-bit binaries.

		This board is based on the POSIX architecture port of Zephyr.
		In this architecture each Zephyr thread is mapped to one POSIX pthread,
		but only one of these pthreads executes at a time.
		This architecture provides the same interface to the Kernel as other
		architectures and is therefore transparent for the application.
		Important limitations
		*********************

		The underlying assumptions behind this port set some limitations on what
		can and cannot be done.
		These limitations are due to the code executing natively in
		the host CPU without any instrumentation or means to interrupt it unless the
		simulated CPU is sleeping.

		You can imagine the code executes in a simulated CPU
		which runs at an infinitely fast clock: No time passes while the CPU is
		running.
		Therefore interrupts, including timer interrupts, will not arrive
		while code executes, except immediately after the SW enables or unmasks
		them if they were pending.

		This behavior is intentional, as it provides a deterministic environment to
		develop and debug.
		For more information please see the
		`Rationale for this port`_ and `Architecture`_ sections

		Therefore these limitations apply:

		- There can not be busy wait loops in the application code that wait for
		something to happen without letting the CPU sleep.
		If busy wait loops do exist, they will behave as infinite loops and
		will stall the execution. For example, the following busy wait loop code,
		which could be interrupted on actual hardware, will stall the execution of
		all threads, kernel, and HW models:

		This board does not try to emulate any particular embedded CPU or SOC.
		The code is compiled natively for the host x86 system, as a 32-bit
		binary assuming pointer and integer types are 32-bits wide.
		.. code-block:: c

		To ensure determinism when the Zephyr code is running,
		and to ease application debugging,
		the issue of code execution speed is ignored.
		The board uses a different time than real time: simulated time.
		This simulated time is, in principle, not linked to the host time.
		while (1){}

		The Zephyr application sees the code executing as if the CPU were running at
		an infinitely high clock, and fully decoupled from the underlying host CPU
		speed.
		No simulated time passes while the application or kernel code execute.
		Similarly the following code where we expect ``condition`` to be
		updated by an interrupt handler or another thread, will also stall
		the application when compiled for this port.

		The CPU boot is emulated by creating the Zephyr initialization thread and
		letting it run. This in turn may spawn more Zephyr threads.
		Eventually the SW will run to completion, that is, it will set the CPU
		back to sleep.
		.. code-block:: c

		At this point, control is transferred back to the HW models and the simulation
		time can be advanced.
		volatile condition = true;
		while (condition){}

		When the HW models raise an interrupt, the CPU wakes back up: the interrupt
		is handled, the SW runs until completion again, and control is
		transferred back to the HW models, all in zero simulated time.

		If the SW unmasks a pending interrupt while running, or triggers a SW
		interrupt, the interrupt controller may raise the interrupt immediately
		depending on interrupt priorities, masking, and locking state.
		- Code that depends on its own execution speed will normally not
		work as expected. For example, code such as shown below, will likely not
		work as expected:

		Normally the resulting executable runs fully decoupled from the real host time.
		That is, simulated time will advance as fast as it can. This is desirable when
		running in a debuger or testing in batch, but not if one wants to interact
		with external interfaces which are based on the real host time.
		.. code-block:: c

		Peripherals
		***********
		peripheral_x->run = true;

		The following peripherals are currently provided with this board:
		/* Wait for a number of CPU cycles */
		for (int i = 0; i < 100; i++) NOP;

		Console/printk driver:
		/* We expect the peripheral done and ready to do something else */

		A driver is provided that redirects any printk write to the native
		host application's stdout.

		Simple timer:

		A simple timer provides the kernel with a 10ms tick.
		This peripheral driver also provides the needed functionality for this
		architecture-specific k_busy_wait().
		- This port is not meant to, and could not possibly help debug races between
		HW and SW, or similar timing related issues.

		This timer, may also be configured with NATIVE_POSIX_SLOWDOWN_TO_REAL_TIME
		to slow down the execution to real host time.
		This will provide the illusion that the simulated time is running at the same
		speed as the real host time.
		In reality, the timer will monitor how much real host time
		has actually passed since boot, and when needed, the timer will pause
		the execution before raising each timer interrupt.
		Normally the Zephyr application and HW models run in very little time
		on the host CPU, so this is a good enough approach.
		- You may not use hard coded memory addresses because there is no I/O or
		MMU emulation.

		Working around these limitations
		==================================

		Interrupt controller
		If a busy wait loop exists, it will become evident as the application will be
		stalled in it. To find the loop, you can run the binary in a debugger and
		pause it after the execution is stuck; it will be paused in
		some part of that loop.

		A simple yet generic interrupt controller is provided. It can nest interrupts
		and provides interrupt priorities. Interrupts can be individually masked or
		unmasked. SW interrupts are also supported.
		The best solution is to remove that busy wait loop, and instead use
		an appropriate kernel primitive to synchronize your threads.
		Note that busy wait loops are in general a bad coding practice as they
		keep the CPU executing and consuming power.

		If removing the busy loop is really not an option, you may add a conditionally
		compiled call to :c:func:`k_cpu_idle` if you are waiting for an
		interrupt, or a call to :c:func:`k_busy_wait` with some small delay in
		microseconds.
		In the previous example, modifying the code as follows would work:

		Important limitations
		=====================
		.. code-block:: c

		The assumption that simulated time can only pass while the CPU is sleeping
		means that there can not be busy wait loops in the application code that
		wait for something to happen without letting the CPU sleep.
		If busy wait loops do exist, they will behave as infinite loops and
		will stall the execution.

		As simulated time does not pass while the CPU is running, it also means no HW
		interrupts will interrupt the threads' execution unless the SW enables or
		unmasks them.
		This is intentional, as it provides a deterministic environment to develop and
		debug.
		But note that this may hide issues in the SW that may only be triggered in the
		real platform.

		This native port of Zephyr is not meant to, and could not possibly
		help debug races between HW and SW, or similar timing related issues.
		volatile condition = true;
		while (condition) {
		#if defined(CONFIG_ARCH_POSIX)
		k_cpu_idle();
		#endif
		}


		How to use it
		@@ -145,11 +137,11 @@ Specify the native_posix board target to build a native POSIX application:

		.. zephyr-app-commands::
		:zephyr-app: samples/hello_world
		:host-os: unix
		:board: native_posix
		:goals: build
		:compact:


		Running
		=======

		@@ -173,7 +165,6 @@ If you want your application to gracefully finish when it reaches some point,
		you may add a conditionally compiled (CONFIG_BOARD_NATIVE_POSIX) call to
		``main_clean_up(exit_code)`` at that point.


		Debugging
		=========

		@@ -186,3 +177,196 @@ Because the execution of your Zephyr application is fully deterministic
		(there are no asynchronous or random components), you can execute the
		code multiple times and get the exact same result. Instrumenting the
		code does not affect its execution.

		To ease debugging you may want to compile your code without optimizations
		(e.g., -O0).

		Rationale for this port
		***********************

		The main intents of this port are:

		- Allow functional debugging, instrumentation and analysis of the code with
		native tooling.
		- Allow functional regression testing, and simulations in which we have the
		full functionality of the code.
		- Run tests fast: several minutes of simulated time per wall time second.
		- Possibility to connect to external tools which may be able to run much
		faster or much slower than real time.
		- Fully deterministic, repeatable runs:
		There must not be any randomness or indeterminism.
		The result must not be affected by:

		- Debugging or instrumenting the code.
		- Pausing in a breakpoint and continuing later.
		- The host computer performance or its load.

		The aim of this port is not to debug HW/SW races, missed HW programming
		deadlines, or issues in which an interrupt comes when it was not expected.
		Normally those would be debugged with a cycle accurate Instruction Set Simulator
		(ISS) or with a development board.

		Comparison with other options
		*****************************

		This port does not try to replace cycle accurate instruction set simulators
		(ISS), development boards, or QEMU, but to complement them. This port's main aim
		is to meet the targets described in the previous `Rationale for this port`_
		section.

		.. figure:: Port_vs_QEMU_vs.svg
		:align: center
		:alt: Comparison of different debugging targets
		:figclass: align-center

		Comparison of different debugging options. Note that realism has many
		dimensions: Having the real memory map or emulating the exact time an
		instruction executes is just some of it; Emulating peripherals accurately
		is another side.

		This native port compiles your code directly to x86, with no instrumentation or
		monitoring code. Your code executes directly in the host CPU. That is, your code
		executes just as fast as it possibly can.

		Simulated time is decoupled from real host time.
		The problem of how to emulate the instruction execution speed is solved
		by assuming that code executes in zero simulated time.

		There is no I/O or MMU emulation. If you try to access memory through hardcoded
		addresses your binary will simply segfault.
		The drivers and HW models for this architecture will hide this from the
		application developers when it relates to those peripherals.
		In general this port is not meant to help developing low level drivers for
		target HW. But for developing application code.

		Your code can be debugged, instrumented, or analyzed with all normal native
		development tools just like any other Linux application.

		Execution is fully reproducible, you can pause it without side-effects.

		How does this port compare to QEMU:
		===================================

		With QEMU you compile your image targeting the board which is closer to
		your desired board. For example an ARM based one. QEMU emulates the real memory
		layout of the board, loads the compiled binary and through instructions
		translation executes that ARM targeted binary on the host CPU.
		Depending on configuration, QEMU also provides models of some peripherals
		and, in some cases, can expose host HW as emulated target peripherals.

		QEMU cannot provide any emulation of execution speed. It simply
		executes code as fast as it can, and lets the host CPU speed determine the
		emulated CPU speed. This produces highly indeterministic behavior,
		as the execution speed depends on the host system performance and its load.

		As instructions are translated to the host architecture, and the target CPU and
		MMU are emulated, there is a performance penalty.

		You can connect gdb to QEMU, but have few other instrumentation abilities.

		Execution is not reproducible. Some bugs may be triggered only in some runs
		depending on the computer and its load.

		How does this port compare to an ISS:
		======================================

		With a cycle accurate instruction set simulator you compile targeting either
		your real CPU/platform or a close enough relative. The memory layout is modeled
		and some or all peripherals too.

		The simulator loads your binary, slowly interprets each instruction, and
		accounts for the time each instruction takes.
		Time is simulated and is fully decoupled from real time.
		Simulations are on the order of 10 to 100 times slower than real time.

		Some instruction set simulators work with gdb, and may
		provide some extra tools for analyzing your code.

		Execution is fully reproducible. You can normally pause your execution without
		side-effects.


		Architecture
		************

		.. figure:: native_layers.svg
		:align: center
		:alt: Zephyr layering in native build
		:figclass: align-center

		Zephyr layering when built against an embedded target (left), and
		targeting the native_posix board (right)

		This board is based on the POSIX architecture port of Zephyr.
		In this architecture each Zephyr thread is mapped to one POSIX pthread,
		but only one of these pthreads executes at a time.
		This architecture provides the same interface to the Kernel as other
		architectures and is therefore transparent for the application.

		This board does not try to emulate any particular embedded CPU or SOC.
		The code is compiled natively for the host x86 system, as a 32-bit
		binary assuming pointer and integer types are 32-bits wide.

		To ensure determinism when the Zephyr code is running,
		and to ease application debugging,
		the board uses a different time than real time: simulated time.
		This simulated time is, in principle, not linked to the host time.

		The Zephyr application sees the code executing as if the CPU were running at
		an infinitely fast clock, and fully decoupled from the underlying host CPU
		speed.
		No simulated time passes while the application or kernel code execute.

		The CPU boot is emulated by creating the Zephyr initialization thread and
		letting it run. This in turn may spawn more Zephyr threads.
		Eventually the SW will run to completion, that is, it will set the CPU
		back to sleep.

		At this point, control is transferred back to the HW models and the simulation
		time can be advanced.

		When the HW models raise an interrupt, the CPU wakes back up, the interrupt
		is handled, the SW runs until completion again, and control is
		transferred back to the HW models, all in zero simulated time.

		If the SW unmasks a pending interrupt while running, or triggers a SW
		interrupt, the interrupt controller may raise the interrupt immediately
		depending on interrupt priorities, masking, and locking state.

		Normally the resulting executable runs fully decoupled from the real host time.
		That is, simulated time will advance as fast as it can. This is desirable when
		running in a debuger or testing in batch, but not if one wants to interact
		with external interfaces which are based on the real host time.

		Peripherals
		***********

		The following peripherals are currently provided with this board:

		Console/printk driver:
		A driver is provided that redirects any printk write to the native
		host application's stdout.

		Simple timer:
		A simple timer provides the kernel with a 10ms tick.
		This peripheral driver also provides the needed functionality for this
		architecture-specific :c:func:`k_busy_wait`.

		This timer, is configured by default with NATIVE_POSIX_SLOWDOWN_TO_REAL_TIME_
		to slow down the execution to real host time.
		This will provide the illusion that the simulated time is running at the same
		speed as the real host time.
		In reality, the timer will monitor how much real host time
		has actually passed since boot, and when needed, it will pause
		the execution before raising its next timer interrupt.
		Normally the Zephyr application and HW models run in very little time
		on the host CPU, so this is a good enough approach.

		.. _NATIVE_POSIX_SLOWDOWN_TO_REAL_TIME:
		../../../../reference/kconfig/CONFIG_NATIVE_POSIX_SLOWDOWN_TO_REAL_TIME.html

		Interrupt controller:
		A simple yet generic interrupt controller is provided. It can nest interrupts
		and provides interrupt priorities. Interrupts can be individually masked or
		unmasked. SW interrupts are also supported.

Admin message