Evolvement of Automotive Networks
So why there is this strong movement toward the usage of Ethernet for in-vehicle communication?
Today’s automotive networks have already a high demand of bandwidth for transferring data between sensor and electronic control units (ECU). For instance, camera data, LiDAR sensors, but also Audio/Video streams for infotainment require high throughput. In future cars, there will be even more sensors and cameras. In addition, there is a design change toward a centralized zone base architecture, which will reduce the number of physical ECUs and lead to wiring optimization, which is a great benefit especially if high-speed data transfer is involved. Ethernet is already everywhere. We have been using it for many years in home or office environments. It is proven with over 40 years of applied usage. Ethernet has a well-established ecosystem and most operating systems offer network stacks that support it. Ethernet offers speed grades from 10Mbit/s to 400GBit/s and is independent from the physical layer. This means, that there are different cabling options available for automotive use cases, like fiber or single twisted pair copper line such as 100BASE-T1 or 1000BASE-T1. So, ideal conditions to be the next-gen communication standard for automotive networks?
We need to be careful, automotive Ethernet is not the same as an Ethernet network used in IT or at home. For instance, the transfer of data from LiDAR sensor must be handled within a short amount of time. This time needs to be deterministic, otherwise, a safe operation cannot be guaranteed. See Table 1 for some timing requirements of automotive applications.
Table 1. Example of time constrain
TSN - Time-Sensitive Networking
Ethernet as originally defined in IEEE 802 does not provide any real-time capabilities. Time-Sensitive Networking (TSN), a set of standards maintained by the IEEE 802.1 working group, extends Ethernet with the capabilities of real-time communication. TSN has three major tasks to fulfill: Time Synchronization, Traffic Scheduling and Configuration as illustrated in Figure 1. The most important step in a time-sensitive network is to synchronize the clocks of each network participant to the most precise source available. TSN uses the generic Precision Time Protocol (gPTP) to achieve this task. To meet a certain latency or to guarantee a certain priority to a data stream, TSN includes several standards that define how different kinds of traffic scheduling shall be implemented. In addition, the standard supports the interchange of configuration properties via high-level protocols such as the Stream Reservation Protocol (SRP). Have a look at Table 2 for a selection of important standards associated with TSN. The whole set of TSN standards ensures a defined latency of traffic with real-time demands. Section 2.2 will provide an itemized overview for some of these. In summary, the calculable delays, bandwidth allocation, shaping and queuing algorithms, including frame preemption are establishing determinism within existing Ethernet networks for services that rely on it. As queuing algorithm, buffer requirements and worst-case delays are mathematically predictable, TSN is highly suitable for safety-related applications.
Table 2. Selection of important TSN standards
TSN Configuration
How TSN-capable hardware is configured is highly application specific. For instance, to configure the RSwitch2 on the Renesas Vehicle Computer 3 (VC3) an XML-based approach has been chosen. Linux-based operating systems offer their own interfaces, such as qdisc (queue disciplines). However, there might be different approaches available, depending on the application needs. For the configuration interface itself, there is commonly no special support needed by the underlying hardware (HW) and it is most commonly software that translates the input to the data format accepted by the hardware. In the case of virtualization, a hypervisor needs to create a clear barrier between the functionality that can be configured by a guest OS and functions that are reserved for the privileged system level. Recently TSN added specifications for high-level configuration mechanisms such as 802.1Qdd – Resource Allocation Protocol or 802.1Qcw – YANG Data Models for Scheduled Traffic, Frame Preemption, and Per-Stream Filtering and Policing. As already mentioned above, implementing such standards should be a hardware-independent piece of software and thus are far beyond the scope of this treatise.
TSN Traffic Scheduling
TSN traffic shaping algorithms enable the co-existence of real-time traffic with regular best-effort traffic within one Ethernet network. Low latency and determinism are guaranteed for time-critical communication. This means, that safety-related components can be integrated in a TSN-enabled Ethernet network. There are varieties of standards available within TSN to ensure a defined latency of traffic with real-time demands.
The Credit Base Shaper (CBS) mechanism, specified in 802.1Qav, handles the prioritization of multiple network queues. It assigns every queue a credit limit based on priority. The current credit increases when a frame is added into the queue and decreases when a frame is sent. If the credit for one queue is negative, no more frames are sent, and another queue is scheduled. In addition, limits based on maximum frame size and port transmission rate are also considered.
Another scheduling technique is specified in IEEE 802.1Qbv as time aware shaper (TAS). It implements a guard band in front of time-critical traffic that is scheduled. This prevents low-priority traffic from being transmitted if that transmission cannot be finished before the traffic window closes. In addition, it offers a preemption mechanism that can stop frame transmission in the middle of a transaction. In such a case, the transmission of another frame with higher priority is executed. After accomplishment, the original transmission can continue. The mechanism is described in IEEE Std 802.1Qbu and IEEE 802.3br. This is just an excerpt of the most important solutions the TSN set of standards offers. Traffic scheduling is assisted by TSN-capable hardware such as the Renesas RSwitch2, which allows up to 8 different priorities and several arbitration algorithms as we just outlined. The following list gives an overview of the most important traffic scheduling features of the RSwitch2 hardware
- Forwarding and Queuing (IEEE 802.1Qav)
- Frame preemption (IEEE 802.1Qbu + IEEE 802.3br)
- Enhancements for scheduled traffic (IEEE 802.1Qbv)
- Per stream filtering, metering, and policing (IEEE 802.1Qci)
- Several QoS mechanisms (priority control, resource management, ...)
TSN Time Synchronization
The TSN standard IEEE 802.1AS Timing and Synchronization for Time-Sensitive Applications
defines the Generic Precision Time Protocol (gPTP). This protocol synchronizes the time between all TSN participants and selects the best master, called the grandmaster. The synchronization mechanism calculates the transmission time between one master and slave by exchanging times within sync and delay request messages. Figure 2 shows the two possible approaches. The time difference between Master and Slave can be calculated by:
- tS – tM = TS1- TM1 – TDelay
Figure 2. Time synchronization approaches
The delay can then be calculated by the round trip time:
TDelay= (TM Delay + TS Delay) / 2
TDelay= (TS1- TM1 + TM2- TS2 ) / 2
The Generic Precision Time Protocol needs a precise transmission timestamp to work accurately. The timestamping can take place at one of the several layers that are involved during transmission, but only a hardware-assisted solution achieves the desired accuracy. Figure 3 shows an approximation of the accuracy achieved by the different approaches.
The Renesas RSwitch2 comes with an associated PTP HW clock. This clock is either the grandmaster clock or synchronized to an external grandmaster by the PTP software stack. On transmission, the MAC part of the RSwitch2 captures the time directly from this high-precision HW clock and adds the timestamp to the transmission frame. In addition, the same HW clock can be accessed via a common registers access method by software layers to provide this precise synchronized clock to other components of the system. Typical applications are the synchronization of the system clock and the acquisition of the correct presentation time for synchronous Audio/Video streaming as it can be achieved by the Audio/Video Transport Protocol (AVTP). See Figure 4 to get an overview of the RSwitch2 HW time stamping design.
Figure 4. RSwitch2 HW Timestamping
So, it has been shown up to here that the different standards of TSN have indeed prepared the way to bring real-time sensitive applications to the world of Ethernet. But how about the move towards a zone-based architecture, which will reduce the number of physical ECUs? We want to show an approach to how TSN works in a virtualized, multiple, domain setup within one SoC.
The Demonstration Setup
In our demonstration setup, we want to show the aggregation of multiple ECUs on one System-on-Chip product. Regardless of this aggregation, each of the now integrated ECUs shall have TSN capabilities as a physical ECU with a TSN-capable network interface would have. Figure 5 illustrates such an integration. For simplification, this demonstrator setup uses only two operation system domains. It is closely based on the setup that has been shown in the blog post The Art of Networking (Series 3): The power of virtualization. For a detailed description on the XEN hypervisor setup and the sharing of queues between different guest operating systems, please refer to this article. Here, it will be shown how to enable TSN capabilities within the guest OS.
Figure 5. ECU consolidation aims to move from a single-function ECU approach (left) to Multi-function ECUs (right)
Hardware Arrangement
The hardware used in this approach is the Vehicle Computer 3 board (VC3), equipped with a Renesas R-Car H3 SoC and a TSN Ethernet switch (R-Switch2). The Ethernet switch is implemented on an FPGA connected to the R-Car through PCIe. The VC3 is a demonstration platform for leading-edge technology. Of course, any other product that has integrated the RSwitch2 switch, such as the R-Car S4 provides the same outstanding features. For clock synchronization, an external PC equipped with a TSN-capable network interface card is used. In the shown scenario, this PC is the PTP grandmaster and the VC3 synchronizes its PTP hardware clock to it. PTP for Linux (PTP4L) is the software stack that is used to implement time synchronization via gPTP.
Each function that does not have direct hardware access to RSwitch2, exchanges its network data via a virtualized network interface which exposes dedicated hardware queues of the RSwitch2 IP to this domain. Because of this, Ethernet frames from any virtual machine can be timestamped in the same way by the RSwitch2 IP-like domains with physical access. As mentioned before, the sharing of network queues between multiple operation systems has been described in previous blog posts in detail.
Software Arrangement
For the software setup, Xen v4.14 was chosen as a hypervisor. Two guest operating systems are running on Xen (also called domains):
- dom0: a privileged domain that has direct access to most of the R-Car peripherals, the RSwitch2 IP, the PTP hardware clocks (PHC). Also, the PTP for Linux (PTP4L) software stack is running in this domain. It synchronizes the PTP to an external grandmaster.
- domU: an unprivileged domain that does not have direct access to any specific hardware device. However, domU has access to two RSwitch2 queues (one RX and one TX) which of course are supporting the HW timestamping capability. It can also access the PTP hardware clock (PHC) read-only via a PTP clock driver that uses Xen IO rings to get the time from the PTP driver in dom0. The clock abstraction this virtualized driver in domU creates is further called a virtual PTP hardware clock (vPHC).
The whole setup is illustrated in Figure 6. The two domains serve as an example of integrated functions of a multi-function ECU. In both domains the sample application that makes use of the provided (v)PHC is chrony, a software that synchronizes the system to external time sources. In our case, chrony synchronizes the system clock of a domain to the (v)PHC. As both system timers are based on the same Arm® hardware system counter, the deviation of PHC to the corresponding system timer gives an indication of the quality this arrangement can achieve. As domU is restricted to read-only access of the PHC, it cannot adjust this clock. This means dom0 and domU share the same clock domain. Of course, the clock synchronization can be moved from dom0 to any other domain, if just access to the PHC is granted. In case more time domains are required, implementations such as the R-Car S4 offer two separated PHCs that can be assigned to different domains.
Figure 6. Setup with virtualized PTP hardware clock
Measurement of Clock Deviation
The chrony application can measure the deviation between the PHC and the system timer. As the system timers of both domains are deferred from the same hardware system counter and both PHCs are deferred from the same physical hardware clock, we can derive the quality of our implementation from the deviation values we got. Figure 7 shows the deviations over the time. One sample has been measured every second. The maximum deviation that can be observed is much higher between virtualized system clock and vPHC in domU. This is somehow explainable, as there are more system services and hardware drivers running in Dom0, this might delay scheduling of domU for a snippy, but measurable amount of time. Although the jitter of domU is generally higher, the drift of both clocks shows similarity as would be expected due to the dependency on the same hardware counters. Of course, for precise analysis, a long-term evaluation with many more samples would be needed, for a detailed analysis of the drift behavior. Besides this, in a real-world scenario, the impact of the overall system load on the jitter should be monitored.
All-in-all the total deviation inside the virtualized environment is completely safe within the limits of a typical sensor application or an A/V streaming as shown in the beginning in Table 1.
Figure 7. Deviation of system clock to (v)PHC
Conclusion
We have shown, how to conveniently provide TSN capabilities to virtualized domains on a Renesas platform that provides the RSwitch2 switching engine. As the queues of the RSwitch2 can be handled directly by the virtualized domains through a hypervisor, hardware timestamping is instantly available to these domains. We laid out how read-only access to a PHC is provided by a hypervisor abstraction in case multiple domains share the same time base with quite good quality. In addition, Renesas SoCs like the R-Car S4 provide an additional PHC that can be used exclusively by one domain. With solutions like the Renesas R-Car S4 SoC, it is perfectly possible to connect external devices to the automotive Ethernet switch, but it is even feasible to virtualize functionalities (ECUs) that anciently needed a dedicated microprocessor. Thanks to the fully integrated and powerful switching technology provided by the RSwitch2, each virtualized function with real-time demands can communicate via TSN Ethernet. For further information on our automotive networking solutions check the links below.
Links:
Learn more about Vehicle Computer 3
Learn more about R-Car H3
Learn more about Arm Generic Timer
Read other blog posts in this series: The Art of Networking Part 1, Part 2, Part 3, Part 4, Part 5, Part 6