Motivation
CAN Bus insecurity is probably the most cited security problem in automotive networks today. Scores of papers have been published about why vehicles are vulnerable to attacks with the CAN bus being the centerpiece of those claims. At the root of this is the fact that the CAN bus which was invented in the 80s of the last century did not consider cyber threats, perhaps understandably since vehicle connectivity was not yet in scope. As vehicle functions evolved and the need for security became apparent, many security solutions were proposed to tackle the CAN insecurity problem. Perhaps the most convincing solution has been the one proposed by AUTOSAR which relies on authenticating the CAN frames using shared symmetric keys and including freshness protection:
Figure 1. SecOC flow in Autosar (Source: Autosar.org)
Due to the popularity of AUTOSAR, this solution is the most prevalent today but comes with performance penalties to the host CPU due to the multiple layers of software needed to perform the tasks of freshness management and data authentication.
Figure 2. SecOC BSW in Autosar Layered Architecture (Source: Autosar.org)
As shown above, when a secure PDU is received, it is routed to SecOC for MAC verification. The SecOC relies on the Freshness Value Manager (FVM) to determine if the received freshness value is within the acceptable window. Here various FVM strategies are possible based on the OEM preference. Due to the limited CAN payload length, many solutions require a truncated freshness value which results in the need for periodic synchronization of the full freshness value. Following the response from the FVM, the SecOC sends the request to the Crypto Service Manager (CSM) to perform the cryptographic function to verify the MAC. The CSM may rely on a software library or the Hardware Security Module (HSM) crypto driver to perform this job before returning the result to the SecOC. Finally, the SecOC either forwards the PDU to the PDU router if the verification is successful or raises an error flag and drops the frame. The CPU workload corresponding to these tasks is significant.
Since the trend has been for the number of messages and CAN channels to increase, the problem of CPU overhead penalty will only get worse. To meet the throughput demands, chip vendors offer AES accelerators integrated into HSMs. However, this experience has shown that a central HSM tasked with handling the authentication requests of hundreds of messages becomes a bottleneck due to the overhead of data copying in and out of the HSM. Note, the AES engine latency makes up only a small fraction of the total time spent in the HSM to perform the authentication. The bulk of the latency is due to the software overhead of job setup, data transfer, job scheduling, key fetching, and responding to the host. With the introduction of CAN XL, which will support payloads of up to 2048 bytes long and baud rates up to 10Mbps, the performance demands on the HSM are bound to get worse. If encryption is needed in addition to authentication, then the additional overhead to transfer data out of the HSM will further add to the overall latency for the host CPU to transmit or receive data. Clearly, today’s security hardware and software architectures are inadequate.
Threat Model
There are many threats against the CAN bus. The list here shows the most worrisome threats:
- Spoofing: due to the broadcast nature of the CAN bus, any CAN node can send any message by spoofing the CAN ID, DLC, and payload
- Sniffing and Replay: due to the open nature of the CAN data and abundance of CAN analysis tools, CAN frames can be easily sniffed and replayed to cause ECUs to perform certain functions such as unlocking the door or applying the breaks
- Repudiation: due to the broadcast nature of the CAN bus, when a malicious CAN frame is transmitted, there is no way to prove which ECU was responsible for sending the fake message
- Resource Exhaustion: when message authentication is enabled using AUTOSAR SecOC, a malicious attacker can send carefully selected freshness values that keep the receiver busy verifying the authenticity of the same frame to exhaust CPU resources
- Denial of Service: a malicious ECU can send zero ID messages continuously causing them to always win arbitration and deny other ECUs from transmitting successfully on the bus. Furthermore, non-conforming CAN hardware can kill specific CAN frames by corrupting certain fields like inserting stuff bits or modifying the physical layer CRC to cause the target ECU to enter BusOff state.
CANsec can address all the above threats except Denial of Service, which would require additional mechanisms for detection and prevention.
Secure CAN Controller
To cope with the increasing demand for throughput and bandwidth of data authentication while lowering CPU overhead, it is recommended to integrate a CANsec layer into the CAN controller to support authentication and/or encryption at line speed.
Figure 3. CANsec Architecture
During ECU startup, in-vehicle communication keys are cached from the HSM secure memory into the CAN controller's dedicated KEY RAM. This RAM is only accessible by the HSM through a private bus to prevent malicious CPU access. It is also only directly accessible by the AES engine inside the CAN controller. Additional CAN registers are added to allow the user to specify a key index mapping for each Secure Channel Identifier (SCI). Similarly, a dedicated register is added per message to store the frame freshness value. The latter must be stored in secure memory prior to ECU shutdown to ensure synchronization on the next boot cycle. When a CAN transmit request is received, the CAN controller performs the following sequence:
- Fetch the key plaintext value based on the key index of the SCI and load it in the AES engine
- Fetch the freshness value and increment it by 1
- Feed the CAN ID | DLC | CAN Payload | Freshness Value, Payload Type, into the AES engine to generate the Integrity Check Value (ICV)
- Insert the CANsec header (Freshness Value) and ICV into the payload while the frame is being transmitted at line speed
During reception, a similar process is followed with the additional steps of comparing the received Freshness Value and ICV to the expected values. To check freshness, the received value is compared to the stored freshness value plus the preconfigured acceptance window. This is necessary to allow ECUs that may go out of synchronization to resynchronize to the received freshness value without complex freshness value management strategies. If both Freshness and ICV values are as expected, the CAN controller updates the Freshness Value register with the received value and sets the reception flag to let the application process the data. Otherwise, it sets the error flag to notify the host CPU that a frame was received but that the data was invalid.
Proof of Concept
Renesas has carried out a feasibility study to prove that it makes sense to implement the CANsec concept. A prototype CANsec implementation was realized in FPGA, based on an early version of the CiA 613-2 CANsec specification. To compare apples with apples we also realized the CANsec protocol in software. The use of SecOC is not appropriate for the reasons stated above.
The graph shows the required CPU performance of a software-based implementation. The processing time is proportional to the amount of Payload data, which is obvious, as more data requires more time to process. It is not expected that the RX latency and the TX latency of the CANsec software have a different slope. Instead, the models should have nearly the same slope. But there is one larger step in the software implemented for sending frames but not for receiving frames. This might be an area of software optimization. However, the general trend is the same for reception and transmission.
Figure 4. CANsec CPU processing time
The software was running on an Arm core of the R-Car H3 SoC with 1.2GHz. If this data would be breakdown to an MCU with 400MHz; then at 100% busload the CPU would be occupied for 25% (@252byte payload). This is a considerable effort, taking into account that such an MCU has several CAN-XL channels, then the CPU would quickly be overloaded. Therefore, it makes sense to implement this as a hardware acceleration function in the CAN-XL IP.
The next graph shows the processing latency of the hardware implementation. The CANsec module is implemented into the data path of the CAN controller. For this the Latency must be shorter than the CAN-XL frame processing time on the CAN-bus, to ensure processing at line speed. In our prototype implementation, the CANsec module is clocked with 80MHz and uses 16 S-Box for the AES algorithms, with this setup we get values that are far below the occupation time of the message on the CAN bus. Further optimization can be achieved for instance, by the use of less S-Box to consume less chip size.
Figure 5. CANsec Hardware Latency time
Conclusion
CANsec is a practical solution to securing the CAN bus against the most common threats facing CAN networks while reducing host CPU overhead and the demand for larger and more powerful HSMs. Offloading the high-speed authentication/encryption task from the HSM into crypto engines that are distributed into peripherals can alleviate the CPU burden. By performing authentication at line speed, it provides seamless security to the application with minimal signal latency. Having the HSM in control of the key caching enables the creation of security policies that restrict the ability of an ECU to send secure messages if the application is no longer deemed trusted.
Further, it was shown that it is feasible to implement CANsec in hardware as well as in software. Of course, implementing in software will have the same drawbacks as today’s implementation based on SecOC. However, a software implementation would allow smooth migration of this technology at the beginning, when not all controllers support a hardware implementation of CANsec yet.
References
[1] CiA 613-2 CAN XL add-on services – Part 2: Security