Daruma Design Pattern for Safe and Continuous Automated Driving

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Daruma Design Pattern for Safe and Continuous Automated Driving

Andrei_Terechko
NXP Employee
NXP Employee
2 0 5,663

Andrei Terechko (NXP), Yuting Fu (NXP), Caspar Hanselaar (TU/e), Kees Schuerman (NXP)
March 2023

What's wrong with modern automated vehicles?

The commercial success of automated vehicles requires safe operation in many driving scenarios and environmental conditions. However, automated driving is often restricted to selected roads, daylight, and good weather conditions, while many hazardous situations can only be handled by a human, such as a safety driver or teleoperator. These restrictions denote the Operational Design Domain (ODD), in which the vehicle was designed to move safely. Unfortunately, the modern automated vehicles have very limited ODDs.

Furthermore, today's automated vehicles drive very conservatively due to various system limitations. This behavior results in unnecessary and risky stops or disengagements, i.e. control transfer from the vehicle to the safety driver. The vehicle's ability to continue automated driving to destination within the ODD without unnecessary stops and disengagements is called availability. Overall, the modern automated vehicles lack availability.

The driving capabilities of automated vehicles are provided by an Automated Driving System (ADS). The ADS reads data from environment sensors and controls the vehicle using powertrain actuators, while making decisions in control software running on interconnected System-On-Chips. Most of the existing architectures for automated driving concentrate on handling system faults. Random hardware faults, such as a memory bit flip due to radiation, and systematic faults in hardware and software, such as an illegal memory address access, have been the primary focus of the ISO 26262 [1], also known as the automotive functional safety standard. Today, fault monitors are well understood and included in every safety-critical design. Furthermore, an ADS often includes redundant and heterogeneous Automated Driving (AD) channels with a high-level arbiter. If a fault monitor detects a fault in the driving channel, the arbiter selects another channel to operate the vehicle [3]. A simplified architecture of an AD channel is illustrated in Figure 1 below.

Andrei_Terechko_0-1679658477785.png

Figure 1. Simplified architecture of an Automated Driving (AD) channel. The environment sensors feed the perception and localization modules, which together with the prediction module compute the world model. The world model consists of road objects with their anticipated motion plans. Based on the global route and the world model, the motion plan computes the ego vehicle trajectory within traffic restrictions found in the map or the world model. Finally, the vehicle control outputs setpoints for steering, braking and acceleration actuators to follow the planned trajectory.

According to our study [5], the wide ADS adoption is mainly hampered by functional insufficiencies as defined in the newly released ISO 21448 standard “Safety Of The Intended Functionality” or SOTIF [2]. Examples of functional insufficiencies include an insufficiently trained neural network or an inaccurate sensor. Unfortunately, the traditional fault handling mechanisms cannot resolve functional insufficiencies. For example, if the primary driving channel's neural network misses the object in front of the vehicle, the system arbiter cannot detect the risk and consequently will not switch to another channel. This inaction will lead to a collision in this example. Furthermore, modern development processes fall short to identify all relevant functional insufficiencies at design time due to the sheer amount of variation in driving scenarios and environmental factors. Despite rigorous safety assurance recommendations from the SOTIF standard based on scenarios and ODDs, there will always be unknown functional insufficiencies present at design time that can trigger hazardous situations at runtime.

Daruma: a design pattern for cross-channel analysis to improve safety and availability

In this article we discuss a novel design pattern, Daruma, which improves both safety and availability of ADS. Note, that it's easy to build a safe but not available vehicle, if it never drives; and it's easy to make a fast vehicle, that is not safe. In fact, Daruma's improvements stem from a dynamic balance of safety and availability implemented through selection of an optimal channel. The Figure 2 below from [5] illustrates how Automated Driving (AD) channel capabilities can be complementary to each other. For each situation identified by the red pin, Daruma must select an AD channel that can handle the situation safely, which is represented by the colored areas in the figure, while the white hole represents the insufficiencies that cannot be addressed by either channel. In general, Daruma addresses both known unsafe and unknown unsafe functional insufficiencies. Note that the arbiter switches between channels even if there are no faults in the system. Andrei_Terechko_1-1679658477882.png

Figure 2. Conceptual illustration of overlapping functional insufficiencies and capabilities of multiple AD channels from [5]. Capabilities of the AD channels can be complementary. By selecting a proper AD channel at runtime, insufficiencies can be mitigated.

Figure 3 below presents the architecture of the Daruma design pattern. The AD channels provide high-level channel state information to the Daruma module, including the world model, motion predictions, ego trajectory and detected traffic rules. Daruma operates independently of the AD channels, staying compatible with innovations in sensor fusion, neural networks, and heuristic algorithms. The channel state information is analyzed using three major types of algorithms: similarity, risk analysis and preferences. The cross-channel analysis produces various safety and availability metrics that are fused with the output of classical fault monitors, ODD monitors, etc. Subsequently, the aggregated score per channel influences the high-level arbiter decision on which channel to select for driving at runtime. In Figure 3 below, Daruma switches over from the risky channel 1 to the more reliable channel 2 while the ADS was free from faults. Note that the Daruma world models (WMx) can include traffic rule restrictions identified by the channels. More information on the Daruma design pattern can be found in [5].

daruma_block_dia_animation.gif

Figure 3. High-level system architecture of an ADS with three heterogeneous AD channels and the Daruma module. The high-level information with the world model and ego vehicle trajectory from the channels is forwarded to the Daruma module along the thick arrows. The Daruma feeds the arbiter with an aggregated safety and available score per channel. If channel 1, for example, has a lower score than channel 2, the arbiter can select channel 2 for driving the vehicle.

Noteworthy, Daruma is uniquely positioned in the ADS to derive Safety Performance Indicators (SPI) [14] shown in the figure above. Indeed, Daruma can correlate the channels' world models and motion plans to identify low safety performance. For example, Daruma can spot repeated disagreements in object classifications between the main and backup channels, even though Daruma is not certain which channel is right. The SPIs and associated driving scenarios are uploaded to the cloud, where they are combined and analyzed across the whole vehicle fleet to create Over-The-Air updates of the firmware, software, and related data in the vehicle. So, if the object classification insufficiency spotted by Daruma in our example above is consistently registered in many vehicles, the ADS manufacturer can retrain the associated neural model and deploy it in the fleet. Ultimately, the SPI statistics can help design new ADS generations and improve AI-based components.

The Daruma design pattern is an ADS extension using cross-channel analysis to select a channel that continues safe automated driving towards destination. Thanks to its channel-agnostic nature and Service-Oriented Architecture design of the Daruma module, it can be instantiated inside the high-performance computer or in the powertrain domain controller of the vehicle, see Figure 4 and Figure 5 below with a red Daruma doll representing the Daruma module. The advantage of the powertrain domain controller instantiation is, for example, independence from faults occurring in the main computer. Note, that Daruma should be close to the vehicle actuators for steering and acceleration depicted with a steering wheel icon in the illustrations.

Andrei_Terechko_0-1680082198138.png

Figure 4. Daruma cross-channel safety analysis in the high-performance central computer (HPC). Multiple AD channels are implemented in the HPC. The red doll represents the Daruma safety analysis module. The Daruma arbiter selects the safe and available channel that will communicate vehicle control setpoints to the zonal controller. The zonal controller uses the setpoints to control steering, braking and acceleration actuators accordingly.

Andrei_Terechko_4-1679658477996.png

Figure 5. Daruma cross-channel safety analysis in the powertrain domain controller. The red doll represents the Daruma safety analysis module. In this architecture, the domain controller includes the AD channels that send the high-level information to Daruma for cross-channel analysis. The arbiter then selects a safe and available channel to let the powertrain domain controller convert steering, braking and acceleration setpoints to electrical control signals for the actuators.

The proof of concept using the CARLA simulator

As part of the NEON project [6], Eindhoven University of Technology and NXP developed the first implementation of the Daruma design pattern in MATLAB: the Safety Shell [4][15]. The Safety Shell is integrated with three heterogenous redundant channels. In particular, the Safety Shell cross-checks each channel's ego vehicle motion plan against world models from the other channels. This cross-channel analysis results in an aggregated safety and availability score for each AD channel, indicating the hazard risk and ability to continue automated driving. Then, the arbiter of the ADS selects the driving channel with a highest aggregated score and other weighting factors, such as the channel preferences defined at design time.

The Safety Shell algorithm implementation is validated on the NXP multi-channel AD testbed with the CARLA simulator [6]. The testbed architecture is illustrated in the Figure 6 below. On this testbed, we integrated AD channels that can operate in scenarios of the CARLA Autonomous Driving Leaderboard [8]. The CARLA Leaderboard is an open platform to evaluate the performance of automated driving agents in realistic traffic scenarios provided by CARLA. The AD agent performance is evaluated against multiple CARLA Leaderboard metrics [9] considering various factors such as the route completion, collisions with pedestrians or other vehicles, running a red light or stop signs, etc. Obviously, the route completion factor influences the AD agent’s availability. We integrated three open-source AD channels in our testbed: LAV [10]  and Transfuser [11] from CARLA Leaderboard submissions [8] and Autoware.Universe, a leading open-source automated driving software stack [12].

Andrei_Terechko_5-1679658478018.png

Figure 6. Multi-channel AD testbed with Daruma cross-channel safety analysis implementation (Safety Shell) and the CARLA simulator. The Carla simulator models the ego vehicle, traffic, the road and environment. The vehicle sensor data is fed into the AD channels the produce the high-level information, such as the world model and ego motion plan. Daruma analyzes the motion plans against the world models to compute the aggregated safety and availability score for the arbiter. Finally, the arbiter selects the channel to send actuators to the modeled vehicle in CARLA.

First, each AD channel was tested separately in a single test route without the integration of the Safety Shell and the CARLA Leaderboard results were collected. Independently, two channels completed the 1 km urban test route quickly but caused either collisions or red-light violations. The third channel did the route safely but slowly. So, there was no single channel to provide high levels of both safety and availability.

Then, we integrated the Safety Shell module with a three-channel AD cluster and tested in the same route and driving scenarios as individual channels. The three-channel cluster enhanced with Safety Shell completed the route without collisions or red-light violations and 15% faster than the safest individual channel. These preliminary results suggest the potential benefit of the Safety Shell algorithms and the Daruma concept as a whole.

Demo videos

The demo videos presents the motivation behind the Daruma design pattern and includes a sped-up CARLA Leaderboard route recording in various weather conditions. Figure 7 before the video clarifies essential video elements to ease understanding. Remarkably, the demo video clearly showcases how differently AD channels perceive the same environment and plan ego vehicle motion differently. Thanks to heterogeneity of the redundant channels the Daruma design pattern enables the opportunity to select the most performant channel at every moment. Note, that high performance here implies a good balance between safety and availability. The first video focuses on safety and the second one on availability.

Andrei_Terechko_0-1679949232860.png

Figure 7. Illustration of the demo video elements, including the ego vehicle, road users and motion plans as seen by three AD channels: autoware (Autoware), lav (LAV), and tfuse (Transfuser). Furthermore, the demo video includes the CARLA simulator viewpoint and the Safety Shell MATLAB log window.

The demo video below consists of several noteworthy scenarios:

  1. at 00:30 the fail-operational redundant ADS collides the vehicle with a pedestrian crossing the street due to the absence of functional insufficiency mitigation mechanisms;
  2. at 02:24 the pedestrian crossing scenario is shown, in which the Safety Shell selects a backup channel and stops the vehicle;
  3. at 03:05 the evasive maneuver scenario is shown, where the ego vehicle is steered away from collision by switching to a backup channel;
  4. at 04:30 the adverse weather condition is simulated, when the chances for a functional insufficiency increase and yet the Safety Shell manages to strike the right balance between safety and availability.

In a risky situation, it’s (relatively) easy to make a vehicle safe by halting it. However, such behavior defeats the transportation purpose of the vehicle and compromises availability of automated driving. The next video illustrates how the Safety Shell improves availability without sacrificing safety. First, the demo shows how an automated vehicle halts due to a functional insufficiency in the motion planner during sharp turns. Then, the Safety Shell unblocks the vehicle by switching to a backup channel to continue automated driving and mitigate the functional insufficiency in the main channel.

 

Frequently Asked Questions

 

1. How does Daruma deal with channel disagreements? Channels can disagree on perception or motion plans, while being safe and available. So, in contrast to other safety mechanisms that suffer from disagreement, Daruma can benefit from them, see examples below:

a) Motion plans disagreement 1. In the birds-eye-view figure below, two channels red and blue control the ego vehicle e to avoid the collision with another vehicle v. In black we have the ground truth. The arrows specify the computed motion plans of the red and blue channels, which are substantially different, so that majority voting mechanisms based on margin or envelope will signal disagreement and disengage automated driving or mark channels as unsafe. In contrast, Daruma analyzes red and blue motion plans in the other world models and concludes there are no safety conflicts. So, any motion plan is good to use.

Andrei_Terechko_0-1694605090041.png
Andrei_Terechko_1-1694605090042.png

b) Motion plan disagreement 2. As illustrated in the last availability video above, we can select a safe trajectory to unblock the ego vehicle that suffers from a functional insufficiency in the red channel’s motion plan. Below is the bird’s-eye view of the scenario, in which we temporarily switch to the green channel to make a sharp turn and then back to the red channel, which is more advanced in majority of the scenarios.

Andrei_Terechko_2-1694605090042.png

c) Perception disagreement. If two channels see no object, while the third one does, then the classical majority voting mechanism will select the two channel’s motion plan. Instead, Daruma can evaluate cross-channel risks and notice the safety conflict of the agreeing channels with the third one’s world model. A bird’s-eye view illustration is shown below. Within a certain margin the red and blue channels agree and do not see vehicle v in front. So, they propose to drive straight. Nevertheless, Daruma selects the green channel, because it is compatible from the safety perspective with all channel’s world models, and it continues driving towards the destination instead of halting. By the way, in our demo videos above you can literally see disagreements between red, green and blue channels that perceive the world differently and plan the motion differently.

Andrei_Terechko_3-1694605090043.png

d) In the end, Daruma doesn't know, which channel is closer to the ground truth, but it can select a safer one through comparative cross-channel analysis of motion plans, as shown in our ESV paper [5] in Figure 11. In general, we distinguish three types of algorithms that fit the Daruma cross-channel analysis: motion plan risks analysis, similarity/agreement, and preference. We covered these algorithms in more detail in our ESV paper in section "Daruma: architecture design pattern".

2. Does Daruma rely on fail-silent behavior? Daruma is an extension to existing fault monitors and safety mechanisms, including those implementing a fail-silent behavior for a channel. We see this compatibility with existing and future AD systems as a useful feature, because we can't outperform decennia of sensor fusion and functional safety development, let alone future innovations. Instead, we complement them.

3. How many AD channels does Daruma need? Daruma is a scalable symmetric design pattern for redundant multi-channel AD systems. So, two or more channels are seamlessly supported. In fact, this scalability seems like a useful advantage over the monitor/actuator setup, because we can boost safety in a modular fashion at the cost of extra channels to go beyond the demands of safety standards towards vision zero.

4. Will Daruma help if AD channels share sensors or algorithms? Yes. In the CARLA Leaderboard evaluation shown in the videos above, the sensors were shared while channel algorithms were different, which was sufficient for Daruma to make better choices than individual channels. Note, however, that Daruma's effectiveness decreases when channels share components, and, ultimately, if the channels are the same, Daruma becomes useless. Furthermore, shared elements must be carefully analyzed from the functional safety perspective (common cause failures, freedom from interference, cascading failures).

Conclusions

We presented the Daruma design pattern to improve safety and availability of automated vehicles. The first implementation of the design pattern, the Safety Shell, calculates risk profiles based on cross-channel analysis to select the high-performance channel in each driving scenario. We validated our concept using a CARLA Leaderboard test route in an urban environment with three open-source AD channels. As a result, the AD channel cluster outperforms each individual channel in both safety and availability metrics, suggesting a high potential of our concept.

Acknowledgements

We thank our former colleagues Jochen Seemann and Tim Beurskens who substantially contributed to the presented results. Furthermore, our work relies on great open-source projects, and we sincerely thank the development teams of Autoware [12, 13], LAV [10], Transfuser [11], and CARLA [7].

References

  1. "ISO 26262:2018 Road vehicles – Functional safety," 2018.
  2. "ISO 21448:2022 Road vehicles — Safety of the intended functionality," June 2022.
  3. S. Fürst, "Scalable, Safe and Multi-OEM Capable Architecture for Autonomous Driving," 9th Vector Congress, Stuttgart, Germany, 2018.
  4. C. A. J. Hanselaar, E. Silvas, A. Terechko and W. P. M. H. Heemels, "Detection and Mitigation of Functional Insufficiencies in Autonomous Vehicles: The Safety Shell," in the 25th IEEE International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 2022.
  5. Y. Fu, J. Seemann, C. A. J. Hanselaar, T. Beurskens, A. Terechko, E. Silvas, W.P.M.H. Heemels, "Characterization And Mitigation Of Insufficiencies In Automated Driving Systems", in the 27th Enhanced Safety of Vehicles conference, Yokohama, Japan, April 2023. [Online]. Available: https://index.mirasmart.com/27esv/PDFfiles/27ESV-000110.pdf
  6. The NEON Project, funded by the Dutch Research Council (NWO), project number 17628. [Online]. Available: https://neonresearch.nl/
  7. CARLA open-source simulator for autonomous driving research. [Online]. Available: https://carla.org
  8. CARLA Autonomous Driving Leaderboard. [Online]. Available: https://leaderboard.carla.org/
  9. CARLA Leaderboard evaluation and metrics. [Online]. Available: https://leaderboard.carla.org/#evaluation-and-metrics
  10. D. Chen, P. Krähenbühl, "Learning from all vehicles". In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. [Online]. Available: https://github.com/dotchen/LAV
  11. K. Chitta, A. Prakash, B. Jaeger, Z. Yu, K. Renz, and A. Geiger. "Transfuser: Imitation with transformer-based sensor fusion for autonomous driving, arXiv 2022. arXiv:2205.15997. [Online]. Available: https://github.com/autonomousvision/transfuser
  12. Autoware.Universe automated driving software stack. [Online]. Available: https://www.autoware.org/autoware
  13. CARLA OpenPlanner Bridge. [Online]. Available: https://github.com/hatem-darweesh/op_bridge
  14. P. Koopman, "Safety Performance Indicators (SPIs) for Autonomous Vehicles," 2020. [Online]. Available: https://users.ece.cmu.edu/~koopman/lectures/L124_SPI_vs_KPI.pdf [Accessed March 2022].

  15. C.A.J. Hanselaar, E. Silvas, A. Terechko, W.P.M.H. Heemels, "The Safety Shell: An Architecture to Handle Functional Insufficiencies in Automated Driving". November  2023. [Online]. Available: https://arxiv.org/abs/2311.08413.