Objective: A number of different methods must be combined for the robust certification of highly automated vehicles (HAVs) for deployment in ODDs encompassing public roads. This paper, which is authored by a braintrust of the world’s leading academics in validation, verification and certification and affiliated with Europe's largest autonomous vehicle developer FiveAI, proposes a core set of processes.
Methods: The paper discusses in detail: (1) requirements discovery; (2) behaviour requirements; (3) simulation as a tool for verification; (4) useful tools and methods.
Results: We propose a process centred around hyper-scale fuzzed scenario-based testing and the use of coverage driven verification methods in digital twins of the ODD and using generative models representative of each ODD. Testing must cover both full stack testing, which will require photo-realistic and sensor-realistic rendering of scenarios and objects, together with accurate sensor modelling and motion planning stack testing, will require robust beliefs over scenario actor behaviours to test predictive, planning and motion synthesis.
Discussion and Conclusions: The paper poses several questions for policy makers: (1) Could a validation, verification and certification system that incentivizes sharing of scenarios while protecting the value intrinsic to their discovery, improve safety across the industry? Could it be used by an approval body such as a national Certification Agency to establish a high standard for national certification? (2) Can the industry agree on a scenario description language that supports coverage-driven verification and is extensible? (3) What should the specification of an appropriate simulation environment be? (4) Could the specification for a test oracle be made available and could this be based on a formal description of ‘good driving’? (5) Is auditable adherence to the IATF16949:2016 quality assurance process sufficient to satisfy ‘Conformity of Production’?
Key questions also remain, including: (a) What machine learning methods should be applied to directed random testing in coverage driven verification? (b) Given the high dimensionality of the test space, what coverage measures are meaningful in generative and ODD digital twin verification? (c) Which computer vision methods can we apply to the 3D reconstruction of digital twin worlds from photogrammetry, LIDAR scans and other modalities that mean accurate, up-to-date digital twins are feasible? (d) What hardware acceleration beyond GPUs can we design and apply to enable faster-than-real-time full stack verification of HAVs? (e) How can we apply formal software checking to the complex integrated systems required for autonomous driving to ensure that each build achieves its goals without bugs or gaps? (f) How do we really apply formal mathematical methods to express the Digital Highway Code (DHC), vehicle dynamics and other road user expectations and behaviours to verify the behavioural safety of HAVs? (g) How can we verify HAV systems that comprise of one or more end-to-end neural networks with the requirements to explain failure modes and take corrective actions to improve their performance using human readability and intermediate outputs of modular processes? (h) How might we extrapolate randomized testing, including near collisions, into a measure of probability of collision generally?