Safe and Optimal Cruise Control

This example is dedicated to Ernst-Rüdiger Olderog [LMT15]. It demonstrates the usage of symbolic techniques in synthesizing a safe discrete controller for continuous dynamics and then optimizing the control for maximum progress by using statistical techniques — all within a single integrated tool Uppaal Stratego.

The goal of cruise control is to maintain steady speed of a vehicle. As roads are becoming more congested, an adaptive cruise control would also maintain a safe distance to vehicles ahead. Picture below captures a setup with our Ego car and a car in Front.

Cars following each other with cruise control

Ego is equiped with a distance sensor. The distance changes dynamically based on a relative speed between cars and each car controls its speed by acceleration or breaking. The goal is to synthesize a control strategy whether to accelerate, maintain speed, break or even move backwards based on measured relative distance.

Our proposed solution is to model the system as a Stochastic Priced Timed Game, abstract away the stochastic and continuous dynamics, solve the Timed Game with a safety goal (when Ego should accelerate or break but maintain minimal safe distance despite what Front may choose to do), then take the safe Ego strategy back to the original Stochastic Priced setup and optimize the safe controller strategy into fast one by using learning techniques.

Modeling

First we model the controllable options of such a system, namely the actions of car Ego of picking its acceleration values via solid edges:

Front car can also exercise the same actions, but for us its transitions are not controllable (dashed edges). In addition Front may go beyond the reach of Ego's distance sensor (Faraway location):

We introduce a scheduler to limit the decision taking for both cars to once per second. The scheduler dictates that Ego should make its choice first without the knowledge of what Front's choice, then Front may choose its acceleration (may turn against Ego's choice) and after one second the discrete approximation of velocities and distance is calculated:

The approximate integer distance is calculated based on averages using the function below:

int distance; // approximated distance between cars
int velocityEgo, velocityFront; // approximated velocities
int accelerationEgo, accelerationFront; // discrete acceleration

void updateDiscrete()
{
    int oldVel, newVel;
    oldVel = velocityFront - velocityEgo;

    velocityEgo = velocityEgo + accelerationEgo;
    velocityFront = velocityFront + accelerationFront;

    newVel = velocityFront - velocityEgo;
    
    if (distance > maxSensorDistance) {
        distance = maxSensorDistance + 1;
    } else {
        distance += (oldVel + newVel)/2;
    }
}

The actual dynamics of continuous distance and velocities is specified by ordinary differential equations over "hybrid clocks" as an invariant expression:

Dynamical monitor with reset for learning

hybrid clock rVelocityEgo, rVelocityFront; // continuous "real" velocities
hybrid clock rDistance; // continuous "real" distance between cars
hybrid clock D; // "cost" to be minimized: integral over distance

double distanceRate(double velFront, double velEgo, double dist)
{
    if (dist > maxSensorDistance)
        return 0.0;
    else
        return velFront - velEgo;
}

The hybrid clocks here are special: they do not influence the behavior of the model but rather monitor the "cost" of a simulation run, consequently they are abstracted away under symbolic model checking queries and enabled in statistical queries. Likewise, the stochastic features become plain non-deterministic under symbolic model checking queries. The __RESET__ location provides special hint for statistical learning (the syntax is going to change in the future): the looping edge resets the variables which should not influence the strategy. In this case all variables are ignored as noise except the process locations.

We validate the discrete approximation by inspecting both discrete and continuous trajectories. The plot shows that discrete variables have exactly the same values as continuous ones at integer time points:

simulate 1 [<=20] { rDistance, dist, rVelocityFront - rVelocityEgo, velocityFront - velocityEgo }

However the verification shows that the distance is not always safe:

A[] distance > 5             // FALSE - unsafe

Statistical model checking also shows that the probability of a crash is unacceptably high if Ego is driven randomly:

Pr[<=100] (<> distance <= 5) // [0.85, 0.86] (95% CI)

The resulting plot shows that the probability density is highest in the beginning and the cumulative probability is over 85% after 100 seconds:

Random drive crash probability density over time

Synthesis of Safe Strategy

Controller synthesis uses symbolic game-solving techniques where Front is an oponent/adversary, Ego is the controlable player and the goal is to maintain the safe distance. Like with other symbolic techniques the continuous dynamics is abstracted away with Uppaal Tiga query and the resulting strategy is stored under the name safe:

strategy safe = control: A[] dist > 5  // success

The synthesized strategy is validated by simulating model behavior with controller choices from safe strategy. The plot shows that the distance trajectories are above 5, i.e. model does not exhibit crashes anymore:

simulate 10 [<=100] rDistance under safe

Once we have the safe strategy we can also estimate the smallest safe distance for any relative velocity. The plot below is a result of many infimum queries (symbolic model checking) for each possible relative velocity value:

inf { velocityFront-velocityEgo == v } : dist under safe

The plot shows that the distance has a quadratic dependency on velocity when the velocity is negative (i.e. Front is approaching towards Ego) and it is a bit more complicated when Front is moving away.

Maximizing Progress by Minimizing Distance

Previous section showed some properties of the safe strategy. Apart from being safe, the strategy also exhibits rich (non-deterministic) behavior saying that there are many ways to stay safe. An emediate remark follows that even though the distance is safe it can still be arbitrary large, i.e. Ego may choose to stay home or even drive backwards without any progress. In order to guarantee maximal progress Ego should follow Front as close as possible, i.e. the progress is maximized when the distance between the cars is minimized. A statistical learning query is used to select a particular controllable transition from a non-deterministic set of safe transitions with an objective to minimized the expected cumulative distance D after 100 seconds under safe strategy drive:

strategy safeFast = minE (D) [<=100]: <> time >= 100 under safe

The learned result is a another strategy stored under the name safeFast which can also be checked using statistical queries. For example, here is a number of random runs using safeFast strategy:

simulate 10 [<=100] rDistance under safeFast

Notice that the distance trajectories are no longer diverging. The efficiency of the strategies is contrasted by comparing probability distributions of distances, which shows that under the safeFast strategy the distance is reduced from 138 to 24 on average:

Pr[rDistance <= maxSensorDistance + 1] <> time >= 100 under safe

Pr[rDistance <= maxSensorDistance + 1] <> time >= 100 under safeFast

In fact the new strategy is so effective at following the front car that the number of possible relative velocities is restricted to a narrow range of [-4,4] (as oposed to [-22,30] under safe) as shown in histogram of distances over velocities:

Box plot of distances over velocities under safe and fast strategy

The box plot also shows that the mean distances stay low and only rarely diverge beyond 70.

Uppaal Stratego model: cruise.xml

References

[LMT15] Kim Guldstrand Larsen, Marius Mikučionis and Jakob Haahr Taankvist. Safe and Optimal Adaptive Cruise Control. Correct System Design Symposium in Honor of Ernst-Rüdiger Olderog on the Occasion of His 60^th Birthday Oldenburg, 2015. {bib}