One-to-Many Sensor Trouble ā Part I
In the world of automated vision, thereās only so much one can do with a single sensor. Optimize for one thing, and lose the other; and even with one-size-fits-all attempts, itās hard to paint a full sensory picture of the world at 30fps.
We use sensor combinations to overcome this restriction. This intuitively seems like the right play; more sensors mean more information. The jump from one camera to two, for instance, unlocks binocular vision, or the ability to see behind as well as in front. Better yet, use three cameras to do both at once. Add in a LiDAR unit, and see farther. Add in active depth, and see with more fidelity. Tying together multiple data streams is so valuable this act ofĀ Sensor FusionĀ is a whole discipline in itself.
Yet this boon in information often makes vision-enabled systems harder to build, not easier. Binocular vision relies on stable intrinsic and extrinsic camera properties, which cameras donāt have. Depth sensors lose accuracy with distance. A sensor can fail entirely, like LiDAR on a foggy day.
This means that effective sensor fusion involves constructing vision architecture in a way thatĀ minimizes uncertainty in uncertain conditions. Sensors arenāt perfect, and data can be noisy. Itās the job of the engineer to sort this out and derive assurances about what is actually true. This challenge is what makes sensor fusion so difficult: it takes competency in information theory, geometry, optimization, fault tolerance, and a whole mess of other things to get right.
So how do we start?
Guessing
ā¦just kidding. Though you would be surprised how many times an educated guess gets thrown in! No, weāre talking
Kalman filters
So letās review our predicament:
- We have a rough idea of our currentĀ stateĀ (e.g. the position of our robot), and we have aĀ modelĀ of how that state changes through time.
- We haveĀ multiple sensor modalities, each with their own data streams.
- All of these sensors giveĀ noisy and uncertain data.
Nonetheless, this is all that we have to work with. This seems troubling; we canāt be certain about anything!
Instead, what weĀ canĀ do isĀ minimize our uncertainty. Through the beauty of mathematics, we can combine all of this knowledge and actually come out with aĀ more certainĀ idea of our state through time than if we used any one sensor or model.
This š is the magic of Kalman filters.
Warning: Math.
Phase 1: Prediction
Model Behavior
Letās pretend that weāre driving an RC car in a completely flat, very physics-friendly line.
There are two things that we can easily track about our carās state: its position pāĀ and velocity vā.
We can speed up our robot by punching the throttle, something we do frequently. We do this by exerting a forceĀ fĀ on the RC carās massĀ m, resulting in an accelerationĀ aĀ (see Newtonās Second Law of Motion).
With just this information, we can derive a model for how our car will act over a time period ĪtĀ using some classical physics:
We can simplify this for ourselves using some convenient matrix notation. Letās put the values we can track, position pāĀ and velocity vā, into aĀ state vector:
ā¦and letās put out applied forces into aĀ control vectorĀ that represents all the outside influences affecting our state:
Now, with a little rearranging, we can organize our motion model for position and velocity into something a bit more compact:
By rolling up these terms, we get some handy notation that we can use later:
- Fā is called ourĀ prediction matrix. It models what our system would do over Īt, given its current state.
- Bā is called ourĀ control matrix.Ā This relates the forces in our control vector uā to the state prediction over Īt.
Uncertainty through PDFs
However, weāre notĀ exactlyĀ sure whether or not our state values are true to life; thereās uncertainty! Letās make some assumptions about what this uncertainty might look like in our system:
- Any error we might get in an observation is inherently random; that is, there isnāt a bias towards one result.
- Errors are independent of one another.
These two assumptions mean that our uncertainty follows theĀ Central Limit Theorem! We can therefore assume that our error follows aĀ Normal Distribution, aka aĀ Gaussian curve.
We will use our understanding of Gaussian curves later to great effect, so take note!
Weāre going to give this uncertainty model a special name: aĀ probability density function (PDF).Ā This represents how probable it is that certain states are theĀ trueĀ state. Peaks in our function correspond to the states that have the highest probability of occurrence.
Fig. 1. Our first PDF
Our state vector xā represents the mean μ of this PDF. To derive the rest of the function, we can model our state uncertainty using aĀ covariance matrixĀ Pā:
There are some interesting properties here in Pā. The diagonal elements (Ī£āā, Σᵄᵄ) represent how much these variables deviate from their own mean. We call thisĀ variance.
The off-diagonal elements of Pā expressĀ covarianceĀ between state elements. If Ī£āᵄ is zero, for instance, then we know that an error in velocity wonāt influence an error in position. If itās any other value, we can safely say that one affects the other in some way. PDFs without covariance terms look like Figure 1 above, with major and minor axes aligned with our world axes. PDFsĀ withĀ covariance are skewed off-axis depending on how extreme the covariance is:
Fig. 2. A PDF with non-zero covariance. Notice the ātiltā in the major and minor axes of the ellipsis.
Variance, covariance, and the relatedĀ correlationĀ of variables are valuable, as they make our PDF more information-dense.
We know how to predict xāāā, but we also need the predicted covariance Pāāā if weāre going to describe our state fully. We can derive it from xāāā using some (drastically simplified) linear algebra:
Notice that Bāuā got tossed out! Control has no uncertainty that we can directly observe, so we canāt use the same math that we did on Fāxā.
However, we can factor in the effects of noisy control inputs another way: by adding aĀ process noise covariance matrixĀ Qā:
Yes, we are literally adding noise.
Prediction step, solved?
We have now derived the fullĀ prediction step:
Our results are⦠ok. We got a good guess at our new state out of this process, sure, but weāre a lot more uncertain than we used to be!
Fig. 3. Starting state PDF in green, predicted state PDF in red. Notice how the distribution is more spread out. Our state is less certain than before.
Thereās a good reason for that: everything up to this point has been a sort of ābest guessā. We have our state, and we have a model of how the world works; all that weāre doing is using both to predict what might happen over time. We still need something toĀ support these predictionsĀ outside of our model.
Something like sensor measurements, for instance.
Weāre getting there! So far, this post has covered:
- The motivation for using a Kalman filter in the first place
- A toy use case, in this case our physics-friendly RC car
- The Kalman filter prediction step: our best guess at where weāll be, given our state
Weāll keep it going in Part II by bringing in our sensor measurements (finally). We will use these measurements, along with our PDFs, to uncover the true magic of Kalman filters!
Spoiler: itās not magic. Itās just more math. Happy new year from the Tangram Vision team!