Uncertainty and Maths - What is Polynomial Chaos

Introduction to the polynomial chaos

Welcome back to the uncertainty quantification (UQ) series. In the opening post, we learned what UQ was in the first place, and why you should know it. We also went through the typical workflow involved in the process.

The second post was then about finding and quantifying the significant sources of uncertainty. For example, uncertain material parameters. We saw that ideally, we would know the joint probability distribution of the uncertain variables under consideration. We also saw this is often impossible. In such a case, estimated parameters (mean, variance) can help, as can estimated distributions like histograms. For several random variables and random fields, estimated marginal distributions and covariances can be a life-saver.

However, knowing the characteristics of the input data is not enough. Obviously, it’s the output that ultimately interests us. And to get that, we need a tool model the propagation of uncertainty from the inputs, and to the output.

Enter polynomial chaos.

Polynomial WHAT??

Yeah, my thougths exactly. The name is really confusing and stupid, and the Wikipedia entry is not exactly helpful either.

But, the idea is quite simple really.

Let’s say we have a random conductivity $\sigma$ . The idea of polynomial chaos is to write $\sigma$ as a weighted sum of polynomials $p$ of a variable $X$ :

$\sigma\approx w_1 p_1(X) + w_2 p_2(X) + w_3 p_3(X)...$

The catch is here: the variable $X$ is a random variable with a nice distribution. It can be normal, it can be uniform, or it can be any other of the textbook distributions.

The polynomials $p$ are then selected so that all $p_i(X)$ are non-correlated. Or, in more mathematical terms, they are orthogonal with respect to $X$ ‘s distribution. For instance, Hermite polynomials are used with normal distributions, and Legendre polynomials with uniform ones.

The polynomials have some nice properties. The first one is always a constant $p_1(X)=1$ . All the others have a zero expected value

$\text{E}(p_i(X)) = 0$ .

And like I already mentioned, they all all mutually uncorrelated, meaning that the covariance goes to zero

$\text{cov}( p_i p_j ) = 0$ ,

whenever $i \neq j$ . Finally, all of them have have a unit variance

$\text{var} ( p ) = 1$ ,

except for the constant one of course.

Simple example

An example might clarify things. Let’s say our conductivity is normally distributed with a mean value of 5, and a variance of 2. (A very realistic example, I know.)

Then, it can be expressed simply as

$\sigma = 5 \cdot H_1(X) + \sqrt{2} \cdot H_2(X)$ ,

where $H$ denote the aforementioned Hermite polynomials.

The first one takes care of the mean value of $\sigma$ . Makes sense right, $H_1$ being the only polynomial with a non-zero mean. Then, the variance is determined purely by the second term.

Still don’t believe me? Checking the Wikipedia article reveals that $H_1(X) = 1$ and $H_2(X)=X$ , so the above expression is simplified into

$\sigma = 5 + \sqrt{2} \cdot X$ .

This indeed has the correct distribution.

Interpretation

Now, note one important thing. There’s a direct correspondence between the conductivity $\sigma$ , and the normally-distributed random variable $X$ . Fix $X$ , and the value of $\sigma$ is fixed too.

Indeed, we could as well write that $\sigma$ is a function of $X$ :

$\sigma \hat{=} \sigma(X)$ .

This is hugely beneficial, as we shall soon see. But first, let’s deal with one issue first.

Non-standard distributions

Yeah. What if $\sigma$ does not follow any standard distributions?

For example, it might look kinda normal, but with the tails cut off. I mean, you simply can’t have negative conductivities in reality, but a normal distribution would allow this.

In this case, it might be a good idea to use an $X$ that follows the standard uniform distribution. Correspondingly, the polynomials would be of the Legendre type. But finally, each value of $\sigma$ would have to related to a particular value of $X$ .

We can do this with basic high-school statistics principles, by transforming distributions. For example, the lower limit for $\sigma$ of course has to correspond to $X=0$ . This is fairly straightforward – neither can get any lower. Likewise, $X=1$ would be mapped to the highest possible $\sigma$ .

This approach can be followed further. Medians are mapped to medians, and quartiles to quartiles. Just like in the magnificent piece of art below.

You probably see now where this is going. We get an exact 1-on-1 correspondence by using the cumulative distribution functions. Simply put, for any possible $X_0$ , we get the corresponding $\sigma_0$ by requiring that

$\text{Pr}\left( \sigma \leq \sigma_0 \right) = \text{Pr}\left( X \leq X_0 \right)$ .

Once we have this relationship, we can again write the series approximation

$\sigma \approx w_1 \text{L}_1(X) + w_2\text{L}_2(X) + w_3\text{L}_3(X) + \ldots$

with the Legendre polynomials $\text{L}$ . The coefficients $w$ can then be determined with linear regression, or the Galerkin’s method. But more of that later.

Why polynomial chaos?

There’s one really nice benefit to the polynomial chaos approach. Representing the input side uncertainty is kinda meh. We could do that in a million other ways, too.

The real benefit comes from analysing the output. The torque $T$ of an electrical machine, for instance.

Once we have the input – the conductivity $\sigma$ – represented as a series, we do the exact same thing for the output too. We write the torque as

$T = c_1 \text{L}_1(X) + c_2\text{L}_2(X) + c_3\text{L}_3(X) + \ldots$ ,

and determine the coefficients $c$ with one of the ways described later in this series.

And that’s where the real benefit appears. First of all, the above expression is really easy to analyse. We can determine mean values, variances, distributions, and you-say-what very easily indeed. No need for time-consuming computer simulations any more – just simpl-ish arithmetics.

There’s also another benefit, maybe even more important than the first one. Realize we’re writing both the input and output as function(als) of $X$ ,

$\sigma = w_1 \text{L}_1(X) + w_2\text{L}_2(X) + w_3\text{L}_3(X) + \ldots$
$T = c_1 \text{L}_1(X) + c_2\text{L}_2(X) + c_3\text{L}_3(X) + \ldots$

This means we have an almost direct expression from the input $\sigma$ , to the output $T$ . This, in turn, can offer some very useful insights to the effect $\sigma$ has on $T$ . Like how any kind of variation in the former influence the latter.

Why the same polynomials

Aaaand the mathematical onslaught continues a little bit more. Saw how we’re using the same Legendre polynomials $\text{L}$ to represent both the input $\sigma$ and the output $T$ ?

That’s partly because of maths – the orthogonality of the polynomials is nice.

But, there’s also a more intuitive reason to this. For the input side, we chose to use $\text{L}$ because the conductivity $\sigma$ was roughly uniformly distributed. Hence, choosing $X$ to be uniformly distributed as well allowed us to easily get a good polynomial approximation for $\sigma$ .

This is exactly why we’re using the same polynomials for representing the output.

Think of it.

If the input is strictly limited to a certain interval, it is almost certain that the output is as well. Hence, an uniformly distributed $X$ makes sense for it too.

Likewise, if the input could obtain any possible value, but were still centered near its mean, the output would probably exhibit the same behaviour. Most of the time, the results would be quite the same, but occasionally you’d get an outlier. In this case, using a normally distributed $X$ would make sense.

Conclusion

Polynomial chaos enables us to express random variables in a nice concise form. That alone may not be very important in itself. Its real power comes from establishing a relationship between the random input, and the output. That way, the quantity of interest can be analysed very easily, without time-consuming simulations.

So far, we’ve only considered a single random variable. Next time, we shall see how several random variables, or random functions can also be handled with the polynomial chaos approach.

Until then!

-Antti

Check out EMDtool - Electric Motor Design toolbox for Matlab.

Need help with electric motor design or design software? Let's get in touch - satisfaction guaranteed!

Uncertainty and Maths – What is Polynomial Chaos

Antti Lehikoinen