German Power prices can be explained by supply and demand, but also by causal correlations to underlying energy future prices. A properly weighted basket of gas, coal and emissions should therefore be able to resemble the moves of the power price. This article will introduce multivariate regression analysis to calculate the influence of the underlying markets on a given benchmark. It is an example of a machine learning algorithm used in analysis and trading.

## Multivariate regression analysis

Regression analysis describes an usually linear relationship between two assets. This is expressed in a simple equation: *Asset2 = Cash+Factor*Asset1*. But in reality you will never come across an asset which can be priced using such a simple relationship. Usually the market has several dependencies. This is where multivariate regression analysis comes into play. It outputs an equation in the form of *Benchmark = intercept+f1*asset1+f2*asset2+f3*asset3+…* Intercept, f1,f2,f3… would be called the coefficients of this equation and are numbers. To translate this formula into the traders language, we can call the output of a multivariate regression analysis a *weighted basket of different assets plus a little cash*. The algorithm uses some training data to calculate the individual weightings.

Multivariate regression analysis uses gradient descent to calculate the individual coefficients / weightings. It is an algorithm of the machine learning class. I used the sklearn Python module to do all the calculations.

## Drivers of German Power Prices

To give an example how multivariate regression analysis can be used in trading and analysis, I will do an analysis of the German power prices. But feel free to use this kind of analysis with any kind of market and it’s causally correlated drivers.

German power, I will use the yearly base price, can be explained by it’s drivers gas (TTF month/season), coal (API2) and the price of emission certificates (CFI2). Depending on the volatility and direction of these legs, power prices will change. (there are surely more drivers, but let’s keep it simple…)

On the chart above a multivariate regression analysis has been used to calculate the influence of TTF month, API2 and CFI2 on the yearly power price. As you can see coal has got the highest influence on power prices, gas and emission prices explain about 30% of the power price move. *This is not about the physical energy mix needed to produce power, but the fitting of a weighted basket of energy futures to replicate the index’s movements.*

To calculate the numbers shown above the regression model was trained with the last 30 closing prices of gas, coal and emissions. With every day of new data, the model is automatically re-trained and calculates a new set of coefficients.

If you know the influence of each market on the benchmark, you can also turn it around and calculate the “fair” value of the benchmark, according to the underlying futures. On the chart above today’s gas/coal/emission prices where used to give an estimation for tomorrow’s power price. This is the blue line on top of the power chart.

## German Power: TTF month and season

The more legs or driving factors you add, the better your prognosis should be. At least multivariate regression analysis is capable to work with as many legs as you want, if more really means better the results will have to show.

For the chart below I added the TTF season gas contract, additionally to the TTF front month. Together with emissions and coal the regression algorithm now is based on 4 correlated legs. This leads to a slightly more precise prognosis of the power price. Compare the red prognoses (4 legs) to the blue prognosis (3 legs). The downside of adding more legs usually is, that the individual factors get more volatile (as the system has more degrees of freedom).

Beside doing a prediction or calculating a fair value for the benchmark, this kind of analysis is also important in risk control. Knowing the driving factors of your portfolio surely helps in designing the right hedging strategy.

## Regression analysis: other markets

Regression analysis is not limited to German power markets and it is not limited to a specific number of legs. The example below shows a regression analysis done to see the influence of the sector ETFs for financial, tech and t-bonds on the SPY. The chart shows the % weightings of a tracking basket and a one day prognosis of SPY using the 3 mentioned ETFs. The model uses the previous 30 closing prices and is retrained every day.