Power transform for solar generation data

In California, solar generation has been increased substantially in recent years due to increased solar panel installation. As the top of the first figure has shown, solar generation was low in 2011 and began to take off since 2013. In general, its time series is characterized by a positive trend and a seasonal cycle. Furthermore, its mean and variance increase over time, leading to an asymmetric distribution with a long tail and a peak at low values. Because of this changeable characteristic, power transform is needed before modeling the data.

Top: Monthly mean solar generation (photovoltaic + thermal) in California from CAISO from 2011 January to 2018 December. Bottom: Distribution of the monthly mean solar generation.

I applied the Box-Cox transform, which automatically optimizes the transform, to the time series and obtained the lambda value ~ 0.41, implying a transform close to the square root transform. After the transform, the mean and variance stay relatively constant over time and its distribution is more symmetric.

Similar to the figure above, except for the monthly mean solar generation after the Box-Cox transform.

If I took a further step to remove the linear trend from the transformed time series, its residuals are around the zero value, although they are more negative in the last two years (2017 and 2018). The transformed time series after detrending is better used for downstream analysis.

Top: Time series of monthly mean solar generation after the Box-Cox transform with the linear fit. Middle: Detrended monthly mean solar generation using the linear fit. Bottom: Distribution of the detrended solar generation data.