Using GoldSim’s probabilistic simulation capabilities, it is straightforward to perform advanced statistical analysis of time series data. While many models focus on a single simulation run, this model uses multiple SubModels and Monte Carlo sampling to process historical records into practical statistical insights.
This approach allows you to compare historical observations (empirical data) with standard analytical distributions (Normal, Gamma, Log-Normal, and Log-Pearson Type III) to find the best fit for your specific watershed.
How the Model Works
The model is structured to automate the heavy lifting of statistical comparison. It iterates over the historical record, calculating monthly and annual flow statistics, and then uses a correlation-based "best-fit" logic to select the most appropriate analytical distribution for each month.
The model logic is divided into distinct functional areas:
Run_Duration: Processes the historical period at a 1-day time step to establish the baseline time history.
Empirical: Iterates over the data to calculate monthly and annual flow statistics based on the number of years on record.
Daily_Percentiles: Uses Monte Carlo realizations to generate probability bands (e.g., 5th, 50th, 95th percentiles) for daily flow rates.
Analytical: A static SubModel that samples from theoretical distributions to allow for a direct comparison with the historical data.
Navigating the Analysis
The model is driven by a central dashboard that guides you through the analysis workflow:
Data Input: Paste your daily time series data into the Time Series element.
Simulation Setup: Define the start and end dates and run the model.
Visualization: View the daily flow rate percentiles to understand seasonal variability and uncertainty.
Distribution Fitting and Comparison
A unique feature of this model is its ability to "brute force" a comparison between empirical data and multiple analytical distribution types. For each month, the model samples from:
Normal
Gamma
Log-Normal
Log-Pearson Type III (standard for flood frequency analysis)
The model calculates a correlation coefficient between the empirical quantiles and the analytical quantiles. Using a Script element, it determines which distribution type most closely matches the historical record.
Validating the Fit
To ensure the analytical distributions are representative, the model includes a dashboard dedicated to correlations. Using GoldSim’s Multi-Variate Result elements, the model plots the sampled analytical values against the empirical observations. A tight linear grouping indicates a high-quality fit, giving you confidence in the parameters used for your planning studies.
Statistical Summaries
The final stage of the analysis is the generation of a comprehensive statistics summary. This includes:
Monthly Averages: A table displaying the 10th percentile, Mean, 90th percentile, and Maximum values.
Annual Peak Flow: A distribution of the highest daily flow rate recorded within each year (Peak Annual).
CCDF Plots: Complementary Cumulative Distribution Function plots for both monthly averages and annual peak flow.
By using this model, water managers can move from simply looking at a hydrograph to understanding the underlying probability of drought and flood events, leading to more risk-informed decisions. All this can be done with just the time series of measured daily flow rate.
Contact:
Jason Lillywhite (Lillywhite Water Solutions)
Comments
4 comments
I tried to use this model with my own data. For some reason, some of the months have some kind of horizontal shift between empirical and analytical curves. What could be causing this?
Atlin,
Thank you for bringing this to my attention. I found a bug in the way we gather the monthly analytical and empirical values within the Analytical submodel. Please download the attached updated model, which I've uploaded to this article and try running it again with your data. I tried this with your time series and get the following result:
In case you are curious, this change was made to the 2 script elements inside the "Analytical" submodel. I now reference the month counter ~M so that we assign the samples for the correct month instead of referencing the entire array:
If you still have the model that you ran with my data, could you send it to me please? I tried to input my time series into the updated model on this page, but I may have made an error trying to adapt the model from daily to monthly data, as I'm still getting an error during February.
Would you mind sending the input time series to me? That way, I can verify correct functionality before sending you the model. I might find another issue that is causing your February anomalies. Please start a new support ticket and attach a zip file. Thank you so much! Below is the link: https://support.goldsim.com/hc/en-us/requests/new
Please sign in to leave a comment.