I am trying to use Bayesian updating to calibrate a hydrologic model probabilistically. To do so, I followed one of the general examples (./General Examples/Stochastic/
I appreciate you helping me out with it.
Thanks!
I am trying to use Bayesian updating to calibrate a hydrologic model probabilistically. To do so, I followed one of the general examples (./General Examples/Stochastic/
I appreciate you helping me out with it.
Thanks!
Either change it to another distribution or change the statistics (e.g., mean, standard deviation and etc.) of that distribution. And more precisely, the distribution of an input (or a few inputs) is adjusted to better match the output distribution with some observation. For instance. in a hydrologic model, the distribution of Curve Number is adjusted to better match the streamflow distribution against observed (e.g., predicted streamflow ensemble median matches up the observed streamflow).
Is it doable with Bayesian updating?
You can always change the distribution parameters programatically (i.e. mean, std dev., etc.) and this can be done using optimization. That part is fairly straightforward.
Selecting a distribution that provides a better fit based on information that is learned during the simulation is a bit trickier but I think can still be done. You could set up your model to have GoldSim select a distribution from a list of possible distributions that you've pre-defined using the built-in optimization function. This would require that you pre-define a few distributions then select from the list of your distributions using logic. Another option might be to use the external definition type of distribution, which is generated in a submodel.
Thanks Jason.
Yes, I meant changing the PDF type too to truly finding the best distribution. Could that Bayesian updating capability be used for this purpose? I tried to follow the general example (./General Examples/Stochastic/BayesianUpdating.gsm), but couldn't understand how it works.
Also, if one just wants to update the PDF statistics, can the objective function in optimization toolbox be based on some distribution percentiles (e.g., difference between 5th and 95th percentile) of an output? To my understanding, the optimization's done deterministically. Or am I missing something?
The example you are referring to is updating the distribution parameters only. It is not changing the distribution type. If you want to change the distribution type in GoldSim, you will need to have a way to switch Stochastic elements that you refer to. There is no way to switch the distribution type within a single Stochastic element during a simulation unless you use the "Externally Defined" type and link it to the output of a submodel that might have a changing distribution inside of it. The simplest approach might be to just create 3 or 4 Stochastic elements, all of which have their own unique distribution defined. Then, during an optimization run, you can select which stochastic output to use with IF Then statements.
If you want to have a statistic be your objective function then you must put the functionality inside a submodel with Monte Carlo. Then, expose the output on the interface of the submodel and you can refer to the 5th and 95th percentile and any other statistic.
Does that make sense?
-Jason
For the simple case (no change on the PDF type):
Your solution: "You can always change the distribution parameters programatically (i.e. mean, std dev., etc.) and this can be done using optimization. That part is fairly straightforward."
The issue that I have is with defining the objective function. I generate a simulated flow time series (QSim) as an expression element with multiple Monte Carlo simulations (probabilistic simulations). Since the element's not stochastic, I get the following error when using the expression 'PDF_CumProb(QSim.distribution,95%) - PDF_CumProb(QSim.distribution,5%)' in the objective function:
"Could not find the distribution of the element QSim".
Thoughts to get rid of this?
Ebrahim,
If "QSim" is truly a time series element, then it will not have an output called "distribution". Instead, QSim must be a probabilistic output either as a stochastic element in your main model or a probabilistic result from a submodel. Can you please clarify what "QSim" is? Please have a look at the output interface of a probabilistic submodel as I think this is what you need to be using here. Run optimization on a model that contains a submodel. The submodel runs Monte Carlo and the output of the submodel will give you the distribution output.
Thanks,
Jason
If it is a probabilistic time history output from a submodel and you want to obtain the 5th and 95th percentile for each time step, then you should link those statistical time histories in the output interface of the submodel as "Statistical Time Histories". Please refer to this model: https://support.goldsim.com/hc/en-us/articles/115015957608
I hope this helps.
Jason
By the way, you could just add many different stochastic elements to your model that each have a different distribution assigned and then use a Selector element to have GoldSim select from the list of possible distributions as part of your optimization process and thereby have your model choose the probability distribution for you.
Would it not be more accurate to use the Bayesian updating (Externally defined PDF)? I don't know how to do this in my hydrologic model even after looking into the example model. How the uncertainty reduction factor (URF) can be determined in this problem (and any problem with some observed data)?
Ebrahim,
Did you already walk through this Help page: https://www.goldsim.com/Help/index.html#!Modules/5/dynamicallyrevisingdistributionsusingsimulatedbayesianupdating1.htm ?
Was this helpful?
-Jason
Yes, the example was helpful in giving me general description of this feature. I think the external definition in this PDF is the initial PDF that we assume (prior distribution) but I still don't understand how we define the URF in a hydrologic model. The example assumes that the URF is simply reduced by a factor of 0.2 every 10 days over a period of 50 days. In my example, we do continuous hydrologic simulation and have a stream gauge with observed discharge time series. How do we define the URF in this problem?
In the example we provide, the level of uncertainty is decreasing over time until the true answer is converged upon. This is done by having a true value that it must get closer to over time. My question about your case is whether you have this true value(s) that the model needs to converge toward. Do you have that?
Can we back up a bit here? Why are you trying to solve the problem this way? If you have observed streamflow and you want to calibrate the model to that, why not just calibrate the model by adjusting input parameters? My understanding of Bayesian updating is that you are learning more about the project over time and as your information gets better over time, you are able to adjust your uncertainty. This seems like a difficult way to just calibrate a hydrologic model. Why not just run a simulation for the period of record and then use optimization to allow the input parameters to change until you get a model output that approaches the observed output? You could use a method discussed in the webinar last month.
My goal's to develop a probabilistic hydrologic model (not a framework). With a deterministic model, many plausible modeling scenarios as a result of different parameter combinations are missing (i.e., equifinality) whereas probabilistic model accounts for all these. Plus many other benefits of probabilistic modeling that you know. Does this sound reasonable?
But in your case, uncertainty is not decreasing as the simulation progresses and isn't this the requirement for Bayesian updating? Isn't the whole purpose to adjust your distributions based on learning better information as time progresses? In your case, the acquired information is not getting any less uncertain at all. In your case, there is always uncertainty random behavior as you progress through time. It seems to me that Bayesian updating is not the solution you need.
Rather, it seems to me you should be focusing on a way to develop a reasonable set of stochastic and uncertain inputs to drive your hydrologic model. These will be developed with some calculations and also with some judgment calls because we just don't have enough information to develop an exact range of inputs. I would approach your situation by isolating the observed record into individual, isolated storm events. With that done, I would look at the inputs that drive certain behaviors like peak flow, duration of flood flow, and total volume of runoff during the event. Then you could look at the ranges of inputs (within reason, based on your engineering judgment) that lead to these single events.
I did have a quick look for literature that uses what sounds like your approach and found this: https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1039&context=cengin_fac and this: http://onlinelibrary.wiley.com/doi/10.1029/2000WR900405/pdf
But these approaches are being updated as new information is coming in. As new data comes in from the field, you can re-run the model and fine tune the inputs. You repeat this over the duration of the period in which new data is coming in. These are sort of like real-time approaches that I think vary from what you are suggesting.
Do you agree? Perhaps I am missing something fundamental about your approach here. Thanks for your patience!
Jason
Thanks Jason.
Well, in a sense, the uncertainty of model parameters reduces when comparing the ensemble simulated streamflow with observed discharge at the gauge but we don't know how much the uncertainty's reduced. We first have no information on these parameters, so we assign some PDF (e.g., uniform) arbitrarily. The comparison with observed streamflow provides us with additional information, so that we can adjust the prior PDFs and derive the posterior distributions. See Fig 5 of this (not sure if you have access to it): https://www.sciencedirect.com/science/article/pii/S0022169409008221
Please sign in to leave a comment.
Comments
20 comments