Skip to content

faq 29720596

Billy Charlton edited this page Sep 5, 2018 · 2 revisions

In order to have proper statistically valid results of my simulations should I calculate a sample of runs? Or should I create a sample of synthetic populations and run them several times? How can I design the experiment?

by Kai Nagel on 2015-06-16 13:53:25


(proxying the question for Davi Bicudo)


Comments: 1


Re: In order to have proper statistically valid results of my simulations should I calculate a sample of runs? Or should I create a sample of synthetic populations and run them several times? How can I design the experiment?

by Kai Nagel on 2015-06-16 14:06:04

I would argue that this is not well researched.  In our book-to-come, there will be one or two chapters on the interpretation of the simulation as a Monte Carlo engine.  In the meantime, a couple of thoughts:

  • We recommend to eventually switch off innovation, and at the same time or somewhat later switch on MSA score averaging.  None of this is currently default since this will fail regression tests and so for changing the default I need some time to look at things.
  • With innovation switched off and score averaging switched on, the scores should eventually converge, and then the simulation (just) does repeated draws from the underlying models.  Depends on the (non-innovative) selectors you are using; with the (recommended) ExpBetaPlanChanger it will draw from a logit distribution.  This, in consequence, should be self-averaging, i.e. running ensemble runs (e.g. 10 runs with different random seeds) should yield the same result as averaging over the iterations (e.g. running for another 1000 iterations after convergence and taking results every 100 iterations).  –  I don't think that anybody has investigated this systematically.

A sample of synthetic populations would be a different experiment.  It will certainly yield wider variability than just looking at the iterations; how much large depends on the variance embedded in your synthetic population generation compared to the variance embedded into matsim.  We did this many years back in the TRANSIMS context, but unfortunately I have no idea where that report went (presumably Beckman and probably Barrett is one of the authors; not sure if I am on it; if someone finds it, pls let us know).

Clone this wiki locally