Stratified sampling based on properties of random number streams

In summary, the conversation discusses stratified sampling of outputs of simulations using properties of random number streams. The technique involves partitioning the streams into categories based on their properties and running replications of the simulation using each category. The purpose of this is to ensure consistent generation of conditions and coverage of the state space. The process of stratifying streams involves partitioning the probability distribution function and shuffling the numbers before creating the final stream. Further research on the topic can be found in various research papers.
  • #1
Stephen Tashi
Science Advisor
7,861
1,600
TL;DR Summary
Are there techniques to do stratified sampling of simulation outputs by basing the strata on properties of the random number streams involved?
I recall seeing briefing notes dating from about the 1970's that advocated doing stratified sampling of the outputs of simulations by using strata based on properties of the random number streams. However, I don't recall how the strata were to be defined. Is this type of stratified sampling a well known technique?

Futher elaboration: A complicated simulation, such as a combat simulation, usually involves several "random number streams" of pseudo random numbers. Each stream has a designated application. For example one stream might be for the accuracy of "Blue" artillery against "Red" targets. The most natural way to run multiple replications of a simulation is vary the seeds of the random number generators for the streams in a way that the replications can be considered independent. I remember seeing briefing notes that advocated the different technique of doing stratified sampling based on the properties of the random number streams. However, I don't recall what properties of the streams were used to define the strata.

According to the standard theory of stratified sampling, one would have to partition the set of random number streams into mutually exclusive categories and find the probability that an independently generated set of streams would fall into a particular category. Then one would have to run replications of the simulation using each category of streams. How this could be accomplished in practice is an interesting question.
 
Physics news on Phys.org
  • #2
The only property I can think of for a stream of random numbers, is the stochastic generating process, or in the simplest case, a probability distribution function as the generator.

I can imagine why you might want to stratify such a stream for a simulation like this, because you might want to ensure that the range of possible conditions is generated consistently. If you didn't stratify, then there would be some chance that your stream is an unlikely one, and if there are multiple streams that go into the same simulation state, then your state space might not be covered very well over a given fixed interval of time.

To stratify streams of random numbers, my guess is you could partition the probability distribution function generating it, sample a batch of numbers that is proportionately within each part, and then shuffle them around before throwing them into the final stream.

If you wanted to stratify the joint streams, then maybe do something similar with a joint distribution.

I think you would find lots of research papers about the topic.
 

FAQ: Stratified sampling based on properties of random number streams

What is stratified sampling based on properties of random number streams?

Stratified sampling is a sampling method used in statistics to divide a population into smaller subgroups or strata based on specific characteristics, such as age or income. Random number streams refer to the sequence of random numbers generated by a computer or other device.

How does stratified sampling based on properties of random number streams work?

In this method, the population is first divided into strata based on the chosen characteristics. Then, random numbers are generated for each stratum, and individuals are selected for the sample based on these numbers. This ensures that the sample is representative of the entire population and reduces the chances of bias.

What are the advantages of using stratified sampling based on properties of random number streams?

One of the main advantages is that it allows for a more accurate representation of the population, as each stratum is represented in the sample. It also ensures that the sample is not biased towards a particular subgroup of the population. Additionally, it can help in reducing the sample size needed for a study, making it more cost-effective.

What are some potential limitations of stratified sampling based on properties of random number streams?

This method may not be suitable for populations with highly diverse characteristics, as it may be difficult to accurately divide them into strata. It also requires prior knowledge of the population and its characteristics, which may not always be available. Additionally, it can be time-consuming and complex to implement.

How can one ensure the effectiveness of stratified sampling based on properties of random number streams?

To ensure the effectiveness of this sampling method, it is important to carefully select the characteristics used to divide the population into strata. These characteristics should be relevant to the research question and should accurately represent the population. Additionally, it is important to use a large enough sample size to accurately capture the characteristics of each stratum.

Back
Top