Randomness
For many, the concept of "What is random?" is difficult to define. The term random indicates the lack of a repeatable, consistent, and testable pattern. The requirement is no bias and all possible locations have a known probability of being selected.
- This means that throwing a stick in the woods is not random.
- Coin flips are considered random yet they only yield a binary number (0 or 1)
- Selecting paper from a bowl or hat is marginally random but bias can exist from poor mixing
Why is this important in a class on sampling? The most recommended type of sampling by statisticians is random sampling. In statistics class, we have a collection of all individuals in a population. A random sample selects individuals from the population without a pattern or bias and all individuals have a known probability of being sampled. In this case the probability of being sampled is equal and dependent on the population size and the sample size. We will discuss this further in the topic sampling probabilities.
So where do we get random numbers for our use? This question has several answers, with issues with each. Let's look at some possibilities:
- Natural Sources
- There are many natural random sequence sources. Here a few:
- Radioactivity - If you measure the time between clicks on a Geiger counter the spacing is random.
- Radio background noise - If you measure the intensity of a background noise on a radio tuned to a place with no signal, the variation in the background noise is random. This is used by the SETI (Search for Extraterrestrial Intelligence, www.seti.org) project to find life on other places in the Universe. They search radio bands for non-random patterns and then attempt to identify the source location. Not all non-random signals are intelligent life such as a pulsar. But most interstellar radio bands is simply random noise.
- Published sources
- In the past people printed random numbers in books. The were quite useful when you only need a few numbers and before the advent of small computers. Remember that portable computers have only been widely available since the 1970's. The first electronic computers were invented in the 1940's
- Computer Programs
- Computer generated random numbers are convenient and suitable for most uses but they have a problem. They are not completely random. Random number programs generally work of a series of formulas the work off the fraction remainders of the calculation. After several steps of this procedure the numbers appear random, however, since computers work on finite mathematics at some interval the sequence will repeat.
For most current generators the repeat interval is several million numbers. For our purposes, this is acceptable. But for simulation models, we require much better random number generators.
These generators are called pseudo random number generators. Another aspect of computer pseudo random number generators is they generally require a seed. For any specific random seeds you will get exactly the same sequence of random numbers. Generally the random number seed is set from a time stamp or some type of entropy collector.
The most common way of testing a pseudo random number generator is to create 2 random number sequences and plot one sequence on the x and one on the y. Any diagonal alinement of the points indicates problems with the number generator.
Because humans are very good at pattern recognition, we tend to see pattern where none exist. There is a logical reason for this if you are a hunter or being hunted, there is a greater penalty for not seeing a pattern than seeing a pattern that is not there.
The out come of this is that the pattern we see in random numbers look like pattern but are not repeatable and detectable. The exercise in the video will demonstrate this idea.
|