### Population, Sample Size, and Margin of Error

Statistics is about precision, not accuracy. That's what I learned from the books. I'm not a statistician nor a mathematician. The methods I use are simple and elementary. The best I could, I try to understand what I'm doing.:) If you're a mathematician or a statistician, by all means, please share your thoughts if there's something I'm doing statistically wrong.

If statistics is about precision, I think that's the reason why statistics use margin of error. (I recently learned that "margin of error" is not the same thing as "margin for error". The latter is a play, and later a film.) I wondered how precise my probability data here; thus, I needed to know the margin of error.

Population and precision

If I could gather all the lottery results in the world, I would be able to present a more interesting information, and perhaps, a close to accuracy data. But I could not. Therefore, are my sample data of 2000 records good enough to produce reliable data? Will they also be useful to other lotto players in other countries?

In lotto 6/42, the possible combinations are more than 5 million. If 5 million is what we call population in Statistics, how good is 2000 as its sample size? The table below shows the population (N), sample size (n), and margin of error (e) of each game's probability data.

Lotto GamePopulationSample SizeMargin of Error
6/425,245,78618852.30%
6/458,145,06020962.18%
6/4913,983,81611492.95%
6/5528,989,6751807.45%

I presumed the total number of possible combinations as the population; the actual lottery results as its sample size. In another point of view, the actual lottery results can be treated as the population, and a smaller fraction of it as its sample size. In the latter case, for the reason that I have included all the actual lottery results in my analysis, then my sample size equals the population. Therefore in such case, there is zero margin of error.

Significance

In my local area where I gathered the data, my probability findings are more significant to our local players than those outside the Philippines for the reason that my data covered all lottery results specific to our area. On the other hand, statistically, the data here can also be useful to other lotto players in other countries, assuming similar factors apply like the way the balls are drawn. The Philippine lottery system is based on the Canadian lottery system.

If you have downloaded my LGPlus, you can build your own sample size database to fit your local lotto system. To achieve, however, a lower margin of error, you can keep the sample database given there. Once you have reached 2000 records, you may start overwriting the earlier lotto results.

Formula to calculate population

If you want to find out how many possible combinations can be formed in a particular lotto game where repeating numbers are not allowed and the numbers are in no particular order, use this formula:

c = n! ÷ (n-r)! ÷ r!

which is the number of combinations equals the factorial of n divided by the factorial of n-r divided by the factorial of r,

where:

c = the number of combinations;
n = the highest lotto number;
r = the number of objects (how many numbers in a combination)
! = the factorial of a number.

For example, in lotto 6/45, it is calculated as c = 45! ÷ (45-6)! ÷ 6! = 8,145,060. The factorial of a number is arrived at by multiplying a set of consecutive numbers. For example, the factorial of 6 or 6! equals 6 x 5 x 4 x 3 x 2 x 1.

Formula to calculate margin of error

The margin of error should be anything less than or equal to 10%. The lower the margin, the closer the result is to precision. Based on the margin or error calculated above, then my data here is close to precision.

I calculated the margin of error based on this formula on how to arrive at the sample size.

n = N ÷ (1 + Ne2)

where:

N = population
n = sample size
e = margin of error

By transposing the sample size formula, we arrive at the margin of error formula.

e = the square root of [(N ÷ n) - 1] ÷ N