2
$\begingroup$

Consider an arbitrary discrete probability distribution with sample space $\Omega$ and let $\omega\subset\Omega$. Let $n$ denote the amount of independent trials of an experiment that are performed and let $\operatorname{f}(n)$ equal the amount of times $\omega$ occurs during those $n$ trials.

It is my understanding that $\operatorname{P}(\omega)=\lim_{n\to\infty}\operatorname{f}(n)/n$. Is $\operatorname{f}$ essentially "pure randomness"? I mean we can't necessarily be certain about what value we acquire from $\operatorname{f}$ when evaluated at $n$. I'm used to a function giving me the same number when I iteratively evaluate it at the same number, but this isn't the case now is it? Does it make sense for $\operatorname{f}$ to exist philosophically?

If "pure randomness" determines the value of $\operatorname{f}(n)$ in the sense that we can never be $100$% certain about we value it will yield, how do we define "pure randomness"?

Since $\operatorname{f}$ is not a normal function like those in calculus, how do we define the convergence of $\operatorname{P}(\omega)=\lim_{n\to\infty}\operatorname{f}(n)/n$? Does the $\delta$-$\epsilon$ kind of definition apply here as well? How rigorous is this definition generally speaking?

In addition to that, how do we define probability for continuous probability distribution in a more rigorous way?

  • 0
    In probability theory we usually start out with a probability space $(\Omega, \mathcal{F}, P)$ with the understanding that we are not defining the randomness. Instead, we take the randomness for granted. Some outcome in $\Omega$ will occur, and we have no say/choice over which outcome it will be. But just because we can't say with certainty which outcome will occur doesn't mean we can't analyze what happens under each of the possible outcomes. We can also compute the expectation, variance, etc., of a random variable which gives a lot of information about the likelihood of certain outcomes.2017-01-06
  • 0
    There would have to be an axiom to allowed us to take it for granted, no? What axiom is that? Also, I don't see the aforementioned limit definition being applicable for continuous distribution, so how do we define such? In addition to that, who is to say that $\operatorname{f}$ will not equal $0$ forever if its values are in same sense based on "pure randomness"? We can't say that probability says that because if the limit is the definition of probability then it cannot rely on itself in that sense, can it? In terms of applications this definition seems ok but beyond that I'm not convinced.2017-01-06
  • 0
    I hope someone is able to answer your questions. I'm looking forward to reading the answers.2017-01-06
  • 0
    There is a substantial misunderstanding here: in the result that says $P(A)=\lim f(n)/n$, $f(n)$ does not refer to the number of times during $n$ trials that $A$ occurred because $A$ does not occur or fail to occur "during $n$ trials". Instead, $A\subset\Omega$ is given once and for all and the $n$ trials refer to something completely different. Namely, one considers some $B$ in $\mathcal E$ which is a subset of the *image space* $(E,\mathcal E)$ and i.i.d. random variables $X_k:\Omega\to E$. Then the result is that $\nu(B)=\lim f(n)/n$ ($P$-almost surely) where $f(n)$ is the size of the ...2017-01-06
  • 0
    ... set $\{k\leqslant n\mid X_k\in B\}$ and $\nu$ is the so-called distribution of every $X_k$, that is, a probability measure on $E$, not on $\Omega$.2017-01-06
  • 0
    @Did: It is possible to talk about the result of $n$ trials in the sample space by considering the product space $\Omega^\mathbb{N}$ and constructing i.i.d random variables - see my answer.2017-01-07
  • 0
    @JeffreyDawson I know, why are you telling me this?2017-01-07
  • 0
    @Did Because your comment made it seem like you thought it only made sense to talk about $n$ trials in the image space - "$f(n)$ does not refer to the number of times during $n$ trials that $A$ occured" - but if $f(n)$ is a random variable on $\Omega^\mathbb{N}$, then it does.2017-01-07

2 Answers 2

0

To be specific, toss a coin repeatedly, and let $X_n$ be the number of Heads in $n$ tosses. According to the frequentist 'definition' of probability, we say that $P(\text{Heads}) = 1/2$ (coin is 'fair') if $R_n = X_n/n$ "approaches" 1/2 with increasing $n$.

But, you're right, this use of "approaches" cannot refer a limit in the same traditional mathematical sense that the deterministic sequence $A_n = (1 + \frac{1}{n})^n$ approaches $e$. We know in advance precisely the value of the deterministic sequence $A_n$ for each $n.$ And we cannot know the value of $R_n$ for each $n$ without doing the "random" coin tosses.

One way to make the random situation rigorous is to say that $R_n$ converges 'in probability' to 1/2 with increasing $n.$ This is defined as saying that $Q_{n,\epsilon}$ converges to $1$ in the traditional mathematical sense, for any $\epsilon > 0$, where $Q_{n,\epsilon}$ is defined by $$Q_{n,\epsilon} = P(|R_n - 1/2| < \epsilon).$$

You might see such notations as "$\text{plim}\, R_n = 1/2$" or "$R_n \stackrel{prob}{\rightarrow} 1/2.$"

Once you have learned a bit more probability, you will recognize that $X_n$ is a 'binomial random variable' with $n$ independent trials and 'Success probability' 1/2 at each trial. Then for any given positive $\epsilon,$ the quantity $Q_{n,\epsilon}$ has a known value in advance. This limiting relationship is a special case of the '(Weak) Law of Large Numbers', which I suppose you will study in due course.

To illustrate, below is a graph of the actual values of $Q_{n,\epsilon},$ for $n = 1, 2, \dots, 800$ and $\epsilon = 0.05,$ made using R statistical software.

n = 1:800;  eps=.05
Q = pbinom(n*(.5+eps), n, .5) - pbinom(n*(.5-eps)+.00001, n, .5)
plot(n, Q, type="l");  abline(h=1, col="darkgreen")

enter image description here

Note: You do not mention the mathematical level of your course. My explanation is intuitive and I hope appropriate for the beginning of an undergraduate post-calculus course in probability. If you are studying measure theoretic probability, then please refer to @Did's more rigorous and elegant approach.

0

Here is a more rigorous measure-theoretic treatment of the problem.

Consider the space $\Omega^\mathbb{N}$ equipped with the product measure. You can think of this as the space of infinite sequences of outcomes in $\Omega$, i.e. the result of performing a random experiment infinitely many times. Now, we can define $X_n$ to be the indicator function of $\pi_n^{-1}(A)$ - here $\pi_n$ is the function mapping a point to its $n$th coordinate, so $X_n$ is $1$ whenever the $n$th coordinate lies in $A$, and $0$ otherwise. Then the random variable $Y_n = \sum^n_{i=1} X_i$ represents the number of times $A$ occurs in $n$ trials. Note that $Y_n$ is not a number, but rather a function from $\Omega^\mathbb{N}$ to $\mathbb{R}$. (This is why your notation of $f(n)$ is misleading, as it suggests that $f$ depends only on $n$, when in fact it is a function of the sample space as well). Thus, for each point in $\Omega^\mathbb{N}$, we can talk about $\frac{Y_n}{n}$ as a sequence of real numbers, and determine its limit, if it converges at all. A special case of the Strong Law of Large Numbers then states that $\frac{Y_n}{n} \rightarrow P(A)$ almost surely (the set of points where it doesn't converge to $A$ has probability $0$). The application of this law requires that the $X_n$ be independent and identically distributed, which should be intuitively obvious since each one depends only on the $n$th coordinate, which is independent of all the others.

Sidenote: you'll notice I used $A$ instead of $\omega$ to denote the set. This is because in probability theory, typically capital letters like $A$,$B$, etc. are used to denote subsets of the sample space, whereas $\omega$ denotes an element of the sample space.