I am a non-math person and have a question about randomness: When generating elements from a finite set using some algorithm, it is clear what it means when saying that elements should be randomly generated. How about for infinite sets? For example if I want to generate real numbers randomly, what does it mean to be random? In general, how is random defined in Euclidean n-space? How about for subsets of n-space, eg. generating random points on the (n-1) unit sphere? Thanks.
Meaning of randomness in space
-
0To answer your question, "how is random defined in Euclidean n-space?", you might want to check out algorithmic randomness. – 2011-06-19
3 Answers
There are two aspects: First, you need to specify a probability distribution. This is a function that says for a subset (ignoring questions of measureability for the moment) how probable it is for a random element to lie in this set.
Example: You have a finite set $ A = \{1, 2, 3, 4 ,5 , 6 \} $ A probability distribution $p$ would be: $ p(x) = \frac{1}{6} $ for $x = 1, ..., 6$. Note that the sum over all probabilities needs to be $1$.
For an infinite set like the interval [0, 1], a probability distribution would be a function that has an integral of 1: $ \int_{0}^1 p(x) d x = 1 $ The simplest possible example would be the uniform distribution, the constant function $ p(x) = 1 $ (This time I implicitly assumed that we are talking about probability distributions that have a Radon-Nikodym density with respect to the Lesbegue measure.)
On "infinite" sets like the whole real line (or $\mathbb{R}^n$), there is no uniform distribution, because its integral could not be 1 anymore, but there are a lot of interesting other distributions.
The second important aspect is independence: You need to specify if you know anything about the next random element if I tell you about what happended before. If you say: What happens next, i.e. the next random element, is independent of what happened before, one says that the random elements are independent. But there are a lot of interesting situations where this is not so. When you play in a casino and have a fixed budget, you cannot play if you go bankrupt, for example.- In this case "what happens next" does depend on the events that happended before.
To pick up one of your examples: The unit sphere in $\mathbb{R}^n$ has finite Lesbegue measure, so that it is possible to define the uniform probability distribution on it. And lets also say that we would like to generate elements that are independent. In the one dimensional case, i.e. the unit circle, we could generate independent uniformly distributed elements of the unit interval $ x \in [0, 1]$ and calculate $ e^{i x} = \cos(x) + i \sin(x) $ which will result in numbers that are uniformly distributed on the circle. (In case you don't know about complex numbers, you can write the latter as $(\cos(x), \sin(x))$, in cartesian coordinates in $\mathbb{R}^2$).
-
1Please use $\backslash\!\cos$ and $\backslash\!\sin$. – 2011-06-19
There often quite a few different conceptions of random-ness; but generally it often alludes the samples being, independent, and free from after effect, and uniformly distributed and homogeneity of the data or relative frequencies amongst each sub collective/sub-interval/sub-sequence, of the collective/sequence/sequence in question .
In the infinite case (particularly the uncountable infinite case), one often speaks about intervals, and that the probability density is uniform; that is, The probability for example of selecting 'arbitrary' outcome within any sub-intervals is proportionate to that sub-intervals length.Such as given by the lebseque measure. One generally does not speak about the individual discrete outcomes/real numbers being equi-likely (one cannot exactly add them) but that is often how its interpreted. When we countably infinite sets, one generally speaks about sequences and relative frequencies. That is, asymptotic limits of frequencies,in sub-sequences already producedm, being homogeneous with regard to some finite attribute. As opposed to a uniform probability distribution/density (which is generally not possible)
Often one uses certain low-discrepancy, well distributed/equi-distributed sequences which approximate, what we naturally conceive as the natural counting or natural density measure of the reals in that interval or what many would consider to be their intuitive (albeit not quite correct) understanding of the lebesgue S eemeasure.https://en.wikipedia.org/wiki/Natural_density
I am not sure what one does in the case in the entire real line except through finer and finer longer intervals; and its not as if one could test it, by via countably long sequence of trials.
Unless one is speaking about a particular finite attribute; the numbers whose element in their decimal expansion is 1. Or some such; its co-finite with the rationals, but each sub-interval could not be equally likely even when there are only countably many. There are different non-standard approaches such as Haar measure and non-standard/p-adic analysis. Some it that some scholars argue the latter, still has issues with infintesimals, with uncountable sample spaces, rather than countably infinite sets of events.
One often has to distinguish between the type of random-ness as well
- process randomness and sample data on the one hand
and- statisical randomness on th eotherand one can either one of the two without the other.
The first(1), roughly speaks speaks about the homogeneity/uniformity or independence generally of the chances/probabilities that produce the data, in an indeterministic or chaotic system,
And the other (2) the datum itself or relative frequencies themselves. That is, the homogeneity, or rather how many, or the degree to which, the relative frequencies for the attribute in question is equal amongst all sub-sequences, and to said relative frequency among-st the entire collective.
When using deterministic algorithms the two can often the two merge into one and the same thing (technically it the process ,is not process random at all), but there is presumably some way (presumably not universally agreed upon nonetheless, of quantifying how 'indeterministic IID-like' the mechanism is.
Perhaps, through an algorithmic complexity approach, without having to look at the statistical data, the datum it would produce (not directly at least).
Otherwise, A lot of the simpler tests, may make use of the statistical datum that it produces, and test it via this by the second form of randomness (2)
There are many distinct tests of random-ness developed by Kolmogorov-Chaitin- Martin Lof and others, and there also at least two types of random-nesss Often a rough test is that of statistical random-ness, is that a great many of the sub-sequences have the same relative frequency of the attribute in question; that is on first appearances, they are free from after effect, so that a smart gambler could not devise an algorithm, to exploit the discrepancies in the sequences of outcomes. In a casino for example. That is pattern-less.
HTHTHTHT would not qualify for example in some cases as a very random sequence in the statistical sense, or the process sense, if the algorithm were to be deterministically designed to produce that sequence. As the relative frequency amongst the even indexed subsequences for 'outcome heads' is 1, and zero otherwise; and is distinct from the overall relative frequency of 'heads' in the entire collective, 0.5. So someone could bet on just the even coin tosses, for heads, and win every time.
A random sequence should be such that the sub-sequences mirror each other and the entire collective as a whole as much as is possible. So in this case have relative frequency 1/2 for heads to avoid being exploitable by simply place selection rules
Generally this is important in cryptography and not just gambling situations