First let me try to describe in more details below the approach of "reordering" digits of Pi, which is used in OEIS A096566
and what I have done analyzing it so far.
I am looking at first 620 "reordered" digits (which is more than currently listed in A096566) of Pi (in decimal representation). The reordering is done in such way that, while looking at consecutive pipeline stream of digits, all "range" of 10 different decimal digits:
1,2,...,7,8,9,0
in their first occurrence are getting "collected", while all coming "repeating" digits of each kind (1,2,...,7,8,9,0 ) are getting "pushed back" to be written later ... .
Then the second "next" unique ten digits are getting collected, that is written, first looking for them in already "pushed back" group and then looking for coming-in (again with all "repeating" digits getting "pushed back") and so on until entire (second) set of all unique digits (1,2,...,7,8,9,0 ) is completely collected (written).
In total - I got 62 such sets (covering 62*10 decimal digits) - see all those 62 sets listed below - where each {....} line represents the set of such 10 digits collection.
{3,1,4,5,9,2,6,8,7,0} {1,5,3,9,2,8,4,6,7,0} {5,9,3,2,6,4,8,1,7,0} {3,2,9,5,8,4,1,6,7,0} {3,2,8,9,5,1,7,4,6,0} {3,9,5,8,2,4,7,1,6,0} {3,9,4,5,2,8,6,0,1,7} {3,9,4,2,8,6,5,1,0,7} {3,9,2,8,4,6,1,0,5,7} {9,3,8,2,4,6,1,0,5,7} {9,8,3,2,4,6,0,5,1,7} {9,8,3,2,6,4,0,5,1,7} {9,8,2,3,4,6,0,5,1,7} {9,8,2,3,4,5,0,1,6,7} {8,2,9,3,4,1,5,0,6,7} {8,9,2,3,4,1,0,5,6,7} {8,2,3,9,4,0,1,5,6,7} {8,2,4,9,3,1,0,5,6,7} {8,2,1,5,9,4,3,0,6,7} {8,2,4,9,5,3,1,0,6,7} {8,2,4,9,1,5,3,6,0,7} {8,2,4,9,5,3,1,6,0,7} {8,2,9,4,5,3,1,6,0,7} {2,8,4,9,3,5,1,6,0,7} {8,2,9,4,3,1,5,6,0,7} {8,4,2,9,1,5,3,6,0,7} {8,4,2,9,3,6,1,5,0,7} {8,4,2,9,3,6,1,5,0,7} {8,2,4,6,3,9,1,0,5,7} {8,4,2,3,6,9,1,5,0,7} {8,4,2,3,6,1,5,9,0,7} {8,4,2,3,6,1,9,5,0,7} {8,4,2,6,3,1,9,5,0,7} {4,8,2,6,9,1,3,5,0,7} {4,2,8,6,1,3,9,0,5,7} {4,2,8,6,9,3,1,0,5,7} {4,8,2,6,1,3,0,5,9,7} {4,8,2,6,3,0,1,5,9,7} {4,8,2,3,6,1,5,0,9,7} {8,2,4,6,3,1,0,5,9,7} {2,4,8,6,1,3,0,5,9,7} {2,4,8,1,6,3,5,9,0,7} {8,2,4,1,3,6,5,9,0,7} {2,8,4,1,5,3,6,9,0,7} {4,2,1,8,3,6,9,5,0,7} {4,2,1,8,3,5,6,9,0,7} {4,1,2,3,8,9,6,5,0,7} {1,4,8,2,3,9,6,5,0,7} {1,4,2,3,8,5,9,6,0,7} {1,4,8,5,2,9,3,6,0,7} {1,4,2,8,3,9,6,5,0,7} {1,4,2,8,9,3,6,5,0,7} {1,2,8,4,9,3,6,0,5,7} {1,2,8,3,4,6,9,5,0,7} {1,3,2,4,8,9,6,5,0,7} {1,4,3,2,9,8,6,0,5,7} {1,3,4,2,9,6,8,0,5,7} {1,4,3,2,9,6,8,5,0,7} {1,4,3,2,9,6,8,5,0,7} {1,4,3,2,9,6,8,5,7,0} {1,3,2,9,4,6,8,7,0,5} {1,2,3,4,6,9,8,7,0,5}
Here are results of arithmetic averages which I got for each (out of ten) positions between those 62 sets:
first position digits average 4.694538
second position digits average 4.306452
third position digits average 4
forth position digits average 5.048387097
fifth position digits average 4.951612903
sixth position digits average 4.548387097
seventh position digits average 4.161290323
eights position digits average 4.274193548
ninths position digits average 2.870967742
tenth position digits average 6.14516129
Above results show that for the first eight positions, digits in each position were changing from the set to set progressively more and more randomly - thus averages for those positions are getting closer to 4.5 .
However for positions 9 and 10 - such randomization was not achieved yet within first 62 sets ...(though looking outside of presented so far 62 sets data and relying on the known observation that eventually the average digit value in the decimal expansion of Pi comes practically to 4.5), I could "speculate" in advance that it will come to 4.5 eventually and for positions 9 and 10 too ... - but it looks like that positions 9 and 10 are "randomizing" at much slowly rate than the other 8 positions ... and that might be (or might not be) interesting.
I am not sure how many more sets (beyond 62, which I presented here) are needed to get arithmetic averages for "ninth" and "tenth" digit positions (within the set) to reach the same proximity of 4.5 for average value, as it is achieved already by first 8 positions within those 62 sets ... .
It is also notable that if to average positions 9 and 10 together, the average between those two, within available so far 62 sets, will be close to 4.5.
Conclusion and questions
There appears to be that those first 62 sets listed above have a slight hint of retaining some loose organizational order between predecessor sets and successor sets.
But I presume that further "down the road", beyond the first 62 sets, one will see that gradually the level of randomness in sets digits composition order is increasing and adjacent sets become more more disconnected from each other.
What I am trying to say that in case of digits Pi (after applied above discussed reordering) it appears that there exists some sort of transition from initial order (within first 620 digits) to total randomness ...
I used Maximal information-based nonparametric exploration statistical analysis program (MINE) by "David Reshef at al ".
Being applied (by me ) "pairwise" to the first 62 terms, MINE shows high values (up to 1) of the maximal information coefficient (MIC), which is a measure of two-variable dependence designed specifically for rapid exploration of many-dimensional data sets.
The links (thanks to LVK for the upload) to the excel spreadsheet, turned into the comma separated value file (.csv), with the data (62 sets of reordered Pi digits) and the MINE generated output .csv file, which was produced (at my home PC Windows based computer) upon executing
java -jar MINE.jar PiReordered.csv -allPairs cv=1.0
correspondingly are
https://dl.dropbox.com/u/29863189/PiReordered.csv
and
https://dl.dropbox.com/u/29863189/PiReordered.csv%2Callpairs%2Ccv%3D1.0%2CB%3Dn%5E0.6%2CResults.csv
Does such concept of transition from order to randomness exist ?
Could this above observation be statistically confirmed or disproved ?
If "yes" - what specific tools / methods of statistical analysis could be applied ?
I also received suggestion that in order to test whether this discussed above feature is only characteristic to (some initial digits of) Pi, the same reordering should be applied to the some significant number of randomly generated very long strings of decimal digits - to see if there the same pattern behavior will appear or not - is it useful ?
Thanks,
Best Regards,
Alexander R. Povolotsky
PS - in response to LVK's answer and his comment, which I am quotting here "The nth line of your table consists of the digits written in the order of their nth appearance in π. This could be in principle read off the graph by crossing it with the horizontal line y=n and reading off the intersection points from left to right. (In practice this is not convenient due to the low resolution and overlap between the curves.) ..... I don't think there is any statistical method for analysis of the data organized in this way. You'll probably need to devise one yourself. ..... – LVK Sep 10 at 16:02"
LVK - thanks for your thoughts and valuable contribution ! I think though that the frequency chart somewhat hides away the positional dependency between the unique digits of the {1,2,...,9,0} set. The table presentation with the columns representing the particular combination of (all) digits (from 1 to 0 in above mentioned set) for each consecutive ten digit collection is, in my opinion, more revealing in that regard.
My questions still remain to be in place:
1) is there some "other" (I would call it "transitional") non-randomness exists for some few hundreds "initial" digits of Pi (beyond the order imposed by the re-arrangement itself), which is getting revealed by this re-arrangement ?
2) what (other than MINE) quantative statistical methods/tools could be used in analysis of this situation -
PPS I am trying to rework first 3 columns in already posted MINE results csv file (where first two columns are "textually enumerated" names of the 10 digitt sets, like for example "18thSet", and the 3rd column is MIC value for the two sets identified in the first two columns at the same row) into three (3)-dimensional "surface" chart with each column, mentioned above, be correspondingly x,y and z values ...
Doing it manually via converting into table -- by keeping the first column in tact, transposing the second column into up-most row and filling the table's body by the MIC values from the 3rd column is very laborious.
I found discussion at
https://stackoverflow.com/questions/7083044/mathematica-csv-to-multidimensional-charts
how to do it with Mathematica, but I don't have it ...
Could some one (who has Mathematica) be kind enough to do it (and post) ?