2
$\begingroup$

For my case, I have 2 arrays or sets of data, 100 elements, and the values are only 0 and 1. What test or procedure would measure the correlation or independence of the 2 sets?

To give an example of the data, suppose one counted if it rained or not in 2 different cities over a period of 100 days. The data would be recorded as 0 for no rain that day, or 1 for any rain that day. After 100 days, CityA would have data of {0,1,0,0,1 ...} with 100 data points, and CityB would have data of {0,1,1,0,0, ...} also with 100 data points. The question is whether the rain in the 2 cities is correlated positively or negatively, or independent.

If it's important, these data sets are time-series, or just that the order is fixed, and the event is mostly randomly distributed over the 100 days.

One measure I found was the Phi Coefficient. But the example in my text book says it's for 2 dichotomous values a subject may have, eg smoking and cancer, male/female and pets, married and employed, etc. I was not sure if this applied to my case where it's 2 time series.

thanks in advance, and I hope I've phrased my question clearly, if not feel free to ask for more details if needed, thanks

  • 0
    I take it back -- after a few trials with numbers, I think the Pearson correlation will give me a useful number.2011-10-27

0 Answers 0