0
$\begingroup$

I have a set of data (metallicity of globular clusters) and wish to determine the presence of two sub-populations. I know there are two populations but I am unsure on how to split the two.

I have plotted a histogram of the data (metallicity vs number). Is there any method aside from sight, to determine the two populations?

FeH     Frequency 0       4 -0.2    6 -0.4    14 -0.6    15 -0.8    6 -1      15 -1.2    22 -1.4    23 -1.6    19 -1.8    15 -2      8 -2.2    5 

Histogram of data

  • 0
    If there are two modes, it is common to infer that there are two populations, and to hope that at least one will be named after you. I do not see two modes here.2012-04-21
  • 0
    Thanks André. I have update my question slightly and redone the histogram to show the total number agains the value. Perhaps this is a better place to start from?2012-04-21
  • 1
    Congraulations! Which subpopulation should be named Carl? The population is clearly bimodal.2012-04-21
  • 0
    Thanks again André. I can see visually the two populations are those greater than and less than -1. Is that all there is to it or is there a more mathematical way to determine the boundaries?2012-04-21
  • 1
    There are. I have forgotten, it has been many years since I have done statistical consulting. That's why I included the term bimodal, for searching. Post a similar question on the stats version, bottom of page, and you will probably get a few references.2012-04-21
  • 1
    I've seen, -also some years ago, and do not remember the details- a concept and a software called "emmix" (try google) which should be able to separate mixtures of normal distributed data into the most likely subgroups. It seems to me, that this could be the required systematic approach for your question (but might require high level understanding of the required math, don't know).2012-04-21
  • 0
    Thanks Gottfried. I did a search for that and found the software but it's only available on Linux and the Windows version only allows you to load their demo files.2012-04-21
  • 0
    Thank you everyone for the info. Gottfried, I ended up finding a similar piece of software specifically suited to astronomy. Would it be worth posting the details here as an answer? What's the etiquette?2012-04-26

1 Answers 1

1

For anyone else interested, I used Gaussian Mixture Modeling (GMM) algorithm to determine the means of the two populations and separate them.

Details of the techniques used are explained in the paper linked on this page: http://www.astro.lsa.umich.edu/~ognedin/gmm/