1
$\begingroup$

Our industrial process spits out items of normal and small size. At most $x\%$ of the items are allowed to be small.

I would like to validate the output of the proces by taking a sample of the output and run some kind of a test.

I guess I have two bins (normal and small) but I have no clue on how to proceed. I don't think I can do a chi-square test since the cumulative distribution is not known (due to the 'at most $x\%$' part). Some hints/pointers to get me started would be appreciated.

Note: the problem can be simplified (if needed/useful/easier to get me started/...) by assuming that the small bin is exactly $x\%$

Thanks in advance!

EDIT

I assume the problem can be reformulated as

  • suppose we have a population with red (too large items), green (normal size) and blue (too small size) balls
  • we take a sample ($N$ balls) and compute the sample percentage $P$ of blue (or red) balls
  • I assume we can say something like 'the population percentage of blue balls lies in the interval [$P-\delta$,$P+\delta$] with an accuracy of $95\%$' but what is the relation between $\delta$, the sample size $N$ and the $95\%$ ?
  • can we also say something like 'the population percentage of blue balls is lower than $X$ with an accuracy of $95\%$'

Some pointers/formulas to get me started would be appreciated.

1 Answers 1

0

There are different ways to think about your problem. Here is one.

Suppose your process produces correctly sized items with probability $p$ and incorrectly sized items with probability $1-p$. You want a $w$-confidence interval for $p$, given sample fraction of correctly sized items $\hat{p}$ out of sample of size $n$. Then it is $\hat{p}\pm z\sqrt{\frac{1}{n}\hat{p}(1-\hat{p})}$, where $z$ is $1-\frac{1}{2}(1-w)$ quantile of standard normal.

This confidence interval is based on approximation of the binomial distribution by the normal one. As you might expect, the approximation will work well if $n$ is large and $p$ not close to zero or unity. There is an entire wiki page describing how to improve when these conditions fail.

  • 0
    This indeed gets me started. If you can point me towards other ways to think about the problem, that would be appreciated!2017-02-11
  • 0
    One way forward is the wiki page on the improvements of the binomial approximation by normal. An alternative is to think about three sizes with three probabilities and search for corresponding test of multinomial distribution. Yet another way is to think about distribution of the sizes, possibly discrete, and try to see if one can use the Kolmogorov-Smirnov test to estimate the size distribution.2017-02-11
  • 0
    Sorry the previous comment was so brief (writing from phone). Googling 'multinomial proportion confidence interval stack exchange' gives http://stats.stackexchange.com/questions/111355/confidence-interval-and-sample-size-multinomial-probabilities, http://stats.stackexchange.com/questions/175756/multinomial-proportion-confidence-interval and this http://stats.stackexchange.com/questions/20555/simultaneous-confidence-intervals-for-multinomial-parameters-for-small-samples. All are relevant, the last one includes link to a paper (implemented in R).2017-02-14
  • 0
    Thanks for updating! I even wrote a thesis (30 years ago) on "Multivariate probit analysis: computation of the multivariate normal integral and applications in biostatistics" but never used this afterwards. Maybe I should re-read this also ;-)2017-02-14