I have a test which benchmarks the performance of my 3D rendering software, in frames per second. I run the test for each revision of my rendering software to check that I'm improving performance. For any given revision, each test run gives a slightly different result, so I actually run the test a number of times and track the mean and variance for each revision.
Each test run takes a reasonable duration of time, so I would like to find the minimum number of iterations that will give me confidence in the mean value that I measure. Any suggestions on how to frame this problem statistically would be much appreciated.