I am trying to come up with a cost function that I could minimize/maximize. This would be a two variables function: number of requests per second and latency in ms. What we observed is that the more requests we serve, the higher the latency. We want to find the right balance.
What would be the formula? Is this simply a linear function?