1
$\begingroup$

I am working my way through a homework involving implementing a few heuristics for a game called Lines of Action. The teacher has given us some structural code that that we can use to test our heuristic and search implementations.

An interface is provided for heuristics and notes that the return value should be between $-1.0$ and $1.0$. There is then a note that "an easy way to scale is to perform a $tanh$". From what I can tell the values of $tanh$ are $1.0$ given $inf$ and $-1.0$ given $-inf$.

But my question is... I have multiple features that make up my heuristic value that all have a different scale. Say one feature might output a value from 0-10 and another 0-100. Initially I was adding them and then returning the overall heuristic value as $tanh$ of the sum. But now I have realized this obviously gives features more weight than others. So I thought if I $tanh$ the features value before addition along with after addition I can keep them at the same weight and output a value from $-1.0$ to $1.0$.

Is there a considerable weighting difference between features with different scales if I combine them in the manner above?

I know this depends on the actual value of $tanh$ and the scales of my features. Is there some scale values that will keep a more even weighting?

1 Answers 1

1

It is true that $\tanh$ takes $(-\infty, \infty)$ to $(-1,1)$ but that may not be the most important point for the question you are asking. If you have one feature that ranges over $0-10$ and another that ranges over $0-100$ the second will control the sum unless the common values are tightly clustered. If both features go over the whole range you could just divide the second by $10$ before adding. You might find this reasonable. Maybe one feature should be more important than the other, so its range should be wider. Maybe the one that ranges from $0$ to $100$ is almost always $55\pm 1$ with values outside that range very rare. Then the sum of the two will be dominated by the $0-10$ one except for the cases where the $0-100$ one is an outlier. The sum is mostly controlled by the variable with the highest variance. All this is a way to show that math can help with figuring out what your rule set will deliver, but not do so much for figuring out what you want the rule set to deliver.