I am attempting to write the multinomial Naive Bayes Classifier Decision rule as a linear rule. A document is viewed as a sequence $d = (w_1, w_2,\ldots,w_l)$ of $l$ words and is classified as $h(d) = \operatorname{argmax}_{y\in\{+1,-1\}}P(Y=y)\prod_{i=1}^l P(W=w_i\mid Y=y)$ Assuming we have estimates for all the appropriate probabilities and none of them are zero, how can we rewrite as $h(d) = \operatorname{sign}(\vec{v}\cdot\vec{x} + b)$? What is $\vec{v}$, $\vec{x}$, and $b$?
How do I write the multinomial Naive Bayes Classifier Decision rule as a linear rule?
0
$\begingroup$
probability
machine-learning
1 Answers
0
In this generality, it could be almost anything. One possibility that comes to mind is
$\operatorname{sign}\left(\sum_{y\in\{+1,-1\}}y\,P(Y=y)\prod_{i=1}^{l}P(W=w_i\mid Y=y)\right)\;,$