0
$\begingroup$

I am attempting to write the multinomial Naive Bayes Classifier Decision rule as a linear rule. A document is viewed as a sequence $d = (w_1, w_2,\ldots,w_l)$ of $l$ words and is classified as $h(d) = \operatorname{argmax}_{y\in\{+1,-1\}}P(Y=y)\prod_{i=1}^l P(W=w_i\mid Y=y)$ Assuming we have estimates for all the appropriate probabilities and none of them are zero, how can we rewrite as $h(d) = \operatorname{sign}(\vec{v}\cdot\vec{x} + b)$? What is $\vec{v}$, $\vec{x}$, and $b$?

1 Answers 1

0

In this generality, it could be almost anything. One possibility that comes to mind is

$\operatorname{sign}\left(\sum_{y\in\{+1,-1\}}y\,P(Y=y)\prod_{i=1}^{l}P(W=w_i\mid Y=y)\right)\;,$