Let's say I want to determine if some person is bald using two features, age and hair color. Also I will assume that age and hair color is independent in other words use Naive Bayes classifier.
Transforming my given data to probability table:
If I wanted to calculate if person is bald at age 20 having brown hair it would be easy
p(bald=yes|20,brown)=1/4*1/4*4/9=0.02
p(bald=no|20,brown)=2/5*4/5*5/9=0.17
Since first probability is higher it will more likely will be bold. But what to do if I wanted to calculate probability of being bold at age 20 and having blonde hair?
p(bald=yes|20,black)=1/4*2/4*4/9=0.05
p(bald=no|20,black)=2/5*0/5*5/9=0
I don't have any data of man being bald when he has blonde hair and I think it wouldn't be very correct just because of this ignore everything. So how I should deal with this situation in general where we would have much more features and much more data?