0
$\begingroup$

I have 50 or so training examples involving a set of 200 or so real numbers (x1,x2,...,x200) (normalized to a 0 mean and std dev 1), and a single output real (y) in the range 0.0..1.0. I want to fit a linear model as follows:

y = w0 + w1 * x1 + w2 * x2 + ... + w200 * x200 

So I need to calculate (w0,w1,w2,...,w200) based on the training examples. By what formula or algorithm should I calculate these weights?

  • 0
    @Tpofofn the post is tagged `(machine-learning)` and `(regression)` in such setting it is *always* the case that the number of training samples < the number of variables (aka features). Methods such as least squares can find *a* solution which minimizes, say, sum of squared error.2012-03-15

1 Answers 1

0

There is a whole area of machine learning studying all different aglorithms to do that. You probably can't visualize your data because it lives in $\mathbb{R}^{200}.$

I guess you should try least squares. Let $X : \mathbb{R}^{50\times 201}$ be your feature matrix augmented with column of $1$'s (to account for $w_0$). $X$ has $50$ rows; one for each training example. Each row looks like: $(1, x_1, x_2, \ldots, x_{200}).$ Let $y : \mathbb{R}^{50\times 1}$ be your output vector.

The linear least squares will solve for $w$ in $Xw = y,$ where $w : \mathbb{R}^{201\times 1}$ are the weights you're looking for.

The naive solution is solving the following linear system for $w$: $ X^{T}Xw = X^{T}y, $ but better numerical methods exist.