2
$\begingroup$

I would like to define a function $f$ whose range is $[0,1]$ such that it takes a matrix $C \in R_+$ of dimension $m \times n$. The entries in the matrices are also in the range $[0,1]$. In addition, each row of $C$ sums to $1$.

The function $f$ should produce $0$ when all the entries in the matrix $C$ are same and produce $1$ when there is only one $1$ entry in each row.

For example,when

C = [1 0 0 0      0 0 1 0      0 0 0 1      0 1 0 0] 

$f(C) = 1$

In other case, when

C = [0.25 0.25 0.25 0.25      0.25 0.25 0.25 0.25      0.25 0.25 0.25 0.25      0.25 0.25 0.25 0.25] 

$f(C) = 0$

Can anyone help me out in contructing such a function?

  • 3
    On each row the maximum of sum of the squares of the entries is equal to $1$, and achieved only, when the row has a single $1$ and the rest are zeros. Similarly the sum of the squares of entries on a row attains its minimum value $1/n$, when all the entries are equal. Therefore all you need is to calculate the sum of the squares. Subtract a constant to make the minimum equal to zero. Then scale by a constant factor to make the maximum equal to one.2011-07-04

2 Answers 2

2

Let the matrix have the form $ \begin{pmatrix} p_{1,1} & \ldots & p_{1,n} \\ \vdots & \ddots & \vdots \\ p_{n,1} & \ldots & p_{n,n} \end{pmatrix} . $ I would approach this problem in the following way:

Since $\sum_{i=1}^np_{k,i}=1$ for all $k$, each of the matrix rows can be thought of as a probability distribution. The almost canonical characterisation of a probability distribution is its Shannon entropy $ H_k = -\sum_ip_{k,i}\cdot\ln(p_{k,i}) $ where summands with $p_{k,i}=0$ are left out. This is known to be highest when $p_{k,i}=p_{k,j}\equiv p_k=\tfrac1n$, and lowest when $p_{k,i}=0$ save for one $i$. Namely, $ \bigl.H_k\bigr|_{p_{k,i}=p_{k,j}} = -\sum_ip_{k}\cdot\ln(p_{k}) = -n\cdot p_{k}\cdot\ln(p_{k}) = -n\cdot \tfrac1n\cdot\ln(\tfrac1n) = -\ln(\tfrac1n) = \ln(n) $ and $ \bigl.H_k\bigr|_{p_{k,i}=0\ \forall i\neq j} = -0 - \ldots - p_{k,j}\ln (p_{k,j}) - \ldots -0 = -1\cdot \ln(1) = 0 $ We can now define a "total entropy" of the matrix, $ H := -\sum_{k}\sum_{i}p_{k,i}\cdot\ln(p_{k,i}) $ This will now be $n\ln(n)$ for the matrix with all-equal entries, and $0$ for a matrix with only-$0$-and-$1$-entries. To get a value in your desired range, just do an affine transformation $ f:= 1-\frac{H}{n\ln(n)} = 1 + \sum_{k,i}\frac{p_{k,i}\ln(p_{k,i})}{n\cdot\ln(n)} $

2

Let $R_i$ be the maximum value in row $i$ less the minimum value in that row, and put $f = (R_1 + ..R_n)/n$.

  • 1
    @Learner, the max entry in each row is at most 1, the min, at least 0, so each $R_i$ is between 0 and 1, so $f$, their average, is also between 0 and 1.2011-07-04