1
$\begingroup$

Is there any example of such a function $f$, preferably one defined on all $V^n$ and all positive integers $n$ where $V$ is some vector space? It must satisfy the following:

  1. $f(T) = f(\sigma(T))$ for any $T \in V^n$ and permutation $\sigma$
  2. $f((X,t,t)) = f((X,t))$ for any (possibly empty) sequence $X$ and $t \in V$
  3. $f(a+T) = a+f(T)$ for any $T \in V^n$ and $a \in V$ where $a+(t,...,u) = (a+t,...,a+u)$
  4. $f(B*T) = B*f(T)$ for any $T \in V^n$ and diagonal matrix $B$ where $B*(t,...,u) = (B*t,...,B*u)$
  5. $f$ is continuous on $V^n$ for each positive integer $n$

If there is no closed form, an algorithm to approximate it will be good enough.

The reason I am looking for such a function is that I want to find a way to determine the most probable original vector given a set of samples where some samples could be dependent on others. In another field it is called textual criticism but the criteria used there are extremely subjective, whereas I believe this model uses objective criteria and the results are reproducible, if this function exists in the first place. Permutation invariance and duplicate invariance are necessary to exclude identical copies (like those of internet-myths). Translation and stretch invariance seem to be logical requirements as well. If $f$ does not exist, can a proof be shown? Thanks a lot!

  • 0
    I don't understand condition 2, as I don't see what elements of $V^n$ are $(X,t,t)$ and $(x,t)$. Can you elaborate on this please?2011-12-27
  • 0
    @Alex: He probably means a function on the set of _all_ finite sequences of vectors, rather than a function on $V^n$.2011-12-27
  • 0
    @Zhen: Ah, that makes sense. User213820: If this is the case, you should edit your question to replace $V^n$ with $\bigoplus\limits_{n=1}^\infty V$.2011-12-27
  • 0
    I still think condition 2 needs clarification. If OP explains what he meant by this notation it would be best for all of us.2011-12-27
  • 2
    Discard duplicates and take the mean? That satisfies all your criteria.2011-12-27
  • 0
    @Alex: Yes I meant what Zhen said. Sorry I did not know there was such a standard notation so I had put "for any n".2011-12-27
  • 0
    For condition 2, an example would be $f((x,y,t,t))=f((x,y,t))$2011-12-27
  • 0
    @Rahul: At first I wanted continuity but removed it.. Sorry I should have left it in.. I will edit my question to reflect that as well as Zhen's clarification.2011-12-27
  • 0
    @Patrick, I think user21820 is using the expression $f((X,t,t))$ to denote the function $f$ applied to the concatenation of two sequences $X$ and $(t,t)$.2011-12-27
  • 0
    @Rahul : OP made it clear in his comments, but thanks. =)2011-12-27
  • 0
    The part with the matrix is weird though : You are only saying that $V$ is a vector space, are you also assuming it to be $K^n$ for some field $K$? Because you're writing in condition 4 that there's a matrix $B$ that's diagonal over some basis of $V$.2011-12-27
  • 0
    @Patrick: Not quite; I simply wanted stretch-invariance but do not know how best to state it. It is of course preferable if $V$ is $F^r$ for some field $F$, but if matrix multiplication is defined in terms of an action of a field on the elements of V, it would suffice for me.2011-12-28

2 Answers 2

3

Let us first consider the one-dimensional case, $V = \mathbb R$. Let $T = (x_1, x_2, \ldots, x_n)$ be the points sorted in ascending order with duplicates discarded, and $I = [x_1, x_n]$ be their convex hull. Associate each point $x_i$ with its Voronoi cell, that is, the interval of points in $I$ closer to it that to any other $x_j$. Define $f$ to be a weighted average of the $x_i$, with the weights being the length $\ell_i$ of the corresponding cell, $$f((x_1,x_2,\ldots,x_n)) = \frac{\sum_{i=1}^n\ell_ix_i}{\sum_{i=1}^n\ell_i}.$$

To show that this construction satisfies (2) and (5) even though we discarded exact duplicates, we only need to show that as two or more points simultaneously approach the same value, say $x^*$, the value of $f$ approaches its value with the points replaced with a single point at $x^*$. This follows from the fact that the total size of the corresponding Voronoi cells approaches the size of the cell that a single point at $x^*$ would have.

Since you only care about diagonal matrices $B$, you can apply this procedure independently to each coordinate axis of $V$.

  • 0
    P.S. The idea of weighting by Voronoi cells came to me by analogy with [natural neighbor interpolation](http://dilbert.engr.ucdavis.edu/~suku/nem/nem_intro/node3.html), which is a very interesting interpolation scheme for scattered data and has many nice properties.2011-12-27
  • 0
    Very interesting idea. Applied independently to each coordinate axis, it does not solve the problem I originally started with, as I would expect two points with one identical coordinate but distinct other coordinates to influence the function towards them in that coordinate as compared to if their other coordinates were identical. I am not sure what is the best way to state such a behaviour. Do you think Voronoi cells within a bounding box in the whole space $V^n$ will work? It seems to intuitively.2011-12-28
  • 0
    @user21820, I see what you mean. Unfortunately, Voronoi cells in $\mathbb R^n$ are not invariant to non-uniform scaling, so it would not satisfy condition (4). Going by your latest comment, if you want the solution to be invariant to arbitrary non-uniform stretching (e.g. stretching the 2D plane along the (1,1) direction), and not just axis-aligned stretching, then $B$ should be a symmetric positive definite matrix, not necessarily a diagonal one. In that case, my answer does not work either.2011-12-28
  • 0
    Oh. Do you have any other geometry-based ideas? At first I tried to find some kind of geometric centre but the internet is not a very easy place to search..2011-12-28
  • 0
    Wait, I do not think I need non-axis-aligned stretching. It is just that a coordinate of the output should not be independent of the other coordinates of the inputs, because clearly the closer two samples are (for some to-be-specified norm), the more likely they are dependent, either one on the other or both copied from the same source, thus they should have a lower total weight than if they were very different samples. In this case, do you think weighting each point by the $k$th-root of the volume of its Voronoi cell will work?2011-12-28
  • 0
    @myself: No it won't work at all... haha..2011-12-28
  • 0
    @user21820, your condition (4) forces each coordinate to be treated independently. Suppose $B$ is zero everywhere except in the $(k,k)$th entry. Then you require the $k$th coordinate of $f(T)$ to equal the result of $f$ applied to the $k$th coordinate of $T$... (Forbidding $B$ from being singular doesn't help, because one can take a sequence of nonsingular $B$'s converging to the above and appeal to continuity.)2011-12-28
  • 0
    The only way I can think of to specify my intention is that there must be another continuous function that returns the weights for each point, and then the final function is the weighted average of the points. This is rather restrictive, thus I did not want to pose the question that way, in case there is a suitable function that cannot be expressed as a linear combination of the points. But perhaps this gives a better idea of what I am looking for. [Edit: I did not see Rahul's comment, so I will have to think about it again..]2011-12-28
  • 0
    I see what you mean, oh no I seem to have to keep changing what I am looking for... What about simple scaling instead of arbitrary stretch? Can I edit my question again, replacing "diagonal matrix B" with "scalar b"?2011-12-28
  • 0
    @user21820: Given that you have revised the question a few times already (and that Patrick has posted some other functions that satisfy your criteria but don't look like they're what you really want), I suggest we leave aside the abstract formulation of the problem for the moment, and you explain more concretely what the actual data you have is and what you want to achieve with the algorithm you're looking for. If that would be a big change to the current question, you could consider posting it as a new one instead.2011-12-29
  • 0
    Unfortunately I do not have a very good idea of what is the best way to analyse difference sources to obtain the most probable facts, given that the author of each source may not report what he knows but may simply state the same things as other sources, and some may even purposely state falsehoods, which may in turn be referred to, and so on.. For example, given 3 samples (0,0), (0,1), (1,1), the most likely original is (0,1), if the original is known to have elements from {0,1}. However if the original had elements from the real interval [0,1], what is the most probable original?2011-12-29
0

Your reason to want to find such a function, topped with Rahul's comment that says "Discard duplicates and take the mean" made me think of other such functions with a statistical background. Obviously the "Discard duplicates and take the mean" function does the job. But are there others? I thought about "Discarding duplicates and taking the variance of the vector $(v_1, \dots, v_n) \in V^n$ seen as a sample". That works too.

Note also that the set of all functions satisfying your properties is a convex space, because we've just shown it is non-empty, and given two functions $f$, $g$ that satisfy it, you can see that $\alpha f + (1-\alpha) g$ is in your function space also, for every $\alpha \in [0,1]$ (here I am assuming that $\alpha \times f(T)$ makes sense, thus the image of this function would need to be some subset of the complex numbers) or even better, for every $\alpha \in \mathbb R$ (i.e. it would be an affine space). I just thought I'd notice that.

Hope that helps,

  • 0
    Discarding duplicates and taking the mean does not satisfy condition 5, which I added after Rahul suggested it. If I understood your "taking the variance" correctly, it does not satisfy condition 4, right?2011-12-28
  • 0
    Aw. I said variance but I meant in the first place to say standard deviation. But again that would not work since $\sqrt{ B^2}$ doesn't make sense for every matrix. Sorry if my answer was a little lame. I'll let my answer there though, seeing mistakes can sometimes be fruitful.2011-12-28
  • 0
    Doesn't matter, but do you have any other statistics-based ideas? I searched for both geometric and statistical methods but failed to find any.. All those that I found were not duplicate-invariant and most assume that there is no dependency between samples.2011-12-28
  • 0
    It's the matrix condition that screws up the statistical point of view of those properties... I've thought of taking the maximum / minimum of the components too but got nothing. Why do you need this criterion?2011-12-28
  • 0
    For example, assume that I can assign a real number to whether a text implies X, with 1 for an unambiguous direct affirmation of X and with -1 for an unambiguous direct refutation of X and values in between for statements that merely suggest the truth or falsity of X to some degree. Then we would not want results dependent on whether I used the interval [-1,1] or [0,1] or .. Thus I was hoping there was a symmetric homogenous expression.2011-12-28
  • 0
    You don't need criterion 4 for that... you just need to "trust your results" (with theorems, that is!) The reason why I am saying this is because probabilities are based entirely on the interval $[0,1]$, and they are still very, very important, though they literally always not satisfy condition 4. In other words, condition 4 seems irrelevant to me from a statistical point of view.2011-12-28
  • 0
    Do you have any functions that satisfy all the conditions except condition 4 and is also a linear combination of the input points where the weights themselves are continuous functions? That will be great, and perhaps closer to what I want than my above formulation..2011-12-28
  • 1
    I can only guess one such function if the vector space is $V = \mathbb R$ : the maximum function and the minimum function obviously work, since those functions are invariant under permutations, do not care about duplicates, and adding the vector $(a,\dots,a)$ just translates the minimum/maximum. Continuity is trivial. =)2011-12-28
  • 0
    Neither the maximum function nor minimum function can be expressed as a weighted sum of the input points where each weight function is continuous.2011-12-30
  • 0
    ...so? You didn't ask for that, did you? What's the problem?2011-12-30
  • 0
    I thought you were responding to my question "do you have any functions ... except condition 4 ...". I originally included condition 4 in an attempt to relax the constraint a little. Nevermind anyway, and thanks for your suggestions! =)2011-12-30
  • 0
    I am confused here. I was answering the question "Do have any functions that satisfy 1-2-3-5?" and I answer with convex combinations of the maximum and the minimum when $V = \mathbb R$. What did I not get?2011-12-30
  • 1
    I meant that I originally wanted $f$ to be a weighted sum of the input points where the weights are continuous functions, but thought it might be preferable to have only conditions on $f$ alone, so I included condition 4. If I remove condition 4, I would want the original condition to be satisfied. If we take the maximum in each coordinate, the resulting $f$ unfortunately cannot be expressed as a weighted sum of the input points where the weights are continuous functions, because they must each be 0 or 1 even in the 1-dimensional case. If $f$ is such a sum, I only need conditions 1,2,5.2011-12-31