3
$\begingroup$

What is the word for the values derived from an ordered set such that the values divide (by virtue of their positions; not by their value) the set into 3 subsets that have an equal or nearly equal number of members? What is the word when the values are drawn from the set?

What is the generic word that applies regardless of the number of subsets?

EDIT: The parenthetical "by virtue of their positions" is probably unnecessary because, as I said, the set is ordered and the subsets have a (nearly) equal number of members.

EDIT: Four answers are sought: a word for a value in the case of 3 subsets, a word for a datapoint in the case of 3 subsets, a word for a value in the general case, a word for a datapoint in the general case.

3 Answers 3

5

I think you are looking for the word tercile. The general concept is called quantile.

  • 1
    Good, I didn't like that name anyway. Since medoid ends in -oid, I will start calling it the quantoid.2011-09-25
2

I hadn't really encountered the term "medoid" before, but, having looked at the Wikipedia entry, I wonder if it even really generalizes the way you want it to.

Wikipedia defines the medoid of a set of points as the point in the set which minimizes the average distance (for some given distance function $\delta$) to all the other points, i.e.

$\operatorname{medoid}_\delta(S) = \underset{x \in S}{\operatorname{arg\,min}}\ \sum_{y \in S}\ \delta(x,y).$

I don't see how to generalize that to dividing the data set into $n$ subsets. Of course, when $S \subset \mathbb R$ and $\delta(x,y) = |x-y|$, then the medoid is the median (or to be exact, since neither is necessarily unique, each medoid is a median), and then there is a natural generalization to arbitrary quantiles: if the quantile is not uniquely defined, just pick one of the endpoints of the range. But I'm not sure if that is really different enough from the basic concept of quantiles to need a separate name.


Digression: If I'm reading the definitions correctly, the definition of the medoid seems closely related to the Fréchet mean, which is given by

$m_d(S) = \underset{x \in M}{\operatorname{arg\,min}}\ \sum_{y \in S}\ d^2(x,y),$

where $M$ is a metric space equipped with the metric $d$ and containing the set of data points $S \subset M$. (Wikipedia actually gives a weighted version of the definition, but I've left the weights out for simplicity.) Just set $M = S$ and $d(x,y) = \delta^\frac{1}{2}(x,y)$, and you recover the definition of the medoid above.

The median can also be represented as a Fréchet mean, with $M = \mathbb R$ and $d(x,y) = \sqrt{|x-y|}$. However, there is also another, equivalent (ignoring issues with uniqueness, anyway) definition of the median in terms of order statistics, and it is this definition which extends naturally to other quantiles. Thus, the issue with extending the general definition of the medoid in the same manner is essentially that, except for particular choices of distance function and underlying space, the medoid need not be an order statistic. In that sense, the similarity between the names "median" and "medoid" is merely an unfortunate coincidence.


Anyway, getting back to the original terminology question, if I wanted to be explicit about my quantiles being actual values from the data set, I might call (or define) the $k$-th $q$-quantile as the $\lceil \frac {nk} q \rceil$-th order statistic, where $n$ is the number of data points.

  • 0
    Medians and medoids need separate names.2011-07-22
1

According to Wikipedia, it is "tertile" or "tercile". The generic word is "quantile".