2
$\begingroup$

A set is a collection of distinct objects.

Suppose I sample some parameter of some population and the results are $\{35,35,36,36,37 \}$.

Do I need to say (i) the multiset of the samples is $\{35,35, \dots \}$ instead of (ii) the set of samples is $\{35,35, \dots \}$?

  • 0
    It seems worth mentioning that informal representations of sets such as {a,b,c} where a, b, and c are letters of the alphabet are often used. I say informal because a, b, and c interpreted as letters are not themselves sets while in formal set theory, every element of a set is a set itself. However, we can replace something such as {a, b, c} with a more formal {$0, 1, 2$}, and $0 = \emptyset$, $1 = \{0\}$, and $2 = \{0, 1\}$ so each member is indeed a set itself. The point is that how formal one should be really seems to depend on the context as suggested by the answers.2011-01-18

3 Answers 3

3

Yes, I think it would be better to use the word "multi-set" here. Formally, say, $\{35,35,7\}$ and $\{35,7\}$ are the same set, but in your case the number of times each element is mentioned actually matters.

One caveat: Not everybody is careful, and some see this distinction as pedantic, so you may find in books or websites mention of the set ${35,35,7}$ when really, it is the multi-set that is meant.

Also, sometimes people use ordered tuples $(35,35,7)$ to avoid this issue.

3

Two sets $A$ and $B$ are the same if and only if they have the same elements; that is, if and only if for every $x$, $x\in A$ is true if and only if $x\in B$ is true.

That means that the set $A=\{1,1\}$ and the set $B=\{1\}$ are really the same set, because they have the exact same elements. Sets cannot "tell" how many times an element is listed in it.

Your set, $A=\{35, 35, 36, 36, 37\}$, and the set $B=\{35, 36, 37\}$ are the same set, for every $x$, $x$ is in $A$ if and only if $x$ is in $B$. They are both also the same as the set $C=\{37, 37, 37, 37, 35, 36\}$, and the set $D=\{37,36,35,35,35,35,35,35,35,35,35,35\}$.

Multisets are slightly different; in a multiset, the number of times that an element appears does matter. Two multisets $A$ and $B$ are equal if and only if for every $x$, the number of times that $x$ occurs in $A$ is the same as the number of times that $x$ occurs in $B$. However, multisets don't care about order, so the multiset $A=\{35, 35, 36, 36, 37\}$ and the multiset $B=\{35, 36, 37, 35, 36\}$ are equal.

If you want to keep track of both the number of times and the order, the simplest thing is to consider ordered pairs; for example, instead of the set you attempt to give, you could have a set in which the $n$th reading is represented by the ordered pair $(n,a)$, where $a$ is the reading itself. So you could take $A=\{(1,35), (2,35), (3,36), (4,36), (5,37)\}$; this would be different from the set in which the readings are, in order, $35$, $36$, $37$, $36$, and $35$. So you need to figure out exactly what is it that matter and what does not matter.

If only what actually occurs matters, use sets.

If what matters is what occurs and the number of times it occurs, use multisets.

If what matters is what occurs, the order, and the number of times it occurs, use sets of ordered pairs or some other modification.

Either way, to avoid confusion, use the corresponding nomenclature: if you want multisets, say "multiset"; if you want sets, say "set".

1

I think it would be proper to use "multiset". But most statisticians probably just say "set".