Random variables with almost countable images

$\newcommand{\F}{\mathscr F}$$\newcommand{\R}{\mathbb R}$$\newcommand{\A}{\mathscr A}$$\newcommand{\G}{\mathcal G}$$\newcommand{\E}{\operatorname E}$$\newcommand{\dp}{\,\text dP}$$\newcommand{\1}{\mathbf 1}$Let $(\Omega, \F, P)$ be a probability space and let $X : \Omega \to \R$ be a random variable. In order for this to be interesting I’m going to assume $\Omega$ is uncountable otherwise the image of $X$ is always at most countable. Now suppose there is some $S\subset \R$ such that

1. $P(X \in S) = 1$

2. $|S| \leq\aleph_0$

3. $S \subsetneq X(\Omega)$.

In words, I’m assuming that there is an at most countable set $S$ such that $X$ is in $S$ almost surely, but $X$ does take values outside of $S$ too. Singletons are Borel and $S$ is a countable union of singletons so it is guaranteed to be Borel as well.

To warm up, I’ll give an example of such an RV. I’ll consider the measurable space $([0,1], \mathbb B_{[0,1]})$ and I’ll equip it with the probability measure $P := \frac 12 (\delta_0 + \delta_1)$, i.e. the average of point masses at $0$ and $1$. Let $X : [0,1] \to [0,1]$ be given by $X(\omega) = \omega$, i.e. $X = \text{id}$. $X$ is clearly $(\mathbb B_{[0,1]}, \mathbb B_{[0,1]})$-measurable so it is a valid RV. Furthermore, $P(X \in \{0,1\}) = P(\{0,1\}) = 1$ so $X \in \{0,1\}$ almost surely even though $X$ takes on every value in $[0,1]$.

Result 1: the distribution of $X$, $P_X := P\circ X^{-1}$, only depends on the measures of the preimages of elements of $S$.

Pf: Consider some Borel set $B$. I can split $B$ over $S$ as
$$
B = B \cap \R = B\cap (S \cup S^c) \\
= (B \cap S) \cup (B \cap S^c)
$$
which is a union of two disjoint sets with the second one having zero measure w.r.t. $P_X$. Next, $S$ can be enumerated so
$$
B \cap S = \bigcup_{s \in B\cap S} \{s\}
$$
which is a countable union of disjoint sets too. This means that
$$
\begin{aligned}P_X(B) &= P_X(B \cap S) + P_X(B \cap S^c) = P_X(B \cap S) \\&
= \sum_{s \in B\cap S} P_X(\{s\}) \\&
= \sum_{s \in B\cap S} P\left(X^{-1}(\{s\})\right)\end{aligned}
$$
so the measure of any Borel set $B$ is determined by the overlap it has with $S$. Even though $X$ takes values not in $S$, since $P(X\in S) = 1$ those values do not affect any probabilities.

$\square$

This suggests a partition of $\Omega$. For $s \in S$ let $A_s = X^{-1}(\{s\})$ and let $A_\dagger = X^{-1}(S^c)$ which I can collect into $\A := \{A_s : s \in S\} \cup \{A_\dagger\}$. This is a disjoint collection of sets and $\bigcup_{A \in \A} A = \Omega$ so $\A$ gives a countable partition of $\Omega$. Let $\G = \sigma(\A)$, i.e. $\G$ is the $\sigma$-algebra generated by $\A$. Since the elements of $\A$ are disjoint this is just all possible unions of elements of $\A$.

Result 2: Let $\A$ be a partition of a sample space $\Omega$, $\G = \sigma(\A)$, and $X:\Omega\to\mathbb R$ be an RV. Then

1. $X$ being $(\G, \mathbb B)$-measurable means $X$ is constant on each element of $\A$

2. $X$ being constant on each element of $\A$ implies $(\G, \mathbb B)$-measurability if additionally $\A$ is countable.

Pf: I’ll begin by showing that $X$ being $(\G, \mathbb B)$-measurable implies that it is constant on elements of $\A$ via the contrapositive.

So suppose $X$ is not constant on all elements of $\A$, i.e. there is some $A \in \A$ such that $X(A)$ contains more than one element. Let $r \in X(A)$ be one of these values and consider $X^{-1}(\{r\})$ which is a proper subset of $A$. $\{r\}$ is Borel so if $X$ was $(\G, \mathbb B)$-measurable I’d have $X^{-1}(\{r\})\in \G$.

Every element of $\A$ is disjoint so $\G$ consists of all countable unions of elements of $\A$. This means that for any $G \in \G$ I will have $G\cap A = A$ or $G \cap A = \emptyset$. Since $X^{-1}(\{r\}) \cap A = X^{-1}(\{r\})$ which is neither $A$ nor empty, this means $X$ is not $(\G,\mathbb B)$-measurable.

For the other direction, suppose $X$ is constant on elements of $\A$ and that $|\A|\leq\aleph_0$. For $A \in \A$ I’ll use $x_A$ to denote the value that $X$ takes there. Let $B \in \mathbb B$ be an arbitrary Borel set. In a manner similar to the proof of Result 1, I can write $B$ as
$$
B = B \cap \mathbb R = B \cap \left(X(\Omega) \cup X(\Omega)^c\right) \\
= (B \cap X(\Omega)) \cup (B \cap X(\Omega)^c).
$$
I’ll now pull this back via $X^{-1}$. $X^{-1}(B \cap X(\Omega)^c) = \emptyset$ since no $\omega\in\Omega$ maps outside the image of $X$, therefore
$$
\begin{aligned}X^{-1}(B) &= X^{-1}(B \cap X(\Omega)) \\&
= X^{-1}\left(B \cap \bigcup_{A\in \A} x_A\right) \\&
= X^{-1}\left(\bigcup_{A\in \A} \{x_A\}\cap B\right) \\&
= \bigcup_{x_A \in B} A \in \G\end{aligned}
$$
since this is just a countable union of elements of $\A$, which is precisely what $\G$ contains.

As an example of why $\A$ being countable mattered for this direction, I can always partition $\Omega = \bigcup_{\omega \in \Omega} \{\omega\}$ and $X$, being a well-defined function, is always constant on each element of this partition. So if this was sufficient for proving that $X$ is measurable then every function would be measurable. Countability comes into play due to the fact that $\sigma$-algebras are only guaranteed closed under countable, not uncountable, unions.

$\square$

I’ll now return to having $X$, $\A$, and $\G$ be my particular objects from before.

Corollary: $X$ is not necessarily $(\G, \mathbb B)$-measurable.

Pf: $\G = \sigma(\A)$ so $\A$ is my partition. $X$ is constant on each element of $\{A_s : s\in S\}$ but $X$ is not guaranteed to be constant on $A_\dagger$ and in general won’t be (unless $|S^c| = 1$) so the result follows by Result 2.

$\square$

So $X$ is not necessarily measurable w.r.t. $\G$, the $\sigma$-algebra generated from the partition $\A$, but it “almost” is since $P(A’) = 0$. I can make this formal as follows:

Result 3: there is a $(\G, \mathbb B)$-measurable random variable $Y$ such that $P(X=Y) = 1$.

Pf: Define a random variable $Y : \Omega \to \R$ by
$$
Y(\omega) = \begin{cases} X(\omega) & \omega \in \cup A_s \\ 0 & \omega \in A_\dagger \end{cases}.
$$
I can write $Y$ more succintly as
$$
Y(\omega) = \sum_{s \in S} s \mathbf 1_{A_s}(\omega).
$$
$X$ was already constant on each $A_s$ for $s\in S$ so $Y$ is too, and by setting $Y$ to zero on $A_\dagger$ it is constant there too, therefore $Y$ is constant on each element of $\A$ thus by Result 2 $Y$ is $(\G, \mathbb B)$-measurable.

For almost sure equality,
$$
\{\omega \in \Omega : X(\omega)= Y(\omega)\} = (\cup_s A_s) \cup X^{-1}(\{0\}).
$$
I know that $X$ is $(\F, \mathbb B)$-measurable so, since $\{0\}$ is Borel, $X^{-1}(\{0\}) \in \F$ which makes this a countable union of sets in $\F$ and is therefore in $\F$ too. $P$ is monotonic so
$$
P\left((\cup_s A_s) \cup X^{-1}(\{0\})\right) \geq P(\cup_s A_s) = 1
$$
therefore $P(X=Y) = 1$.

$\square$

This shows that $X$ is almost everywhere equal to a $(\G, \mathbb B)$-measurable random variable that is constant on each part of the partition given by $\A$.

Connection to conditional expectation

Before I prove the main result of this section, I’ll prove the following lemma.

Lemma 1: For a probability space $(\Omega, \F, P)$ let $\{C_i : i \in \mathbb N\}$ be a disjoint collection of sets in $\F$. Then for any integrable RV $X:\Omega\to\mathbb R$ I’ll have
$$
\int_{\bigcup_{i\in\mathbb N} C_i} X\,\text dP = \sum_{i\in\mathbb N} \int_{C_i} X\,\text dP.
$$
Pf: I can rewrite the left hand side integral using an indicator function as
$$
\int_{\bigcup_{i\in\mathbb N} C_i} X\,\text dP = \int_{\Omega} \mathbf 1_{\cup_i C_i} X\,\text dP \\
= \int_\Omega \sum_{i\in\mathbb N} \mathbf 1_{C_i} X\,\text dP
$$
so the problem reduces to being able to exchange an infinite sum with an integral.

By assumption $X$ is integrable meaning $\int_\Omega |X|\,\text dP < \infty$ so
$$
\int_{\bigcup_{i\in\mathbb N} C_i} |X|\,\text dP = \int_\Omega \sum_{i\in\mathbb N} |\mathbf 1_{C_i} X|\,\text dP < \infty
$$
therefore I can apply Fubini’s theorem to conclude that
$$
\int_{\bigcup_{i\in\mathbb N} C_i} X\,\text dP = \sum_{i\in\mathbb N}\int_{C_i}X\,\text dP.
$$

$\square$

Suppose now that $\Omega$ can be partitioned into a countable collection $\mathcal C := \{C_0, C_1, C_2, \dots \}$ where for all $i \in \mathbb N$ $C_i \in \F$ and $P(C_i) > 0$. Let $\mathscr C = \sigma(\mathcal C)$.

Result 4: For an integrable RV $X: \Omega\to\mathbb R$, I claim
$$
Y := \E(X\mid \mathscr C) = \sum_{i=0}^\infty \frac{\text E(X\mathbf 1_{C_i})}{P(C_i)}\mathbf 1_{C_i}.
$$

Pf: All I have to do is verify the two requirements for the definition of conditional expectation and then because $\E(X\mid \mathscr C)$ is a.s. unique, I’ll be done.

$Y$ needs to be $(\mathscr C, \mathbb B)$-measurable, and this follows directly from $\mathcal C$ being a countable partition and Result 2. Therefore I just need to check that
$$
\int_{A}Y\,\text dP = \int_A X\,\text dP
$$
for $A \in \mathscr C$ and I’m done.

First, suppose $A = C$ for some $C \in \mathcal C$. $Y$ is constant on $C$ and
$$
\int_C Y \,\text dP = \frac{\E(X\mathbf 1_C)}{P(C)} \int_C \,\text dP \\
= \E(X \mathbf 1_C) = \int_C X \,\text dP
$$
so the result holds there. For a general $A \in \mathscr C$, the key is to note that $A$ is a countable union of elements of $\mathcal C$ so I can write
$$
A = \bigcup_{C \in \mathcal C : C \cap A \neq \emptyset} C
$$
therefore
$$
\int_A Y\,\text dP = \int_{\bigcup_{C \in \mathcal C : C \cap A \neq \emptyset} C} Y \,\text dP \\
= \sum_{C \in \mathcal C : C \cap A \neq \emptyset} \E(X\mathbf 1_{C})
$$
by Lemma 1. I now want to switch $\E$ with this countable sum but I’ll need to justify that.

Assuming I can make that exchange, I’ll have
$$
\int_A Y \,\text dP = \E\left(X \sum_{C \in \mathcal C : C \cap A \neq \emptyset} \mathbf 1_C\right) \\
= \E \left(X \mathbf 1_A\right)
$$
as desired.

I’ll now justify that exchange via a simple application of the Dominated Convergence Theorem (DCT). In the step under question I had some countable collection of sets $\{C \in \mathcal C : C \cap A \neq \emptyset\}$. I’ll enumerate these sets as $\{B_1, B_2, \dots\}$ for convenience.

I have
$$
\begin{aligned}\sum_{i=1}^\infty\E\left[X \mathbf 1_{B_i}\right] &= \lim_{n\to\infty}\sum_{i=1}^n \int X\1_{B_i}\dp \\&
= \lim_{n\to\infty}\int \sum_{i=1}^n X\1_{B_i}\dp \\&
= \lim_{n\to\infty} \int X \1_{\cup_{i \leq n} B_i} \dp \\&
\stackrel{?}= \int \lim_{n\to\infty} X\1_{\cup_{i \leq n} B_i} \dp \\&
= \int_{\cup_{i=1}^\infty B_i} X\,dp\end{aligned}
$$
and then “$\stackrel ?=$” gives the key step that I need to prove.

Setting up the DCT, I’ll take $f_n = X \1_{\cup_{i \leq n} B_i}$ and these converge pointwise to $f_\infty = X$. Taking $g = |X|$ to be my bounding function, I know this is integrable by assumption and $|f_n| = |X|\1_{\cup_{i \leq n} B_i} \leq |X| = g$ for all $\omega\in\Omega$ and $n\in\mathbb N$ so those are all the pieces I needed and the result holds via DCT.

$\square$

Now I’ll modify this by changing one element of the partition to have measure zero so that this reflects my situation with $\A$ and $\G$.

Result 5: Suppose now for my partition that $P(C_0) = 0$ but the rest is as before. I claim
$$
Y := \E(X\mid \mathscr C) \stackrel{\text{a.s.}}= \sum_{i=1}^\infty \frac{\E(X\mathbf 1_{C_i})}{P(C_i)}\mathbf 1_{C_i}
$$
(note that $C_0$ does not appear, and reflects that $Y$ can be changed on $C_0$ without affecting anything since it’s a $P$-null set).

Pf: $(\mathscr C, \mathbb B)$-measurability still holds because it’s still a countable partition and Result 2 didn’t require positive measure. For the other part of the definition, if $C_0 \nsubseteq A$ then all proceeds as in Result 4. But if $C_0 \subseteq A$ I can separate it out so
$$
\begin{aligned}\int_A Y \,\text dP &= \int_{C_0} Y\,\text dP + \int_{A\backslash C_0} Y \,\text dP \\&
= \int_{A\backslash C_0} Y \,\text dP = \int_{A\backslash C_0} X \,\text dP \\&
= \int_{C_0} X\,\text dP + \int_{A\backslash C_0} X \,\text dP = \int_A X \,\text dP\end{aligned}
$$
so my result still holds due to the fact that integrals over $C_0$ are zero.

$\square$

The upshot of this is that if $X$ is integrable and each $A_s$ has positive measure then by conditional expectation I have the existence of a $(\G, \mathbb B)$-measurable random variable $Y := \E(X \mid \G)$ and for $\omega \in A_s$ I’ll have
$$
Y(\omega) = \frac{\E(X \mathbf 1_{A_s})}{P(A_s)} = \frac{1}{P(A_s)}\int_{A_s} X\,\text dP \\
= \frac{s}{P(A_s)}\int_{A_s} \,\text dP = s.
$$
On $A_\dagger$ I can have $Y$ take an arbitrary value so I’ll set it to zero, which means I have
$$
\E(X\mid \G) \stackrel{\text{a.s.}}= \sum_{s\in S}s \mathbf 1_{A_s}
$$
which is exactly the RV from Result 3, except I didn’t need integrability or positive measure of the $A_s$ to get there.

Discretizing $X$

I’ll now get an actually discrete variable out of $X$. I have a partition of $\Omega$ given by $\A$ so I can let $\sim$ represent the corresponding equivalence relation where for $a,b\in\Omega$ I’ll say $a \sim b \iff$ $a,b$ are in the same piece of the partition. I can now consider $\Omega’ := \Omega / \sim$ so I’m collapsing each $A \in \A$ to a single element and now $|\Omega’|\leq\aleph_0$. The corresponding $\sigma$-algebra is $\F’ = 2^{\Omega’}$ because all possible unions of elements of $\A$ now corresponds to all subsets of the newly created atoms. Finally, for my measure I have $P’$ which is just defined by its values on the atoms of $\Omega’$.

Let $Z : \Omega’ \to \R$ be a random variable where for $\omega’\in\Omega’$ $Z$ takes the value that $X$ takes on the corresponding partition. If $\omega’$ corresponds to $A_\dagger$ $Z(\omega’) := 0$. Thus $Z$ directly corresponds to the RV $Y = \sum_{s\in S} s \1_{A_s}$ from before, but collasped down (I’m guaranteed that $Z$ is measurable since $\F’$ is the power set).

Let $P_Z$ be the distribution of $Z$. $Z$ only takes countably many values so the CDF $F_Z$ will be a step function which makes $Z$ a discrete random variable (this is the definition of a discrete RV as given by Jun Shao in his Mathematical Statistics).

Leave a Reply

Your email address will not be published. Required fields are marked *