2.3 Conditioning on \(\sigma\)-algebras and random variables

The next level of abstraction comes from a simple application: we want to condition on random variables (or, measurable maps), or more generally, \(\sigma\)-algebras. Intuitively, suppose \(X\) and \(Y\) are random variables (in some probability space). What should conditioning \(X\) on \(Y\) mean? One way to think of it is this: if we’re given enough information about \(Y\), what is the best guess we can make for \(X\)? This guess is captured by the conditional expectation \(\mathbb{E}[X | Y]\). Also, note that since we don’t condition on a specific value of \(Y\), we expect this object to itself be random.

Let’s now see the more general conditional expectation, which conditions a random variable on a \(\sigma\)-algebra. Let \(X\) be a random variable on a probability space \((\Omega, \mathcal{A}, \mathbb{P})\), and assume that \(X\in\mathcal{L}^1(\Omega, \mathcal{A}, \mathbb{P})\) (i.e., \(X\) is Lebesgue integrable). Let \(\mathcal{F}\subset\mathcal{A}\) be a sub-\(\sigma\)-algebra. Again, the question we want to ask is: given \(\mathcal{F}\), what is the best guess we can make about \(X\)? We’ll denote this guess by \(\mathbb{E}[X | \mathcal{F}]\).

To that end, we posit that \(\mathbb{E}[X | \mathcal{F}]\) is itself a random variable. Moreover, it turns out that, to get the best guess of \(X\) given \(\mathcal{F}\), it is enough to be able to take averages of \(X\) over sets in \(\mathcal{F}\). This leads us to a definition.

Definition 2.1 (Conditional Expectation) A random variable \(Y\) is called a conditional expectation of \(X\) given \(Y\), denoted by \(\mathbb{E}[X | \mathcal{F}]\), if the following are true:

  1. \(Y\) is \(\mathcal{F}\)-measurable.
  2. For any \(A\in\mathcal{F}\), we have that \(\mathbb{E}[X1_A] = \mathbb{E}[Y1_A]\). Intuitively, this is just spelling out the fact that, restricted to sets in \(\mathcal{F}\), we can average out \(X\) via it’s \(\mathcal{F}\)-measurable proxy \(Y\).

Taking this a step further, if \(X\) and \(Y\) are random variables, we define

\[ \begin{aligned} \mathbb{E}[X | Y] := \mathbb{E}[X | \sigma(Y)] \end{aligned} \]

It turns out that, under our assumptions, conditional expectations exist upto equality almost surely. Conditional expectations have many interesting properties (aka linearity, tower property, etc.), which I will not mention here. Readers are free to go through the given references to know more about them.