EXAMPLE 1:
A 7-segment display shows any number from 0 to 9 at random (equal
probabilities).
Let $X$ be the indicator random variable of
whether the blue segment is on. Similarly, $Y$ is the
indicator for the red segment. Find the conditional distribution
of $Y$ given $X.$
SOLUTION:
Here $X,Y$ both take values in $\{0,1\}.$
We need to find $P(Y=y | X=x)$ for $x,y\in\{0,1\}.$
Now $P(Y=1|X=1) = P(X=1,Y=1)/P(X=1).$
Both the blue and the red segments are on in only the numbers
3,4,5,6,8,9. So $P(X=1,Y=1) = \frac{6}{10}.$
The blue segment is on in the numbers 2,3,4,5,6,8,9. So $P(X=1) =\frac{7}{10}.$
Hence $P(Y=1|X=1) = P(X=1,Y=1)/P(X=1) = \frac 67.$
You should now be able to work out the other three conditional
probabilities similarly.
■
We can define conditional CDF or conditional PMF in the obvious
way.
It is important to understand that the conditional
expectation/variance is a random variable, which is a function of
the conditioning random variable.
Remember the throm of total probability:
$$
P(A) = P(B) P(A|B) + P(B^c)P(A|B^c),
$$
where combined the two conditional probabilities of $A$ to
arrive at the (unconditional) probability of $A?$
Well, we can do similar things with conditional
expectation/variance also.
Proof:
Let $X$ take values $x_1,x_2,...$ and $Y$ take
values $y_1,y_2,...$. Let the joint PMF of $(X,Y)$ be
$$
P(X=x_i~\&~Y=y_j) = p_{ij}.
$$
Then $P(Y=y_j | X=x_i) = \frac{p_{ij}}{p_{i\bullet}}.$
So $E(Y|X=x_i) = \sum_j y_j \frac{p_{ij}}{p_{i\bullet}}.$
Expectation of this is
$$
\sum_i E(Y|X=x_i) p_{i\bullet} = \sum_i \sum_j y_j
\frac{p_{ij}}{p_{i\bullet}}p_{i\bullet} = \sum_i \sum_j y_j p_{ij} =
\sum_j y_j \sum_i p_{ij} = \sum_j y_j p_{\bullet j} = E(Y),
$$
as required.
[QED]
Many expectation problems can be handled step-bystep using this
result. Here are some examples.
EXAMPLE 2:
A casino has two gambling games:
Roll a fair die, and win Rs. $D$ if $D$ is the
outcome.
Roll two fair dice, and win Rs 5 if both show the same
number, but lose Rs 5 otherwise.
You throw a coin with $P(Head)=\frac 13$ and decide to play game
1 if $Head,$ and game 2 if $Tail.$ What is your
expected gain?
SOLUTION:
Let $X$ be your gain (in Rs), and let $Y$ be the outcome of the
toss.
Then $E(X|Y=Head) = 3.5$ and $E(X|Y=Tail) = 5\times\frac{6-30}{36}=-\frac{10}{3}.$
So, by the tower property, $E(X) = P(X|Y=Head)\times P(Y=Head)+P(X|Y=Tail)\times P(Y=Tail) = \cdots.$
■
The tower property is very useful for computing expectations
involving a random number of random variables. Here is an
example.
EXAMPLE 3:
A random number $N$ of customers enter a shop in a
day, where $N$ takes values in $\{1,...,100\}$ with
equal probabilities. The $i$-th customer pays a random amount $X_i$,
where $X_i$ takes values in $\{1,2,...,10+i\}$
ith equal probabilities. Assuming that $N,X_1,...,X_N$ are
all independent, find the total expected payments by the
customers on that day.
SOLUTION:
We have $E(X_i) = \frac{11+i}{2}.$
So $E\left(\sum_1^N X_i|N\right) = \sum_1^N E(X_i|N) = \sum_1^N E(X_i) = \sum_1^N \frac{11+i}{2} = 5.5N+\frac{N(N+1)}{4}.$
By tower property, the required answer is $E\left(5.5N+\frac{N(N+1)}{4}\right)=\cdots.$
■
EXAMPLE 4:
10 holes, numbered 1 to 10, in a row. 5 balls are dropped
randomly in them (a hole may contain any number of balls). Call a
ball "lonely" if there is no other ball in its hole or the
adjacent holes. Find the expected number of lonely balls.
SOLUTION:
Define the indicators $I_1,...,I_5$ as
$$
I_i = \left\{\begin{array}{ll}1&\text{if }i\mbox{-th ball is lonely}\\0&\text{otherwise.}\end{array}\right.
$$
Then the total number of lonely balls is $X = \sum I_i.$
So we are to find $E(X) = \sum E(I_i).$
Let $Y_i = $ the hole where the $i$-th ball has fallen.
Then $E(I_i|Y_i=1)$ is the conditional probability that
all the balls except the $i$-th one has landed in
holes $2,...,10$ given that the $i$-th ball has landed
in hole 1.
You should be able to compute this easily. Similarly, you can
compute $E(I_i|Y_i=k)$ for $k=1,...,10.$
Notice that $Y_i$ can take values $1,...,10$ with equal probabilities.
So tower property should provide the answer as
$$
E(X) = \sum E(E(I_i|Y_i)) = \cdots.
$$
■
Proof:
This follows directly from the tower property.
If $X,Y,Z$ are jointly distributed random variables, then we
can talk about conditional distribution of $Z$
given $(X,Y)$ or $X$ given $Z$ or $(X,Z)$
given $Y,$ etc. We can even condition step by step. For
example, we can talk about $E(E(Z|X,Y)|X).$ This is a
function of $X$ alone.
(a) Once you realise that $f_X(x) = P(X=x)$, $f_Y(y) = P(Y=y)$ and
$f_{Y|X}(y|x) = P(Y=y|X=x),$ the given equality is just theorem of total probability.
(b) The RHS is $E(E(Y|X))$ and so the equality is just the tower property.
(a) Let $U = \min(X,Y).$ Then $U$ can take values $0,...,N.$
$P(U=k) = P(U\geq k)-P(U\geq k+1).$
Now $P(U\geq k) = P(X,Y\geq k) = P(X\geq k)P(Y\geq k) = \left(\frac{N-k+1}{N+1}\right)^2.$
Similarly, $P(U\geq k+1) = \left(\frac{N-k}{N+1}\right)^2.$
So $P(U=k) = \frac{(N-k)^2-(N-k+1)^2}{(N+1)^2} = ... .$
(b) Let $T = \max(X,Y).$ Then $T$ can take values $0,...,N.$
$P(T=k) = P(U\leq k)-P(T\leq k-1).$
Now $P(T\leq k) = P(X,Y\leq k) = P(X\leq k)P(Y\leq k) = \left(\frac{k+1}{N}\right)^2.$
Similarly, $P(T\leq k-1) = \left(\frac{k}{N}\right)^2.$
So $P(T=k) = \left(\frac{(k+1)^2-k^2}{N^2} = \frac{2k+1}{N^2}.$
(c) $R=|Y-X|$ can take values in 0,1,...,$N.$
$P(R=0) = P(X=Y) = \frac{1}{N+1}.$
For $k=1,...,N,$ we have $P(R=k) = P(R=k \& X < Y) + P(R=k \& X=Y) + P(R=k \& X > Y).$
Now $P(R=k \& X=Y) =0.$
Also $P(R=k \& X < Y) =P(R=k \& X > Y).$
For $\{R=k\ & X < Y\}$ to happen we must have $X = 0,...,N-k$ and correspondingly $Y = k,...,N.$
So $P(R=k\ & X < Y) = \frac{N-k+1}{N}.$
Hence $P(R=k) = \frac{2(N-k+1)}{N}.$
(a) $(X_1,...,X_r)$ can take values $(x_1,...,x_r)$ where each $x_i$ is a nonnegative integer and $\sum_1^r x_i = 2r.$
We consider the random experiment of dropping the balls one by one into the boxes. For each ball have $r $ posible destinations.
So $|\Omega| = r^{2r}.$
Now fix some $(x_1,...,x_r)$ as above. The event $A=\{(X_1,...,X_r) = (x_1,...,x_r)\}$ may be obtained as follows.
Pick and order $x_i$ balls to drop into box $i$ one by one.
So $|A| = \frac{(2r)!}{x_1!\times\cdots\times x_r!}.$
Hence
$P\{(X_1,...,X_r) = (x_1,...,x_r)\}= \frac{ |A| }{ |\Omega| }.$
(b) $\frac{ (2r)!}{(4r)^r}. $
(a) $1-\left(\frac{5}{6}\right)^6.$
(b) For $n$ rolls $P($ at least one 6$)=1-\left(\frac 56\right)^n.$
We need $n$ such that $1-\left(\frac 56\right)^n\geq \frac 12.$
Direct computation shows $n\geq 4.$
::
EXERCISE 10:
Imagine this set up: A coin with $P(H)=p$ is repeadly tossed. Success means $H.$
Let $I_j$ be the indicator variable for whether there is a
record at position $j.$ Then $P(I_j=1)$ may be computed
by total probability:
$$
P(I_j=1) = \sum_{k=j}^n P(X_j=k)P(I_j=1|X_j=k).
$$
Similarly for $P(I_jI_k=1).$
Let the black balls be labelled $b_1,...,b_m.$
Let $X_i=\left\{\begin{array}{ll}1&\text{if }\mbox{no white drawn before }b_i\\
0&\text{otherwise.}\end{array}\right..$
Then $X= 1+\sum_1^m X_i.$
Also, $E(X_i) = \frac{1}{n+1}$. To see this consider the $n$ white balls plus $b_i.$ Out of these $n+1$
balls $b_i$ has the chance $\frac{1}{n+1}$ to come first.
(a) $V(X_i) = \frac{n}{(n+1)^2}.$
Also for $i\neq j$ we have $E(X_iX_j) = \frac{2}{(n+2)(n+1)}$ (because out of the $n$ white balls plus $b_i$
and $b_j$ any of the $\binom{n+2}{2}$ pairs can come first with equal probability).
(b) Let $Y_i$ be as given in the hint. Let's take an example to
understand how $Y_i$'s are defined. Suppose that we have $m=20$ black and $n=3$ white balls.
Here is one way they may turn up:
B B Y Y B B B B B Y
Then $Y_1 = 2$ (as there are two B's preceding the first W), $Y_2=0$ (since the second W is immediately after
the first), and $Y_3 = 5$ (because there are as many B's between the second and third W).
We shall argue using bijection that $Y_i$'s are all identically distributed. Let's
try to show that $P(Y_1=0) = P(Y_2=0).$ The outcome shown above is in the event $\{Y_2=0\}.$
Now just swap the first two W's (along with B's immediately preceding it) to get:
Y B B Y B B B B B Y
Clearly, this is another possible outcome which is inside $\{Y_1=0\}.$
It is not difficult to see (check!) that this swap is a bijection between the events $\{Y_1=0\}.$ and $\{Y_2=0\}.$
If the bijection is denoted by $f,$ then $\forall \omega\in\Omega~~P(\omega) = P(f(\omega))$ (why?)
Hence $P\{Y_1=0\} = P\{Y_2=0\}.$
In general, we see that $Y_i$'s are all identically distributed. Now (b) follows immediately from (a) applied to each
$Y_i$ separately.
Let $T = \lambda X_1+ (1-\lambda) X_2.$
Then $V(T) = \lambda^2 V(X_1) + (1-\lambda)^2 V(X_2),$ since $X_1,X_2$ are independent.
Thus, $V(T) = \lambda^2 \sigma_1^2 + (1-\lambda)^2 \sigma_2^2 = f(\lambda),$ say.
Then $f'(\lambda) = 2 \sigma^2_1 \lambda - 2 \sigma^2_2(1-\lambda).$
Solving $f'(\lambda) = 0$ we get $\lambda = \frac{\sigma^2_2}{\sigma^2_1+\sigma^2_2}.$
This is desirable because we are giving more weight to the $X_i$ that has less variance (i.e., is more stable).
$E(X|Y=y) = \sum_x x P(X=x|Y=y) = \sum_x x P(X=x),$
since $X,Y$ independent.
Hence the result.
::
EXERCISE 20:
Do this for discrete $X, Y$ only. If $X$ can take values $x_1,x_2,x_3,...$ with
positive probabilities, then
you are prove
$$\forall i~~E(g(X)Y|X=x_i] = g(x_i)E(Y|X=x_i).$$
No, the result may not hold if the $X_i$'s have a dependence structure that is
asymetric. A counterexample is as follows.
$X_1 = $ outcome of a roll of a fair die. $X_2$ is obtained from $X_1$ by
swapping 1 and 2. $X_3$ is obtained from $X_1$ by swapping 1 and 3. Then $E(X_1|X_1+X_2+X_3=6)=1\neq \frac 63.$
Let $X_i$ be the indicator for $i$-th red ball being a win.
There are $\binom{2n}{n}$ sequences of $n$ R's and $n$ B's in all. Let us count
how many of these lead to $\{X_i=1\}.$
Split each such sequence into two parts, the part before the $i$-th R, and the part after.
For instance, for $n=4$ and $i=3$ the sequence RRBRBRBB is split as RRBRBRBB.
For general $n$ and $i,$ the red part must consist of exactly $i-1$ R's and at
most $i-1$ B's. The blue part consists of exactly $n-i$ R's and the remaining B's.
Let $N_{r,b} = $ number of sequences with exactly $r$ R's and $b$ B's. In other words, $N_{r,b} =\binom{r+b}{r} = \binom{r+b}{b}. $
Then, for any sequence in $\{X_i=1\}$ the red part may be selected in
$$\sum_{j=0}^{i-1} \binom{i+j-1}{j}$$
ways. Here $j$ denotes the number of B's in the red part. Once we also count the matching number
of blue parts for each value of $j$, we get the size of $\{X_i=1\}$ as
$$\sum_{j=0}^{i-1} \binom{i+j-1}{j}\binom{2n-i-j}{n-j}.$$
Now you should be able to complete the rest.
[Thanks to Arnab Nayak for correcting a typo.]