[Home]

Expectation


Expectation

Expectation of a random variable

For many random variables we see a striking example of statistical regularity. As an example, consider this gambling game:
A fair die is rolled. If it shows an odd number then I pay you Rs 20, else you pay me Rs 10.
A typical plot of my running average gain per game against number of games is as follows:
It is produced by the following code.
w = sample(6,1000,rep=T)
profit =c(-20,10,-20,10,-20,10)
X = profit[w]
avgX = cumsum(X)/(1:1000)
#png('image/explotnow.png')
plot(avgX,ty='l',xlab="#games played",ylab="My running avg gain")
#dev.off()
In fact, it is this phenomenon that first let man to study probability. If you run a gambling game a large number of time the running average profit per game becomes more and more stable. Gamblers wanted to guess this stable value beforehand. They argued as follows:
If I play this game a large number of times (say $n$ times), then approximately $\frac n2$ times I should get $10$ and the remaining $\frac n2$ times I should get $-20.$ So approximately my total gain would be $$ \frac n2\times 10 + \frac n2\times (-20). $$ So the average should be approximately this divided by $n,$ i.e., $$ \frac 12\times 10 + \frac 12\times (-20) = -5. $$
Indeed, this simple argument turns out to be remarkably accurate. Gamblers could not understand why it becomes so accurate as $n$ becomes large. But nevertheless they used this formula to find out what they could expect the random variable to do in the long run.

Simple random variables

A random variable is called simple if takes only finitely many values.

Definition: Expectation of simple random variables Let $X$ be a simpe random variable taking only the values values $x_1,x_2,...,x_k$ with probabilities $p_1,p_2,..., p_k$. Then we define the expectation of $X$ as $$ E(X) = \sum_1^k p_i x_i. $$

::

EXERCISE 1: (Easy)A random variable $X$ takes the values $-2, -1, 0, 1 $ and $2$ with probabilities $p,q,1-2p-2q, p$ and $q,$ respectively. Find $E(X).$

::

EXERCISE 2: (Easy)A random variable takes the values $1,2,...,10$ with probabilities $p_1,p_2,...,p_{10},$ respectively, where $\sum_i p_i = 1.$ Prove that $1\leq E(X)\leq 10.$ Also find $p_i$'s if $E(X) = 10.$

Random variables taking countably many values

Definition: If $X$ takes only countably many nonnegative values $x_1, x_2, ...$ with probabilities $p_1,p_2,...$ where $\sum p_i = 1,$ then $E(X)$ is defined as $$E(X) = \sum p_i x_i.$$

For any random variable $X$ we define $X_+ = \max\{X,0\}$ and $X_- = -\min\{X,0\}.$ Notice that both $X_+$ and $X_-$ are nonnegative.

EXERCISE 3: (Easy)Which of the following must be true?

  1. $X = X_++X_-.$
  2. $X = X_+-X_-.$
  3. $X = X_--X_+.$
  4. None of the above.

Definition: If $X$ takes only countably many values $x_1, x_2, ...$ with probabilities $p_1,p_2,...$ where $\sum p_i = 1,$ then $E(X)$ is defined as $$E(X) = \left\{\begin{array}{ll} E(X_+)-E(X_-)&\text{if }E(X_+), E(X_-)<\infty \\ \infty &\text{if }E(X_+)=\infty, E(X_-)<\infty \\ -\infty &\text{if }E(X_+)<\infty, E(X_-)=\infty \\ \end{array}\right..$$

Notice that we leave the case $E(X_+), E(X_-)=\infty $ unmentioned in the definition. This means $E(X)$ is undefined in this case.

EXAMPLE 1: A random variable takes the values $\pm 2^n$ for $n\in{\mathbb N}$ with probabilities $P(X=2^n) = P(X=-2^n) = 2^{-n-1}$ for $n\in{\mathbb N}$. What is $E(X)$?

SOLUTION: Here $X_+$ takes the values 0, 2, $2^2, 2^3,...$ with probabilities $2^{-1}, 2^{-2}, 2^{-3},...$.

So $E(X_+) = 0 + \frac 12+\frac 12+\frac 12+\cdots = \infty$.

Similarly, $E(X_-) = \infty$.

Hence $E(X)$ is undefined. ■

Theorem If $X$ takes only countably many values $x_1, x_2, ...$ with probabilities $p_1,p_2,...$ where $\sum p_i = 1,$ and $\sum |p_i x_i|< \infty,$ then $$E(X) = \sum p_i x_i.$$

It is possible to define expectation of random variables that are more general (taking uncountably many values). We shall give most general definition in the next semester. The following properties actually hold for the genral definition. Unless noted otherwise, we shall only prove them for the case of simple random variables. These proofs are actually the first steps in the general proofs that will come next semester.

A couple of warnings

Many students somehow get the idea that if a (discrete) random variable $X$ takes the values $x_1,x_2,..$. with probabilities $p_1,p_2,...$ where $\sum p_i = 1$, then $E(X) = \sum_i x_i p_i$. But this is wrong! It is wrong even if the sum converges! This formula is true in the following special cases:
  1. If $X$ is simple, i.e., takes only finitely many values.
  2. If $X$ takes only nonnegative (or only nonpositive) values (and here the formula holds even if the sum diverges).
  3. If $\sum_i |x_i| p_i < \infty$.

Here is another point that students sometimes get wrong. When we say $E(X)$ is undefined we mean $E(X)$ is meaningless in that context, and not that $E(X)$ can be anything there.

Properties of expectation

Relation of $E(X)$ with values of $X$

Theorem If $X$ is a degenerate rv (i.e., takes only one value with probability 1), then $E(X)$ equals that value.

Proof:Easy.[QED]

Theorem If $X$ always takes values in $[a,b],$ then $E(X)$ must exist finitely, and lie in $[a,b].$

Proof: Easy. [QED]

The condition "$X$ always lies in $[a,b]$" may be written as $P(X\in[a,b])=1.$

TheoremLet $X$ and $Y$ be random variables taking only finitely many values, and $X\leq Y.$ Then $E(X)\leq E(Y).$

Proof: Here $X\leq Y$ means $\forall \omega\in\Omega~~X(\omega)\leq Y(\omega).$

As already mentioned, we shall restrict the proof to only the case where $X$ and $Y$ are both simple.

Let $X$ take values $x_1,...,x_m,$ and $Y$ take values $y_1,...,y_n.$

Let $p_{ij} = P(X=x_i, Y=y_j).$

Clearly, if $p_{ij}>0,$ then we must have $x_i\leq y_j.$

Now $$\begin{eqnarray*}E(X) & = & \sum_i x_i P(X=x_i) = \sum_i (x_i \sum_j p_{ij}) =\sum_i\sum_j (x_i p_{ij})\\ & \leq & \sum_i\sum_j (y_j p_{ij}) ~~[\because p_{ij}>0\Rightarrow x_i\leq y_j]\\ & = & \sum_j\sum_i (y_j p_{ij})[\because \mbox{addition is associative and commutative}]\\ & = & \sum_j (y_j \sum_i p_{ij}) = \sum_j y_j P(Y=y_j) = E(Y). \end{eqnarray*}$$ [QED]

Theorem Let $a\in{\mathbb R}$ be any number. If $P(X\leq a)=1,$ then $E(X)=a$ if and only if $X$ is degenerate at $a.$

EXERCISE 4: (Easy)Prove this for simple $X$.

However, if $a\in{\mathbb R}$ is replaced by $\infty,$ then the result fails, i.e., It is possible to have a random variable $X$ that is always finite (any real-valued random variable will do, since $\infty\not\in{\mathbb R}$) such that $E(X)=\infty.$ Of course, we cannot get a counterexample using simple random variables. However, such counterexamples exist for random variables taking countably many values, as shown below.

EXAMPLE 2:  It is a standard fact that $\sum\frac{1}{n^2}<\infty.$ Let the sum be $c.$ (The exact value of $c$ which is $\frac{\pi^2}{6},$ is of no importance here).

Then consider a random variable $X$ that takes values in ${\mathbb N}$ and $P(X=n)=\frac{1}{cn^2}.$

Then $E(X) = \frac 1c\sum\frac 1n=\infty.$ ■

By the way, if $X$ can take values $x_1,x_2,...$, there is no guaranty that $E(X)$ will equal any of the $x_i$'s. For example, if $X$ is the outcome of a fair die, then $E(X)=3.5,$ which is not a possible outcome.

Transformation properties

Theorem Let $X$ be a random variable and let $a,b$ be constants. Then $E(a+bX) = a+bE(X).$

Proof: Prove it for simple $X$. [QED]

::

EXERCISE 5: (Easy) If $E(X) = \mu\in{\mathbb R},$ then what is $E(X-\mu)?$

Theorem Let $X,Y$ be two random variables both defined on the same probability space. We assume that both $E(X)$ and $E(Y)$ both exist finitely.

Then $E(X+Y)$ also exists finitely and we have $$ E(X+Y) = E(X)+ E(Y). $$

Next we shall need a new concept, that of a convex function. Graphically, $f(x)$ is a convex function if its graph is like a bowl opening upwards (possibly slanted). Some examples are shown below.
Mathematically we may define a convex function as follows.
While this definition is graphically quite intuitive, you may have seen other definitions of convexity elsewhere. Read here to learn more about equivalences between different definitions of convexity.
Definition: Convex function A function $f:{\mathbb R}\rightarrow{\mathbb R}$ is called convex if $\forall a\in{\mathbb R}$ there is a line $y = \ell_a(x)$ through $(a,f(a))$ that lies on or below the graph of $f(x),$ i.e., $$ \forall x\in{\mathbb R}~~ \ell_a(x) \leq f(x). $$
In the following diagram the blue line is $\ell_a.$ Both the red lines are candidates for $\ell_b.$
Jensen's inequality Let $X$ be a random variable and $f:{\mathbb R}\rightarrow{\mathbb R}$ be any convex function. We assume that $E(X)$ and $E(f(X))$ both exist finitely. Then $f(E(X))\leq E(f(X)).$

Proof: Let $\mu = E(X).$ Consider $\ell_\mu(x)$ as mentioned in the definition of convexity.

Since the graph of $\ell_\mu(x)$ is a straight line passing through $(\mu,f(\mu)),$ hence it must be of the form $$\ell_\mu(x) = f(\mu)+m(x-\mu),~~x\in{\mathbb R}.$$ So $$ E(f(X)) \geq E(\ell_\mu(X)) = E(f(\mu))+mE(X-\mu) = f(\mu)+0 = f(E(X)), $$ as required. [QED]

::

EXERCISE 6: (Medium)Which is larger $(E(X))^2$ or $E(X^2)?$ Assume that both exist finitely.

Expectation of a function

EXAMPLE 3:  Suppose I have a random variable that takes values $-1,0$ and $1$ with probabilities $0.1, 0.5$ and $0.4,$ respectively. What is $E(X^2)?$

SOLUTION: Here $X^2$ is a new random variable. Call it $Y,$ say. Then $Y$ takes values $0$ and $1$ with probabilities $0.5$ each.

So $E(Y) = \frac 12.$ ■

Here is another technique to arrive at the same result. $$ E(X^2) = 0.1\times (-1)^2 + 0.5\times 0^2 + 0.4\times 1^2 = 0.5. $$ This technique is often easier because here we do not need to find the distribution of $Y=X^2$ first. Both these techniques will always give the same answer.

Law of the lazy statistician Let a (discrete) random variable $X$ take values $x_1,x_2,...$ with probabilities $p_1,p_2,...$. Let $h(\cdot)$ be any function defined on the set $\{x_1,x_2,...\}.$ If $\sum |p_i h(x_i)| <\infty,$ then we must have $$ E(h(X)) = \sum p_i h(x_i). $$ Also, if $\sum|p_i h(x_i)|=\infty$ and all but finitely many $h(x_i)$'s are $>0$ (resp, $<0$), then $E(h(X))=\infty $(resp, $-\infty$).

Proof: If $X$ takes only finitely many values, then the result follows from distributivity of multiplication over addition.

If $X $ takes countably infinitely many values, and $h(X)$ is non-negative, then define $$ U_n =\left\{\begin{array}{ll}h(X)&\text{if }X=x_1,...,x_n\\ 0&\text{otherwise.}\end{array}\right. $$ and proceed as for the proof of $E(X)=\sum p_i x_i.$ [QED]

Indicator trick

Often we have to find $E(X)$ where $X$ is the count of something, e.g., number of heads in 100 tosses of coin, or number of times something interesting happens. If you want to find $E(X)$ directly from the definition, then you need to find the distribution of $X$ first, which is often difficult. In such situatons the indicator trick may provide a short cut.

EXAMPLE 4: We have a ring of 20 lamps. A wind blows and a random subset of lamps go out. Find the expected number of singleton lights (i.e., lighted lamps with both neighbours off).

The singletons are shown with arrowheads

SOLUTION: Let $X$ be the number of singletons. Finding the distribution of $X$ is not very difficult, but still we shall demonstrate the use of the indicator trick.

We shall use the arrowheads as our random variables. Let the lamps be numbered from 1 to 20. Define $L_i=1$ if $i$-th lamp is on and is a singleton, and $0$ else. In other words, $L_i=1$ means we have put an arrow head at position $i.$

Each $L_i$ is called an indicator variable.

Clearly $X = L_1+\cdots+L_{20}.$

So $E(X) = E(L_1)+\cdots+E(L_{20}) = 20 E(L_1),$ since by symmetry all the $L_i$'s have the same distribution.

To find $E(L_1)$ we need to find just $P(L_1=1)$, which involves only lamp 1 and its two neighbours. It should be clear that $P(L_1) = \frac{1}{8}.$

Hence $E(X) = \frac{20}{8} = \frac 52.$ ■

Finite existence of $E(X)$

We know from the definition of expectation that it may come in four varieties: it may be finite, or $\infty$ or $-\infty$ or undefined. The finite case is the most useful, and it sometimes helps to know some sufficient conditions for this.

Theorem If $X$ is simple, then $E(X)$ must be finite.

Proof: Goes without saying! [QED]

Non-negative random variables have the advantage that their expectation is always defined (though may be $\infty$). Now, from any random variable $X$ we can easily manufacture a non-negative random variable, viz, $|X|.$ It is good to be able to relate $E(X)$ with $E(|X|).$

Theorem $E(|X|)$ is finite if and only if $E(X)$ is finite.

Proof: We define $X_+, X_-$ as usual.

Then $X = X_+-X_-$ and $|X| = X_++X_-$.

Then finiteness of $E(|X|)$ is equivalent to finiteness of both $E(X_+), E(X_-).$

Again, finiteness of $E(X)$ is equivalent to finiteness of both $E(X_+), E(X_-).$

Hence the result. [QED]

::

EXERCISE 7: (Medium)If $E(|X|)=\infty,$ then what can you say about $E(X)?$

[Hint]

May be $\infty$ or $-\infty$ or undefined.

Theorem Let $X,Y$ be random variables defined on the same probability space. Let $|X|\leq|Y|.$ If $E(Y)$ is finite, then $E(X)$ must also be finite.

Proof: Since $E(Y)$ is finite, hence $E(|Y|)$ is finite. So $E(|X|)\leq E(|Y|)$ is also finite. Hence $E(X)$ is finite. [QED]

Theorem Let $X,Y$ be random variables defined on the same probability space. Let $E(X)$ and $E(Y)$ both be finite. Then $E(\max\{X,Y\})$ must also be finite.

Proof: Step 1: First we shall prove it for nonnegative $X,Y$. Since $E(X)$ and $E(Y)$ are both finite, so must be $E(X+Y)$.

Now, thanks to nonnegativity of $X,Y$ we must have $\max\{X,Y\}\leq X+Y$.

So, by the last theorem, $E(\max\{X,Y\})$ must be finite.

Step 2: Apply step 1 to $|X|$ and $|Y|$ and use the fact that for any $a,b\in{\mathbb R}$ we have $|\max\{a,b\}|\leq \max\{|a|,|b|\}$. [QED]

It is possible to combine the two steps into a single line proof. But stepwise thinking starting with the nonnegative case is a good habit.
Theorem Let $m<n$ be any two positive integers. If $E(X^n)$ exists finitely, then $E(X^m)$ must also exist finitely.

Proof: Use the fact that $|X^m| \leq \max\{1,|X^n|\}.$ Now use the last theorem. [QED]

The following theorem often proves very useful.

Markov inequality Let $X$ be any non-negative random variable. Let $\epsilon>0.$ Then $$E(X)\geq \epsilon P(X\geq\epsilon).$$

Proof: Let $Y =\left\{\begin{array}{ll}\epsilon&\text{if }X\geq \epsilon\\ 0&\text{otherwise.}\end{array}\right.. $ Then clearly, $X\geq Y$. So $E(X)\geq E(Y) = \epsilon P(X\geq \epsilon)$. [QED]

Problems for practice

::

EXERCISE 8: (Easy) What is $E(X)$ if $X$ takes the values $1,2,...,n$ with probability $\frac 1n$ each?

[Hint]

$\frac{n+1}{2}.$

::

EXERCISE 9: (Medium)Show that if $X$ is a random variable taking only non-negative integer values, then $$E(X) = \sum_{k=1}^\infty P(X\geq k).$$ This formula often proves useful for finding expected counts.

[Hint]

Let $p_n = P(X=n)$ for $n=0,1,2,3,...$ Then $$\begin{eqnarray*} P(X\geq 1) & = & p_1 + p_2 + p_3+\cdots\\ P(X\geq 2) & = & \phantom{p_1 +} p_2 + p_3+\cdots\\ P(X\geq 3) & = & \phantom{p_1 + p_2 +} p_3+\cdots\\ \cdots \end{eqnarray*}$$ Now add columnwise. Non-negative series do not change value when you rearrange the terms.

::

EXERCISE 10: (Medium)For a group of $n$ people find the expected number of days of the year which are birthdays of exactly $k$ people. (Assume 365 days and that all arrangements are equally probable.) Also find the expected number of multiple birthdays. How large should $n$ be to make this expectation exceed 1?

[Hint]

Let $X_i = \left\{\begin{array}{ll}1&\text{if }\mbox{exactly $k$ people have birthdays on day} i\\ 0&\text{otherwise.}\end{array}\right..$

Then $X = \sum_1^{365} X_i.$

So $E(X) = \sum_1^{365} E(X_i).$

Expected number of days of the year which are birthdays of exactly $k$ people is $\binom{n}{k}\frac{364^{n-k}}{365^{n-1}}.$

Expected number of multiple birthdays is $365\left\{1-\left(\frac{364}{365}\right)^n - \frac{n\times 364^{n-1}}{365^n}\right\}.$

::

EXERCISE 11: (Medium)A man with $n$ keys wants to open a door (where exactly one key works). He tries the keys independently at random. Find the expected number of trials needed to open the door if keys are tried (a) with replacement (b) without replacement.

[Hint]

(a) $\sum_1^ \infty k\cdot \left(1-\frac 1n\right)^{k-1}\cdot\frac 1n = \cdots = n.$

(b) $\sum_1^n \frac kn = \frac{n+1}{2}.$

::

EXERCISE 12: (Hard)

[Hint]

Here $p$ is what he says, and $p^*$ is what he believes. The meteorologist is not an honest one, and may say something different from what he believes. His only aim is to maximise the expected score.

The expected score is $$p^*(1-(1-p)^2) + (1-p^*)(1-p^2).$$ This is to be maximised wrt $p$ (with $p^*$ fixed).

Differentiate (or think of the graph) to see that the maximising value of $p$ is $p^*.$

::

EXERCISE 13: (Hard)

[Hint]

Company pays the amount $A$ with probability $p.$ It pays $0$ with probability $1-p.$

So its expected payoff is $pA.$

Suppose that it charges $B.$ Then expected profit is $B-pA.$ To keep it 10% of $A$ we need $B = (p+0.1)A.$

::

EXERCISE 14: (Medium)

[Hint]

(a) $E(X)$ would be larger, because when a student is selected at random, he is more likely to come from the larger buses. So $E(X)$ is a weighted average of the bus sizes where the larger buses get more weight.

But $E(Y)$ is the simple average of the bus sizes.

(b) $E(X) = \frac{40^2+33^2+25^2+50^2}{40+33+25+50}.$

$E(Y) = \frac{40+33+25+50}{4}.$

::

EXERCISE 15: (Hard)

We assume that no draw is possible.

[Hint]

By the pigeon hole principle, the winner will be decided by at least 2 and at most 3 games. So the sample space is $\{AA, BB, ABA, BAA, BAB, ABB\}.$ The probabilities are, respectively, $p^2, q^2, p^2q, p^2q, pq^2, pq^2,$ where $q=1-p.$

If $X$ is the random variable denoting the number of games played, then it takes the values, respectively, $2,2,3,3,3,3.$

So $E(X) = 2(p^2 + q^2) + 3(2p^2q + 2pq^2) = 2(1+pq).$

This is maximised when $pq = p(1-p)$ is maximised, which is when $p=\frac 12.$

::

EXERCISE 16: (Easy)

[Hint]

(a) $E\big( (2+4X)^2 \big) = 4E(1+4X+4X^2) = 4(1+4E(X)+4E(X^2)) = \cdots$.

(b)$ E(X^2+(X+1)^2) = E(2X^2+2X+1) = 2E(X^2)+2E(X)+1 = \cdots$.

::

EXERCISE 17: (Medium)

[Hint]

(a) Let $X_i = \left\{\begin{array}{ll}1 &\text{if }i\mbox{-th draw is white}\\ 0&\text{otherwise.}\end{array}\right.$ for $i=1,...,10.$

Then $E(X_i) = P(i$-th draw is white$)=\frac{17}{40}.$

So $E(X) = 10\times \frac{17}{40} = \frac{17}{4}.$

(b) Let $Y_i = \left\{\begin{array}{ll}1&\text{if }i\mbox{-th white ball is selected}\\ 0&\text{otherwise.}\end{array}\right.$ for $i=1,...,17.$

Then $E(Y_i) = P(i$-th white ball is delected$)=\frac 14.$

So $E(X) = 17\times \frac 14 = \frac{17}{4}.$

::

EXERCISE 18: (Medium)

[Hint]

Let $X_i$ be as given in the hint.

Let $X = $ number of intact marriages.

Then $X = \sum_1^{100} X_i$

Now $E(X_i) = \binom{198}{50}/\binom{200}{50}=\frac{150\times149}{200\times199}.$

So $E(X) = \frac{150\times149}{2\times199}.$

::

EXERCISE 19: (Easy)

[Hint]

$E(X) = \frac 52.$

::

EXERCISE 20: (Medium)

Here we assume that $E(X)$ exists finitely. The inequality holds even if $E(X^2)$ is not finite (with the interpretation that $\forall a\in{\mathbb R}~~\infty \geq a$.)

[Hint]

You may either use Jensen's inequality with the convex function $f(x)=x^2$ or the fact that $V(X)\geq 0.$

Equality if and only if $V(X)= 0$, i.e., if $X$ is a degenerate random variable.

::

EXERCISE 21: (Hard)

You may use some approximations in parts (c) and (d) of this problem. For instamce there are $\frac nk$ groups of $k$ patients each, even if $\frac nk$ is not an integer. You may also differentiate w.r.t. $k.$

[Hint]

(a) $1-(1-p)^k.$

(b) For a group of size $k$ the random variable $X$ takes the value $k+1$ with probability $1-(1-p)^k$ and the value $1$ with probability $(1-p)^k.$

So $E(X) = (k+1)(1-(1-p)^k)+(1-p)^k = k+1-k(1-p)^k.$

If there are $N$ people in all, where $N = qk+r$ with $r\in\{0,...,k-1\}$, then this applies to each of the $q$ groups and also the reaminder group of size $r.$

So the required expectation is $$q(k+1-k(1-p)^k)+r+1-r(1-p)^r.$$ If $k$ divides $N$, then this is $$\frac Nk(k+1-k(1-p)^k) = N+\frac Nk-N(1-p)^k = N\left(1+\frac 1k-(1-p)^k\right).$$

(c) Enough to minimise $1+\frac 1k-(1-p)^k$ wrt $k$ for given $p.$

Treating $k $ as a continuous variable, the derivative is $$-\frac{1}{k^2}-(1-p)^k\log(1-p).$$

::

EXERCISE 22: (Medium)

[Hint]

Here $P(X=k) = \binom{k-1}{r-1}p^rq^{k-r}$ for $k=r,r+1,...$ where $q = 1-p.$

So $$E\left(\frac rX\right) = \sum_{k=r}^\infty \binom{k-1}{r-1}p^rq^{k-r}\frac rk.$$ This may be handled by repeated term by term integration and differentiation of the power series $$1+q+q^2+\cdots = \frac{1}{1-q}$$ for $|q|<1.$

For example, if $r=1$, then the sum is $$p\sum_{k=1}^\infty \frac{q^{k-1}}{k} = \frac pq\sum_1^\infty \frac{q^k}{k} = \frac pq\sum_1^\infty \int_0^q t^{k-1}\, dt = \frac pq\int_0^q \sum_1^\infty t^{k-1}\, dt =\frac pq\int_0^q \frac{dt}{1-t} = \cdots. $$

::

EXERCISE 23: (Medium)

[Hint]

Let $X = $ the number of trials needed to get the first 6.

Then $P(X=k) = \left(\frac 56\right)^{k-1}\frac 16$ for $k=1,2,3,....$

So $E(X) = \sum_{k=1}^\infty k \left(\frac 56\right)^{k-1}\frac 16.$

Now, we know that $\frac{1}{1-x} = 1+x+x^2+x^3+\cdots$ if $|x|<1.$ This may be differentiated term by term (needs a justification that you should learn in your real analysis class) to give $$\frac{1}{(1-x)^2} = 1+2x + 3x^2+\cdots.$$ Put $x=\frac 56$ to find the required expectation.

::

EXERCISE 24: (Hard)

[Hint]

Assuming the dice to be fair, the answer does not depend on the number the gambler bets on.

Let $X$ be the loss for unit stake on 1. Then $$X = \left\{\begin{array}{ll} 1&\text{if }\mbox{no die shows 1}\\ -1&\text{if }\mbox{exactly 1 die shows 1}\\ -2&\text{if }\mbox{exactly 2 dice show 1}\\ -3&\text{if }\mbox{all 3 dice show 1}\\ \end{array}\right..$$ So $P(X=1) = \left(\frac 56\right)^3$, $P(X=-1) = 3\left(\frac 16\right)\left(\frac 56\right)^2$, $P(X=-2) = 3\left(\frac 16\right)^2\left(\frac 56\right)$ and $P(X=-3) = \left(\frac 16\right)^3.$

Hence $$E(X) = \left(\frac 16\right)^3(5^3-3\times 5^2-6\times5-3).$$

::

EXERCISE 25: (Hard)

[Hint]

Let $T_1 = 1.$

Let $T_i = $ waiting time for the $i$-th new coupon after the $(i-1)$-th coupon has been encountered, for $i=2,3,4,5.$

Consider the following example to understand the definition of $T_i$'s. Suppose that the coupons arive in the order:
3 3 4 3 5 4 3 4 2 3 4 5 1.
The first occurences of each type of coupon have been shown in red. They are at positions $$S_1 = 1, S_2 = 3, S_3 = 5, S_4=9\mbox{ and } S_5=13.$$ We are defining $T_i = S_i-S_{i-1}$ for $i=1,...,5$ where $S_0=0.$

Then the $T_i$'s are independent random variables.

$T_1$ is degenerate at 1, and for $i=2,...,5$ we have $$P(T_i = k) = q_i^{k-1}p_i$$ for $k\in{\mathbb N}$ where $q_i = \frac{i-1}{5}$ and $p_i = 1-q_i.$

We can easily find $E(T_i)$'s.

The answer to the problem is $E(T_1+\cdots+T_5) = 1+E(T_2)+\cdots+E(T_5).$

[Thanks to Nipam for correcting a mistake here.]

::

EXERCISE 26: (Hard)

[Hint]

The same person may be part of two marriageable couples.
The guys are all distinct, and so are the girls (though it is not clear from my wonderful artwork!).

The diagram shows 8 runs, i.e., stretches of same gender. A single girl or a single guy consitute the shortest possible run. Notice that the number of marriageable couples is one less than the number of runs.

Thus, the number of arrangements with $k$ marriageable couples is the same as the number of arrangements with $k+1$ runs. Here $k$ can take any value between $1$ and 14.

As an example let us find $P(k=3).$

The total number of arrangements is of course $15!.$

We need $3+1=4$ runs: either male-female-male-female or female-male-female-male.

So $$P(k=3) = \frac{8!\times7!\times7\times6\times2}{15!}.$$ Find these for all possible values $k$, and then compute expectation.

Or...use the indicator trick!!!

::

EXERCISE 27: (Hard)

[Hint]

Let there be $n$ workers. Let $X$ be the number of working days.

Let $X_i=\left\{\begin{array}{ll}1&\text{if }i\mbox{-th day is a working day}\\ 0&\text{otherwise.}\end{array}\right..$

Then $X = \sum_1^{365} X_i.$

Now $E(X_i) = P(i$-th day is a working day$)= \left(\frac{364}{365}\right)^n.$

So $E(X) = 365\times \left(\frac{364}{365}\right)^n.$

Hence expected number of man-days is $365n\times \left(\frac{364}{365}\right)^n=f(n)$, say.

We want to maximise this wrt $n.$

Now $$\frac{f(n+1)}{f(n)} = \frac{n+1}{n}\times \frac{364}{365}.$$ This is $>/=/< 1$ according as $364 >/+/< n.$

So the function is maximised at $n=364$ and $365.$

::

EXERCISE 28: (Hard)

[Hint]

Let $X=$ number of cards required to be turned.

Then $P(X=k)=\frac{4\times {}^{48}P_{k-1}(52-k)!}{52!}.$

So you can now find $E(X) = \sum_{k=1}^{49} k P(X=k)$.

But here is a smarter solution (taken from Mosteller's book and also given by a student via Whatsapp):