[Home]

Table of contents


Joint distribution

Definition: Jointly distributed random variables When we say that some random variables are jointly distributed, we mean that they are all defined on the same probability space.
If we want to combine values of different random variables (e.g., by addition, subtraction etc or comparison like $\leq$), then they must be jointly distributed. If we have $n$ jointly distributed real-valued random variables, then you may consider them as components of an ${\mathbb R}^n$-valued random variable. Sometimes we call such a random variable a multivariate random variable, as opposed to a univariate one.

We shall now extend the various familiar concepts about ${\mathbb R}$-valued random variables to ${\mathbb R}^n$-valued random variables.

Definition: Joint CDF Let $X = (X_1,...,X_n)$ be an ${\mathbb R}^n$-valued random variable. Its joint CDF is defined as $F:{\mathbb R}^n\rightarrow{\mathbb R}$ where for all $(x_1,...,x_n)\in{\mathbb R}^n$ $$ F(x_1,...,x_n) = P(X_1\leq x_1~\&~\cdots~\&~X_n\leq x_n). $$

The extension of the concept of discreteness is straightforward.

Definition: Discrete An ${\mathbb R}^n$-valued random variable $X$ is called discrete if there is a countable set $A\subseteq{\mathbb R}^n$ such that $P(X\in A)=1.$
The definition of continuous random variable is slightly more confusing. For ${\mathbb R}$-valued random variables we had two equivalent definitions: For an ${\mathbb R}^n$-valued random variable, these two conditions are not equivalent (the latter is stronger). We use the stronger condition as the defintion of continuity of an ${\mathbb R}^n$-valued random variable.

Caution: Most books take a much stronger definition of continuity for joint distribution. More precisely, that definition should be called absolute continuity, which we shall learn later.
Definition: Continuous An ${\mathbb R}^n$-valued random variable $X$ is called continuous if its joint CDF is continuous.

The following example shows that the first condition is indeed weaker than the second.

EXAMPLE 1:  Consider the function with the following graph:

Clearly it satisfies the 4 conditions of being a CDF. Hence we know that there is a random variable $X$ with this CDF (by the fundamental theorem).

Define a ${\mathbb R}^2$-valued random variable as $Y=(X,1).$ Show that for any $(a,b)\in{\mathbb R}^2$ we have $P(Y=(a,b))=0.$ Also show that the CDF of $Y$ is not continuous.

SOLUTION: $P(Y=(a,b))= P(X=a~\&~1=b)\leq P(X=a)=0,$ since $X$ is a continuous random variable.

Also, the joint CDF is $$ F(a,b) = P(X\leq a~\&~1\leq b) = \left\{\begin{array}{ll}0&\text{if }b < 1\\F(a)&\text{if }b\geq 1.\\\end{array}\right. $$ If we take $(a_n,b_n) =\left( \frac 12, 1-\frac 1n\right),$ then $(a_n,b_n)\rightarrow \left(\frac 12,1\right).$

Now $F(a_n,b_n)\equiv 0,$ and so $F(a_n,b_n)\rightarrow 0.$

But $F\left(\frac 12,1\right) = \frac 12\neq 0.$ ■

Definition: Joint PMF Let $X$ be an ${\mathbb R}^n$-valued discrete random variable. Then its joint PMF is the function $p:{\mathbb R}^n\rightarrow{\mathbb R}$ defined as $$ p(x_1,...,x_n)= P(X_1=x_1~\&~\cdots~\&~X_n=x_n). $$

Marginal distributions

If you are given two jointly distributed random variables $X,Y$ and you know their joint distribution, i.e. given any $A\subseteq{\mathbb R}^2$ you know $P((X,Y)\in A),$ then you can work out the probability distribution of $X$ and $Y$ separately from this, i.e., for any fiven $B\subseteq{\mathbb R}$ you can find $P(X\in B)$ and $P(Y\in B)$ as follows:
$P(X\in B) = P(X\in B~\&~ Y\in{\mathbb R}) = P((X,Y)\in A),$ where $A = B\times{\mathbb R}.$ Similarly, for $Y.$

Definition: Marginal distribution Let $X=(X_1,...,X_n)$ be an ${\mathbb R}^n$-valued random variable. For any $\{i_1,...,i_k\}\subseteq\{1,2,...,n\}$ the joint distribution of $(X_{i_1},...,X_{i_k})$ is called a $k$-dimensional marginal for the joint distribution of $X.$

Theorem Let $(X,Y)$ be an ${\mathbb R}^2$-valued random variable with joint CDF $F(x,y).$ Then the marginal CDF of $X$ is $$ F_X(x) = P(X\leq x) = \lim_{y\rightarrow \infty} F(x,y) $$ and the marginal CDF of $X$ is $$ F_Y(y) = P(Y\leq y) = \lim_{x\rightarrow \infty} F(x,y). $$

Expectation

The definition of expectation is straightforward extension of the univariate case.

Definition: Expectation Let $X$ be an ${\mathbb R}^n$-valued discrete random variable with PMF $p(x)$. Let $f:{\mathbb R}^n\rightarrow {\mathbb R}$ be any function. Then $E(h(X))$ is defined as follows.

If $X$ is an ${\mathbb R}^n$-valued random variable, and $h:{\mathbb R}^n\rightarrow {\mathbb R}^m$ is any function, then $E(h(X))$ is defined component by component, and is said to exists finitely iff all the component expectations exist finitely.

Theorem If $X, Y$ are jointly distributed real-valued random variables, each with finite expectation, then $X+Y$ also has finite expectation $$ E(X+Y) = E(X)+E(Y). $$

Proof: In this course we shall prove this only when $X,Y$ are both discrete random variables.

First, notice that $X+Y$ is again discrete.
Because: If $X$ takes values in the countable set $\{x_1,x_2,...\}$ and $Y$ take values in the countable set $\{y_1,y_2,...\},$ then each possible value of $X+Y$ must be of the form $x_i+y_j.$ There are only countably many such values.
Let $p_{ij} = P(X=x_i~\&~ Y=y_j).$

Then $P(X=x_i) = \sum_j p_{ij}$ and $P(Y=y_j) = \sum_i p_{ij}.$

So $E(X) = \sum_i x_i P(X=x_i) = \sum_i x_i \sum_j p_{ij},$ and $E(Y) = \sum_j y_j P(Y=y_j) = \sum_j y_j \sum_i p_{ij} .$

By the given condition both these series converges absolutely, and may be grouped and arranged in any way without changing the sum.

So $\sum_i\sum_j |x_i p_{ij}|< \infty,$ and $\sum_j\sum_i |y_j p_{ij}|< \infty.$

Now $|x_i+y_j|\leq |x_i|+|y_j|$ by triangle inequality.

Hence $\sum_{i,j} |(x_i+y_j)p_{ij}| <\infty$ and so $E(X+Y)$ exists finitely. Also $$ E(X+Y) = \sum_{i,j} (x_i+y_j)p_{ij} = \sum_i\sum_j x_ip_{ij} + \sum_j\sum_i y_jp_{ij} = E(X)+E(Y), $$ as required. [QED]

This result leads to simple trick that we discuss next.

Indicator trick

Suppose that you are to find expected number of something. For example, $n$ letters are randomly put into $n$ addressed envelops, and you are to find $E(X),$ where $X$ is the number of correctly placed letters. would you count $X$ In any given situation like the following, you can find $X$ by first putting a check mark for each correctly placed letter and then counting the total number of check marks.

Mathematically each ckec mark is an indicator. For example, the indicator for the $i$-th letter is $$ I_i = \left\{\begin{array}{ll}1&\text{if }i\mbox{-th letter is placed correctly}\\0&\text{otherwise.}\end{array}\right.. $$ Counting the number of check marks amounts to summing $I_i$'.s Thus, $X = \sum I_i.$

Notice that each $I_i$ is a random variable, and $E(X) = \sum E(I_i).$

Since each $I_i$ takes only the values $1$ and $0,$ hence $E(I_i) = P(I_i=1).$

Now $I_i=1$ means $i$-th letter has been placed correctly. This is has probability $\frac{(n-1)!}{n!} = \frac 1n.$

So $E(X) = n\times \frac 1n = 1.$

It's a bit surprising that $E(X)$ does not depend on $n.$

Independent random variables

An important special case of jointly distributed random variables is that of independent random variables. To state the definition we shall intriduce a new terminology: If $X:\Omega\rightarrow S$ is a random variable, then by "an event in terms of $X$" we shall mean $\{w\in\Omega~:~ X(w)\in A\}$ for some $A\in S.$ Similarly, if $X:\Omega\rightarrow S$ and $Y:\Omega\rightarrow T$ are jointly distributed random variables, then "an event in terms of $X,Y$" means $\{w\in\Omega~:~ (X(w),Y(w))\in A\},$ where $A\subseteq S\times T.$

Definition: Indepdendent random variables Let $X_1,...,X_n$ be jointly distributed random variables. We say that they are independent if for all disjoint subsets $A,B\subseteq\{1,...,n\}$ any event in terms of $\{X_i~:~i\in A\}$ is independent of any event in terms of $\{X_i~:~i\in B\}.$

EXAMPLE 2:  If $X,Y,Z$ are independent random variables, then $$ P(X^2+Y^2 \leq 4~\&~ Z\neq 5) = P(X^2+Y^2 \leq 4)P(Z\neq 5). $$ ■

Theorem If $X_1,...,X_n$ are independent random variables, then any function of some of the $X$'s is independent of any function of the remaining $X$'s.

Proof: Split $\{1,...,n\}$ into two disjoint subsets $\{i_1,...,i_k\}$ and $\{j_1,...,j_{n-k}\}.$

Let $Y = f(X_{i_1,...,i_k})$ and $Z = g(X_{j_1,...,j_{n-k}}),$ where $f,g$ are any two functions.

Take any two sets $A,B.$ Then $$P(Y\in A~\&~Z\in B) = P(f(X_{i_1,...,i_k})\in A~\&~g(X_{j_1,...,j_{n-k}})\in B) = P(f(X_{i_1,...,i_k})\in A)P(g(X_{j_1,...,j_{n-k}})\in B) = P(Y\in A)P(Z\in B). $$ [QED]

Theorem Let $X,Y$ be jointly distributed discrete random variables, with PMFs $p(x)$ and $q(x).$ If they are independent, then their joint PMF is $h(x,y) = p(x)q(y).$

Proof:Immediate from the definition of independence.[QED]

Theorem Let $X,Y$ be jointly distributed random variables, with CDFs $F(x)$ and $G(x).$ If they are independent, then their joint CDF is $H(x,y) = F(x)G(y).$

Proof:Immediate from the definition of independence.[QED]

Theorem If $X,Y$ are independent random variables with finite expectations, then $E(XY) = E(X)E(Y).$

Proof: We shall prove this for the case where $X,Y$ are both discrete (hence so is $(X,Y)$).

Let $p(x,y), p_X(x)$ and $p_Y(y)$ be the joint and marginal PMFs, respectively.

Then $$ E(XY) = \sum_{x,y} xy p(x,y) = \sum_{x,y} xy p_X(x)p_Y(y) = \sum_x x p_X(x)\times \sum _y yp_Y(y) = E(X)E(Y). $$ The grouping and rearranging were justified since the series were absolutely convergent. [QED]

Covariance

Definition: Covariance If $X,Y$ are jointly distributed random variables, then their covariance is defined as $$ cov(X,Y) = E[(X-E(X))(Y-E(Y))]. $$

Theorem $cov(X,Y) = E(XY)-E(X)E(Y).$

Proof: By direct algebraic expansion. [QED]

Theorem If $X,Y$ are independent and $E(X^2), E(Y^2) < \infty,$then $cov(X,Y)=0.$ The converse is not true.

Proof: The first part follows immediately from the fact that $E(XY)=E(X)E(Y).$

A counter example for the second part is as follows.

$X$ takes values $-1,0,1$ with equal probabilities. $Y = |X|.$ Direct computation shows $E(X)=E(XY)=0$ and so $cov(X,Y)=0.$

But $P(X=0~\&~Y=1) = 0 \neq P(X=0)P(Y=1).$ [QED]

The $cov(\cdot,\cdot)$ function behaves much like ordinary multiplication. The following theorems show this.

Theorem $cov(X,Y)=cov(Y,X).$

Theorem $cov(\sum a_i X_i, \sum b_j Y_j) = \sum_{i,j} a_ib_jcov(X_i,Y_j).$

Also we have
Theorem $cov(X,X) = V(X).$

Theorem $cov(aX+b,cY+d) = ac cov(X,Y).$

EXAMPLE 3:  The analog of $(a+b)^2 = a^2+2ab+b^2$ here is $V(X+Y) = V(X)+2 cov(X,Y) +V(Y).$ This also shows that if $X,Y$ are independent, then $V(X+Y) = V(X)+V(Y).$ ■

::

EXERCISE 1:  If $X$ and $Y$ have finite first moments, and at least one of them is a degenerate random variable, then show that $cov(X,Y)=0.$

Hint:

If $X$ is degenerate, say $P(X=c)=1,$ then $\cov(X,Y) = E(XY)-E(X)E(Y) = E(xY)-cE(Y) = 0.$

::

EXERCISE 2: Let $X_1,X_2,...,X_n$ be identically distributed independent random variables with $V(X_1) = \sigma^2 < \infty.$ Then what is $V(\overline X_n)?$ Here $\overline X_n = \frac 1n\sum_1^n X_i.$

Hint:

$V(\overline X_n) = \frac{\sigma^2}{n}.$

At last we shall be able to prove our first theorem about statistical regularity. This is essentially what we had started our class with.

Weak Law of Large Numbers (WLLN) Let $X_1,X_2,...$ be a sequence of independent and identically distributed (IID) random variables (defined on the same probability space) with $E(X_1)=\mu$ and $V(X_1)=\sigma^2<\infty.$ Let, for $n\in{\mathbb N}$, $$ \overline X_n = \frac 1n\sum_{i=1}^n X_i. $$ Then $$ \forall \epsilon > 0~~ P(|\overline X_n-\mu|> \epsilon) \rightarrow 0\mbox{ as } n\rightarrow \infty. $$

Proof: Use the exercise above and Chebyshev inequality. [QED]

Cauchy-Scwartz inequality $cov(X,Y)^2 \leq V(X)V(Y).$ Equality holds iff $\exists a,b,c\in{\mathbb R}~~P(aX+bY=c)=1.$

Proof: The result is obvious if $X$ is degenerate. So let's consider the case where $X$ is not degenerate. Then $V(X)>0.$

Define $Z = Y-\underbrace{\frac{cov(X,Y)}{V(X)}}_\beta X.$

We know that $V(Z)\geq 0.$

Now, $$ V(Z) = V(Y) + V(\beta X) - 2cov(Y,\beta X) = V(Y) + \beta^2 V(X) - 2 \beta cov(X,Y). $$ Since $\beta = \frac{cov(X,Y)}{V(X)},$ this reduces to $$ V(Y) - \frac{cov(X,Y)^2}{V(X)}. $$ Since this is $\geq0,$ the inequality follows immediately.

Also equality holds iff $V(Z)=0$, i.e., $Z$ is degenerate.

So we have $V(X) X - cov(X,Y) Y = kV(X)$ for some $k\in{\mathbb R}.$

This completes the proof. [QED]

Definition: Correlation If $X,Y$ are jointly distributed random variables with $V(X), V(Y)>0,$ then their correlation is defined as $$ \rho(X,Y)= \frac{ cov(X,Y) }{ \sqrt{V(X)V(Y)} }. $$
By Cauchy-Scwartz inequality, $rho(X,Y) \in [-1,1].$ Also, $\rho(X,Y)=-1$ or $\rho(X,Y)=1$ if and only if $X,Y$ are linearly linearly related with probability 1, i.e., $\exists a,b,c\in{\mathbb R}$ such that $P(aX+bY=c)=1.$

Problems for practice

::

EXERCISE 3: 

Hint:

(a) Let $X$ be the number of matching pairs.

Let $X_i = \left\{\begin{array}{ll}1&\text{if }i\mbox{-th pair match}\\ 0&\text{otherwise.}\end{array}\right..$

Then $X = \sum_1^{52} X_i.$

Now, $E(X_i) = P(i$-th pair match$)=\frac{1}{52}.$

So $E(X) = 1.$

(b) 1.

::

EXERCISE 4: 

Here "discrete density" means "PMF".

Hint:

We need to check that $\forall x\in{\mathbb R}~~f(x)\geq 0$ and $\sum_{x=1}^N f(x) = 1.$

Both are immediate.

The mean is $E(X)$ where $X$ has this PMF.

$E(X) = \sum_{x=1}^N x f(x) = \frac{2}{N(N+1)}\sum_{x=1}^N x^2 = \frac{2}{N(N+1)}\times\frac{N(N+1)(2N+1)}{6} = \frac{2N+1}{3}.$

::

EXERCISE 5: 

Hint:

Since $P(|X-Y|\leq M)=1$, hence $E|X-Y| \leq E(M) = M.$

Also we know that $X = Y + (X-Y)$ and so, by triangle inequality, $|X| \leq |Y| + |X-Y|.$

Now $E|X|$ always exists (may be $\infty$) and $E|X|\leq E|Y| + E|X-Y| <\infty,$ since $E|Y|<\infty$ and $E|X-Y|\leq M.$

Hence $E(X)$ exists finitely.

Also $|E(X)-E(Y)| = |E(X-Y)| \leq E|X-Y|$ by Jensen's inequality, since $|x|$ is a convex function.

Hence $|E(X)-E(Y)| \leq M,$ as required.

::

EXERCISE 6: 

Hint:

$E(X) = \frac N2$ and $V(X) = \frac{N(2n+1)}{6}-\frac{N^2}{4} = \frac{N^2+2N}{12}. $

::

EXERCISE 7: 

Here "density" means "PMF".

Hint:

We know from analysis that $\sum_1^\infty x^{-(r+2)} <\infty$ since $r>0.$

Let $c = \frac{1}{\sum_1^\infty x^{-(r+2)}}.$

Then $p(x) = \left\{\begin{array}{ll}c x^{-(r+2)}&\text{if }x\in{\mathbb N}\\ 0&\text{otherwise.}\end{array}\right. $ is a PMF with the required property.

::

EXERCISE 8: 

Hint:

$V(X^2Y) =E(X^4Y^2)-E^2(X^2Y) = E(X^4)E(Y^2)-E^2(X^2)E^2(Y),$ since $X,Y$ are independent (and so any function of $X$ is independent of any function of $Y$).

Now $E(X^4)E(Y^2)-E^2(X^2)E^2(Y) = E(X^4)E(Y^2) -0 = 2\times1 = 2.$

::

EXERCISE 9: 

Hint:

$E(2X+3Y) = 2E(X)+3E(Y)$ and $V(2X+3Y) = 4V(X)+9V(Y).$

::

EXERCISE 10: 

Hint:

(a) Use the fact that $\sum_{k=1}^n (X_k-\overline X) = 0.$

(b) $E\left(\sum (X_k-\overline X)^2\right) = E\left(\sum (X_k-\mu)^2\right) - n E(\overline X-\mu)^2. $

Now $E(X_k)=\mu$ and $E(\overline X)=\mu.$

So $E\left(\sum (X_k-\mu)^2\right) = \sum E(X_k-\mu)^2 = n\sigma^2$ and $E(\overline X-\mu)^2 = V(\overline X) = \frac{\sigma^2}{n}. $

Hence $E\left(\sum (X_k-\overline X)^2\right) = n\sigma^2- n\times \frac{\sigma^2}{n} = (n01)\sigma^2,$ as required.

::

EXERCISE 11: 

Hint:

(a) $E(X_i) = P(i$-th box empty$) = \left(\frac{r-1}{r}\right)^n.$

(b) Let $i\neq j.$ Then $E(X_iX_j) = P(i$-th and $j$-th boxes empty$) = \left(\frac{r-2}{r}\right)^n.$

(c) $E(S) = \frac{(r-1)^n}{r^{n-1}}.$

(d) $V(S) = E(S^2)-E^2(S).$

Now $E(S^2) = E\left(\sum X_i\right)^2 = \sum E(X_i^2) + \sum_{i\neq j} E(X_iX_j) =\sum E(X_i) + \sum_{i\neq j} E(X_iX_j) = \frac{(r-1)^n}{r^{n-1}} + r(r-1)\times\frac{(r-2)^n}{r^n}. $

Now simplify.

::

EXERCISE 12: 

Hint:

(a) 1.

(b) $E(X_i^2) = E(X_i) = \frac 1n.$ Also for $i\neq j$ we have $E(X_iX_j) = \frac{1}{n(n-1)}.$

So $$\begin{eqnarray*} V(S_n) & = & E(S_n^2)-E^2(S_n)\\ & = & E(S_n^2)-1\\ & = & \sum E(X_i^2) + \sum_{i\neq j} E(X_iX_j)-1\\ & = & n\times\frac 1n + n(n-1)\times \frac{1}{n(n-1)}-1 = 1. \end{eqnarray*}$$

::

EXERCISE 13: 

Hint:

$cor(X_1-X_2,X_2+X_3) = \frac{cov(X_1-X_2,X_2+X_3)}{\sqrt{V(X_1-X_2)\cdot V(X_2+X_3)}}.$

Now $V(X_1-X_2) = V(X_1)+V(X_2) = \sigma_1^2 + \sigma_2^2,$ since $X_1,X_2$ independent.

Similarly, $V(X_2+X_3) = \sigma_2^2+\sigma_3^2.$

Also $cov(X_1-X_2,X_2+X_3) = cov(X_1,X_2)+\cov(X_1,X_3)-cov(X_2,X_2)-cov(X_2,X_3) = -\sigma_2^2.$

::

EXERCISE 14: 

Hint:

$V(X-2Y) = V(X)+4V(Y)-4cov(X,Y) = 1+8+4\times \frac 12\times \sqrt{1\times 2}.$

::

EXERCISE 15: 

Hint:

Here $U+V=2.$ Since $V=2-U$ is a linear relation with negative slope, $cor(U,V)=-1.$

::

EXERCISE 16: 

Hint:

The joint PMF of $(X,Y)$ is
$Y=1$$Y=2$$Y=3$
$X=1$$0$$\frac 16$$\frac 16$
$X=2$$\frac 16$$0$$\frac 16$
$X=1$$\frac 16$$\frac 16$$0$
So $XY$ takes the values $2,3,6$ each with probability $\frac 13.$ Hence $E(XY) = \frac{11}{3}.$

Also $E(X) = E(Y) = 2$ and $E(X^2) = E(Y^2) = \frac{14}{3}.$

So $V(X)=V(Y) = \frac{14}{3}-4 = \frac 23.$ Also $cov(X,Y)=\frac{11}{3}-4=-\frac 13.$

Hence $cor(X,Y) =-\frac 12. $

::

EXERCISE 17: 

Hint:

(a) This is because the $i$-th trial cannot produce both 1 and 2 together!

(b) The trials are indep. So $E(I_iJ_j) = E(I_i)E(J_j) = p_ip_j.$

(c), (d), (e): SImple algebra.

::

EXERCISE 18: 

Hint:

(a) $E(I_i) = P(i$-th elt in sample is of type 1$)=\frac{r_1}{r}.$

SImilarly, $E(J_i) = \frac{r_2}{r}.$

(b) $E(I_iJ_j) = $ probability that the $i$-th and $j$-th elts in the sample are, repectively, of types 1 and 2.

Now these two elements may be chosen in $r(r-1)$ ways in all. These are all equally likely. Total number of favourable cases is $r_1r_2.$ Hence the probability is $\frac{r_1r_2}{r(r-1)}.$

(c) $E(XY) = E\big[(\sum I_i)(\sum_j J_j)\big] = \sum_{i,j} E(I_iJ_j)= \sum_{i\neq j} E(I_iJ_j),$ since $E(I_iJ_i)=0.$

So $E(XY) = n(n-1)\times\frac{r_1r_2}{r(r-1)}.$

Also $E(X^2) = E\left(\sum I_i\right)^2 = \sum E(I_i^2) + \sum_{i\neq j} E(I_iI_j) =n\times\frac{r_1}{r} + n(n-1) \times\frac{r_1(r_1-1)}{r(r-1)}.$

The rest follows using simpe algebra.

::

EXERCISE 19: 

Here "density" means "PMF". Also $\mu=E(X).$

Hint:

By symmtry around $2$ we see that $\mu = 2.$

Also $V(X) = E(X-\mu)^2 = 0^2\times \frac{16}{18}+1^2\times \frac{2}{18} = \frac 19.$

So we are looking for $\delta$ such that $P(|X-\mu|\geq\delta) = \frac{1}{9\delta^2}.$

Now, for $\delta>0,$ the LHS is either 0 or $\frac 19$ (according as $\delta$ is $> 1$ or not).

So $\delta=1$ makes both sides $\frac 19.$

::

EXERCISE 20: 

Hint:

Let $X=$ number of defective bolts in a random shipment.

We want to choose $a$ such that $P(X> a) < 0.05.$

Here $X$ can take values 0,1,2,...,10000 with the probabilities $$P(X=k) = \binom{10000}{k} 0.05^k 0.95^{10000-k}=p_k,\mbox{ say.}$$ The probability of refund is $\sum_{k>a} p_k.$

So $a$ needs to be chosen such that $$\sum_{k>a} p_k \leq 0.01 <\sum_{k\geq a} p_k.$$ Finding this $a$ is not easy by hand, though trivial using a computer.

There is a theorem called the Central Limit Theorem which allows a simple approximate way to find $a.$ We shall learn it in the next semester.

::

EXERCISE 21: 

A brief note about probability generating functions: If $X$ takes non-negative integer values with $p_i = P(X=i)$ for $i=0,1,2,...$ then its probability genrating function is $$\Phi_X(t) = p_0 + p_1t + p_2 t^2 +\cdots.$$ Clearly this converges absolutely for $|t|\leq 1.$ In this problem we are assuming that it converges for all $t\in{\mathbb R}.$

Hint:

(a) Let $Y =\left\{\begin{array}{ll}t^{x_0}&\text{if }X\leq x_0\\ 0&\text{otherwise.}\end{array}\right.. $

Then, for $t\in[0,1],$ we have $Y\leq t^X.$ (Remember that $x\mapsto t^x$ is a non-increasing function for $t\in[0,1]$).

So $E(Y)\leq E(t^X).$ Now $E(Y) = t^{x_0}P(X\leq x_0).$

Hence the result.

(b) Let $Z =\left\{\begin{array}{ll}t^{x_0}&\text{if }X\geq x_0\\ 0&\text{otherwise.}\end{array}\right.. $

Then, for $t\geq 1,$ we have $Z \leq t^X.$

Hence the result follows as in (a).

Comments

To post an anonymous comment, click on the "Name" field. This will bring up an option saying "I'd rather post as a guest."