We know from the definition of expectation that it may come in four varieties: it may be finite, or $\infty$ or $-\infty$
or undefined.
The finite case is the most useful, and it
sometimes helps to know some sufficient conditions for this.
Proof:
Goes without saying!
[QED]
Non-negative random variables have the advantage that their expectation is always defined (though may be $\infty$). Now,
from any random variable $X$ we can easily manufacture a non-negative random variable, viz, $|X|.$ It is
good to be able to relate $E(X)$ with $E(|X|).$
Proof:
We define $X_+, X_-$ as usual.
Then $X = X_+-X_-$ and $|X| = X_++X_-$.
Then finiteness of $E(|X|)$ is equivalent to finiteness of both $E(X_+), E(X_-).$
Again, finiteness of $E(X)$ is equivalent to finiteness of both $E(X_+), E(X_-).$
Hence the result.
[QED]
::
EXERCISE 1: If $E(|X|)=\infty,$ then what can you say about $E(X)?$
A random variable is, well, random. So it may very well differ from
its expectation. By how much? A lot or a little? We can use
expectation to find that out.
Proof:
We shall not do the proof here. But here is the main idea:
$$
e^{tX} = 1 + \frac{tX}{1!} + \frac{t^2X^2}{2!} + \frac{t^3X^3}{3!} + \cdots.
$$
From this we want to write
$$
E(e^{tX}) = 1 + \frac{tE(X)}{1!} + \frac{t^2E(X^2)}{2!} + \frac{t^3E(X^3)}{3!} + \cdots.
$$
This is not a precise statement, because we do not know if all
raw moments of $X$ exist finitely. Also, even if they do, is
it valid to "distribute" expectation over an infinite sum?
Answers to these questions require deeper real analysis results
than we know at this point.
However, assuming that this is valid, we may try to differentiate
both sides to get
$$
\frac{d}{dt} E(e^{tX}) = E(X) + \frac{2tE(X^2)}{2!} + \frac{3t^2E(X^3)}{3!} + \cdots.
$$
Again this step needs justification. Can we "distribute"
differentiation over an infinite sum?
Assuming that we can, puting $t=0$ indeed gives us $E(X).$
SImilarly, differentiating once again, and putting $t=0$
gives us $E(X^2),$ and so on.
[QED]
We shall not spend much time with MGFs, because there is a better
alternative called the characteristic function (CF).
Don't be nervous to see expectation of a complex random
variable. It is simply
$$
E(\cos tX) + i E(\sin tX).
$$
CFs are better than MGFs because of two reasons, that we give as
theorems below.
Proof:
This is obvious, since $\sin tX$ and $\cos tX$ are both
bounded random variables, and hence have finite expectations.
[QED]
Proof:
Not in this course.
[QED]
Indeed, this property has earned characteristic functions their name.
MGFs do not have this proprty. It is possible to get (rather
ugly) counter-examples of random variables $X$ and $Y$
that both have the same MGF (in particluar both have the same
domain $D\subseteq{\mathbb R}$), but still $X$ and $Y$ have
different distributions. However, if the domain includes a
neighbourhood of $0,$ then $X,Y$ must have the same
distribution. This is stated in the following theorem.
Proof:
Too difficult for this course.
[QED]
We shall not spend proving any result on MGF here. We shall
learn the proofs for CFs in the next semester.
EXERCISE 2: A box has 6 red balls an 4 black balls. An SRSWR of
size $n$ is selected. If $X$ is the number of red
balls selected, then find PMF of $X$ and $E(X).$ Also solve the
problem in the case of SRSWOR.
For SRSWR: $P(X=x) = \binom{n}{x} \left(\frac{6}{10}\right)^x\left(\frac{4}{10}\right)^{n-x}$ for $x=0,1,...,n.$
For SRSWOR:
$P(X=x) = \frac{\binom{6}{x} \binom{4}{n-x}}{\binom{10}{n}}$
for $x=0,1,...,n.$
By the way, this does not mean that $X$ can indeed take all the values from 0 to $n.$ For some of these values
the probability is zero.
::
EXERCISE 3: Let $N$ be a positive integer. Let
$$
f(x) = \left\{\begin{array}{ll}c 2^x &\text{if }x=1,2,...,N\\0&\text{otherwise.}\end{array}\right.
$$
be a PMF. Find $c.$ Find $E(X)$ and $V(X)$ if $X$ has this PMF.
For $f(x)$ to be a PMF we need
$$f(1)+\cdots+f(N)=1.$$
Hence
$$c = \frac{1}{2^{N+1}-2}.$$
So
$$E(X) = \sum_1^N x f(x) = c\sum_1^N x 2^x = ...$$
Similarly, you can find $V(X).$
::
EXERCISE 4: An SRSWR of size 2 is drawn from $\{1,2,...,12\}.$
Let $X$ be the maximum of the two numbers
selected. Find $E(X).$
Here $X$ can take only the values $1,2,...,12.$
For $k\in\{1,2,...,12\}$ we have
$$P(X\leq k) = P(X_1, X_2 \leq k) = \left(\frac{k}{12}\right)^2.$$
So $P(X=k) = \frac{k^2-(k-1)^2}{144} = \frac{2k-1}{144}.$
Hence $E(X) = \sum_1^{12} \frac{2k^2-k}{144}=....$
::
EXERCISE 5: An SRSWR of size $n$ is selected
from $\{1,2,...,12\}.$ Let $a_n $ be the expected
value of the maximum of the sample. Show that $a_n \leq
a_{n+1}$ without explicily finding $a_n$ in terms of $n.$
Let $X_1,...,X_{n+1}$ be an SRSWR of size $n+1$ from $\{1,...,12\}.$
Then $X_1,...,X_n$ is an SRSWR of size $n$ from $\{1,...,12\}.$
Let $U = \max\{X_1,...,x_{n+1}\}$ and $V = \max\{X_1,...,x_n\}.$
Then $U = \max\{V,X_{n+1}\} \geq V.$
So $E(U)\geq E(V).$
Hence $a_{n+1}\geq a_n,$ as required.
(a) By Markov inequality, $E(X)\geq 85 P(X> 85).$
So $P(X> 85) \leq \frac{75}{85}.$
(b) $P(65\leq X \leq 85) =
P(|X-75|\leq 10) = 1- P(|X-75|> 10)\geq 1-\frac{V(X)}{100} = \frac 34$ by Chebyshev.
(c) Let the answer be $n$, and class average be $\bar X.$
Then $E(\bar X) = 75$ and $V(\bar X) = \frac{25}{n}.$
So, by the Chebyshev inequality, $P(|\bar X-75|\geq 5) \leq \frac{25}{5^2n} = \frac 1n. $
So we need $1-\frac 1n \geq 0.9$ or $n\geq 10.$
Here $P(X\leq x) = F_X(x) = F_Y\left(\frac{x-a}{b}\right) = P\left(Y\leq \frac{x-a}{b}\right) = P(a+bY\leq x).$
Since this holds for all $x\in{\mathbb R},$ hence $X$ and $a+bY$ have the same CDF.
Since $CDF$ is unique for a distribution, hence $X$ and $a+bY$ have the same distribution.
(a) $E(X) = E(a+bY) = a+bE(Y).$
(b) $V(X) = V(a+bY) = b^2 V(Y).$