Random variables

Random variables
What they are
New random variables from old ones
Addition, multiplication etc
Functions of a random variable
Distribution of a random variable
CDF
Different types of random variables
PMF
Problems for practice

Random variables $\newcommand{\pow}{{\mathcal P}}$

Random variables

What they are

Suppose that I toss a fair coin, and offer you Rs 10 for a head, and demand $Rs 20$ for a tail. In other words, your gain (in Rs) from this deal is $10$ for head and $-20$ for tail. Both $10$ and $-20$ are constants, but since you do not know which of these two constants you are going to get, you gain is a variable. Since it varies with chance, we call it a random variable.

Think of this as made of two stages. In the first stage we have a random experiment with $\Omega = $ {Head, Tail}. In the second stage we have a function $X:\Omega\rightarrow {\mathbb R}$ defined as $$\begin{eqnarray*} X(head) & = & 10,\\ X(tail) & = & -20. \end{eqnarray*}$$ There is nothing random about this function. The randomness comes from the mechanism that decides what goes into this: head or tail?

We use this idea to define random variables mathematically. We start with a random experiment which is the provider of the randomness. Then any (real valued) function defined on its sample space is called a random variable. In probability theory, it is the function (which is not at all random) that is called the random variable. Thus, if in the above coin toss example, we replace the fair coin with a biased coin, but keep the payment rules the same, then we still have the same random variable.

Beginners often find it odd: a random variable is neither random nor a variable!

However, it is not as unnatural as it sounds. In calculus also we write $y = x^2$ and say $y$ is a variable as well as $y$ is a function of $x.$

EXAMPLE 1: In the coin tossing example with a fair coin, let your gain be denoted by $X.$ (or sometimes $X(w)$, if you want to emphasize that it is a function). Find $P(X=10).$

SOLUTION: The immediate answer is $\frac 12.$ Let's see the steps that led to this answer. $P(X=10)$ is the probability that $X$ is $10,$ i.e., the probability that the coin toss has produced an outcome for which the function $X$ takes the value $10.$ Thus $$ P(X=10) = P\big\{w\in\{head,tail\}~:~X(w)=10\big\}. $$ Now $\big\{w\in\{head,tail\}~:~X(w)=10\big\} = \{head\},$ and so the problem now reduces to finding $P(\{head\}),$ which is $\frac 12.$ ■

The general case, then, looks like this: We have a random experiment with sample space $\Omega.$ A random variable $X$ is a function $X:\Omega\rightarrow {\mathbb R}$ where ${\mathbb R}$ is any codomain of our choice. If some one gives us some $A\subseteq {\mathbb R}$ and asks us to find $P(X\in A),$ we are to actually find $$ P\big(\big\{w\in\Omega~:~X(w)\in A\big\}\big). $$ Remember that this is the definition of $P(X\in A).$ The complicated looking set $\big\{w\in\Omega~:~X(w)\in A\big\}$ is often abbreviated to $\{X\in A\}$ or $X ^{-1} (A).$

Don't let the notation $X ^{-1}$ make you think that $X:\Omega\rightarrow{\mathbb R}$ has to be invertible. For any $f:A\rightarrow B$ we can define $f ^{-1}:\pow(B)\rightarrow\pow(A)$ as follows (here $\pow$ denotes power set, i.e., the set of all subsets): for $V\in\pow(B)$ we define $f ^{-1}(V)=\{a\in A~:~f(a)\in V\}$.

New random variables from old ones

Addition, multiplication etc

Sometimes we need to combine the values of two or more random variables. Say $X,Y$ are both random variables and we want to compute $X+Y.$ Since random variables are actually functions, so this sum can be formed only when $X$ and $Y$ have the same domain. This simple point sometimes needs careful handling as the following example shows.

EXAMPLE 2: I am playing against two gamblers simultaneosly. One gambler tosses a fair coin and pays Rs 10 for a head and takes Rs 20 for a tail. The other gambler takes Rs 3 from me, rolls a fair die and pays me as many rupees as the outcome. What is my total gain?

SOLUTION: If I call the gain from the first gambler $X,$ then $X$ is a function from $\{head,tail\}$ to ${\mathbb R},$ while the gain from the second gambler is a function $Y:\{1,2,3,4,5,6\}\rightarrow{\mathbb R}.$ Obviously, $X+Y$ does not make any sense here. We need to first combine the two random experiments to get the product sample space: $\{head,tail\}\times\{1,2,3,4,5,6\}$ and then consider $X,Y$ both as functions from $\Omega$ to ${\mathbb R}.$ For example, $X(head,4) = 10$ and $Y(head,4) = 4-3 = 1.$

Now it is meaningful to talk about $X+Y.$ ■

Functions of a random variable

Is any function of a random variable is again a random variable? Well, for if our $\Omega$ is a countable set (finite/infinite), then the answer is "yes". We shall not worry about the uncountable case here.

Distribution of a random variable

Consider another gambling game.

EXAMPLE 3: A fair die is rolled. I shall pay you Rs 10 if the die shows an even number, you'll pay me Rs 20 otherwise. Let's denote by $Y$ your gain (in Rs). Express $Y$ as a function from $\{1,2,3,4,5,6\}$ to ${\mathbb R}.$ Let $A = \{10\}.$ Find $Y ^{-1} (A)$ and using it find $P(Y\in A).$

SOLUTION: Here $Y^{-1}(A) = \{2,4,6\}.$ So $P(Y=10) = P(\{2,4,6\}) = \frac 16+\frac 16+\frac 16 = \frac 12.$ ■

In each of these examples we had a random variable that took only two values $10$ and $-20.$ Which random variable do you think is more profitable for you, $X$ or $Y$? Well, both are actually the same so far as profit goes. Understand this carefully: $X$ and $Y$ are completely different as functions (their domains are also different), but in terms of the "behaviour of the output" of the functions they are identical. This "behaviour of the output" is called the distribution of the random variable. It is the distribution which we care about mostly in real applications. So we often start a discussion as

Let $X$ be a random variable taking values $10$ and $-20$ each with probability $\frac 12.$

We understand implicitly that there is some random experiment (say the coin toss experiment or the die roll experiment or something similar) and some function from its sample space to ${\mathbb R}$ such that the distribution is as specified. In this course, we shall often omit the sample space or the function.

Definition: Distribution of a random variable By the distribution of a random variable $X$ we mean any statement that gives us $P(X\in B)$ for any set $B\subseteq{\mathbb R}.$

How do we specify the distribution of a random variable? Do we make a list of all the subsets of ${\mathbb R}$, and label them with their probabilities? That would be insane, because there are uncountably infinitely many such subsets. It turns out that specifying the probabilities of intervals like $(-\infty, a]$ is enough. This is what we discuss next.

CDF

Definition: Let $X$ be any real valued random variable. Then its (cumulative) distribution function (CDF) is defined as the function $F:{\mathbb R}\rightarrow[0,1]$ where $\forall x\in{\mathbb R}~~F(x) = P(X\leq x).$

EXAMPLE 4: Consider the gambling game that tosses a coin, and has payoffs $-10$ for head, and $20$ for tail. Let $X$ denote the payoff. What is its CDF?

SOLUTION: Here $X$ takes only two values $-10$ and 20, each with probability $\frac 12.$

So $F(a) = P(X\leq a) = 0$ whenever $a<-10.$

But $F(-10)=P(X\leq -10) = \frac 12.$ Indeed, as long as $a\in[-10,20)$ we have $F(a) = \frac 12.$

At $a=20,$ we have $F(a) = 1.$ In fact, $\forall a\geq 20~~F(a) = 1.$ So the graph looks like this:

■

The following properties of a CDF are more or less obvious.

Theorem Let $F(x)$ be the CDF of some rv $X.$ Then

$F(x)$ must be nondecreasing, i.e., $\forall x < y\in{\mathbb R}~F(x)\leq F(y).$
$\lim_{x\rightarrow-\infty} F(x) = 0.$
$\lim_{x\rightarrow\infty} F(x) = 1.$
$F(x)$ must be right continuous, i.e., $\forall a\in{\mathbb R}~~F(a+)=F(a).$

Proof:

Since $\{X\leq x\} \subseteq \{X\leq y\},$ hence $P(\{X\leq x\}) \leq P(\{X\leq y\}),$ i.e., $F(x)\leq F(y).$

Shall show $$ \forall \epsilon>0 ~~ \exists M \in{\mathbb R} ~~ \forall x < M~~ |F(x)-0| < \epsilon. $$ (Actually we may drop the absolute value sign around $F(x)$ since it is anyway $\geq 0$).

Take any $\epsilon>0.$

Let $A_n$ be the event that $\{X \leq -n\}$ for $n\in{\mathbb N}.$ Then $F(-n) = P(A_n).$ Clearly, $A_1\supseteq A_2\supseteq A_3\supseteq\cdots$ and $\cap A_n=\phi.$
So $P(A_n)\rightarrow 0,$ i.e., $F(-n)\rightarrow 0.$
So $N\in{\mathbb N} ~~F(-N)<\epsilon.$
Choose $M = -N.$
Take any $x < M.$
Then $0\leq F(x) \leq F(M)<\epsilon,$ since $F(\cdot)$ is nondecreasing.
So $|F(x)-0| < \epsilon,$ as required.
Very similar. Try yourself first, before clicking here.
Shall show $$ \forall \epsilon>0 ~~ \exists M \in{\mathbb R} ~~ \forall x > M~~ |F(x)-1| < \epsilon. $$ (Actually we may drop the absolute value sign around $|F(x)-1|$ is $1-F(x)$, since $F(x)\leq 1,$ anyway.)

Take any $\epsilon>0.$

Let $A_n$ be the event that $\{X \leq n\}$ for $n\in{\mathbb N}.$ Then $P(A_n)=F(n).$

Clearly, $A_1\subseteq A_2\subseteq A_3\subseteq\cdots$ and $\cup A_n=\Omega.$
So $P(A_n)\rightarrow 1,$ i.e., $F(n)\rightarrow1.$
So $N\in{\mathbb N} ~~|F(N)-1|<\epsilon.$
Choose $M = N.$
Take any $x > M.$
Then $0\leq 1-F(x) \leq 1-F(M) <\epsilon,$ since $F(\cdot)$ is nondecreasing.
So $|F(x)-1| < \epsilon,$ as required.

Shall show: $$ \forall a\in{\mathbb R}~~\forall \epsilon>0~~\exists \delta>0~~ \forall x\in (a,a+\delta)~~|F(x)-F(a)| < \epsilon. $$ Take any $a\in{\mathbb R}$ and any $\epsilon>0.$
Let $A_n$ be the event that $\left\{X\leq a+\frac 1n\right\}$ for $n\in{\mathbb N}.$
Also let $A$ be the event that $\{X\leq a\}.$
Then $A_1\supseteq A_2\supseteq\cdots$ and $\cap A_n = A.$
So $P(A_n)\rightarrow P(A)$ and hence $F\left(a+\frac 1n\right)\rightarrow F(a).$
Hence $\exists N\in{\mathbb N} ~~ |F\left(a+\frac 1N\right)-F(a)|<\epsilon.$
Choose $\delta = \frac 1N>0.$
Take any $x\in (a,a+\delta).$
Since $F(\cdot)$ is nondecreasing, hence $F(a)\leq F(x) \leq F(a+\delta) < F(a)+ \epsilon.$
So $|F(a+x)-F(a)|<\epsilon,$ as required.

[QED]

A rather nontrivial theorem is that the converse is also true. This converse is called the fundamental theorem of probability.

Fundamental theorem of probability Let $F:{\mathbb R}\rightarrow[0,1]$ be any function satisfying the properties listed in the last theorem. Then there must exist a real-valued rv $X$ with this $F(x)$ as its CDF.

Proof:Too technical for this course.[QED]

Theorem Any nondecreasing function bounded from above (and hence all CDF's) must have finite left hand limit at each point.

Proof: Let $F:{\mathbb R}\rightarrow{\mathbb R}$ be nondecreasing and bounded from above.

Take any $a\in {\mathbb R}.$

We shall show that $\lim_{x\rightarrow a-} F(x)$ exists as a finite number, i.e., $$ \exists\ell\in{\mathbb R}~~\forall \epsilon>0~~\exists \delta>0~~\forall x\in(a-\delta,a)~~|F(x)-\ell|\leq\epsilon. $$ Consider the set $A=\{F(x)~:~x < a\}.$ Then $A\neq\phi$ and bounded from above (by $F(a)$).

So $\sup(A)\in{\mathbb R}.$

Choose $\ell = \sup(A).$

Take any $\epsilon>0.$

Then $\exists y < a~~F(y) > \ell-\epsilon.$

Choose $\delta = a-y > 0.$

Take any $x\in(a-\delta,a) = (y,a).$

Then $F(y)\leq F(x) \leq \ell,$ or, in other words, $\ell-\epsilon\leq F(x)\leq \ell.$

So $|F(x)-\ell|\leq \epsilon,$ as required. [QED]

Theorem Let $X$ have CDF $F.$ Then $$ \forall a\in{\mathbb R}~~F(a-) = P(X < a). $$

Proof: Take any $a\in{\mathbb R}.$

Let $A = \{X < a\}$ and let $A_n = \left\{X \leq a-\frac 1n\right\}$ for $n\in{\mathbb N}.$

Then $A_n\nearrow A.$

Hence $P(A_n)\rightarrow P(A).$

So $F\left(a-\frac 1n\right)\rightarrow P(A).$

But $F\left(a-\frac 1n\right)\rightarrow F(a-),$ since $F(a-)$ exists.

Hence $P(X < a) = F(a-),$ as required. [QED]

Theorem Let $X$ have CDF $F.$ Then $$ \forall a\in{\mathbb R}~~P(X=a) = F(a)-F(a-). $$

Proof: $P(X=a) = P(X\leq a)-P(X < a).$ [QED]

Different types of random variables

Depending on the distribution, a random variable may be of 3 types:

Discrete: These random variables take only countably many (finite/infinitely many) values.
Continuous: If a random variable takes values in some set $S$ such that $\forall a\in S~~P(X=a)=0,$ then we call it a continuous random variable. Notice that a continuous random variable is not defined as a random variable that takes a "continuous stretch of values". However, most continuous random variables in practice do indeed take all values in an interval, e.g., height of a randomly selected person.
Neither discrete nor continuous: These take uncountably many values and for at least one value $a$ we have $P(X=a)>0.$

The following theorem justifies the adjective "continuous" for a random variable.

Theorem A random variable is continuous if and only if its CDF is continuous everywhere.

Proof: Obvious from the last theorem. [QED]

In this course we shall focus on discrete random variables only.

The distribution of a discrete random variable is completely specified by the countable set of values it can take, and the probability with which it takes each of those values. These two specifications together are called the probability mass function (PMF) of the rv.

PMF

Definition: Probability Mass Function (PMF) Let $X$ be a discrete random variable taking values $x_1,x_2,...$ with probabilities $p_1,p_2,...$. Then the probability mass function (PMF) of $X$ is defined as $p:{\mathbb R}\rightarrow[0,1]$ where $$ p(x) = \left\{\begin{array}{ll}p_i&\text{if }x=x_i\\0&\text{otherwise.}\end{array}\right.. $$

Clearly, $\sum p_i = 1$ and $\forall i~~p_i\geq 0.$ A consequence of the fundamental theorem of probability is that for any countable set $\{x_1,x_2,...\}$ and for any sequence $(p_i)_i,$ for which $\forall i~~p_i\geq 0$ and $\sum p_i=1,$ there is a (discrete) random variable of which the PMF is $p(x)$ given above.

The CDF of a discrete random variable is a step function like the one we saw in our example.

Problems for practice

EXERCISE 1: (Easy)

[Hint]

$P(X=i) = \frac{^5P_{i-1}\times 5\times (10-i)!}{10!}$ for $i=1,2,3,4,5,6.$ The probability is 0 for $i=7,8,9,10.$

EXERCISE 2: (Easy)

[Hint]

If $n$ is even, then all even values between $0$ and $n.$ If $n$ is odd, then all the odd values in the same range.

EXERCISE 3: (Easy)

[Hint]

(a)


Plot of CDF

(b) $P\left(X > \frac 12 \right) = 1-F\left(\frac 12\right) = \frac 34.$

(d) $P(X < 3) = F(3-) = \frac{11}{12}.$

(e) $P(X=1) = F(1)-F(1-) = \frac 16.$

EXERCISE 4: (Easy)

[Hint]

$P(X=1) = P(X\leq 1)-P(X< 1).$

Now $\{X < 1\} = \lim_n \left\{X\leq 1-\frac 1n\right\}.$ Since this is an increasing limit, hence by continuity of probability, we have $P(X<1) = \lim_n P\left(X\leq 1-\frac 1n\right) = \lim_n F\left(1-\frac 1n\right) = F(1-).$

Hence $P(X=1) = F(1)-F(1-).$

EXERCISE 5: (Easy)

[Hint]

$P(|X-1|\geq 2) = P(X\leq -1\mbox{ or }X\geq 3) = P(X\leq -1) + P(X\geq 3) = P(X\leq -1) + P(X \gt 3)+P(X=3) = F_X(-1)+1-F_X(3)+P(X=3) =F_X(-1)+1-F_X(3),$ since $P(X=3)=0$.

[Thanks to ''Chaitu'' for correcting a typo here.]

In all the following problems the term "density" stands for "PMF".

EXERCISE 6: (Easy)

[Hint]

(b) $X$ can take values $0,1,...,n$. We have $$P(X=k) = \binom{n}{k} \left(\frac 35\right)^k \left(\frac 25\right)^{n-k}$$ for $k=0,1,...,n$. The PMF is zero otherwise.

(a) Here $n$ ca be at most 10. For $k=0,...,6$ we have $$P(X=k) = \frac{\binom{6}{k}\binom{4}{n-k}}{\binom{10}{k}}. $$ Here we have the understanding that $\binom{n}{r} = 0$ if $r\not\in\{0,1,...,n\}$.

EXERCISE 7: (Easy)

[Hint]

For this to be a PMF we need $f(1)+\cdots+f(N)=1$.

So $c(2+2^2+\cdots 2^N) = 1$.

EXERCISE 8: (Easy)

[Hint]

(a) $P(X<0) = f(-3)+f(-1) = 0.1 + 0.2 = 0.3$.

(b) $f(0)+f(2) +f(8) = 0.15+0.1+0.05$.

(d) $P(X=-3|X\leq 0)= \frac{P(X=-3 \& X\leq 0)}{P(X\leq 0)} = \frac{P(X=-3)}{P(X\leq 0)} = \frac{f(-3)}{f(-3)+f(-1)+f(0)} = $

(e) $P(X\geq 3| X > 0) = \frac{P(X\geq 3 \& X > 0)}{P(X > 0)} = \frac{P(X\geq 3)}{P(X>0)}= \frac{f(3)+f(5)+f(8)}{f(1)+f(2)+f(3)+f(5)+f(8)}.$

EXERCISE 9: (Medium)

Here they mean an SRSWR of size 2.

[Hint]

Clearly $X$ can take values in $\{1,2,...,12\}$.

So if $k\not\in\{1,2,...,12\}$, then $P(X=k) = 0$.

Now let $k\in\{1,2,...,12\}$.

Let the first number be $X_1$ and the second be $X_2$

Then $P(X\leq k) = P(X_1\leq k \& X_2\leq k) = P(X_1\leq k)P(X_2\leq k) = \left(\frac{k}{12}\right)^2$.

Actually this formula also holds for $k=0$.

So for $k\in\{1,2,...,12\}$ we have $$P(X=k)= P(X\leq k)-P(X\leq k-1) = \frac{k^2-(k-1)^2}{144},$$ which is the required PMF.

EXERCISE 10: (Medium)

[Hint]

(a) Of course we assume that $n\leq r$.

Then $P(Y\leq y) = 0$ if $y < 1$ and also $P(Y\leq y) = 1$ if $y\geq r$.

Let $y\in [1,r)$. Then $\{Y\leq y\}$ means all the balls are from $\{1,...,[y]\}$.

This has probability $\frac{\binom{[y]}{n}}{\binom{r}{n}}$.

(b) Similar.

The nex two problems refer to the following CDF:

EXERCISE 11: (Easy)

[Hint]

$F(m) = \frac 12$ means $1-e^{- \lambda m} = \frac 12$, or $m = \frac{\log 2}{\lambda}$.

EXERCISE 12: (Easy)

[Hint]

First notice that since $F(x)$ is a continuous function, hence for any given $a\in{\mathbb R}$ we must have $P(X=a) = 0$.

So $P(X\geq 0.01) = P(X > 0.01) = 1-F(0.01) = e^{-\lambda/100}$.

So we get $e^{-\lambda/100}=\frac 12$ or $\lambda=100\log 2$.

Again, $P(X\geq t) = e^{-\lambda t}$.

So we need $e^{-\lambda t} = 0.9$. Hence $t = -\frac{\log 0.9}{\lambda} = -\frac{\log 0.9}{100\log 2}$.

This is actually positive, since $\log 0.9 < 0$.

EXERCISE 13: (Easy)

[Hint]

(a) $P\left(\frac 12\leq X \leq \frac 32\right) = P\left(X\leq \frac 32\right) - P\left(X < \frac 12\right) = P\left(X\leq \frac 32\right)-P\left(X\leq \frac 12\right),$ since $P\left(X=\frac 12\right) = 0$ as $F(x)$ is continuous at $x=\frac 12$.

So the answer is $F\left(\frac 32\right)-F\left(\frac 12\right) = \frac 34-\frac 16=\frac{7}{12}$.

(b) $P\left(\frac 12\leq X \leq 1\right) = P\left(X\leq 1\right) - P\left(X < \frac 12\right) = F(1)-P\left(X < \frac 12\right)$. Now proceed as before.

(c) $P\left(\frac 12\leq X < 1\right) = P\left(X< 1\right) - P\left(X < \frac 12\right) = \lim_{x\rightarrow 1-}F(x)-F\left(\frac 12\right) = \frac 13-\frac 16 = \frac 16.$.

(d) $P\left(1\leq X\leq \frac 32\right) = P\left(X\leq\frac 32\right) - P(X < 1) = F\left(\frac 32\right)-\lim_{x\rightarrow1-} F(x) = \frac 34-\frac 13 = \frac{5}{12}$.

(e) $P(1 < X < 2) = P(X<2)-P(X\leq1) = \lim_{x\rightarrow2-}F(x)-F(1) = 1-\frac 12 = \frac 12$.

Actually here $F(x)$ is continuous at $x=2$.

[Thanks to Avigyan for correcting a silly mistake in the solution to part (e).]

EXERCISE 14: (Medium)

Here are the properties (i)-(iv) from section 5.1.1:

Here $F(\pm\infty) = \lim_{x\rightarrow\pm\infty} F(x)$ and $F(x+) = \lim_{t\rightarrow x-} F(t).$

[Hint]

(a) (i), (ii), (iii) No modification. (iv) $F(x-) = F(x)$ for all $x$.

(b) (i) No modification. (ii) nonincreasing. (iii) $F(-\infty) = 1$ and $F(\infty) = 0$. (iv) No modification.

(c) (i) No modification. (ii) nonincreasing. (iii) $F(-\infty) = 1$ and $F(\infty) = 0$. (iv) $F(x-) = F(x)$ for all $x$.