Loading [MathJax]/extensions/color.js

Transformations

Via CDF
Problem set 1
Order statistics (part 1)
Problem set 2
Order statistics (part 2)
Problem set 3
Order statistics (part 3)
Problem set 4
Jacobian formula (1 dim)
Problem set 5
Jacobian formula (intuition)
Problem set 6
Jacobian matrix
Multivariate differentiation
Problem set 7
Multivariate Jacobian formula
Problem set 8
Sum
Problem set 9
Quotient
Problem set 10
Characteristic function (CF)
Complex random variables
Complex calculus
An example
Problem set 11
Properties of CF
Problem set 12
Miscellaneous problems

$\newcommand{\x}[1]{X_{(#1)}}$ $\newcommand{\v}[1]{{\mathbf #1}}$ Transformations We often work with functions of random variables. New random variables are created out of existing ones via functions. So a natural requirement is to be able to work out the distributions of the new random variables in terms those of the existing ones. There are quite a few techniques to do this.

Via CDF

Video for this section

If we working with univariate distributions, then the most general (and often the simplest) technique is to use CDF. This is particularly so, if the transformation is a monotone one.

EXAMPLE 1: If $X$ is uniformly distributed over $[0,2],$ then find a density for $X^2.$

SOLUTION: Let $Y = X^2.$ A density for $X$ is $f(x) = \frac 12$ if $0\leq x\leq 2$ (and 0 else). We shall pass to the CDF of $X:$ $$F(x) = \left\{\begin{array}{ll}0&\text{if }x < 0\\ \frac x2&\text{if }0\leq x < 2\\ 1&\text{otherwise.}\end{array}\right. $$ From this we shall compute the CDF of $Y.$ Clearly, $Y$ cannot take values outside $[0,4].$ So $G(y) = 0$ if $y<0$ and $G(y) = 1$ if $y\geq 4.$

Let $y\in[0,4).$

Then $$G(y) = P(Y\leq y) = P(X^2\leq y) = P(X\leq \sqrt y) = \frac 12\sqrt y.$$ Differentiating this, we arrive at the required density of $Y$ as $$g(y) = G'(y) = \left\{\begin{array}{ll}\frac{1}{4\sqrt y}&\text{if }y\in[0,4]\\ 0&\text{otherwise.}\end{array}\right.$$ ■

You see the advantage of monotonicity. Even though $x\mapsto x^2$ is a not a monotone function over ${\mathbb R},$ it is so when restricted to $[0,2].$ The CDF technique can handle even some simple non-monotonic cases, as we show now.

EXAMPLE 2: Let $X$ be uniform over $[-1,1].$ Find the density of $X^2.$

SOLUTION: Clearly, $Y=X^2$ cannot go outside $[0,1].$ So its CDF $G(y)$ must have $G(y)=0$ for $y<0$ and $G(y)=1$ for $y\geq 1.$

For $y\in[0,1)$ we have $$G(y) = P(X^2\leq y) = P(-\sqrt y \leq X \leq \sqrt y) = \sqrt y.$$ Differentiating we get the density $$g(y) = \left\{\begin{array}{ll}\frac{1}{2\sqrt y}&\text{if }y\in[0,1]\\ 0&\text{otherwise.}\end{array}\right. $$ ■

Problem set 1

EXERCISE 1: If $X$ has density $f(x)=\left\{\begin{array}{ll}2x&\text{if }x\in(0,1)\\ 0&\text{otherwise.}\end{array}\right.$, then find density of $X^2.$

EXERCISE 2: If $X$ has constant density over $(0,1)$ and zero outside it, then guess the density of $1-X$, and prove your guess.

EXERCISE 3: If $X$ has density $f(x)$, then the density of $-X$ is

$f(x)$
$-f(x)$
$f(-x)$
$-f(-x)$

EXERCISE 4: If $(X,Y)$ has joint density $f(x,y) = \frac{1}{2\pi} e^{-\frac 12(x^2+y^2)},$ then find the density of $R = \sqrt{X^2+Y^2}.$

EXERCISE 5: If $(X,Y)$ is uniformly distributed over the unit disc in ${\mathbb R}^2,$ and we write $(X,Y)$ as $(R,\Theta)$ in polar coordinates where $\Theta\in[0,2\pi),$ then find density of $R$ and also the density of $\Theta.$

Order statistics (part 1)

Video for this section

An interesting application of non-monotonic transformation that can be handled by CDF is about order statistics. If we have a random sample $X_1,...,X_n$, and sort them as $X_{(1)}\leq X_{(2)}\leq \cdots X_{(n)},$ then $X_{(i)}$ is called the $i$-th order statistic.

We shall start with the simplest case $\x n$, the maximum. Let $X_i$'s be IID with common density $f(x)$ and CDF $F(x).$ Let us find the density of $\x n.$

We shall first compute the CDF $G(x)$of $\x n.$ $$G(x) = P(\x n\leq x) = P(\forall i~~X_i\leq x) = P(X_1\leq x)\cdots P(X_n\leq x) =(F(x))^n.$$ Hence a density of $\x n$ is $g(x) = G'(x) = n(F(x))^{n-1}f(x).$

Problem set 2

EXERCISE 6: Let $X_1,...,X_n$ be IID with density $f(x).$ Find a density for $\x 1$, the minimum.

EXERCISE 7: If $X_1,...,X_5$ are IID with density $f(x)=\left\{\begin{array}{ll}2 e^{-2x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right.$, find density of $\x 5.$

EXERCISE 8: If $X_1,...,X_n$ are IID with density $f(x)=\left\{\begin{array}{ll}\frac 1\theta&\text{if }x\in(0,\theta)\\ 0&\text{otherwise.}\end{array}\right.$, find a constant $c$ such that $E(c\x n)=\theta.$

Order statistics (part 2)

Video for this section

If $X_1,...,X_n$ are IID with density $f(x)$ then there is a particularly simple formula for the joint density of $(\x 1,...,\x n).$ Before giving the general form, let us warm up with a simple example.

EXAMPLE 3: Let $X_1, X_2$ be IID with density $f(x)$ and CDF $F(x).$ Show that the joint CDF of $(\x 1, \x 2)$ (call it $G(x,y)$, say) is free of $x$ when $x>y.$ So what will be $\frac{\partial^2}{\partial x\partial y} G(x,y)$ in this case?

SOLUTION: To keep things concrete, let's first work with $x=3$ and $y=2.$ Then $G(3,2) = P(\x 1\leq 3,\x 2\leq 2) = P(\x 2\leq 2),$ since $\{\x 1\leq 3,\, \x 2\leq 2\} = \{\x 2\leq 2\}. $

More generally, if $x > y,$ then $G(x,y)$ is going to be free of $x.$

So we have $\frac{\partial^2}{\partial x\partial y} G(x,y) = 0$ if $x > y.$ ■

If we work with $X_1,...,X_n$ instead of just $X_1,X_2,$ then the same argument would show that $\frac{\partial^n}{\partial x_1\cdots\partial x_n} G(x_1,...,x_n) = 0$ unless $x_1\leq x_2\leq\cdots\leq x_n.$

EXAMPLE 4: Same set up as in the last example. Now find $G(x,y)$ for $x < y.$ Again find $\frac{\partial^2}{\partial x\partial y} G(x,y).$

SOLUTION: Let us start with $x=3$ and $y=2.$ Then $G(2,3) = P(\x 1\leq 2,\, \x 2\leq 3).$

By the inclusion-exclusion principle, this is $P(X_1\leq 2,\, X_2\leq 3)+P(X_1\leq 3,\, X_2\leq 2)-P(X_1\leq 2,\, X_2\leq 2)=F(2)F(3)+F(3)F(2)-F(2)^2.$

In general, for $x < y$ we have $G(x,y) = 2F(x)F(y)-F(x)^2.$

The last term will be killed when we differentiate wrt $y.$ The first term will produce $2f(x)f(y).$ So $\frac{\partial^2}{\partial x\partial y} G(x,y)= 2f(x)f(y).$ ■

Again, if we work with $X_1,...,X_n$ instead of just $X_1,X_2,$ then the same argument would show that $\frac{\partial^n}{\partial x_1\cdots\partial x_n} G(x_1,...,x_n) = n!f(x_1)f(x_2)\cdots f(x_n)$ if $x_1\leq x_2\leq\cdots\leq x_n.$

Combining our findings from the two example, we get the following theorem.

Theorem If $X_1,...,X_n$ are IID with density $f(x),$ then the joint density of the order statistics $(\x 1,...,\x n)$ is $$g(x_1,...,x_n)=\left\{\begin{array}{ll}n!f(x_1)\cdots f(x_n)&\text{if }x_1 < \cdots < x_n\\ 0&\text{otherwise.}\end{array}\right..$$

Problem set 3

EXERCISE 9: If $X_1,X_2,X_3$ are IID with density $f(x)=\left\{\begin{array}{ll}1&\text{if }x\in(0,1)\\ 0&\text{otherwise.}\end{array}\right.$, find density of $\x 2.$

EXERCISE 10: If $X_1,...,X_n$ are IID with common CDF $F(x),$ then show that the CDF of $\x k$ is $$P(\x k\leq x) = \sum_{j=k}^n \binom n j F(x)^j(1-F(x))^{n-j}.$$

EXERCISE 11: If $X_1,...,X_n$ are IID with common density $f(x),$ then find density of $\x k.$

Order statistics (part 3)

Video for this section

Here we shall dicuss an interesting heuristics.

EXAMPLE 5: If $X_1,...,X_{20}$ are IID with density $f(x),$ then write down the joint density of $(\x 3, \x 4, \x 7, \x {15}).$

SOLUTION: We can of course derive the required joint density by starting with the joint density of $(\x 1,...,\x {20})$ and then integrating over all $\x i$ for $i\not\in\{3,4,7,15\}.$ But there is a simple heuristic alternative worth learning.

Let the required joint density be $g(a,b,c,d).$ Think of it like this: if, for some very small $\epsilon > 0$ we write $x\approx y$ to mean $x\in\left(y-\frac \epsilon2,y+\frac \epsilon2\right),$ then $$P(\x 3\approx a,\, \x 4\approx b,\, \x 7\approx c,\, \x {15}\approx d)\approx g(a,b,c,d) \epsilon^4.$$ The heuristic technique tries to find the probability directly using combinatorics (and a pinch of salt). Consider the number line below, and think of how the $\x i$'s are scattered along it.

From this we can see how many $X_i$'s need to be where:

Partition $\{1,2,...,20\}$ into $2+1+1+2+1+7+5.$ This may be done in $\frac{20!}{2!2!7!5!}$ ways. Now we write down the probability for each "block". The singleton blocks have approximate probability as "density $\times \epsilon.$

Multiplying everything we get the final answer $$g(a,b,c,d) = \left\{\begin{array}{ll}\frac{20!}{2!2!7!5!} f(a)f(b)f(c)f(d)(F(a))^2(F(c)-F(b))^2 (F(d)-F(c))^7 (1-F(d))^5&\text{if }a < b < c < d\\ 0&\text{otherwise.}\end{array}\right.$$ ■

Problem set 4

EXERCISE 12: Check that this heutistic method gives the same density for $\x 1$ and $\x n$ that we obtained earlier.

EXERCISE 13: Write down the joint density of $(\x 1, \x n)$ using this heuristic method.

EXERCISE 14: Let $X_1,...,X_{15}$ be a random sample from a distribution with density $f(x).$ Write down a density for the sample median. Sample median is the central value among the $X_i$'s, i.e., $\x 8$ in this case.

Jacobian formula (1 dim)

Video for this section

To understand the Jacobian method, it will help to look at the univariate CDF method. Let $f(x)$ be a density of $X$ and let $Y=h(X),$ where $h(\cdot)$ is an increasing bijection with differentiable $h ^{-1}(y).$

Then the CDF of $Y$ is $G(y) = P(Y\leq y) = P(h(X)\leq y) = P(X\leq h ^{-1}(y)) = F(h ^{-1}(y)),$ where $F(\cdot)$ is the CDF of $X.$

So $Y$ has density given by $$g(y) = G'(y) = \frac{d}{dy}F(h ^{-1}(y)) = f(h ^{-1}(y))\frac{d}{dy}h ^{-1}(y).$$ So far we are assuming that $h(\cdot)$ is an increasing function. A very similar argument works for a decreasing function as well. In general for any bijection $h(\cdot),$ we have $$g(y) = f(h ^{-1}(y)) \left| \frac{d}{dy}h ^{-1}(y) \right|.\hspace{1in} \mbox{(*)}$$

Problem set 5

EXERCISE 15: If $X$ has density $f(x)$, then find density of $aX+b$ for $a\neq 0$ and $b\in{\mathbb R}.$

EXERCISE 16: If $X$ has density $f(x) =\left\{\begin{array}{ll}c\, x e^{-x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right. $, then find density of $Y = \sqrt{X}.$

EXERCISE 17: Let $X$ have density $f(x) = \left\{\begin{array}{ll}2 e^{-2x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right.$ Find density of $Y=X^2$ using (*).

EXERCISE 18: Let $X$ have density $f(x).$ Find density of $Y=a X+b$ using (*) if $a\neq 0.$

EXERCISE 19: Let $X$ have uniform distribution over $(-1,1).$ Find density of $Y=\sin X$ using (*).

Jacobian formula (intuition)

Video for this section

Let's first massage (*) into a more elegant form. We know that $h(h ^{-1} (y))\equiv y.$

Differentiating this wrt $y$ we have $h'(h ^{-1} (y))\frac{d}{dy} h ^{-1}(y) \equiv 1,$ i.e., $$\frac{d}{dy} h ^{-1}(y) = \frac{1}{h'(h ^{-1}(y))}.$$ So we get $$g(y) = \frac{f(h ^{-1}(y))}{ |h'(h ^{-1}(y))| }.$$ If we write $x = h ^{-1}(y),$ this will look less complicated: $$g(y) = \frac{f(x)}{ |h'(x)| }.$$ So we may say that $g$ is just same as $f,$ except that it is scaled by $h'.$

Suppose that $X$ has uniform distribution over $[0,1].$ Consider the density of $X.$ Imagine 10 equal length subintervals along $[0,1].$ Since the total area under the density is 1, the rectangle on each subinterval has area $\frac{1}{10}.$ You may say that each subinterval accounts for $\frac{1}{10}$ mass.


All the rectangles are identical

When you compute $Y=X^2,$ the intervals close to 0 get squeezed further down to 0, while those closer to 1 are stretched.


Rectangles are squeezed and stretched

But still each rectangle has to account for $\frac{1}{10}$ mass. So the squeezed rectangles have to compensate by growing taller, while the stretched ones compensate by getting shorter.


All rectangles now again have area $\frac{1}{10}.$

This leads to $Y$ having higher density near 0 than near 1. Thus, the non-uniformity of the density is controlled by the squeezing of the transforming function, i.e., the derivative. Smaller the derivative, higher the density.

Problem set 6

EXERCISE 20: If $X$ has uniform distribution over (2,4) then roughly sketch the density of $Y = \frac 1X.$ Don't apply the Jacobian formula algeraically. Think in terms of which part gets squeezed/expanded.

EXERCISE 21: Suppose that $X$ is uniform over $(-1,1)$ and $Y=X^2.$ (not a bijection!). Guess the form of the density of $Y.$ Do you see why we needed the transform to be bijective in our intuition?

[Hint]

We were assuming that density of $Y$ at any given point was controlled by the density of $X$ at only one point. But in this example, the density of $Y$ at, say, $y=\frac 14$ is governed by the density of $X$ at $x=\frac 12$ as well as $x=-\frac 12.$

Jacobian matrix

Video for this section

In (*) above we had $$g(y) = f(h ^{-1}(y)) \left| \frac{d}{dy}h ^{-1}(y) \right|,$$ where $h$ was assumed to be a bijection with differentiable $h ^{-1}.$

In order to generalise this to the multivariate set up, we need to work with a bijection $h:{\mathbb R}^n\rightarrow{\mathbb R}^n.$ We need to do two things:

we need to define differentiability for functions from ${\mathbb R}^n$ to ${\mathbb R}^n.$
we need to compute derivative of such functions.

Multivariate differentiation

$f:{\mathbb R}\rightarrow{\mathbb R}$ is called differentiable at some $a$, if $$\lim_{x\rightarrow a} \frac{f(x)-f(a)}{x-a}\mbox{ exists finitely.}$$ If this limit is called $m$, then this can be recast in the geometrically more applealing way as $$\exists m\in{\mathbb R}~~ \lim_{x\rightarrow a}\frac{f(x)-\{f(a)+m\cdot(x-a)\}}{x-a} = 0.$$ This is geometrically more appealing because you can think of this as $f(x)-f(a)\approx m\cdot(x-a),$ i.e., near $a$ the graph of $f$ looks like the line passing through $(a,f(a))$ with slope $m.$

This immediately generalises to $f:{\mathbb R}^n\rightarrow{\mathbb R}^m$ as follows.

Definition: Multivariate differentiation Call $f:{\mathbb R}^n\rightarrow{\mathbb R}^m$ differentiable at $\v a\in{\mathbb R}^n$ if $$\exists M_{m\times n} ~~ \lim_{\v x\rightarrow \v a}\frac{f(\v x)-\{f(\v a)+M\cdot(\v x-\v a)\}}{\|\v x-\v a\|} = \v 0.$$

Such a matrix $M$ may depend on $\v a,$ and will be unique, and its $(i,j)$-th entry will be given by $$m_{ij} = \frac{\partial f_i}{\partial x_j}.$$ Here $f_i$ is the $i$-th component of $f.$

Let us digest this using an example.

EXAMPLE 6: Let $f:{\mathbb R}^2\rightarrow{\mathbb R}^2$ be $f(x_1,x_2) = (\sin (x_1x_2),\, x_1-x_2^2).$ Find its Jacobian. Also find the determinant of the Jacobian.

SOLUTION: Note that $f$ consists of two function $f_1,f_2:{\mathbb R}^2,\rightarrow{\mathbb R}.$ These are its component functions, $f_1(x_1,x_2) = \sin(x_1x_2)$ and $f_2(x_1,x_2) = x_1-x_2^2.$

The Jacobian is a $2\times 2$ matrix with $(i,j)$-th entry $\frac{\partial f_i}{\partial x_j}.$ Note that each row is devoted to a single $f_i$ and each column to a single $x_j.$ In general, if we had $f:{\mathbb R}^n\rightarrow{\mathbb R}^m$ the matrix would have been $m\times n.$

In our case $$\begin{eqnarray*} \frac{\partial f_1}{\partial x_1} & = & x_2\cos (x_1x_2),\\ \frac{\partial f_1}{\partial x_2} & = & x_1\cos (x_1x_2),\\ \frac{\partial f_2}{\partial x_1} & = & 1\\ \frac{\partial f_2}{\partial x_2} & = & -2x_2. \end{eqnarray*}$$ So the Jacobian is $$\left[\begin{array}{ccccccccccc}x_2\cos (x_1x_2) & x_1\cos (x_1x_2)\\ 1 & -2x_2 \end{array}\right].$$ Its determinant is $$x_2\cos (x_1x_2)\times(-2x_2)- x_1\cos (x_1x_2)\times 1 = -(2x_2^2+x_1)\cos (x_1x_2).$$ ■

If all this looks like unmotivated magic, you might benefit from this introductory video that I have made for Jacobians. The video is about 21 min long, which is too long for my taste. You may like to navigate to relevant portions of it using the following guideline:

0:00: Casting univariate differentiation into a form suitable for generalisation.

4:31: Differentiation of $f:{\mathbb R}^n\rightarrow{\mathbb R}^m$

8:34: Geometric interpretation of Jacobian matrices

15:05: Why care about the determinant of Jacobian

Problem set 7

EXERCISE 22: Compute the Jacobian matrix for $h(x,y) = (x+y,x-y).$

EXERCISE 23: What is the Jacobian matrix for the transform $h:{\mathbb R}^n\rightarrow{\mathbb R}^n$ where $h(\v x) = A\v x+\v b$ for some matrix $A_{n\times n}$ and vector $\v b_{n\times 1}$?

Multivariate Jacobian formula

Video for this section

We shall imitate our familiar univariate Jacobian formula $$g(y) = f(h ^{-1}(y)) \left| \frac{d}{dy}h ^{-1}(y) \right|$$ to get the following theorem.

Theorem Let $\v X$ be an ${\mathbb R}^n$-valued random vector. Let $h:{\mathbb R}^n\rightarrow{\mathbb R}^n$ be a bijection with differentiable inverse. Let $\v Y = h(\v X).$ Then $\v Y$ has density $$g(\v y) = f(h ^{-1}(\v y)) J,$$ where $J$ is the absolute determinant of Jacobian of $h ^{-1}(\v y).$

We shall not prove this theorem here. But the intuitive argument is just as in the univariate case.

EXAMPLE 7: Let $\v X = (X_1,X_2)$ be uniformly distributed over $[1,2]\times[3,4].$ Let $Y_1 = X_1X_2$ and $Y_2 = X_1.$ Find the joint density of $\v Y = (Y_1,Y_2).$

SOLUTION: Let $S = [1,2]\times[3,4].$

Here the transform is $h(x_1,x_2) = (x_1x_2,x_1).$

Clearly, $h:S\rightarrow h(S)$ is a bijection, because given $y_1=x_1x_2$ and $y_2=x_1$ you can recover $(x_1,x_2)\in[1,2]\times[3,4]$ uniquely.

The inverse transform is $h ^{-1}(y_1,y_2) = \left(y_2,\frac{y_1}{y_2}\right).$ The Jacobian of this is $$\left[\begin{array}{ccccccccccc}0 & 1\\\frac{1}{y_2} & -\frac{y_1}{y_2^2} \end{array}\right],$$ which has absolute determinant $\frac{1}{y_2},$ since $y_2 > 0.$

So the required density will be $$g(y_1,y_2) = \left\{\begin{array}{ll}\frac{1}{y_2}&\text{if }\left(y_2,\frac{y_1}{y_2}\right)\in S\\ 0&\text{otherwise.}\end{array}\right.$$ Often we want to write it as $$g(y_1,y_2) = \left\{\begin{array}{ll}\frac{1}{y_2}&\text{if }(y_1,y_2)\in T\\ 0&\text{otherwise.}\end{array}\right.$$ for some suitably defined $T.$ This may be done as follows.

$\left(y_2,\frac{y_1}{y_2}\right)\in S$ means $$1\leq y_2 \leq 2 \mbox{ and } 3\leq \frac{y_1}{y_2}\leq 4.$$ Sketching these restrictions we get this region:


$T$ shown in red

■

Problem set 8

EXERCISE 24: If $(X,Y)$ has joint density $f(x)=\left\{\begin{array}{ll}x+y&\text{if }x,y\in[0,1]\\ 0&\text{otherwise.}\end{array}\right.$, then find the joint density of $(X+Y, X-Y).$

EXERCISE 25: If $(X,Y)$ is uniformly distributed over $[0,1]\times[0,2]$, then find the joint density of $(X^2,X+Y).$

EXERCISE 26: If $(X,Y)$ is uniformly distributed over the red rectangle below, then find non-zero constants $a,b,c,d$ such that $U=aX+bY$ and $V=cX+dY$ are independent.

Sum

Video for this section

Suppose that we are given the joint density of $(X,Y).$ We want to find the density of $X+Y.$ Can we use Jacobians for this? Yes, but not directly. The Jacobian technique works directly only when we are dealing with transformations from ${\mathbb R}^n$ to ${\mathbb R}^n,$ and the transformation must be bijective with nonsingular Jacobian. Unfortunately, $(X,Y)\mapsto X+Y$ does not satisfy any of these conditions, it is from ${\mathbb R}^2$ to ${\mathbb R},$ and is not bijective. But we can remdy this by considering the transformation $h(X,Y) = (X,X+Y).$ This is a bijective nonsingular linear transformation. So the Jacobian technique will apply.

EXAMPLE 8: Let $(X_1,X_2)$ have joint density $f(x_1,x_2).$ Find density of $X_1+X_2.$

SOLUTION: Consider $(Y_1,Y_2) = (X_1,X_1+X_2).$

Here the transform is $h(x_1,x_2) = (x_1,x_1+x_2).$ This is a bijection from ${\mathbb R}^2$ to ${\mathbb R}^2$ with inverse $h ^{-1}(y_1,y_2) = (y_1,y_2-y_1).$

The Jacobian matrix is $\left[\begin{array}{ccccccccccc}1 & 0\\-1 & 1 \end{array}\right]$, with absolute determinant 1. So the required density is $$g(y_1,y_2) = f(y_1,y_2-y_1). $$ Now we need to find the marginal density of $Y_2.$ This is $$g_2(y_2) = \int_{-\infty}^\infty g(y_1,y_2)\, dy_1 = \int_{-\infty}^\infty f(y_1,y_2-y_1)\, dy_1.$$ ■

The result is quite useful, and worth recording as a theorem:

Theorem If $(X,Y)$ had joint density $f(x,y)$ for $(x,y)\in{\mathbb R}^2$, then the density of $X+Y$ is $$f_{X+Y}(u) = \int_{-\infty}^\infty f(x,u-x)\, dx.$$

A special case is when the two random variables are independent:

TheoremIf $X,Y$ are indenedent random variables with densities $f(x)$ and $g(y),$ respectively, then the density of $X+Y$ is $$f_{X+Y}(u) = \int_{-\infty}^\infty f(x)g(u-x)\, dx.$$

This gives us a way to manufacture a new density by combining two existing densities. This is called convolution.

Definition: Convolution If $f,g$ are two densities, then their convolution is the density $f*g$ given by $$(f*g)(u) = \int_{-\infty}^\infty f(x)g(u-x)\, dx.$$

EXAMPLE 9: If $X,Y$ are independent uniform over $(0,1),$ then find density of $X+Y.$

SOLUTION: The answer is $f*f,$ where $f(x) =\left\{\begin{array}{ll}1&\text{if }0 < x < 1\\ 0&\text{otherwise.}\end{array}\right. $

So $$(f*f)(u) = \int_{-\infty}^\infty f(x)f(u-x)\, dx = \int_{\max\{0,u-1\}}^{\min\{1,u\}}dx=\left\{\begin{array}{ll}u&\text{if }0 <u < 1\\ 2-u&\text{if }1 <u < 2\\ 0&\text{otherwise.}\end{array}\right.$$ To see this notice that for $f(x)$ to be nonzero we need $0 < x < 1,$ while for $f(u-x)$ to be non-zero we need $0 < u-x < 1,$ or $u > x >u-1.$ So for $f(x)f(u-x)$ to be non-zero we need $1,u > x >0,u-1.$ Also, the non-zero value of $f(x)f(u-x)$ is 1. ■

Problem set 9

EXERCISE 27: Show that $f*g = g*f.$

EXERCISE 28: Does there exist a density $i(x)$ such that for all densities $f$ we have $i*f = f?$

[Hint]

Think in terms of random variables.

EXERCISE 29: If $X,Y$ are IID with common density $\lambda e^{-\lambda x}$ ($x>0$), then find the density of $X+Y.$

EXERCISE 30: If $X,Y$ are independently distributed uniformly over $(0,1),$ the sketch density of $X+Y.$

EXERCISE 31: If $X,Y$ are independent with common density $f(x)$, what will density of $X-Y$ be?

Quotient

Video for this section

Sometimes we need to work with the quotient of two independent random variables. The following theorem helps when the random variables are both positive and independent.

Theorem Let $X,Y$ are independent random variables with densities $f_X(x)$ and $f_Y(y),$ respectively. Let $Y$ be always positive. Then $Z=X/Y$ has density $$f_Z(z) = \int_0^\infty uf_X(zu)f_Y(u)\, du.$$

Proof: Use the Jacobian technique for the transform $(X,Y)\mapsto \left(\frac XY,Y\right)\equiv(Z,U).$ [QED]

Problem set 10

EXERCISE 32: Prove the above theorem using Jacobian.

EXERCISE 33: If $X,Y$ are independent and uniformly distributed over $[1,2],$ then find density of $X/Y.$

EXERCISE 34: If $X,Y$ are IID with common density $f(x)=\left\{\begin{array}{ll}e^{-x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right.$, then find density of $X/Y.$

EXERCISE 35: A point $Q$ is chosen at random from the unit square. Let $Q$ be $(R,\Theta)$ in polar coordinates. Find density of $\tan\Theta.$

Characteristic function (CF)

Video for this section

(The video has been corrected.)

We have seen various functions connected with a distribution, PMF, PDF, CDF and MGF. In case, you have forgotten about the concept of a moment generating function (MGF) that we briefly touched upon last semester, here is the definition again:

Definition: Moment generating function (MGF) The MGF of a random variable $X$ is defined as the function $M_X(t) = E(e^{Xt})$ for whatever $t\in{\mathbb R}$ the expectation is finite. (Since $e^{Xt}$ is a positive random variable, it's expectation is always defined.)

Out of these only CDF is guaranteed to exist for any random variable. And also uniquely determines a distribution (i.e., if the CDFs of two random variables match, then their distributions must also match). Unfortunately, CDF does not "play well" with convolution, i.e., if $X,Y$ are independent then there is no nice formula expressing the CDF of $X+Y$ in terms of those of $X$ and $Y.$ There is, however, one such function that combines all the good properties: it exists finitely for all random variables, it uniquely determines a distribution and "plays well" with convolution. Its definition is given below.

Definition: Characteristic function (CF) The characteristic function (CF) of a random variable $X$ is defined as the function $\xi_X:{\mathbb R}\rightarrow{\mathbb C}$ as $\xi_X(t) = E(e^{iXt})$ for $t\in{\mathbb R}.$

You may be scared by the unexpected appearance of complex numbers inside the expectation! Let's learn about complex random variables.

Complex random variables

Just remember that A complex random variable $Z$ means $Z = X+i Y,$ where $X,Y$ are (real) random variables. We define $E(Z)=E(X)+iE(Y)$ (and say $E(Z)$ does not exist if at least one of $E(X), E(Y)$ does not).

Since we have $e^{iXt} = \cos (Xt)+i\sin(Xt)$, the characteristic function is just $\xi_X(t) = E(\cos(Xt))+i E(\sin(Xt)).$ Since $\cos$ and $\sin$ are both bounded, finite existence of the expectation is not a problem.

Complex calculus

For $f:{\mathbb R}\rightarrow{\mathbb C}$ write $f(x) = g(x) + i h(x) $ for $g,h:{\mathbb R}\rightarrow{\mathbb R}.$ Then differentiation and integration are defined in the obvious way: $$\begin{eqnarray*} f'(x) & = & g'(x) + i h'(x),\\ \int f(x)\, dx & = & \int g(x)\, dx + i\int h(x)\, dx. \end{eqnarray*}$$ From this it immediate follows (check!) that $\frac{d}{dx}e^{ix} = i e^{ix}$ and $\int e^{ix}\, dx = \frac 1ie^{ix}+$ arbit constant.

An example

EXAMPLE 10: Find the CF of $X$ having density $f(x) = \left\{\begin{array}{ll} 3 e^{-3x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right. $

SOLUTION: $$E(e^{iXt}) = 3\int_0^ \infty e^{ixt}e^{-3x}\, dx = 3\int_0^\infty e^{(it-3)x}\, dx = \frac{3}{3-it}$$ for $t\in{\mathbb R}.$ ■

Clearly, for any random variable $X$ we have $\xi_X(0) = 1.$

Problem set 11

EXERCISE 36: Find CF for the degenerate distribution at $5.$

EXERCISE 37: Find CF for the uniform distribution over $(-1,1).$

Properties of CF

Video for this section

The following two theorems are what make CF useful.

Theorem If $X,Y$ are two random variables such that $\xi_X(t) \equiv \xi_Y(t)$, then $X$ and $Y$ must have the same distribution.

Proof:Will be done next semester.[QED]

Theorem If $X,Y$ are independent random variables, then $\xi_{X+Y}(t) = \xi_X(t)\xi_Y(t)$ for $t\in{\mathbb R}.$

Proof: Since $X,Y$ are independent, hence so are their functions $e^{iXt}$ and $e^{iYt}.$

Since their expectations are finite, so $E(e^{iXt}\times e^{iYt}) = E(e^{iXt})\times E(e^{iYt}).$ Hence the result.

[QED]

If we know a list of CFs for some standard distributions, then these two results often help us to identify if the convolution of two distributions in our list again belong to the list. Here is an example.

EXAMPLE 11: Suppose that you are told that, for $a>0$, the distribution with density $f_a(x) = \left\{\begin{array}{ll}c x^{a-1}e^{-x}&\text{if }x>0\\ 0&\text{otherwise.}\end{array}\right.$ has CF $\xi_a(t) = (1-it)^{-a}.$ for $t< 1.$

Show that for $a,b>0$ we have $f_a* f_b = f_{a+b}.$

SOLUTION: You can of course show this directly using the definition of convolution. But that would require you to compute an integral. But it is trivial using CF: $\xi_a(t)\xi_b(t) = (1-it)^{-a} (1-it)^{-b} = (1-it)^{-(a+b)}$ for $t \in{\mathbb R}.$

Since CF uniquely determines the distribution, we get the result. ■