Wang Haihua
🚅 🚋😜 🚑 🚔
Here we go through the methods for distributions which have more than two variables. Most of results are similar to us already.
Let $X_1, X_2,...X_n$ be discrete random variables, the PMF is defined as
$$ P_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right)=P\left(X_{1}=x_{1}, X_{2}=x_{2}, \ldots, X_{n}=x_{n}\right) $$The marginal PMF of $x_1$ is
$$ P_{X_1}=P(X=x_1) = \sum_{x_2\in R}\sum_{x_3\in R}...\sum_{x_n\in R}P\left(X_{1}=x_{1}, X_{2}=x_{2}, \ldots, X_{n}=x_{n}\right) $$If $X_1, X_2,...X_n$ are continuous random variables, the PDF is defined as
$$ P\left(\left(X_{1}, X_{2}, \cdots, X_{n}\right) \in A\right)=\int_A \int_{A} \cdots \int_A f_{X_{1} X_{2} \cdots X_{n}}\left(x_{1}, x_{2}, \cdots, x_{n}\right) d x_{1} d x_{2} \cdots d x_{n} $$The marginal PDF of $x_1$ is
$$ f_{X_{1}}\left(x_{1}\right)=\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} f_{X_{1} X_{2} \cdots X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right) d x_{2} \cdots d x_{n} $$The joint CDF is
$$ F_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right)=P\left(X_{1} \leq x_{1}, X_{2} \leq x_{2}, \ldots, X_{n} \leq x_{n}\right) $$Let $X, Y$ and $Z$ be three jointly continuous random variables with joint PDF
$$ f_{X Y Z}(x, y, z)=\left\{\begin{array}{ll} cx+2 y+3 z & 0 \leq x, y, z \leq 1 \\ 0 & \text { otherwise } \end{array}\right. $$Use the property we get
\begin{align} \int_0^1\int_0^1\int_0^1(cx+2y+3z)dxdydz &= 1\\ \int_0^1\int_0^1\left[\frac{1}{2}cx^2+2xy+3xz\right]_0^1dydz&=1\\ \int_0^1\int_0^1\left(\frac{1}{2}+2y+3z\right)dydz&=1\\ \int_0^1\left[\frac{cy}{2}+y^2+3zy\right]^1_0dz&=1\\ \left[\frac{cz}{2}+z+\frac{3z^2}{2}\right]^1_0&=1\\ c&=-3 \end{align}If $X_1, X_2,...X_n$ are independent discrete random variables
$$ P_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right)=P_{X_{1}}\left(x_{1}\right) P_{X_{2}}\left(x_{2}\right) \cdots P_{X_{n}}\left(x_{n}\right) $$And if $X_1, X_2,...X_n$ are independent continuous random variables $$ f_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right)=f_{X_{1}}\left(x_{1}\right) f_{X_{2}}\left(x_{2}\right) \cdots f_{X_{n}}\left(x_{n}\right) $$
Their CDF and expections are
$$ F_{X_{1}, X_{2}, \ldots, X_{n}}\left(x_{1}, x_{2}, \ldots, x_{n}\right)=F_{X_{1}}\left(x_{1}\right) F_{X_{2}}\left(x_{2}\right) \cdots F_{X_{n}}\left(x_{n}\right) $$$$ E\left[X_{1} X_{2} \cdots X_{n}\right]=E\left[X_{1}\right] E\left[X_{2}\right] \cdots E\left[X_{n}\right] $$One common setting for error term of regression model is independent identically distributed.
\begin{aligned} E\left[X_{1} X_{2} \cdots X_{n}\right] &=E\left[X_{1}\right] E\left[X_{2}\right] \cdots E\left[X_{n}\right] \\ &=E\left[X_{1}\right] E\left[X_{1}\right] \cdots E\left[X_{1}\right] \\ &=E\left[X_{1}\right]^{n}=E\left[X_{2}\right]^{n}=\cdots = E\left[X_{n}\right]^{n} \end{aligned}Without formal definition, we call $E[X], E[X^2], E[X^k]$ the first moment, the second moment and the $k$th moment.
Another important concept is central moment, for instance $E[(X - E[X])^k]$ is called the $k$th moment.
The moment generating function (MGF) is a convenient way of generating all moments that we need.
If $X$ is discrete random variable, the MGF is
$$ M_X(t) = E[e^{tX}] = \sum_0^\infty e^{tk}P_X(k) $$For continuous random variable $X$, the MGF is
$$ M_X(t) = E[e^{tX}] = \int_{-\infty}^\infty e^{tX} f_X(x) dx $$Perform Taylor expansion on $e^x$
$$ e^{x}=1+x+\frac{x^{2}}{2 !}+\frac{x^{3}}{3 !}+\ldots=\sum_{k=0}^{\infty} \frac{x^{k}}{k !} $$Suppose $X$ is a discrete random variable, therefore $e^{tX}$ is
$$ e^{tX} = 1+tX+\frac{(tX)^{2}}{2 !}+\frac{(tX)^{3}}{3 !}+\ldots=\sum_{k=0}^{\infty} \frac{(tX)^{k}}{k !} $$Take expectation, we obtain MGF
$$ E[e^{tX}] =1+tE[X]+\frac{t^2E[X^{2}]}{2 !}+\frac{t^3E[X^{3}]}{3 !}+\ldots= \sum_{k=0}^{\infty} \frac{t^kE[X^{k}]}{k !} $$Take the first order derivative and evaluate at $t=0$
$$ \frac{d E[e^{tX}]}{dt}= 0 + E[X] + tE[X^2] + \frac{t^2E[X^3]}{2!}+\cdots = 0+ E[X]+ 0 + 0 +\cdots = E[X] $$Take the second order derivative and evaluate at $t=0$
$$ \frac{d^2 E[e^{tX}]}{dt^2} = 0 + 0 + E[X^2]+ tE[X^3]+ \frac{t^2E[X^4]}{2!}+\cdots=E[X^2] $$Take the third order derivative and evaluate at $t=0$
$$ \frac{d^3 E[e^{tX}]}{dt^3} = 0+0+0+E[X^3]+tE[X^4] = E[X^3] $$Let's calculate the MGF of $X\sim Ber(p)$, we know $P_X(1) = p, P_X(0)= 1-p$.
$$ E[e^{tX}] = e^{0t}P_X(0)+e^{1t}P_X(1) = 1-p+pe^t $$If we want to calculate the variance with formula $E(X^2)-(E[X])^2$, we need to generate the first and second moment.
$$ E[X]=\frac{d E[e^{tX}]}{dt} \bigg|_{t=0}= p\\ E[X^2]=\frac{d^2 E[e^{tX}]}{dt^2} \bigg|_{t=0}= p $$So the variance is $E(X^2)-(E[X])^2 = p - p^2$.
If $X_1, X_2, \ldots ,X_n \stackrel{i.i.d.}{\sim} Ber(p)$,
$$ \sum^n_{i=1}X_i\ \sim\ B(n,p) $$where $B(n,p)$ denotes binomial distribution.
The goal is to find the first moment $E(\sum^n_{i=1}X_i)$ and the second central moment $\text{Var}(\sum^n_{i=1}X_i)$.
Using MGF definition
\begin{align} E[e^{t\sum^n_{i=1}X_i}] &= E[e^{tX_1}e^{tX_2}\ldots e^{tX_n}] = E[e^{tX_1}] E[e^{tX_2}]\ldots E[e^{tX_n}] \tag{$X_i$'s are i.i.d.}\\ & = (1-p+pe^t)(1-p+pe^t)\ldots (1-p+pe^t)\\ & = (1-p+pe^t)^n \end{align}The first moment is
$$ E\bigg(\sum^n_{i=1}X_i\bigg) = \frac{d}{dt}(1-p+pe^t)^n\bigg|_{t=0}=n(1-p+pe^t)^{n-1}pe^t\Big|_{t=0}=np $$The second moment requires product rule of differentiation
\begin{align} E\bigg[\bigg(\sum^n_{i=1}X_i\bigg)^2 \bigg]=\frac{d}{dt}n(1-p+pe^t)^{n-1}pe^t &= n(n-1)(1-p+pe^t)^{n-2}pe^tpe^t+n(1-p+pe^t)^{n-1}pe^t\Big|_{t=0}\\ &=n(n-1)p^2+np=n^2p^2-np^2+np \end{align}Thereby
$$ \text{Var}\left(\sum^n_{i=1}X_i\right)= n^2p^2-np^2+np - n^2p^2 = np - np^2 $$