



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Probability Theory cheat sheet on these topics: Scalar-valued Random Variables, Vector-valued Random Variables, Gaussian Random Variables
Typology: Cheat Sheet
1 / 6
This page cannot be seen from the preview
Don't miss anything!
Consider two real-valued random variables (RV) X and Y with the individual probabil-
ity distributions pX (x) and pY (y), and the joint distribution pX,Y (x, y). The probability
distributions are probability mass functions (pmf) if the random variables take discrete
values, and they are probability density functions (ptf) if the random variables are con-
tinuous. Some authors use f () instead of p(), especially for continuous RVs.
In the following, the RVs are assumed to be continuous. (For discrete RVs, the integrals
have simply to be replaced by sums.)
pX (x) =
pX,Y (x, y) d y pY (y) =
pX,Y (x, y) d x
pX|Y (x|y) =
pX,Y (x, y)
pY (y)
pY |X (y|x) =
pX,Y (x, y)
pX (x)
for pX (x) 6 = 0 and pY (y) 6 = 0
pX|Y (x|y) =
pX,Y (x, y) ∫ pX,Y (x′, y) d x′^
pY |X (y|x) =
pX,Y (x, y) ∫ pX,Y (x′, y) d y′
g 1 (X)
g 1 (x) pX (x) d x
g 2 (Y )
g 2 (y) pY (y) d y
g 3 (X, Y )
g 3 (x, y) pX,Y (x, y) d x d y
for any functions g 1 (.), g 2 (.), g 3 (., .)
μX := E
x pX (x) d x μY := E
y pY (y) d y
σ 2 X ≡^ ΣXX^ := E
(X − μX ) 2
(x − μX ) 2 pX (x) d x
σ 2 Y ≡^ ΣY Y^ := E
(Y − μY ) 2
(y − μY ) 2 pY (y) d y
Remark: The variance measures the “width” of a distribution. A small variance means that most of the probability mass is concentrated around the mean value.
σXY ≡ ΣXY := E
(X − μX )(Y − μY )
(x − μX )(y − μY ) pX,Y (x, y) d x d y
Remark: The covariance measures how “related” two RVs are. Two indepen- dent RVs have covariance zero.
ρXY :=
σXY
σX σY
2
= ΣXX + μ 2 X
E
2
= ΣY Y + μ 2 Y E
= ΣXY + μX · μY
((X − μX ) + μX )((Y − μY ) + μY )
(X − μX )(Y − μY )
(X − μX )μY
μX (Y − μY )
μX μY
= ΣXY − (E[X] − μX )μY − (E[Y ] − μY )μX + μX μY
= ΣXY + μX μY
This method of proof is typical.
σXY ≡ ΣXY = E
(X − μX )(Y − μY )
Remark: If RVs are independent, they are also uncorrelated. The reverse holds only for Gaussian RVs (see below).
Remark: The RVs with finite energy, E[X^2 ] < ∞, form a vector space with scalar product 〈X, Y 〉 = E[XY ] and norm ‖X‖ =
E[X^2 ]. (This is used in MMSE estimation.)
These relations for scalar-valued RVs are generalized to vector-valued RVs in the
following.
2 Vector-valued Random Variables
Consider two real-valued vector-valued random variables (RV)
with the individual probability distributions pX (x) and pY (y), and the joint distribution
pX,Y (x, y). (The following considerations can be generalized to longer vectors, of course.)
The probability distributions are probability mass functions (pmf) if the random vari-
ables take discrete values, and they are probability density functions (pmf) if the random
variables are continuous. Some authors use f () instead of p(), especially for continuous
RVs.
In the following, the RVs are assumed to be continuous. (For discrete RVs, the integrals
have simply to be replaced by sums.)
Remark: The following matrix notations may seem to be cumbersome at the first
glance, but they turn out to be quite handy and convenient (once you got used to).
μ X
μX 1 μX 2
(X − μ X )(X − μ X
T
X 1 − μ 1 X 2 − μ 2
X 1 − μ 1 X 2 − μ 2
(X 1 − μ 1 )(X 1 − μ 1 )
(X 1 − μ 1 )(X 2 − μ 2 )
(X 2 − μ 2 )(X 1 − μ 1 )
(X 2 − μ 2 )(X 2 − μ 2 )
(X − μ X )(Y − μ Y
T
Remark: This matrix contains the covariance of each element of the first vector with each element of the second vector.
T
= ΣXX + μ X μ T X
E
= ΣXY + μ X
μT Y
Remark: This result is not too surprising when you know the result for the scalar case.
3 Gaussian Random Variables
pX (x) =
2 πσ X^2
· exp
(x − μX )^2
2 σ X^2
The often used symbolic notation
X ∼ N (μX , σ 2 X )
may be read as: X is (distributed) Gaussian with mean μX and variance σ X^2.
p(x) =
2 π
e − x
2 (^2).