STAT2001 Mathematical Statistics

Posted 2023-03-27 ynxad62

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了STAT2001 Mathematical Statistics相关的知识，希望对你有一定的参考价值。

STAT2001/STAT2013/STAT6013 - Introductory
Mathematical Statistics (for Actuarial
Studies)/Principles of Mathematical Statistics (for
Actuarial Studies)
Lecturer: Associate Professor Janice Scealy 1
1Research School of Finance, Actuarial Studies and Statistics, ANU
? Read class summary carefully.
? The course follows the text book closely (see text guide for
relevant sections).
2 / 16
What is statistics? (very brief overview)
Typically statistics involves:
(1) collecting data: x1, x2, . . . , xn.
(2) presenting and describing data
(3) propose a probability model to summarise data e.g. N(μ,2).
(4) estimate parameters (quantities) in the model
(5) use sample statistics to do inference for parameters in model
- e.g. confidence interval for true μ
- e.g. hypothesis test μ = μ0 vs μ 6= μ0.
e.g measure heights of people
eg boxplot histogram numericalsummary
n
i
jig a IR
ofparametersIM
with a small number
n 2 parameters
3 / 16
Research led teaching (and a bit about me)
I am a statistical scientist.
My research focuses on developing new statistical methods for
analysing: compositional data (proportions which sum to one),
data on spheres or on more general curved surfaces (manifolds)
and data with very complex structures.
Applications include: Paleomagnetism (study of Earth’s ancient
magnetic field), household expenditure survey data,
geochemistry (composition of rocks), Microbiome count data
(stool samples).
See slides of my most recent work on analysing Microbiome
data and how I’ve used ideas from this course (not assessable).
R
M Kitaitas\'t
N that 23 1 i
Ki 7 0 Nz
Az Nz
4 / 16
OVERVIEW
Introduction (Chapter 1 in textbook)
? What is statistics? Basic concepts. Summarising (or
characterising) a set of numbers, graphically and numerically.
Probability (Ch 2)
1 A coin is tossed twice. What’s the probability (pr) that exactly
one head comes up? 1/2
(2/4 = 1/2 : HH,HT, TH, TT)
2 A committee of 5 is to be randomly selected from 12 people.
What’s the pr it will contain the 2 oldest people? 5/33
(C(10, 3)/C(12, 5) = 5/33)
3 John and Kate take turns tossing a coin, starting with John. The
first to get a head wins. What’s the pr that John wins? 2/3
(1/2+ 1/8+ 1/32+ ... = 2/3)
5 / 16
OVERVIEW (continued)
John wins if H or TTH or TTT TH pattern
Hss Iss Iss
sum probabilities of each of these mutually exclusiveevents
42 4242 42 t text 2 42 112 42 t
plz t 42
3 4275 t
9 liz 25
1
Let 2 112 sum of infinite
If x\'it n 23 25 t
a ta ta
4 I
geometri series
ie Erk I
k O
a i txt the\'t t
Iri C I
In 3
6 / 16
OVERVIEW (continued)
Discrete random variables (Ch 3)
A formalisation of concepts in Ch 2.
A coin is tossed twice. Let Y be the number of heads that come
up. Then Y is a discrete random variable (rv), one with possible
values 0, 1, 2. Y has probability distribution (pr dsn) given by
P(Y = 0) = 1/4, P(Y = 1) = 1/2, P(Y = 2) = 1/4.
Continuous random variables (Ch 4)
A stick is measured as 1.2 m long. Let Y be the exact length of
the stick. Then Y is a continuous random variable (cts rv), one
with possible values between 1.15 and 1.25 (a continuum of
values). How can we describe Ys pr. dsn? See next slide.
Multivariate probability distributions (Ch 5)
A die is rolled twice. Let X be the no. of 5s that come up and Y
the no. of 6s. X and Y are 2 rv’s with a multivariate (or bivariate
or joint) pr dsn. Possible values of (X, Y) are:
(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0)
7 / 16
OVERVIEW (continued)
What is c? fly measure of probability
fly
q
o yet is
C 1 15 E y I 1.25
O y 1.25
In the discrete case sum of probabilities
In the continuous case we need
Sfly dy I c if i dy
y I
I 25 1.15 I
2 10
8 / 16
OVERVIEW (continued)
Functions of random variables (Ch 6)
? A coin is tossed twice. Let Y = no. of H’s. Then X = 3Y is a
function of Y . What’s the pr dsn of X?
(Possible values of X are 0, 3, 6.)
Sampling distributions and Central Limit Theorem (CLT) (Ch 7)
? A coin is tossed 100 times. What’s the pr that at least 60 heads
come up? This pr is hard to work out exactly, but can be well
approximated using the CLT and the normal distribution (Ch 4).
Y v Bin(100, 0.5) ? N(50, 25)
P(Y 60) ? P(Z > (600.550)/p25) = P(Z > 1.9) = 0.0287.
The exact probability is 0.0284.
9 / 16
OVERVIEW (continued)
Point estimation and confidence intervals (CI’s) (Ch 8)
? p = pr of H on a single toss of a bent coin. We toss the coin 100
times and get 60 heads. So we estimate p by 60/100 = 0.6 (a
point estimate). A 95% CI for p is (0.504, 0.696). So we are
95% ‘confident’ that p lies between 0.504 and 0.696. What
exactly this means will be made clear later.
0.6± 1.96p(0.6 ? 0.4/100) = (0.6± 0.096) = (0.504, 0.696).
Methods of estimation (Ch 9)
It was reasonable to estimate p (above) by 0.6. In some situations it is
not so clear how to proceed. We’ll look at 2 general methods for
estimating a quantity:
1 the method of moments (MOM)
2 the method of maximum likelihood (ML).
In fact, 0.6 above is both the MOME and MLE (E = estimate) of p.
10 / 16
OVERVIEW (continued)
Hypothesis testing (Ch 10)
We toss a coin 100 times and get 60 heads. Can we conclude that the
coin is unfair? Ie, should we accept or reject the statement “p = 1/2”?
Bayesian methods (Ch 16)
Needed for Credibility Theory; it also provides a 3rd method of
estimation. We treat the model parameters as random variables
instead of unknown constants and assign a prior distribution to them.
This is useful when we have prior knowledge about the parameters.
e.g. choose a model for the data Y|p, assume p has some prior
distribution (follows a particular model). Then find the posterior
distribution of p|Y .
Extra topics (Ch 17)
The negative binomial distribution, sufficiency and....
11 / 16
INTRODUCTION (chapter 1)
Statistics = descriptive statistics + inferential statistics (or
“inference”).
Descriptive statistics involves tables, graphs, averages and other
simple summaries of data (not much maths is required). This is the
focus of Chapter 1, covered here.
Inferential statistics has to do with using information from a sample
to make statements about a population.
Sample 100 people from Canberra. On the basis of the sample,
we estimate the average height of all Canberrans as 1.74 m, with
95% CI (1.72, 1.76).
To make inferential statements like this we need a good understanding
of probability, which is a branch of mathematics. The term
mathematical statistics refers to the combined fields of probability and
inferential statistics (not descriptive statistics).
12 / 16
INTRODUCTION (continued)
Chapters 2 through 7 deal with probability. Chapters 8 to 10 and 16
deal with inferential statistics.
Field map: mathematical
statistis
how things fit together
mathematics
strata probability chap 2 7
gyp ftp.gyggy
4eed
4mWtst1w
Estimation hyptnesitestigtBayesian methods
Chap16
ipointestimation Interval estimation
13 / 16
Descriptive statistics
Consider a small dataset of n = 10 values:
Table: Daily profit data
i 1 2 3 4 5 6 7 8 9 10
yi 2.1 2.4 2.2 2.3 2.7 2.5 2.4 2.6 2.6 2.9
Here, for example, yi represents the profit of a particular business on
Day i (in units of $1000). How can we graphically summarise these
data?
Frequency Histogram: Create a number of bins that span the range
of the data (2.1 to 2.9), and draw vertical bars which show the number
of observations (frequencies) in each bin.
See R code Ch1_example1.R
14 / 16
Descriptive statistics (continued)
Numerical summaries of the data
Two basic types of descriptive measures are measures of central
tendency and measures of dispersion (or variation).
The most common measure of central tendency is the arithmetic mean
(or average), defined by
and in our example this works out as
yˉ =
1
10
(2.1+ 2.4+ . . .+ 2.9) = 2.47 ($2470).
If yi represents the profit of a particular business on Day i (in units of
$1000), then yˉ is an estimate of the average profit of the business per
day over a long period of time, which may also be called the
population mean μ. (This is an example of statistical inference and
will be discussed in greater detail in later chapters.)
15 / 16
Descriptive statistics (continued)
A common measure of dispersion is the sample variance, defined by
s2 =
1
n 1
nX
i=1
(yi yˉ)2
which in our example works out as
s2 =
1
10 1

(2.1 2.47)2 + . . .+ (2.9 2.47)2 = 0.057889.
A problem with s2 is that it is in squared units. Here, one unit is 1000
dollars, and so a squared unit is 1000000 square dollars. That is,
s2 = 57, 889$2 (a bit awkward).
Another measure of dispersion is the sample standard deviation, s,
which in our example is
p
0.057889 = 0.24060. This is in the same
units as yˉ , and so s = $240.60.
16 / 16
Descriptive statistics (continued)
The sample variance s2 and sample standard deviation s are estimates
of the population variance 2 and population standard deviation ,
respectively. Here, 2 may be thought of as the sample variance of a
‘very large’ number of yi values. (More will be said about this later,
and also about why s2 involves “n 1 ”.)
A useful principle which involves μ and is the Empirical Rule:
About 68% of the values lie in the interval μ±
About 95% of the values lie in the interval μ± 2
Almost all (> 99%) of the values lie in the interval μ± 3
This Rule is most accurate when the n is large and histograms of the
data are bell-shaped, but still has some validity otherwise. It also
applies when μ and/or are changed to yˉ and s.
See R code Ch1_example2.R

WX：codehelp mailto: thinkita@qq.com

以上是关于STAT2001 Mathematical Statistics的主要内容，如果未能解决你的问题，请参考以下文章