Machine Learning Lecture Notes

Posted 2022-02-16 zena爱吃小饼干

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Machine Learning Lecture Notes相关的知识，希望对你有一定的参考价值。

Machine Learning Lecture Notes-1[KCL Financial Mathematics]

1 Introduction

1 Introduction

1.1 Machine Learning v.s Statistics

Definition

Machine Learning : a field that takes an algorithmic approach to data analysis, processing and prediction.
Algorithmic :produces good predictions or extracts useful information from data to solve a practical problem

Venn diagram: Statictics\\data science\\AI

1.2 Applications

Supervised Learning

Definition:
Automate decision-making processes by generalising from input-output pairs $x_i , y_i )$ , $i \in$ $1, . . . N$ for some $\\N$
Drawback
Creating a dataset of inputs and outputs is often a laborious manual process.
Advantage
Supervised learning algorithms are well-understood and their performance is easy to measure
Example(train tickets pricing by distance)

Unsupervised Learning

Definition
Only the input data is known, and no known output data is given to the algorithm.
Drawback
Harder to understand and evaluate than SL.
Advantage
Only the input data is needed and there is no process of “creating input-output pairs” involved.
Example(trade portfolio)

Data & Tasks & Algorithms

data(explain with case)
- data point
- feature
- feature extraction/engineering
task(figure: overview of ml tasks)
- regression
- classification
- clustering
- dimensonality reduction
< Different tasks have different loss functions, refer to 1.3–Function >
algorithm
- supportvectormachines(SVMs)
- nearestneighbours
- random forest
- k means
- matrix factorisaion/autoencoder
- …

1.3 Deep Learning

Definition(supervised & unsupervised)

Deep learning solves Problems by employing neural networks(artificial neural networks), functions constructed by composing alternatingly affine and (simple) non-linear functions.

Task

prediction
classification
image recognition
speech recognition and synthesis
simulation
optimal decision making
…

Applications in Finance

detect fraud
machine read cheques
perform credit scoring

Limits

limited explainability of deep learning
black-box nature of neural networks

Function

$f =(f_1,...,f_O):R^I →R^O$

inputs
$x_1,...,x_I$ ( $\\in \\N$ )
outputs
$f_1(x_1,...,x_I),..., f_O(x_1,...,x_I )$ ( $\\in \\N$ )
loss function
$L(f):=\\frac1I\\sum_i=1^Il(\\hatf_i,f_i)$ (e.g. squared loss, absolute loss)

1) Example

Regression Problem: data fitting
Binary Classification Problem:the direction of next price change//credit risk analysis

2) A Class of Functions Construction

rich in the sense that it encompasses “almost any” reasonable functional relationship between the outputs and inputs;
parameterised by a finite set of parameters,so that we can actually work with it numerically;
able to cope with high-dimensional inputs and outputs.

3) Optimal $f$ Selection

implementable numerically;
efficient enough to be able to cope with large numbers of samples;
able to avoid the pitfall of overfitting, that is, producing a function $f$ that performs well with the training data but poorly with other data.

4) Functions in Deep Learning

Function: Affine Function\\Activation Function
$f =σ_r ◦L_r ◦···◦σ_1◦L_1:R^I →R^O$
where
$x =(x_1,...,x_d_i )∈R_d_i$ ;
$L_i:R^d_i-1 → R^d_i$ , $\\forall$ $i\\in\\N$ is an affine function, transmiting $d_i−1$ signals to $d_i$ units or neurons;
$σ_i(x):=(σ_i(x_1),...,σ_i(x_d_i )):R→R$ is an activation function, transforming $d_i−1$ siginals.

< satisfy the 2) requirement>
< Since there is no specific data definition for the model, it’s super general to fit diverse data.>
Optimal $f$ :stochastic gradient descent (SGD)
- the matrices and vectors parameterise its layers
- a randomly drawn subset of samples(minibatch) is used
- the gradient is computed using a form of algorithmic differentiation(backpropagation)
Loss function
generally absolute value of residual,but some others in particular

History of Deep Learning

Heaviside function< problem sheet1>
$f(x;w,b) := H(w^′x +b),$ where
$H(x):=\\begincases 0,\\quad x< 0 \\\\[2ex] 1, \\quad x\\geq0 \\endcases$