机器学习笔记(Washington University)- Classification Specialization-week 3
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了机器学习笔记(Washington University)- Classification Specialization-week 3相关的知识,希望对你有一定的参考价值。
1. Quality metric
Quality metric for the desicion tree is the classification error
error=number of incorrect predictions / number of examples
2. Greedy algorithm
Procedure
Step 1: Start with an empty tree
Step 2: Select a feature to split data
explanation:
Split data for each feature
Calculate classification error of this decision stump
choose the one with the lowest error
For each split of the tree:
Step 3: If all data in these nodes have same y value
Or if we already use up all the features, stop.
Step 4: Otherwise go to step 2 and continue on this split
Algorithm
predict(tree_node, input)
if current tree_node is a leaf:
return majority class of data points in leaf
else:
next_node = child node of tree_node whose feature value agrees with input
return (tree_node, input)
3 Threshold split
Threshold split is for the continous input
we just pick a threshold value for the continous input and classify the data.
Procedure:
Step 1: Sort the values of a feature hj(x) {v1, v2,...,vn}
Step 2: For i = 1 .... N-1(all the data points)
consider split ti=(vi+vi+1)/2
compute the classification error of the aplit
choose ti with the lowest classification error
以上是关于机器学习笔记(Washington University)- Classification Specialization-week 3的主要内容,如果未能解决你的问题,请参考以下文章
机器学习笔记(Washington University)- Regression Specialization-week four
机器学习笔记(Washington University)- Classification Specialization-week 3
机器学习笔记(Washington University)- Regression Specialization-week five
机器学习笔记(Washington University)- Regression Specialization-week six
机器学习笔记(Washington University)- Regression Specialization-week one
机器学习笔记(Washington University)- Clustering Specialization-week four