机器学习笔记(Washington University)- Classification Specialization-week 3

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了机器学习笔记(Washington University)- Classification Specialization-week 3相关的知识,希望对你有一定的参考价值。

1. Quality metric

Quality metric for the desicion tree is the classification error

error=number of incorrect  predictions / number of examples

 

2. Greedy algorithm

Procedure

Step 1: Start with an empty tree

Step 2: Select a feature to split data

explanation:

  Split data for each feature 

  Calculate classification error of this decision stump

  choose the one with the lowest error

For each split of the tree:

  Step 3: If all data in these nodes have same y value

      Or if we already use up all the features, stop.        

  Step 4: Otherwise go to step 2 and continue on this split

Algorithm

predict(tree_node, input)

if current tree_node is a leaf:

  return majority class of data points in leaf

else:

  next_node = child node of tree_node whose feature value agrees with input

  return (tree_node, input)

3  Threshold split

Threshold split is for the continous input

we just pick a threshold value for the continous input and classify the data.

Procedure:

Step 1: Sort the values of a feature hj(x) {v1, v2,...,vn}

Step 2: For i = 1 .... N-1(all the data points)

      consider split ti=(vi+vi+1)/2

      compute the classification error of the aplit

    choose ti with the lowest classification error

 

以上是关于机器学习笔记(Washington University)- Classification Specialization-week 3的主要内容,如果未能解决你的问题,请参考以下文章

机器学习笔记(Washington University)- Regression Specialization-week four

机器学习笔记(Washington University)- Classification Specialization-week 3

机器学习笔记(Washington University)- Regression Specialization-week five

机器学习笔记(Washington University)- Regression Specialization-week six

机器学习笔记(Washington University)- Regression Specialization-week one

机器学习笔记(Washington University)- Clustering Specialization-week four