开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

我的世界守则 · 2023年07月17日

老师可以详细讲解一下怎么用entropy的结果做决策树吗?

* 问题详情,请 查看题干

NO.PZ202212020200001502

问题如下:

Build a decision tree for this problem.

选项:

解释:

Both of the features are binary, so there are no issues with having to determine a threshold as there would be for a continuous series. The first stage is to calculate the entropy if the split was made for each of the two features.

Examining the Car_owner feature first, among owners (feature = 1), two made a claim while four did not, leading to entropy for this sub-set of:


Among non-car owners (feature = 0), two made a claim and two did not, leading to an entropy of 1. The weighted entropy for splitting by car ownership is therefore given by


and the information gain is information gain = 0.971 - 0.951 = 0.020

We repeat this process by calculating the entropy that would occur if the split was made via the College_degree feature. If we did so, we would observe that the weighted entropy would be 0.551, with an information gain of 0.420. Therefore, because the entropy is maximized when the sample is first split by College_degree, this becomes the root node of the decision tree.

For policyholders with a college degree (i.e., the feature=1), there is already a pure split as four of them have not made claims while none have made claims (in other words, nobody with college degrees made claims). This means that no further splits are required along this branch. The other branch can be split using the Car_ownership feature, which is the only one remaining.

The tree structure is given below:


看答案没看懂

1 个答案
已采纳答案

DD仔_品职助教 · 2023年07月19日

嗨,努力学习的PZer你好:


同学你好,

这是根据这个driver是否有大学学位和是否是car owner来判断第一年会不会进行赔付,需要根据这两个特征来判断哪一个应该作为第一个节点。这道题是讲义的原题,我建议同学去听一下课程视频,老师在基础班里讲的非常仔细,肯定要比我打字解释的更清楚:

具体位置在第12章以下这个视频:


----------------------------------------------
加油吧,让我们一起遇见更好的自己!

  • 1

    回答
  • 0

    关注
  • 160

    浏览
相关问题