开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

12345678wdv · 2024年05月10日

Gini weighted 的权重不是 4/10 6/10吗,为什么是5/10

NO.PZ2024030508000093

问题如下:

A quantitative analyst supporting the acquisitions team of a European corporate real estate firm is using the decision tree technique to create a model for forecasting property prices. The analyst compiles a training data set comprised of information from 10 recent property sales, as shown in the following table:

The table also includes the target variable of the model: a class label indicating whether the property was sold for a price greater than EUR 8,000,000. The analyst selects the occupancy status as the feature that is used as the root node of the decision tree. What is the estimated information gain of the split put forward by this root node?

选项:

A.0.09 B.0.37 C.0.44 D.0.82

解释:

Explanation: A is correct. Before we can calculate the information gain as Ginibase Giniweighted, we first calculate for the base-level Gini measure by looking at the output variable being considered before we know anything about the features.

There are 5 properties that sold above EUR 8,000,000 and 5 that sold below.

Ginibase =

Using the feature “occupancy status” as the root node, we examine this feature and find that for the 4 properties that were occupied, 3 sold above the amount and only 1 sold below.

Ginioccupied =

In a similar fashion, we find that for the 6 properties that were not occupied, 2 sold above the amount and 4 sold below.

Gininotoccupied =

Thus, the weighted Gini measure for this feature is obtained as:

Giniweighted =

Therefore, Information Gain = Ginibase Giniweighted = 0.50-0.4097 = 0.0902 or approximately 0.09.

B is incorrect. This is just the Gini measure for the sold properties that were occupied.

C is incorrect. This is just the Gini measure for the sold properties that were not occupied.

D is incorrect. This is the unweighted sum of the Gini measure for the sold properties that were occupied and the Gini measure for the sold properties that weren’t occupied (0.375 + 0.444).

Learning Objective: Show how a decision tree is constructed and interpreted.

Reference: Global Association of Risk Professionals. Quantitative Analysis. New York, NY: Pearson, 2023, Chapter 15, Machine Learning and Prediction [QA-15].

如题

1 个答案

品职答疑小助手雍 · 2024年05月11日

同学你好,那是课上的例题,要看题目给的条件:

题目给的表格里10个中有5个,所以是5/10.

  • 1

    回答
  • 0

    关注
  • 105

    浏览
相关问题