开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

12345678wdv · 2024年05月10日

Gini weighted 的权重不是 4/10 6/10吗,为什么是5/10

NO.PZ2024030508000093

问题如下:

A quantitative analyst supporting the acquisitions team of a European corporate real estate firm is using the decision tree technique to create a model for forecasting property prices. The analyst compiles a training data set comprised of information from 10 recent property sales, as shown in the following table:

The table also includes the target variable of the model: a class label indicating whether the property was sold for a price greater than EUR 8,000,000. The analyst selects the occupancy status as the feature that is used as the root node of the decision tree. What is the estimated information gain of the split put forward by this root node?

选项:

A.0.09 B.0.37 C.0.44 D.0.82

解释:

Explanation: A is correct. Before we can calculate the information gain as Ginibase Giniweighted, we first calculate for the base-level Gini measure by looking at the output variable being considered before we know anything about the features.

There are 5 properties that sold above EUR 8,000,000 and 5 that sold below.

Ginibase =

Using the feature “occupancy status” as the root node, we examine this feature and find that for the 4 properties that were occupied, 3 sold above the amount and only 1 sold below.

Ginioccupied =

In a similar fashion, we find that for the 6 properties that were not occupied, 2 sold above the amount and 4 sold below.

Gininotoccupied =

Thus, the weighted Gini measure for this feature is obtained as:

Giniweighted =

Therefore, Information Gain = Ginibase Giniweighted = 0.50-0.4097 = 0.0902 or approximately 0.09.

B is incorrect. This is just the Gini measure for the sold properties that were occupied.

C is incorrect. This is just the Gini measure for the sold properties that were not occupied.

D is incorrect. This is the unweighted sum of the Gini measure for the sold properties that were occupied and the Gini measure for the sold properties that weren’t occupied (0.375 + 0.444).

Learning Objective: Show how a decision tree is constructed and interpreted.

Reference: Global Association of Risk Professionals. Quantitative Analysis. New York, NY: Pearson, 2023, Chapter 15, Machine Learning and Prediction [QA-15].

如题

1 个答案

品职答疑小助手雍 · 2024年05月11日

同学你好,那是课上的例题,要看题目给的条件:

题目给的表格里10个中有5个,所以是5/10.

  • 1

    回答
  • 0

    关注
  • 156

    浏览
相关问题

NO.PZ2024030508000093 问题如下 A quantitative analyst supporting theacquisitions teof a Europecorporate reestate firm is using thecision tree technique to create a mol for forecasting property prices. Theanalyst compiles a training ta set compriseof information from 10 recentproperty sales, shown in the following table:The table also inclus thetarget variable of the mol: a class label incating whether the property wassolfor a prigreater thEUR 8,000,000. The analyst selects the occupancystatus the feature this usethe root no of the cision tree. Whatis the estimateinformation gain of the split put forwarthis root no? A.0.09 B.0.37 C.0.44 0.82 Explanation: Ais correct. Before we ccalculate the information gain Ginibase − Giniweighte we first calculate for the base-level Ginimeasure looking the output variable being consirebefore we knowanything about the features.There are 5 properties thsolabove EUR8,000,000 an5 thsolbelow.Ginibase =Using the feature “occupanstatus” theroot no, we examine this feature anfinthfor the 4 properties thwereoccupie 3 solabove the amount anonly 1 solbelow.Ginioccupie= ​In a similfashion, we finthfor the6 properties thwere not occupie 2 solabove the amount an4 solbelow.Gininotoccupie= ​Thus, the weighteGini measure for thisfeature is obtaineas:Giniweighte= ​Therefore, Information Gain = Ginibase − Giniweighte= 0.50-0.4097 = 0.0902 orapproximately 0.09.B is incorrect. This is just the Ginimeasure for the solproperties thwere occupieC is incorrect. This is just the Ginimeasure for the solproperties thwere not occupieis incorrect. This is the unweightesumof the Gini measure for the solproperties thwere occupieanthe Ginimeasure for the solproperties thweren’t occupie(0.375 + 0.444).Learning Objective: Show how a cision tree is constructeaninterpreteReference: GlobalAssociation of Risk Professionals. Quantitative Analysis. New York, NY:Pearson, 2023, Chapter 15, Machine Learning anPrection [QA-15]. 请问可以讲解一下如果用Entropy这道题应该怎么算吗

2024-11-04 06:47 2 · 回答

NO.PZ2024030508000093 问题如下 A quantitative analyst supporting theacquisitions teof a Europecorporate reestate firm is using thecision tree technique to create a mol for forecasting property prices. Theanalyst compiles a training ta set compriseof information from 10 recentproperty sales, shown in the following table:The table also inclus thetarget variable of the mol: a class label incating whether the property wassolfor a prigreater thEUR 8,000,000. The analyst selects the occupancystatus the feature this usethe root no of the cision tree. Whatis the estimateinformation gain of the split put forwarthis root no? A.0.09 B.0.37 C.0.44 0.82 Explanation: Ais correct. Before we ccalculate the information gain Ginibase − Giniweighte we first calculate for the base-level Ginimeasure looking the output variable being consirebefore we knowanything about the features.There are 5 properties thsolabove EUR8,000,000 an5 thsolbelow.Ginibase =Using the feature “occupanstatus” theroot no, we examine this feature anfinthfor the 4 properties thwereoccupie 3 solabove the amount anonly 1 solbelow.Ginioccupie= ​In a similfashion, we finthfor the6 properties thwere not occupie 2 solabove the amount an4 solbelow.Gininotoccupie= ​Thus, the weighteGini measure for thisfeature is obtaineas:Giniweighte= ​Therefore, Information Gain = Ginibase − Giniweighte= 0.50-0.4097 = 0.0902 orapproximately 0.09.B is incorrect. This is just the Ginimeasure for the solproperties thwere occupieC is incorrect. This is just the Ginimeasure for the solproperties thwere not occupieis incorrect. This is the unweightesumof the Gini measure for the solproperties thwere occupieanthe Ginimeasure for the solproperties thweren’t occupie(0.375 + 0.444).Learning Objective: Show how a cision tree is constructeaninterpreteReference: GlobalAssociation of Risk Professionals. Quantitative Analysis. New York, NY:Pearson, 2023, Chapter 15, Machine Learning anPrection [QA-15]. 还是不太明白为什么weight要用5/10讲义里面的例题权重是按照feature的个数来做的讲义485页,当我们weight large cap时候使用 large cap/tot和 非large cap/tot并不使用paivintot和no vintotal那为什么这道题不是用同一个思路呢?

2024-10-15 03:20 1 · 回答