开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

锦鲤本鲤 · 2022年01月21日

这题不太会做,能解析下吗

* 问题详情,请 查看题干

NO.PZ202108310100000207

问题如下:

Based on Exhibit 3, which threshold p-value indicates the best fitting model?

选项:

A.

0.57

B.

0.79

C.

0.84

解释:

B is correct. The higher the AUC, the better the model performance. For the threshold p-value of 0.79, the AUC is 91.3% on the training dataset and 89.7% on the cross- validation dataset, and the ROC curves are similar for model performance on both datasets. These findings suggest that the model performs similarly on both training and CV data and thus indicate a good fitting model.

A is incorrect because for the threshold p-value of 0.57, the AUC is 56.7% on the training dataset and 57.3% on the cross- validation dataset. The AUC close to 50% signifies random guessing on both the training dataset and the crossvalidation dataset. The implication is that for the threshold p-value of 0.57, the model is randomly guessing and is not performing well.

C is incorrect because for the threshold p-value of 0.84, there is a substantial difference between the AUC on the training dataset (98.4%) and the AUC on the cross- validation dataset (87.1%). This suggests that the model performs comparatively poorly (with a higher rate of error or misclassification) on the cross- validation dataset when compared with training data. Thus, the implication is that the model is overfitted.

关于本题的考点重要吗
1 个答案

星星_品职助教 · 2022年01月21日

同学你好,

1)用ROC这种方法来评估model training的结论为AUC越大,说明模型(在这个dataset里)越好。所以首先要选择AUC大的,直接排除AUC不大的A选项。

2)和B选项相比,C选项在training set里的AUC和在CV set里的AUC差距较大,这说明这个模型只在training set里表现好,在CV set里表现(相对)差。这是Overfitting问题的表现(即模型可以很好的拟合现在的数据,但是用于预测未来却效果很差)。

而B选项的两个AUC差距不大,说明这个模型无论是在training set还是在CV set里表现都不错而且表现差不多,所以是一个“ best fitting model”

--------

把对应的结论记忆一下,这个考点就算搞定了。