开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

梨花0901 · 2022年12月13日

为什么不是看两者相加最大的时候是最好的模型

NO.PZ2021083101000016

问题如下:

Achler splits the DTM into training, cross-validation, and test datasets. Achler uses a supervised learning approach to train the logistic regression model in predicting sentiment. Applying the receiver operating characteristics (ROC) technique and area under the curve (AUC) metrics, Achler evaluates model performance on both the training and the cross-validation datasets. The trained model performance for three different logistic regressions’ threshold p-values is presented in Exhibit 3.

Based on Exhibit 3, which threshold p-value indicates the best fitting model?

选项:

A.

0.57

B.

0.79

C.

0.84

解释:

B is correct. The higher the AUC, the better the model performance. For the threshold p-value of 0.79, the AUC is 91.3% on the training dataset and 89.7% on the cross- validation dataset, and the ROC curves are similar for model performance on both datasets. These findings suggest that the model performs similarly on both training and CV data and thus indicate a good fitting model.

A is incorrect because for the threshold p-value of 0.57, the AUC is 56.7% on the training dataset and 57.3% on the cross- validation dataset. The AUC close to 50% signifies random guessing on both the training dataset and the crossvalidation dataset. The implication is that for the threshold p-value of 0.57, the model is randomly guessing and is not performing well.

C is incorrect because for the threshold p-value of 0.84, there is a substantial difference between the AUC on the training dataset (98.4%) and the AUC on the cross- validation dataset (87.1%). This suggests that the model performs comparatively poorly (with a higher rate of error or misclassification) on the cross- validation dataset when compared with training data. Thus, the implication is that the model is overfitted.

考点:Model Training: Performance Evaluation

为什么不是看两者相加最大的时候是最好的模型

1 个答案

星星_品职助教 · 2022年12月14日

同学你好,

1)AUC的判断准则为在training set和validation set里都尽可能的大,这实际上和“两者相加最大”的思想是一致的。如果能看到AUC分别在training set和validation set中的表现,则没有必要去重复添加这个环节。

2)如果仅看相加后的数字,则无法体现AUC在training set和validation set中各自的表现,容易造成误判。

3)仅看相加后的数字还会忽略掉Overfitting的问题,这个逻辑和本题不选择C选项的原因是一致的,即AUC在training set里过大了。


  • 1

    回答
  • 0

    关注
  • 334

    浏览
相关问题

NO.PZ2021083101000016问题如下 Achler splits the M into training, cross-valition, antest tasets. Achler uses a superviselearning approato train the logistic regression mol in precting sentiment. Applying the receiver operating characteristi(ROtechnique anarea unr the curve (AUmetrics, Achler evaluates mol performanon both the training anthe cross-valition tasets. The trainemol performanfor three fferent logistic regressions’ thresholp-values is presentein Exhibit 3.Baseon Exhibit 3, whithresholp-value incates the best fitting mol? A.0.57B.0.79C.0.84 B is correct. The higher the AUthe better the mol performance. For the thresholp-value of 0.79, the AUC is 91.3% on the training taset an89.7% on the cross- valition taset, anthe ROC curves are similfor mol performanon both tasets. These finngs suggest ththe mol performs similarly on both training anta anthus incate a goofitting mol.A is incorrebecause for the thresholp-value of 0.57, the AUC is 56.7% on the training taset an57.3% on the cross- valition taset. The AUC close to 50% signifies ranm guessing on both the training taset anthe crossvalition taset. The implication is thfor the thresholp-value of 0.57, the mol is ranmly guessing anis not performing well.C is incorrebecause for the thresholp-value of 0.84, there is a substantifferenbetween the AUC on the training taset (98.4%) anthe AUC on the cross- valition taset (87.1%). This suggests ththe mol performs comparatively poorly (with a higher rate of error or misclassification) on the cross- valition taset when comparewith training tThus, the implication is ththe mol is overfitte考点Mol Training: PerformanEvaluation 就是不知道那个P-value有什么用处,还有这两个概率相加到什么程度才过度拟合和,没明白。

2023-05-29 10:15 1 · 回答

NO.PZ2021083101000016 问题如下 Achler splits the M into training, cross-valition, antest tasets. Achler uses a superviselearning approato train the logistic regression mol in precting sentiment. Applying the receiver operating characteristi(ROtechnique anarea unr the curve (AUmetrics, Achler evaluates mol performanon both the training anthe cross-valition tasets. The trainemol performanfor three fferent logistic regressions’ thresholp-values is presentein Exhibit 3.Baseon Exhibit 3, whithresholp-value incates the best fitting mol? A.0.57 B.0.79 C.0.84 B is correct. The higher the AUthe better the mol performance. For the thresholp-value of 0.79, the AUC is 91.3% on the training taset an89.7% on the cross- valition taset, anthe ROC curves are similfor mol performanon both tasets. These finngs suggest ththe mol performs similarly on both training anta anthus incate a goofitting mol.A is incorrebecause for the thresholp-value of 0.57, the AUC is 56.7% on the training taset an57.3% on the cross- valition taset. The AUC close to 50% signifies ranm guessing on both the training taset anthe crossvalition taset. The implication is thfor the thresholp-value of 0.57, the mol is ranmly guessing anis not performing well.C is incorrebecause for the thresholp-value of 0.84, there is a substantifferenbetween the AUC on the training taset (98.4%) anthe AUC on the cross- valition taset (87.1%). This suggests ththe mol performs comparatively poorly (with a higher rate of error or misclassification) on the cross- valition taset when comparewith training tThus, the implication is ththe mol is overfitte考点Mol Training: PerformanEvaluation AUC不是越大越趋近于1越好吗?B和C该如何综合对比?

2022-08-14 18:13 1 · 回答