小一一 · 2021年03月09日

关于statement1 CART里面的cutoff value为什么不属于超参数嘛？

NO.PZ2015120204000034

问题如下：

Paul suggests the following step which would be repeated every quarter.

Step 3 For each of the 20 different groups, we use labeled data to train a model that will predict the five stocks (in any given group) that are most likely to become acquisition targets in the next one year.

Comparing two ML models that could be used to accomplish Step 3, which statement(s) best describe(s) the advantages of using Classification and Regression Trees (CART) instead of K-Nearest Neighbor (KNN)?

Statement I For CART there is no requirement to specify an initial hyperparameter (like K).

Statement II For CART there is no requirement to specify a similarity (or distance) measure.

Statement III For CART the output provides a visual explanation for the prediction

选项：

Statement I only

Statement III only

Statements I, II and III

解释：

C is correct. The advantages of using CART over KNN to classify companies into two categories (“not acquisition target” and “acquisition target”), include all of the following: For CART there are no requirements to specify an initial hyperparameter (like K) or a similarity (or distance) measure as with KNN, and CART provides a visual explanation for the prediction (i.e., the feature variables and their cut-off values at each node).

A is incorrect, because CART provides all of the advantages indicated in Statements I, II and III.

B is incorrect, because CART provides all of the advantages indicated in Statements I, II and III.

有点不明白为什么A是对的

添加评论

2 个答案

已采纳答案

星星_品职助教 · 2021年03月09日

同学你好，

initial hyperparameter在CART算法中指的是设置“tree”的最大高度，节点的最多数量，每个节点最少包含的数据量这些，这当中并不包括cutoff value，后者是跟着feature（用来分类的特征值）走的。即有多少个features，就要有多少个cutoff value。而不是简单粗暴就设定一个，然后整个模型都遵循这个参数的法则来运算。

对于A选项而言，CART是可以设置超参数的（例如以上那些），但是这不是CART算法可以使用的必须前提条件（requirement）。只有在当建模者认为CART有可能产生过度拟合问题时，才可能选择用加入这些超参数的方式来降低overfitting。

其次，加入超参数并不是降低overfitting唯一的选项。也可以通过在建立完CART后，对这个算法进行“剪枝”（pruning）处理来改善overfitting，这同样不需要超参数的参与。

所以Statement I中对于超参数“no requirement”的描述是正确的。

添加评论

廿三里 · 2021年03月17日

Cutoff value 为什么不是超参数呀，这个不也是要构建前设立好的吗

星星_品职助教 · 2021年03月17日

@ 廿三里

这道题目回复的第一段解释的就是为什么cutoff value不是超参数。

回复为：“initial hyperparameter在CART算法中指的是设置“tree”的最大高度，节点的最多数量，每个节点最少包含的数据量这些，这当中并不包括cutoff value，后者是跟着feature（用来分类的特征值）走的。即有多少个features，就要有多少个cutoff value。而不是简单粗暴就设定一个，然后整个模型都遵循这个参数的法则来运算。”

并不是提前设定的就是超参数，对于Supervised learning而言，所有的features都是提前设定好的。

添加评论

关于statement1 CART里面的cutoff value为什么不属于超参数嘛？

2 个答案

2

4

642

相关问题