开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

aileen20180623 · 2022年01月19日

3的意思

* 问题详情,请 查看题干

NO.PZ202108310100000102

问题如下:

Which of Bector’s statements regarding TF, IDF, and TF–IDF is correct?

选项:

A.

Statement 1

B.

Statement 2

C.

Statement 3

解释:

C is correct.

Statement 3 is correct. TF–IDF values vary by the number of documents in the dataset, and therefore, the model performance can vary when applied to a dataset with just a few documents.

Statement 1 is incorrect because IDF is calculated as the log of the inverse, or reciprocal, of the document frequency measure.

Statement 2 is incorrect because TF at the sentence (not collection) level is multiplied by IDF to calculate TF–IDF.

A is incorrect because Statement 1 is incorrect. IDF is calculated as the log of the inverse, or reciprocal, of the document frequency (DF) measure.

B is incorrect because Statement 2 is incorrect. TF at the sentence (not collection) level is multiplied by IDF to calculate TF–IDF.

statment3的意思是?为什么选3,在课件里哪里提到了?

1 个答案

星星_品职助教 · 2022年01月20日

同学你好,

Statement 3意为document的数量不同会影响到TF-IDF的计算结果,当document变的很少的时候,model performance也会发生变化。

举例来说,当document为段落、或document为整篇文章、或每个句子时,算出来的TF-IDF会不同,不同的TF-IDF就对应着不同的model performance。

这句话是原版书直接给出的结论,了解一下就可以。