老师，这道题的意思是需要移除Group1的全部stop word以及Group2里词频最低的naval吗？-有问必答-品职教育专注CFA ESG FRM CPA 考研等财经培训课程

NO.PZ2021083101000014问题如下 aitionpart of the text exploration step, Achler concts a term frequenanalysis to intify outliers. Achler summarizes the analysis in Exhibit 2.Baseon Exhibit 2, Achler shoulexclu from further analysis wor in: A.only Group 1B.only Group 2C.both Group 1 anGroup 2 C is correct. Achler shoulremove wor thare in both Group 1 anGroup 2. Term frequenvalues range between 0 an1. Group 1 consists of the highest frequenvalues (e.g., “the” = 0.04935), anGroup 2 consists of the lowest frequenvalues (e.g., “naval” = 1.0123e–05). Frequenanalysis on the processetext ta helps in filtering unnecessary tokens (or features) quantifying how important tokens are in a sentenanin the corpus a whole. The most frequent tokens (Group 1) strain the machine-learning mol to choose a cision bounry among the texts the terms are present across all the texts, whilea to mol unrfitting. The least frequent tokens (Group 2) mislethe machine-learning mol into classifying texts containing the rare terms into a specific class, whilea to mol overfitting. Intifying anremoving noise features is criticfor text classification applications.A is incorrebecause wor in both Group 1 anGroup 2 shoulremove The wor with high term frequenvalue are mostly stop wor, present in most sentences. Stop wor not carry a semantic meaning for the purpose of text analyses anML training, so they not contribute to fferentiating sentiment.B is incorrebecause wor in both Group 1 anGroup 2 shoulremove Terms with low term frequenvalue are mostly rare terms, ones appearing only onor twiin the tThey not contribute to fferentiating sentiment.考点Unstructureta Exploration 什么频率算作intermeate

2024-08-21 23:36 1 · 回答

NO.PZ2021083101000014问题如下 aitionpart of the text exploration step, Achler concts a term frequenanalysis to intify outliers. Achler summarizes the analysis in Exhibit 2.Baseon Exhibit 2, Achler shoulexclu from further analysis wor in: A.only Group 1B.only Group 2C.both Group 1 anGroup 2 C is correct. Achler shoulremove wor thare in both Group 1 anGroup 2. Term frequenvalues range between 0 an1. Group 1 consists of the highest frequenvalues (e.g., “the” = 0.04935), anGroup 2 consists of the lowest frequenvalues (e.g., “naval” = 1.0123e–05). Frequenanalysis on the processetext ta helps in filtering unnecessary tokens (or features) quantifying how important tokens are in a sentenanin the corpus a whole. The most frequent tokens (Group 1) strain the machine-learning mol to choose a cision bounry among the texts the terms are present across all the texts, whilea to mol unrfitting. The least frequent tokens (Group 2) mislethe machine-learning mol into classifying texts containing the rare terms into a specific class, whilea to mol overfitting. Intifying anremoving noise features is criticfor text classification applications.A is incorrebecause wor in both Group 1 anGroup 2 shoulremove The wor with high term frequenvalue are mostly stop wor, present in most sentences. Stop wor not carry a semantic meaning for the purpose of text analyses anML training, so they not contribute to fferentiating sentiment.B is incorrebecause wor in both Group 1 anGroup 2 shoulremove Terms with low term frequenvalue are mostly rare terms, ones appearing only onor twiin the tThey not contribute to fferentiating sentiment.考点Unstructureta Exploration frequency不是0—1吗？！Group 1这些词的频率是0.0几，不算高吧？

2022-07-26 10:42 1 · 回答

NO.PZ2021083101000014问题如下 aitionpart of the text exploration step, Achler concts a term frequenanalysis to intify outliers. Achler summarizes the analysis in Exhibit 2.Baseon Exhibit 2, Achler shoulexclu from further analysis wor in: A.only Group 1 B.only Group 2 C.both Group 1 anGroup 2 C is correct. Achler shoulremove wor thare in both Group 1 anGroup 2. Term frequenvalues range between 0 an1. Group 1 consists of the highest frequenvalues (e.g., “the” = 0.04935), anGroup 2 consists of the lowest frequenvalues (e.g., “naval” = 1.0123e–05). Frequenanalysis on the processetext ta helps in filtering unnecessary tokens (or features) quantifying how important tokens are in a sentenanin the corpus a whole. The most frequent tokens (Group 1) strain the machine-learning mol to choose a cision bounry among the texts the terms are present across all the texts, whilea to mol unrfitting. The least frequent tokens (Group 2) mislethe machine-learning mol into classifying texts containing the rare terms into a specific class, whilea to mol overfitting. Intifying anremoving noise features is criticfor text classification applications.A is incorrebecause wor in both Group 1 anGroup 2 shoulremove The wor with high term frequenvalue are mostly stop wor, present in most sentences. Stop wor not carry a semantic meaning for the purpose of text analyses anML training, so they not contribute to fferentiating sentiment.B is incorrebecause wor in both Group 1 anGroup 2 shoulremove Terms with low term frequenvalue are mostly rare terms, ones appearing only onor twiin the tThey not contribute to fferentiating sentiment.考点Unstructureta Exploration 请问老师移除group1只是因为这些都是stop wor如果换成其他词，要怎么判断是否需要移除？

2022-04-07 19:03 1 · 回答

NO.PZ2021083101000014 group 2 被移除是因为频率太低吗？

2022-02-04 17:23 1 · 回答

老师，这道题的意思是需要移除Group1的全部stop word以及Group2里词频最低的naval吗？

1 个答案

1

1

386

相关问题