NO.PZ202304050200007202
问题如下:
Steele’s Step 2 can be best described as:
选项:
A.
tokenization
B.
lemmatization
C.
standardization
解释:
A is correct. Tokenization is the process of splittinga given text into separate tokens. This step takes place after cleansing theraw text data (removing html tags, numbers, extra white spaces, etc.). Thetokens are then normalized to create the bag-of-words (BOW).
B是要把单词变成原型,为啥不对?