开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

天气不错 · 2021年02月14日

an “Interest Coverage Ratio” variable equal to EBIT divided by interest expense,这个不算aggregation吗?

NO.PZ2015120204000043

问题如下:

After cleansing the data, Steele then preprocesses the dataset. She creates two new variables: an “Age” variable based on the firm’s IPO date and an “Interest Coverage Ratio” variable equal to EBIT divided by interest expense. She also deletes the “IPO Date” variable from the dataset.

Exhibit 1 Sample of Raw Structured Data Before Cleansing


During the preprocessing of the data in Exhibit 1, what type of data transformation did Steele perform during the data preprocessing step?

选项:

A.

Extraction

B.

Conversion

C.

Aggregation

解释:

A is correct. During the data preprocessing step, Steele created a new “Age” variable based on the firm’s IPO date and then deleted the “IPO Date” variable from the dataset. She also created a new “Interest Coverage Ratio” variable equal to EBIT divided by interest expense. Extraction refers to a data transformation where a new variable is extracted from a current variable for ease of analyzing and using for training an ML model, such as creating an age variable from a date variable or a ratio variable. Steele also performed a selection transformation by deleting the IPO Date variable, which refers to deleting the data columns that are not needed for the project.

 an “Interest Coverage Ratio” variable equal to EBIT divided by interest expense,这个不算aggregation吗?

1 个答案
已采纳答案

星星_品职助教 · 2021年02月14日

同学你好,

Aggregation(数据合并)的定义为:相似的变量可以加总或合并为一个新变量。例如工资和其他收入其实只是收入的两种形式,可以合并为一项“总收入”。

所以可以看出 EBIT和 interest expense这两个变量既不相似,新变量也不是加总或合并后的结果,而是一个比值(ratio)。

-------------

而Extraction(数据提取)的定义为:从现有的变量中,提取出一组新的变量便于分析。例如将月-日-年组成的出生日期直接用于分析并不直观,这个时候就可以据此新提取出一个变量“年龄”。

所以可以看出 ,无论是根据“IPO date”创造“Age”变量,还是根据“ EBIT”和 “interest expense”这两个变量创造出一个“Interest Coverage Ratio”变量,都是Extraction的应用。

天气不错 · 2021年02月14日

懂了,“相似变量”和“加总”,很清楚,谢谢!