开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

西红柿面 · 2023年04月19日

Outlier不一定会使得R2升高对吧?

NO.PZ2021061603000047

问题如下:

An economist collected the monthly returns for KDL's portfolio and a diversified stock index. The data collected are shown in the following table:


The economist calculated the correlation between the two returns and found it to be 0.996. The regression results with the KDL return as the dependent variable and the index return as the independent variable are given as follows:


When reviewing the results, Andrea Fusilier suspected that they were unreliable. She found that the returns for Month 2 should have been 7.21% and 6.49%, instead of the large values shown in the first table. Correcting these values resulted in a revised correlation of 0.824 and the following revised regression results:


Explain how the bad data affected the results.


选项:

解释:

The Month 2 data point is an outlier, lying far away from the other data values.

Because this outlier was caused by a data entry error, correcting the outlier improves the validity and reliability of the regression. In this case, revised R2 is lower (from 0.9921 to 0.6784). The outliers created the illusion of a better fit from the higher R2; the outliers altered the estimate of the slope. The standard error of the estimate is lower when the data error is corrected (from 2.861 to 2.0624), as a result of the lower mean square error. However, at a 0.05 level of significance, both models fit well. The difference in the fit is illustrated in Exhibit 1:


The outliers created the illusion of a better fit from the higher R2; the outliers altered the estimate of the slope. 

这个结论没有普适性对吧?还是得具体情况具体分析?


1 个答案
已采纳答案

星星_品职助教 · 2023年04月19日

同学你好,

是的。需要具体问题具体分析。

  • 1

    回答
  • 0

    关注
  • 295

    浏览
相关问题

NO.PZ2021061603000047问题如下 economist collectethe monthly returnsfor K's portfolio ana versifiestoinx. The ta collecteare shownin the following table:The economist calculatethe correlationbetween the two returns anfounit to 0.996. The regression results withthe K return the pennt variable anthe inx return theinpennt variable are given follows:When reviewing the results, Anea Fusiliersuspecteththey were unreliable. She founththe returns for Month 2shoulhave been 7.21% an6.49%, insteof the large values shown in thefirst table. Correcting these values resultein a revisecorrelation of 0.824anthe following reviseregression results:Explain how the bta affectetheresults. The Month 2 ta point is outlier, lyingfawfrom the other ta values.Because this outlier wcausea taentry error, correcting the outlier improves the vality anreliability ofthe regression. In this case, reviseR2 is lower (from 0.9921 to 0.6784). Theoutliers createthe illusion of a better fit from the higher R2; the outliersalterethe estimate of the slope. The stanrerror of the estimate is lowerwhen the ta error is correcte(from 2.861 to 2.0624), a result of thelower mesquare error. However, a 0.05 level of significance, both molsfit well. The fferenin the fit is illustratein Exhibit 1: 请问决定系数R的平方下面的stanrerror指的是什么?

2023-07-17 13:34 1 · 回答

NO.PZ2021061603000047 问题如下 economist collectethe monthly returnsfor K's portfolio ana versifiestoinx. The ta collecteare shownin the following table:The economist calculatethe correlationbetween the two returns anfounit to 0.996. The regression results withthe K return the pennt variable anthe inx return theinpennt variable are given follows:When reviewing the results, Anea Fusiliersuspecteththey were unreliable. She founththe returns for Month 2shoulhave been 7.21% an6.49%, insteof the large values shown in thefirst table. Correcting these values resultein a revisecorrelation of 0.824anthe following reviseregression results:Explain how the bta affectetheresults. The Month 2 ta point is outlier, lyingfawfrom the other ta values.Because this outlier wcausea taentry error, correcting the outlier improves the vality anreliability ofthe regression. In this case, reviseR2 is lower (from 0.9921 to 0.6784). Theoutliers createthe illusion of a better fit from the higher R2; the outliersalterethe estimate of the slope. The stanrerror of the estimate is lowerwhen the ta error is correcte(from 2.861 to 2.0624), a result of thelower mesquare error. However, a 0.05 level of significance, both molsfit well. The fferenin the fit is illustratein Exhibit 1: 请问下这题的来源

2022-10-20 10:58 1 · 回答

NO.PZ2021061603000047 问题如下 economist collectethe monthly returnsfor K's portfolio ana versifiestoinx. The ta collecteare shownin the following table:The economist calculatethe correlationbetween the two returns anfounit to 0.996. The regression results withthe K return the pennt variable anthe inx return theinpennt variable are given follows:When reviewing the results, Anea Fusiliersuspecteththey were unreliable. She founththe returns for Month 2shoulhave been 7.21% an6.49%, insteof the large values shown in thefirst table. Correcting these values resultein a revisecorrelation of 0.824anthe following reviseregression results:Explain how the bta affectetheresults. The Month 2 ta point is outlier, lyingfawfrom the other ta values.Because this outlier wcausea taentry error, correcting the outlier improves the vality anreliability ofthe regression. In this case, reviseR2 is lower (from 0.9921 to 0.6784). Theoutliers createthe illusion of a better fit from the higher R2; the outliersalterethe estimate of the slope. The stanrerror of the estimate is lowerwhen the ta error is correcte(from 2.861 to 2.0624), a result of thelower mesquare error. However, a 0.05 level of significance, both molsfit well. The fferenin the fit is illustratein Exhibit 1: 何老师总说有P-value的话就不用看t统计量,又说P是在跟不上切得的面积,那么面积就是百分比,是和α进行对比的?但题目里p-value都是值,t-statistic不也是算出来的检验统计量的值吗?这两个有什么区别呢?请老师帮忙解答,谢谢!

2022-10-08 11:34 2 · 回答

NO.PZ2021061603000047问题如下 economist collectethe monthly returnsfor K's portfolio ana versifiestoinx. The ta collecteare shownin the following table:The economist calculatethe correlationbetween the two returns anfounit to 0.996. The regression results withthe K return the pennt variable anthe inx return theinpennt variable are given follows:When reviewing the results, Anea Fusiliersuspecteththey were unreliable. She founththe returns for Month 2shoulhave been 7.21% an6.49%, insteof the large values shown in thefirst table. Correcting these values resultein a revisecorrelation of 0.824anthe following reviseregression results:Explain how the bta affectetheresults. The Month 2 ta point is outlier, lyingfawfrom the other ta values.Because this outlier wcausea taentry error, correcting the outlier improves the vality anreliability ofthe regression. In this case, reviseR2 is lower (from 0.9921 to 0.6784). Theoutliers createthe illusion of a better fit from the higher R2; the outliersalterethe estimate of the slope. The stanrerror of the estimate is lowerwhen the ta error is correcte(from 2.861 to 2.0624), a result of thelower mesquare error. However, a 0.05 level of significance, both molsfit well. The fferenin the fit is illustratein Exhibit 1: 请问题目中提到求得correlation有什么用?

2022-07-29 20:19 1 · 回答