开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

dejiazheng · 2025年02月19日

怎样区分default和non default哪个是positive,哪个是negative?

NO.PZ2024082801000116

问题如下:

Googol Mining 2 Case Scenario

James Quinn is the founder and CEO of GoogolMining Inc. (GMI). The company applies data science and machine learning to assist lenders in predicting the credit risk of their customers.

Today, Quinn is meeting with Judy Wu, the chief lending officer at onLineLoans.com. Wu’s company already uses some of Quinn’s machine learning (ML) methods to better screen loan applicants whose credit files are usually thin—that is, having limited or no past credit history and low incomes, making traditional credit evaluation difficult. Quinn is introducing Wu to the benefits of using text-based data that can be found in the loan applications to further aid the screening process.

Quinn states that the choice and pattern of words used by applicants in describing the loan request and its purpose and the use of pleading terms, excessive uppercase words, and spelling mistakes can all be very informative. He provides the following excerpts from two loan applications (Exhibit 1).

Exhibit 1:

Selected Text-Based Data from a Few Credit Applications

Wu tells Quinn that she has heard a little about text mining for clues about an individual’s behavior and recalls that text preparation must be carried out by removing such items as HTML tags, punctuation, numbers, and stop words and eliminating the distinction between uppercase and lowercase words by lowercasing them all.

Quinn subdivides the word clouds generated from the document term matrix (DTM) into defaulting and non-defaulting loans and selectively illustrates how the words, n-grams, and more common themes differ between the two groups:

  • The wording in non-defaulting loans often has indications of a brighter future—for example, “graduate” and “wedding”—and reasons why the loan is needed, such as “student_loan” or “car_insur.”
  • The wording in defaulting loans often contains statements of desperation and pleading terms—for example, “and_need” and “your_help”—as well as detailed explanations as to why these circumstances have arisen, such as “loan_explain” and “loan_because.”

Quinn uses logistic regression to train the model: Defaulting loans are classified as Class 1, and non-defaulting loans are classified as Class 0. After tuning, the threshold p-value of 0.60 (for Class 1) is used to predict the outcome for each loan application in the test data. Exhibit 2 indicates the result for one of those applications.

Exhibit 2:


After running his model on the test set, Quinn produces a confusion matrix for evaluating the performance of the model (Exhibit 3). He reminds Wu that since the number of defaults in the dataset is likely much smaller than the number of non-defaults, this needs to be considered in evaluating model performance.

Exhibit 3:

Confusion Matrix of Model Results for Test Data of Loan Default Classifications with Threshold p-Value = 0.60

Based on Exhibit 2, the most appropriate statement about the model’s performance for the selected credit applicant is that it results in:

选项:

A.A.a Type I error. B.B.a Type II error. C.C.the correct classification.

解释:

  • A is Incorrect. A Type I error is a false positive: It would have arisen if the loan did not default but was predicted to do so.

  • B is Correct. The threshold p-value for Class 1 (default) is 0.60, which has not been met (p = 0.41); thus, the final ML model predicts that the applicant would be a non-defaulter (Class 0). The loan has been misclassified as not being likely to default when it defaulted. This is a Type II error (a false negative).

  • C is Incorrect. A misclassification has occurred, resulting in a Type II error.

我把违约当成positive.低于门槛值所以被错误划分为FP,选成了一类错误

1 个答案

品职助教_七七 · 2025年02月20日

嗨,努力学习的PZer你好:


Positive被定义为Class 1时的情况。Negative被定义为Class 0时的情况。这一点需要记忆。

题干中说明Defaulted是class 1,所以Defaulted就是positive。

由于预测出的结果是0,而真实结果是1。所以这个结果就是①false;②由于结果是0,所以是negative。也就是这个预测的结果为false negative。

False Negative被定义为Type II error。

----------------------------------------------
就算太阳没有迎着我们而来,我们正在朝着它而去,加油!

  • 1

    回答
  • 0

    关注
  • 4

    浏览
相关问题