STATSML 604: Car Loan Propensity Prediction using Logistic Regression
Overview
Predicting whether a customer is likely to take a car loan can significantly improve a bank’s ability to design targeted campaigns, manage credit risk, and optimize resource allocation. In this tutorial, we build a Logistic Regression model to classify a customer's car loan propensity using their profile and transaction behavior.
Objective
Build and deploy a classification model that predicts if a customer is likely to opt for a car loan, based on:
Demographic and behavioral data
Financial account balances
Derived income and loan features
Sample Dataset Structure
Fields & Description
customer_id
STRING
Unique identifier
age
INT
Customer age
gender
STRING
Male / Female / Other
marital_status
STRING
Married / Single / Divorced
employment_status
STRING
Employed / Self-employed / Retired etc.
annual_income
DECIMAL
Yearly income
credit_score
INT
Credit bureau score
checking_balance
DECIMAL
Balance in checking account
savings_balance
DECIMAL
Balance in savings account
monthly_debit_volume
DECIMAL
Monthly average spending
monthly_credit_volume
DECIMAL
Monthly average income
loan_history
ARRAY<STRUCT>
Past loans (type, amount, status)
existing_auto_loan
BOOLEAN
Existing car loan
owns_vehicle
BOOLEAN
Whether customer owns a car
propensity_car_loan
FLOAT
Target: Likelihood (0-1) of taking a loan
Derived Features
These features are engineered to improve model performance:
savings_to_income_ratio
savings_balance / annual_income
debt_to_income_ratio
(monthly_debit_volume * 12) / annual_income
avg_monthly_net_income
monthly_credit_volume - monthly_debit_volume
loan_count
COUNT(loan_history)
previous_auto_loans
COUNT WHERE loan_type = 'Auto'
good_credit_flag
credit_score >= 700
high_cash_reserve_flag
checking_balance + savings_balance > 10000
Feature Weight Intuition
Not having an auto loan
0.30
More likely to consider buying
Doesn’t own a vehicle
0.20
May need a car, hence loan
Good credit score
0.20
More eligible for credit
Income > $60K
0.10
Likely to get approved
Checking balance > $2K
0.10
Has funds for down payment
Net income > $2K
0.10
Better repayment capacity
Model Definition
Model Evaluation
Results
AUC ROC
0.9362
Accuracy
0.9361
Precision
0.9367
Recall
0.9372
Predict on New Customers
Use Cases
Targeted Campaigns: Focus offers on high-propensity segments
Loan Eligibility Filtering: Pre-qualify candidates automatically
Customer Risk Profiling: Understand financial behavior deepl
Last updated