-
Notifications
You must be signed in to change notification settings - Fork 44
Expand file tree
/
Copy pathAryan Singh
More file actions
9 lines (6 loc) · 2.24 KB
/
Aryan Singh
File metadata and controls
9 lines (6 loc) · 2.24 KB
1
2
3
4
5
6
7
8
9
BUSINESS UNDERSTANDING
The loan is one of the most important products of the banking. All the banks are trying to figure out effective business strategies to persuade customers to apply their loans. However, there are some customers behave negatively after their application are approved. To prevent this situation, banks have to find some methods to predict customers’ behaviours. Machine learning algorithms have a pretty good performance on this purpose, which are widely-used by the banking. Here, I will work on loan behaviours prediction using machine learning models.
The data set I use contains several tables with plenty of information about the accounts of the bank customers such as loans, transaction records and credit cards. Here, my main purpose is to predict customer behaviours about loan for each account. Thus, the most important table here is table “loan”. And after checking the description of all the features, we think “order”, “trans” and “card” contain useful info for our purpose. And I also need to use account and disposition to combine them together. Finally, the tables required are highlighted in the following figure.
After having all data in Python, I need to separate a holdout testing data set from the entire data, which is for avoiding the overfitting. Since the data size is small, I apply K-hold out cross validation here for generalization, which randomly splits the data set into two parts.
CONLUSION
Actually, most of the binary classification models will give the prediction of probability first and then assign the probabilities to 1 or 0 based on the default threshold of 0.5. To improve the recall of the model, we can use the the probabilities predicted by the model and set threshold by ourselves. The threshold is set based on several factors such as business objectives. In the bank loan behaviour prediction, for example, banks want to control the loss to a acceptable level, so they may use a relatively low threshold. This means more customers will be grouped as “potential bad customers” and their profiles will be checked carefully later by the credit risk management team. In this way, banks can detect the default behaviours in the earlier stage and conduct the corresponding actions to reduce the possible loss.