Skip to content

jjadeb/DSCI310_Group-12_Credit-Risk-Classification

 
 

Repository files navigation

Predictive Modelling: German Credit Risk

Table Of Contents

Authors

  • Shahrukh Islam Prithibi
  • Sophie Yang
  • Yovindu Don
  • Jade Bouchard

About

The goal of our analysis is to classify whether someone is a good or bad credit risk using attributes such as Credit History, Duration, and Residence. Our best-performing model is a Random Forest model. This model gave us an accuracy of 0.8 on unseen data, a decent result compared to the dummy model's accuracy of 0.7. We also obtained a precision score of 0.8, a recall score of 0.95, and F1 Score of 0.87. Our model performs decently well in terms of identifying people who are a good credit risk. However, if this model is to have a hand in real-world decision-making, precision should be improved to minimize classifying poor credit risks as good credit risks (false positives). In addition, more research should be done to ensure the model produces fair and equitable recommendations.

Data Description

The Statlog (German Credit Data) dataset, sourced from this UCI’s Machine Learning Repository, used for classifying individuals as good or bad credit risks based on a variety of attributes. A cost matrix is required for evaluation, where misclassification costs are outlined. The cost matrix indicates that it is worse to classify a customer as good when they are bad, compared to classifying a customer as bad when they are good. The dataset contains 1000 instances with 20 features. Each feature has a different role, type, and demographic information.

Report

The final report can be found here

Docker

Build and run the project using Docker by following these steps:

Setup

  1. First, ensure you have Docker installed and running on your machine.
  2. Clone this repository, and navigate to the root of the repository in a terminal window.

Choose one of the following two options for launching the Docker container.


The preferred method to run the Docker container is to use docker-compose. Run the following command in the terminal to build and start the container. This command activates the commands specified in docker-compose.yml.

docker-compose up

Check the Developer Notes section of this README for details on how to run our analysis.

Stop the Docker container by first typing Cntrl + Cin the terminal where you launched the container, and then run the following command:

docker-compose rm

Another method of running the docker container is by executing the following commands:

Build the Docker image (optional):

docker build -t yovindu/project --platform=linux/amd64 .

Run the Docker container:

docker run -it --rm -p 8888:8888 -v /"$(pwd)":/home/jovyan --platform=linux/amd64 yovindu/project

Check the Developer Notes section of this README for details on how to run our analysis.

In order to exit the container type Cntrl + C in the terminal where you launched the container.

Developer Notes

Working with the project in the container using Jupyter lab

(Below instructions copied form this repository)

After launching the Docker Container, in the terminal look for a URL that starts with http://127.0.0.1:8888/lab?token= . Copy and paste that URL into your browser.

You should now see the Jupyter lab IDE in your browser, with all the project files visible in the file browser pane on the left side of the screen.

Working with the project in the container using VSCode

Note if you prefer to work in VS Code, you can run the following from the root of the project in a terminal in VS Code to launch the container in the terminal there:

docker compose run --rm myapp bash

To exit the container type exit in the terminal.

Project Execution and Cleanup

Open a terminal at the project root in Jupyter or VSCode. Use the command

make clean-all

to reset the project to a clean state (i.e., remove all files generated by previous runs of the analysis).

To run the analysis in its entirety, enter the command

make all

in the terminal in the project root.

Dependencies

Docker is a container solution used to manage the software dependencies for this project. The Docker image used for this project is based on the quay.io/jupyter/scipy-notebook:2024-02-24 image. Additional dependencies are specified int the Dockerfile.

Running the tests

Tests are run using the pytest command in the root of the project. Run

pytest tests/*

License

The Credit Risk Analysis report contained herein is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. See the license file for more information. If re-using/re-mixing please provide attribution and link to this webpage. The software code contained within this repository is licensed under the MIT license. See the license file for more information.

References

Costa e Silva, E., Lopes, I. C., Correia, A., & Faria, S. (2020). A logistic regression model for consumer default risk. Journal of Applied Statistics, 47(13-15), 2879–2894. https://doi.org/10.1080/02664763.2020.1759030

Dobby, C., & Vossos, T. (2024, February 22). Wall Street to Follow Canada’s Hot Risk Transfer Trade. Bloomberg.com. https://www.bloomberg.com/news/articles/2024-02-22/wall-street-to-follow-canada-s-hot-capital-relief-trade

Goraieb, E., Kumar, S., & Pepanides, T. (n.d.). Credit Risk | Risk & Resilience | McKinsey & Company. https://www.mckinsey.com/capabilities/risk-and-resilience/how-we-help-clients/credit-risk

Personal characteristics, grounds of discrimination protected in the BC Human Rights Code - BC Human Rights Tribunal. (2023, May 9). BC Human Rights Tribunal. https://www.bchrt.bc.ca/human-rights-duties/personal-characteritics/

About

A reproducible and auditable analysis of credit risk for The University of British Columbia's DSCI310 class. Created as a team of four.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 85.6%
  • HTML 6.8%
  • JavaScript 4.5%
  • Python 2.0%
  • Makefile 0.6%
  • TeX 0.3%
  • Dockerfile 0.2%