Skip to content

LinhNguyen2901/URL-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

URL-classification

URL Classification for Phishing Detection

A machine learning-based solution to identify and classify potentially malicious phishing URLs.

Overview

This project implements a URL classification system that helps protect users from phishing attacks by analyzing and classifying URLs as either legitimate or potentially malicious. The system uses machine learning techniques to identify common patterns and characteristics associated with phishing URLs.

Members

  • Quan Pham
  • Linh Nguyen
  • Phat Tran
  • Kien Le

Tech Stack

  • Python
  • Scikit-learn
  • Pandas
  • Pytorch
  • Golang
  • HTML/JS/CSS
  • AWS Lambda
  • AWS API Gateway
  • MongoDB
  • Docker

Features

  • Real-time URL analysis and classification
  • Detection of common phishing patterns and techniques
  • Machine learning model trained on extensive phishing and legitimate URL datasets
  • Feature extraction from URLs including:
    • Domain characteristics
    • URL structure analysis
    • Special character frequency
    • Length-based features
    • TLD analysis

Installation

  1. Clone the repository:

git clone https://github.com/LinhNguyen2901/URL-classification.git

How It Works

  1. URL Preprocessing: Extracts and normalizes URL components
  2. Feature Engineering: Analyzes various URL characteristics
  3. Classification: Applies trained machine learning model to determine URL legitimacy
  4. Result Output: Provides classification result with confidence score

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/URL-classification)
  3. Commit your changes (git commit -m 'Add something')
  4. Push to the branch (git push origin feature/URL-classification)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Dataset sources
  • Contributors
  • Research papers and references

Disclaimer

This tool is meant to assist in identifying potential phishing URLs but should not be relied upon as the sole means of protection. Always exercise caution when clicking on unknown links and maintain proper cybersecurity practices.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •