░█▀▄░█▀█░█░█░█▀▀░█▀█░░░█▀█░█▀▄░█▀█░█░█░█░█░░░█▀▀░█▀▀░█▀▄░█▀█░█▀█░█▀▀░█▀▄ ░█▀▄░█▀█░▀▄▀░█▀▀░█░█░░░█▀▀░█▀▄░█░█░▄▀▄░░█░░░░▀▀█░█░░░█▀▄░█▀█░█▀▀░█▀▀░█▀▄ ░▀░▀░▀░▀░░▀░░▀▀▀░▀░▀░░░▀░░░▀░▀░▀▀▀░▀░▀░░▀░░░░▀▀▀░▀▀▀░▀░▀░▀░▀░▀░░░▀▀▀░▀░▀
High-speed proxy harvesting tool for security professionals
Fast as a Raven, Deadly as a Scavenger
github.com/sirhadey/raven-proxy-scraper
Raven Proxy Scraper is a sophisticated, multi-threaded tool designed to harvest and validate proxies from multiple public sources. Built for penetration testers, security researchers, and privacy enthusiasts, it efficiently scrapes proxies from decaying internet sources and delivers clean, validated lists ready for immediate use with Proxychains.
"Gathering proxies from the digital decay since 2024"
✅ Multi-source scraping - Aggregates from dozens of public proxy sources
✅ Intelligent validation - Tests connectivity and measures response times
✅ Proxychains-ready output - Generates properly formatted configuration files
✅ Threaded performance - Parallel scraping and validation for speed
✅ Extensible architecture - Easy to add new proxy sources
✅ MSF-style interface - Familiar CLI for security professionals
✅ Duplicate removal - Clean, unique proxy lists
✅ User-agent rotation - Avoids blocking and detection
✅ Multiple output formats - TXT, JSON, and Proxychains config
✅ Configurable via files - Customize without touching code
- Python 3.8 or higher
- pip package manager
# Clone repository
git clone https://github.com/sirhadey/raven-proxy-scraper.git
cd raven-proxy-scraper
# Install dependencies
pip3 install -r requirements.txt
# Make executable
chmod +x raven_scraper.pyrequests>=2.31.0
beautifulsoup4>=4.12.0# Basic usage (scrape + validate)
python3 raven_scraper.py
# Fast mode (no validation)
python3 raven_scraper.py --no-validate
# Use with Proxychains immediately
proxychains -f proxychains.conf curl http://ifconfig.mepython3 raven_scraper.py [OPTIONS]
Options:
-h, --help Show help message
-c, --config FILE Configuration file (default: raven.conf)
--no-validate Skip proxy validation (much faster)
--add-sites FILE Add custom sites from text file
--list-sites List all configured scraping sites
--test-url URL Custom URL for proxy validation
-o, --output FILE Custom output file for Proxychains config
-v, --verbose Verbose output# Basic scrape with validation
python3 raven_scraper.py
# Fast reconnaissance (no validation)
python3 raven_scraper.py --no-validate
# Add custom proxy sources
python3 raven_scraper.py --add-sites my_sites.txt
# Custom validation test
python3 raven_scraper.py --test-url "http://ifconfig.me"
# List all configured sources
python3 raven_scraper.py --list-sites
# Custom output filename
python3 raven_scraper.py -o my_proxychains.conf[sites]
urls =
https://api.proxyscrape.com/v2/?request=getproxies&protocol=socks5&timeout=10000
https://www.proxy-list.download/api/v1/get?type=socks5
https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/socks5.txt
https://raw.githubusercontent.com/ShiftyTR/Proxy-List/master/socks5.txt
https://spys.one/en/socks-proxy-list/
https://www.socks-proxy.net/
https://free-proxy-list.net/
https://www.sslproxies.org/
https://hidemy.name/en/proxy-list/
timeout = 10
max_workers = 20
[output]
proxychains_file = proxychains.conf
raw_file = proxies.txt
json_file = proxies.json
[validation]
test_url = http://httpbin.org/ip
timeout = 5
max_validation_workers = 50Create custom_sites.txt:
# Add your custom proxy sources (one per line)
https://api.proxyscrape.com/v2/?request=getproxies&protocol=socks4
https://raw.githubusercontent.com/hookzof/socks5_list/master/proxy.txt
https://www.proxy-list.download/api/v1/get?type=http
http://proxydb.net/__data/proxy-list.jsonThen run:
python3 raven_scraper.py --add-sites custom_sites.txt# Proxychains configuration file
# Generated by RavenProxyScraper
# Generated on: 2024-01-15T14:30:00.123456
[ProxyList]
# add proxy here ...
# meanwhile
# defaults set to "tor"
socks5 45.77.56.113 1080 # 1.23s
socks5 138.197.157.44 1080 # 0.87s
socks5 167.99.123.123 1080 # 2.11s
socks4 192.241.145.92 4145 # 1.56s
45.77.56.113:1080
138.197.157.44:1080
167.99.123.123:1080
192.241.145.92:4145{
"total_found": 5421,
"valid": 3124,
"generated_at": "2024-01-15T14:30:00.123456",
"proxies": [
{
"ip": "45.77.56.113",
"port": "1080",
"type": "socks5",
"response_time": 1.23,
"country": "Unknown",
"validated_at": "2024-01-15T14:30:00.123456"
}
]
}# Direct usage
proxychains -f proxychains.conf nmap -sT -Pn target.com
# Replace system config
sudo cp proxychains.conf /etc/proxychains.conf
proxychains firefox
# Use with specific tool
proxychains -f proxychains.conf sqlmap -u "http://target.com"# Daily proxy refresh (crontab)
0 2 * * * cd /opt/raven-proxy-scraper && python3 raven_scraper.py --no-validate
# Pipeline with other tools
python3 raven_scraper.py --no-validate | grep "socks5" > socks5_only.txtraven_scraper.py
├── ProxyScraper Class
│ ├── Site Scrapers
│ │ ├── ProxyScrape API
│ │ ├── Spys.one Parser
│ │ ├── Free-Proxy-List Extractor
│ │ └── Raw Text Parser
│ ├── Threaded Validation Engine
│ ├── Configuration Manager
│ └── Output Formatters
├── Command-line Interface
└── Configuration System
- ✅ ProxyScrape.com (API)
- ✅ Proxy-List.download
- ✅ Spys.one
- ✅ Socks-Proxy.net
- ✅ Free-Proxy-List.net
- ✅ SSLProxies.org
- ✅ HideMy.name
- ✅ GitHub Raw Lists
- ✅ Custom Sources
# Rotate IPs during reconnaissance
proxychains -f proxychains.conf nmap -sS -Pn --proxies target.com
# Bypass rate limiting
proxychains -f proxychains.conf ffuf -w wordlist.txt -u https://target.com/FUZZ# Avoid IP bans
proxychains -f proxychains.conf python3 scraper.py# Anonymous browsing
proxychains -f proxychains.conf firefox- Analyze proxy network distributions
- Study proxy geographical patterns
- Monitor proxy availability over time
IMPORTANT: This tool is for legitimate security testing and research only.
- Testing your own systems
- Authorized penetration tests
- Academic research
- Security education
- Unauthorized system access
- Bypassing paid services
- Illegal activities
- Harassment or abuse
- Copyright infringement
By using this tool, you agree to:
- Only test systems you own or have explicit permission to test
- Respect website terms of service and robots.txt
- Comply with all applicable laws and regulations
- Accept full responsibility for your actions
We welcome contributions! Here's how:
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR-USERNAME/raven-proxy-scraper.git
- Create a feature branch:
git checkout -b feature/amazing-feature
- Commit your changes:
git commit -m 'Add amazing feature' - Push to your branch:
git push origin feature/amazing-feature
- Open a Pull Request
- New proxy source scrapers
- Performance improvements
- Additional output formats
- Documentation updates
- Bug fixes
sirhadey
Security Researcher & Tool Developer
"Building tools that make security research more efficient"
MIT License
Copyright (c) 2024 sirhadey
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
If you find this tool useful:
- ⭐ Star the repository
- 🐛 Report issues
- 💡 Suggest features
- 🔗 Share with colleagues
Like a raven in flight - silent, swift, and effective. 🦅
Raven Proxy Scraper - Gathering proxies from the digital decay since 2024