Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
211 changes: 174 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# 🚨 Bucket Stream is no longer maintained. If you need support or consultation for your red teaming endeavours, drop me an e-mail paul@darkport.co.uk 🚨

# Bucket Stream

**Find interesting Amazon S3 Buckets by watching certificate transparency logs.**

This tool simply listens to various certificate transparency logs (via certstream) and attempts to find public S3 buckets from permutations of the certificates domain name.

> **Note:** This project has been updated and modernized for Python 3. The original project is no longer maintained by the original author, but has been updated to work with current dependencies and Python versions.

![Demo](https://i.imgur.com/ZFkIYhD.jpg)

**Be responsible**. I mainly created this tool to highlight the risks associated with public S3 buckets and to put a different spin on the usual dictionary based attacks. Some quick tips if you use S3 buckets:
Expand All @@ -19,60 +19,197 @@ Thanks to my good friend David (@riskobscurity) for the idea.

## Installation

Python 3.4+ and pip3 are required. Then just:
**Requirements:** Python 3.7+ (Python 3.8+ recommended)

1. Clone the repository:
```bash
git clone https://github.com/eth0izzle/bucket-stream.git
cd bucket-stream
```

2. Create and activate a virtual environment (recommended):
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

1. `git clone https://github.com/eth0izzle/bucket-stream.git`
2. *(optional)* Create a virtualenv with `pip3 install virtualenv && virtualenv .virtualenv && source .virtualenv/bin/activate`
2. `pip3 install -r requirements.txt`
3. `python3 bucket-stream.py`
3. Install dependencies:
```bash
pip install -r requirements.txt
```

4. Configure (optional but recommended):
Edit `config.yaml` and add your AWS credentials to avoid rate limiting:
```yaml
aws_access_key: 'your-access-key'
aws_secret: 'your-secret-key'
```

## Usage

Simply run `python3 bucket-stream.py`.

If you provide AWS access and secret keys in `config.yaml` Bucket Stream will attempt to access authenticated buckets and identity the buckets owner. **Unauthenticated users are severely rate limited.**

usage: python bucket-stream.py

Find interesting Amazon S3 Buckets by watching certificate transparency logs.

optional arguments:
-h, --help Show this help message and exit
--only-interesting Only log 'interesting' buckets whose contents match
anything within keywords.txt (default: False)
--skip-lets-encrypt Skip certs (and thus listed domains) issued by Let's
Encrypt CA (default: False)
-t , --threads Number of threads to spawn. More threads = more power.
Limited to 5 threads if unauthenticated.
(default: 20)
--ignore-rate-limiting
If you ignore rate limits not all buckets will be
checked (default: False)
-l, --log Log found buckets to a file buckets.log (default:
False)
-s, --source Data source to check for bucket permutations. Uses
certificate transparency logs if not specified.
(default: None)
-p, --permutations Path of file containing a list of permutations to try
(see permutations/ dir). (default: permutations\default.txt)
### Basic Usage

Simply run:
```bash
python bucket-stream.py
```

If you provide AWS access and secret keys in `config.yaml`, Bucket Stream will attempt to access authenticated buckets and identify the bucket owner. **Unauthenticated users are severely rate limited (max 5 threads).**

### Command Line Options

```
usage: python bucket-stream.py

Find interesting Amazon S3 Buckets by watching certificate transparency logs.

options:
-h, --help Show this help message and exit
--only-interesting Only log 'interesting' buckets whose contents match
anything within keywords.txt (default: False)
--skip-lets-encrypt Skip certs (and thus listed domains) issued by Let's
Encrypt CA (default: False)
-t, --threads Number of threads to spawn. More threads = more power.
Limited to 5 threads if unauthenticated. (default: 20)
--ignore-rate-limiting
If you ignore rate limits not all buckets will be
checked (default: False)
-l, --log Log found buckets to a file buckets.log (default: False)
-s, --source SOURCE Data source to check for bucket permutations. Uses
certificate transparency logs if not specified.
(default: None)
-p, --permutations PERMUTATIONS
Path of file containing a list of permutations to try
(see permutations/ dir). (default: permutations/default.txt)
```

### Usage Examples

**Basic scan with CertStream:**
```bash
python bucket-stream.py
```

**Use extended permutations list (more comprehensive but slower):**
```bash
python bucket-stream.py -p permutations/extended.txt
```

**Scan specific domains from a file:**
```bash
python bucket-stream.py --source domains.txt --threads 10
```

**Only log interesting buckets (matching keywords.txt):**
```bash
python bucket-stream.py --only-interesting --log
```

This will only report buckets that contain files matching keywords in `keywords.txt` (e.g., password files, database dumps, configuration files, etc.).

**Skip Let's Encrypt certificates:**
```bash
python bucket-stream.py --skip-lets-encrypt
```

### Permutations

The tool uses permutation files to generate potential bucket names. Two files are provided:

- **`permutations/default.txt`** - ~30 common permutations (fast, recommended for most use cases)
- **`permutations/extended.txt`** - 1000+ permutations (comprehensive but slower)

You can create custom permutation files. Each line should contain `%s` where the domain name will be inserted, for example:
```
%s-backup
backup-%s
%s-data
data-%s
```

### Keywords Filtering

The `keywords.txt` file contains a list of sensitive keywords and file extensions used to identify "interesting" buckets when using the `--only-interesting` flag. The file includes:

- **Sensitive keywords**: password, secret, token, api-key, credentials, etc.
- **Database files**: .sql, .db, .dump, .backup, etc.
- **Configuration files**: .env, .pem, .key, config files, etc.
- **Source code**: .git, .svn, source code files, etc.
- **Archives**: .zip, .tar, .rar, compressed files, etc.
- **Documents**: .xls, .csv, .pdf, spreadsheets, etc.
- **Log files**: .log, access logs, error logs, etc.
- **Virtual machines**: .ova, .vmdk, disk images, etc.
- **And many more...**

The file contains **200+ keywords** organized by category. You can customize it by adding or removing keywords. Lines starting with `#` are treated as comments and ignored.

**Example keywords.txt:**
```
password
secret
.sql
.env
backup
```

## Updates & Improvements

This version includes the following updates:
- ✅ Updated to Python 3.7+ (removed Python 2 compatibility)
- ✅ Updated all dependencies to latest compatible versions
- ✅ Fixed CertStream connection issues
- ✅ Improved error handling and reconnection logic
- ✅ Enhanced default permutations list (~30 common patterns)
- ✅ Expanded keywords.txt file (200+ keywords across 15+ categories)
- ✅ Added comment support in keywords.txt (lines starting with # are ignored)
- ✅ Code modernization and cleanup

## F.A.Qs

- **Nothing appears to be happening**

Patience! Sometimes certificate transparency logs can be quiet for a few minutes. Ideally provide AWS secrets in `config.yaml` as this greatly speeds up the checking rate.
Patience! Sometimes certificate transparency logs can be quiet for a few minutes. The tool will show "Waiting for Certstream events..." and then "Connected to CertStream!" when connected. Ideally provide AWS secrets in `config.yaml` as this greatly speeds up the checking rate.

- **I'm getting rate limited**

If you don't have AWS credentials, you're limited to 5 threads. Either:
- Add AWS credentials to `config.yaml` (recommended)
- Use `--ignore-rate-limiting` flag (may miss some buckets)
- Reduce threads with `-t 3`

- **CertStream connection errors**

The tool automatically retries on connection errors. If you see repeated errors, check your internet connection or try again later.

- **I found something highly confidential**

**Report it** - please! You can usually figure out the owner from the bucket name or by doing some quick reconnaissance. Failing that contact Amazon's support teams.

## Troubleshooting

**Import errors:**
- Make sure you're using Python 3.7+
- Ensure all dependencies are installed: `pip install -r requirements.txt`
- Use a virtual environment to avoid conflicts

**Connection issues:**
- CertStream may be temporarily unavailable
- Check your firewall/proxy settings
- The tool will automatically retry

**Rate limiting:**
- Add AWS credentials to `config.yaml` for better performance
- Without credentials, you're limited to 5 threads

## Contributing

1. Fork it, baby!
Contributions are welcome! Please:

1. Fork the repository
2. Create your feature branch: `git checkout -b my-new-feature`
3. Commit your changes: `git commit -am 'Add some feature'`
4. Push to the branch: `git push origin my-new-feature`
5. Submit a pull request.
5. Submit a pull request

## License

Expand Down
56 changes: 31 additions & 25 deletions bucket-stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,29 @@
# -*- coding: utf-8 -*-

import sys
PY2 = sys.version_info[0] == 2
PY3 = (sys.version_info[0] >= 3)

#import queue
if PY2:
import Queue as queue
else: # PY3
import queue

import queue
import argparse
import logging
import os
import signal
import time
import json
from threading import Lock
from threading import Event
from threading import Thread
from threading import Lock, Event, Thread

import requests
import tldextract
import yaml
from boto3.session import Session
from certstream.core import CertStreamClient
import certstream
from requests.adapters import HTTPAdapter
from termcolor import cprint

ARGS = argparse.Namespace()
CONFIG = yaml.safe_load(open("config.yaml"))
KEYWORDS = [line.strip() for line in open("keywords.txt")]
with open("config.yaml", "r") as f:
CONFIG = yaml.safe_load(f)
with open("keywords.txt", "r") as f:
KEYWORDS = [line.strip() for line in f if line.strip() and not line.strip().startswith('#')]
S3_URL = "http://s3-1-w.amazonaws.com"
BUCKET_HOST = "%s.s3.amazonaws.com"
QUEUE_SIZE = CONFIG['queue_size']
Expand Down Expand Up @@ -68,18 +61,29 @@ def run(self):
class CertStreamThread(Thread):
def __init__(self, q, *args, **kwargs):
self.q = q
self.c = CertStreamClient(
self.process, skip_heartbeats=True, on_open=None, on_error=None)

super().__init__(*args, **kwargs)

def run(self):
global THREAD_EVENT
while not THREAD_EVENT.is_set():
cprint("Waiting for Certstream events - this could take a few minutes to queue up...",
cprint("Waiting for Certstream events - this could take a few minutes to queue up...",
"yellow", attrs=["bold"])
self.c.run_forever()
THREAD_EVENT.wait(10)
try:
certstream.listen_for_events(
self.process,
"wss://certstream.calidog.io/",
skip_heartbeats=True,
on_open=self._on_open,
on_error=self._on_error
)
except KeyboardInterrupt:
pass

def _on_open(self):
cprint("Connected to CertStream! Listening for certificate updates...", "green", attrs=["bold"])

def _on_error(self, ex):
if not isinstance(ex, KeyboardInterrupt):
cprint("CertStream connection error: {} - Will retry...".format(ex), "yellow")

def process(self, message, context):
if message["message_type"] == "heartbeat":
Expand Down Expand Up @@ -246,7 +250,8 @@ def get_permutations(domain, subdomain=None):
"%s-www" % domain,
]

perms.extend([line.strip() % domain for line in open(ARGS.permutations)])
with open(ARGS.permutations, "r") as f:
perms.extend([line.strip() % domain for line in f])

if subdomain is not None:
perms.extend([
Expand Down Expand Up @@ -314,9 +319,10 @@ def main():
if ARGS.source is None:
THREADS.extend([CertStreamThread(q)])
else:
for line in open(ARGS.source):
for permutation in get_permutations(line.strip()):
q.put(BUCKET_HOST % permutation)
with open(ARGS.source, "r") as f:
for line in f:
for permutation in get_permutations(line.strip()):
q.put(BUCKET_HOST % permutation)

for t in THREADS:
t.daemon = True
Expand Down
Loading