Automated half-hour scrapers for LUMA and Tesla energy data on macOS, plus a polars notebook that builds presentation-grade charts:
- Daily stacked supply (Base / Peak / Renewable / VPP Discharge)
- Peak-hour waterfall (zoom-in composition)
Why: quantify how Tesla’s Virtual Power Plant contributes to Puerto Rico’s supply mix—especially at the evening peak.
-
Scrapers (every 30 minutes):
-
Tesla: Powerwall stats + Battery Power – Discharging
-
LUMA: Supply by generation plant and source_tab (Base / Pico / Renovable)
-
-
Notebook: type-clean, de-duplicate intra-hour rows (earliest minute per hour), aggregate by hour, and plot.
-
Outputs: clean PNG/SVG charts, saved to
06_outputs/02_plots/.
02_energy/
├─ 01_data/
│ ├─ 01_luma_data/
│ │ └─ luma_system_summary.parquet
│ └─ 02_tesla_data/
│ └─ tesla_powerwall_summary.parquet
├─ 02_scripts/
│ ├─ 01_luma_data.py
│ ├─ 02_tesla_data.py
│ ├─ KeepAwake/ # optional macOS helper
│ └─ run_energy_scrapers.sh
├─ 03_notebooks/
│ └─ 01_supply_vpp_daily.ipynb
├─ 04_logs/
│ └─ energy_scraper.log
├─ 05_docs/
├─ 06_outputs/
│ ├─ 01_tables/
│ └─ 02_plots/
│ ├─ supply_stack_YYYY-MM-DD.png
│ ├─ supply_stack_YYYY-MM-DD.svg
│ ├─ peak_waterfall_YYYY-MM-DD_HH00.png
│ └─ peak_waterfall_YYYY-MM-DD_HH00.svg
├─ requirements.txt
└─ README.md# from your projects folder
git clone <your-repo-url> 02_energy
cd 02_energy
# use your base conda env or create a new one
conda activate base
pip install -r requirements.txtRequirements
polars>=1.0
pandas>=2.2
numpy>=1.26
matplotlib>=3.8
selenium>=4.20
webdriver-manager>=4.0
pyarrow>=16.0Tip: If you prefer conda, create an
environment.yml. Keep Selenium/WebDriver pinned if Chrome updates frequently.
✅ Manual Run
Use this command any time:
~/path/to/your/projects/02_energy/02_scripts/run_energy_scrapers.sh >> ~/path/to/your/projects/02_energy/04_logs/energy_scraper.log 2>&1Expected results:
-
Timestamped entries in
04_logs/energy_scraper.log -
Updated
.parquetfiles with no duplicate entries.
- Open the crontab editor
crontab -e- Add the following line to run the scraper every 30 minutes
0,30 * * * * /bin/bash /path/to/your/projects/02_energy/02_scripts/run_energy_scrapers.sh >> /path/to/your/projects/02_energy/04_logs/energy_scraper.log 2>&1Replace the path accordingly
-
Save and exit
-
Press
Esc -
Type
:wq -
Hit
Enter
-
To verify:
crontab -lTo remove later:
crontab -e(Then move to the line and type dd to delete and save with :wq)
To keep your Mac awake in the background:
caffeinate -dimsu &Or create a .app:
-
Open
Script Editor -
Paste:
applescript do shell script "caffeinate -dimsu &" -
Save as:
KeepMacAwake.appinside the project folder. -
Enable it via:
System Settings > Login Items > Open at Login
You’ll see it running in the menu bar as a gear icon ⛭
Open:
03_notebooks/01_supply_vpp_daily.ipynbThen notebook:
-
Excludes the startup date (2025-08-03) and today for partial hours.
-
Builds a long table with hourly supply by source_tab + VPP discharge.
-
Produces two charts, saved under 06_outputs/02_plots/:
-
supply_stack_YYYY-MM-DD.(png|svg)
-
peak_waterfall_YYYY-MM-DD_HH00.(png|svg)
Chart examples (filenames will match your analysis day)
-
Visual style: light grey for Base, dark blue for Peak, dark orange for Renewable, accent for VPP; white background; no chart junk.
-
Intra-hour handling: keep the earliest minute per hour to avoid double-counting (scrapers run every 30 minutes).
-
Coverage check: both sources must have at least one record per hour in the shared window; today is ignored.
-
Harmonization: LUMA
source_tabnames mapped to English; Tesla"Battery Power – Discharging"mapped to VPP Discharge. -
Rounding: values shown as whole MW for clarity.
-
Timestamp note (Tesla): The public dashboard updates roughly every 15 minutes and often lags real time by ~15 minutes. My scraper runs at :00 and :30 and, for each hour, keeps the earliest reading available (≈ :00 or :15). As a result, the “peak” time shown is approximate to the nearest 15 minutes.
No new logs or data?
- Ensure the Mac is not asleep
- Confirm cron is active (
crontab -l) - Try a manual run and check for log updates
- Inspect
energy_scraper.logfor errors - Add diagnostic prints like
python --versionorwhich pythoninside the script
- Data from public dashboards; this repo is not affiliated with LUMA or Tesla.
- Code licensed under MIT (see LICENSE).
Maintained by Jesús Ortiz
Last updated: August 9, 2025