This guide covers the advanced AI-powered features that have been added to the Job Prospect Automation system, including Azure OpenAI integration, AI parsing capabilities, comprehensive product analysis, and automated email sending.
- Overview of Enhanced Features
- Azure OpenAI Integration
- AI-Powered Data Parsing
- Comprehensive Product Analysis
- Enhanced Email Personalization
- Automated Email Sending with Resend
- Enhanced Notion Data Storage
- Configuration Options
- Best Practices
- Performance Optimization
The enhanced version of the Job Prospect Automation system includes several powerful new capabilities:
- Multi-Provider AI Architecture: Support for OpenAI, Azure OpenAI, Anthropic Claude, Google Gemini, and DeepSeek through unified interface
- Unified Service Architecture: BaseService foundation with standardized error handling, rate limiting, and performance monitoring
- Advanced Caching System: Multi-tier caching with in-memory and persistent storage for optimal performance
- Azure OpenAI Integration: Enterprise-grade AI services with better reliability
- AI-Powered Data Parsing: Intelligent structuring of scraped data
- Comprehensive Product Analysis: Deep analysis of products and market position
- Enhanced Email Personalization: AI-driven personalization using structured data
- Automated Email Sending: Reliable email delivery with tracking via Resend
- Enhanced Data Storage: Optimized Notion database schema for AI-structured data
- Higher Quality Outreach: More personalized and relevant emails
- Better Data Organization: Structured data optimized for email generation
- Improved Reliability: Enterprise-grade AI services and email delivery
- Enhanced Performance: Multi-tier caching reduces API costs and improves response times
- Faster Processing: Optimized scraping delays (85% faster with 0.3s vs 2.0s default)
- Enhanced Insights: Comprehensive product and market analysis
- Automated Workflow: End-to-end automation from discovery to email sending
The system now features a unified AIService class that consolidates all AI operations:
- LinkedIn Profile Parsing: Structured extraction from HTML content
- Ultra-Fast LinkedIn Finding: Lightning-fast LinkedIn URL discovery with smart caching
- Email Generation: Personalized outreach emails with multiple templates
- Product Analysis: Business intelligence from product data
- Team Extraction: Team member identification and structuring
- Business Metrics: Company insights and metrics analysis
- Standardized Processing: Consistent AI patterns across all operations
- Result Caching: Expensive AI operations cached for performance
- Error Handling: Comprehensive error recovery and fallback strategies
- Performance Monitoring: Detailed metrics and operation tracking
Azure OpenAI provides several advantages over regular OpenAI:
- Enterprise Reliability: 99.9% uptime SLA
- Enhanced Security: Enterprise-grade security and compliance
- Predictable Pricing: Reserved capacity and volume discounts
- Regional Deployment: Data residency and latency optimization
- Better Rate Limits: Higher throughput for production workloads
# Prerequisites
# - Azure account with active subscription
# - Approved access to Azure OpenAI (apply at aka.ms/oai/access)
# Create resource group
az group create --name job-prospect-rg --location eastus
# Create Azure OpenAI resource
az cognitiveservices account create \
--name job-prospect-openai \
--resource-group job-prospect-rg \
--location eastus \
--kind OpenAI \
--sku S0# Deploy GPT-4 for email generation
az cognitiveservices account deployment create \
--name job-prospect-openai \
--resource-group job-prospect-rg \
--deployment-name gpt-4-email-gen \
--model-name gpt-4 \
--model-version "0613" \
--model-format OpenAI \
--scale-settings-scale-type "Standard"# Azure OpenAI Configuration
USE_AZURE_OPENAI=true
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=https://job-prospect-openai.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4-email-gen
AZURE_OPENAI_API_VERSION=2024-02-15-preview
# Model Configuration for Different Tasks
EMAIL_GENERATION_MODEL=gpt-4-email-gen
AI_PARSING_MODEL=gpt-4-parsing
PRODUCT_ANALYSIS_MODEL=gpt-4-parsingThe enhanced system now includes a centralized OpenAI Client Manager that provides:
Key Benefits:
- Unified Client Management: Single point of control for all OpenAI operations
- Connection Pooling: Efficient HTTP connection reuse reduces latency
- Error Standardization: Consistent error handling across all AI services
- Multi-Client Support: Support for different client configurations
- Thread Safety: Safe concurrent operations for parallel processing
Usage Example:
from services.openai_client_manager import (
get_client_manager,
configure_default_client,
CompletionRequest
)
from utils.config import Config
# Initialize the client manager
config = Config.from_env()
configure_default_client(config)
manager = get_client_manager()
# Make a structured completion request
request = CompletionRequest(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Parse this LinkedIn profile data..."}
],
temperature=0.7,
max_tokens=1000
)
response = manager.make_completion(request)
if response.success:
print(f"Parsed content: {response.content}")
print(f"Tokens used: {response.usage}")
else:
print(f"Error: {response.error_message}")
# Simple completion for quick operations
simple_response = manager.make_simple_completion([
{"role": "user", "content": "Generate a professional email subject line"}
])
print(f"Generated subject: {simple_response}")Client Manager Features:
# Get client information
client_info = manager.get_client_info()
print(f"Client Type: {client_info['client_type']}")
print(f"Model: {client_info['model_name']}")
print(f"Configured: {client_info['configured']}")
# List all configured clients
clients = manager.list_clients()
print(f"Available clients: {clients}")
# Close clients when done (automatic cleanup)
manager.close_all_clients()
```## 🧠 AI-P
owered Data Parsing
### Overview
AI parsing transforms raw scraped data into structured, organized information optimized for email personalization.
### Parsing Capabilities
#### 1. LinkedIn Profile Parsing
Converts raw LinkedIn HTML into structured professional data with enhanced validation and fallback handling:
```python
# Example parsed LinkedIn data
{
"name": "John Smith",
"current_role": "Senior Software Engineer at TechCorp",
"experience": [
"Senior Software Engineer at TechCorp (2021-Present)",
"Software Engineer at StartupXYZ (2019-2021)",
"Junior Developer at WebCorp (2017-2019)"
],
"skills": [
"Python", "JavaScript", "React", "Node.js", "AWS",
"Docker", "Kubernetes", "Machine Learning"
],
"summary": "Experienced software engineer with 6+ years in full-stack development..."
}Enhanced Validation Features:
- Required Field Validation: Ensures critical fields (
name,current_role) are present and non-empty - Fallback Data Support: Uses provided fallback data when AI parsing fails or returns incomplete results
- Default Value Handling: Provides meaningful defaults ("Unknown Profile", "Unknown Role") when data is unavailable
- Data Quality Assurance: Validates and cleans extracted data before creating profile objects
Organizes product data for better understanding:
# Example structured product data
{
"name": "AI Analytics Platform",
"description": "Advanced analytics platform using AI to provide business insights",
"features": [
"Real-time data processing",
"Machine learning predictions",
"Custom dashboard creation",
"API integrations"
],
"pricing_model": "freemium",
"target_market": "B2B SaaS companies",
"competitors": ["Tableau", "PowerBI", "Looker"],
"market_analysis": "Growing market with focus on AI-driven insights..."
}# Enable AI parsing
ENABLE_AI_PARSING=true
AI_PARSING_MODEL=gpt-4
AI_PARSING_MAX_RETRIES=3
AI_PARSING_TIMEOUT=30#!/usr/bin/env python3
"""Example of using the unified AI service."""
from services.ai_service import AIService, AIOperationType, EmailTemplate
from models.data_models import Prospect
from utils.config import Config
def demonstrate_ai_service():
config = Config.from_env()
ai_service = AIService(config)
# Parse LinkedIn profile
linkedin_html = """<html><!-- LinkedIn profile HTML --></html>"""
result = ai_service.parse_linkedin_profile(linkedin_html)
if result.success:
profile = result.data
print(f"Parsed profile for {profile.name}")
print(f"Current role: {profile.current_role}")
print(f"Top skills: {', '.join(profile.skills[:5])}")
print(f"Confidence score: {result.confidence_score:.2f}")
print(f"Cached result: {result.cached}")
else:
print(f"Parsing failed: {result.error_message}")
# Example with fallback data for enhanced reliability
fallback_data = {
"name": "John Doe",
"current_role": "Software Engineer",
"experience": [],
"skills": [],
"summary": ""
}
result_with_fallback = ai_service.parse_linkedin_profile(
linkedin_html,
fallback_data=fallback_data
)
if result_with_fallback.success:
profile = result_with_fallback.data
print(f"Profile with fallback: {profile.name} - {profile.current_role}")
# Even if AI parsing fails, fallback ensures we have valid data
# Generate personalized email
prospect = Prospect(
name="John Doe",
role="Software Engineer",
company="TechCorp",
linkedin_url="https://linkedin.com/in/johndoe"
)
email_result = ai_service.generate_email(
prospect=prospect,
template_type=EmailTemplate.COLD_OUTREACH,
linkedin_profile=profile
)
if email_result.success:
email = email_result.data
print(f"Generated email subject: {email.subject}")
print(f"Personalization score: {email.personalization_score:.2f}")
if __name__ == "__main__":
demonstrate_ai_service()Resend integration provides reliable email delivery with real-time tracking and analytics.
- High Deliverability: Optimized for transactional email delivery
- Real-time Tracking: Track opens, clicks, bounces, and complaints
- Webhook Support: Automatic status updates via webhooks
- Domain Authentication: SPF, DKIM, and DMARC support
- Analytics Dashboard: Comprehensive sending statistics
# DNS records to add to your domain
# Replace 'yourdomain.com' with your actual domain
# SPF Record
TXT @ "v=spf1 include:_spf.resend.com ~all"
# DKIM Record
TXT resend._domainkey "v=DKIM1; k=rsa; p=YOUR_DKIM_PUBLIC_KEY"
# DMARC Record
TXT _dmarc "v=DMARC1; p=quarantine; rua=mailto:dmarc@yourdomain.com"# Resend API Configuration
RESEND_API_KEY=re_your_resend_api_key_here
SENDER_EMAIL=your-name@yourdomain.com
SENDER_NAME=Your Full Name
REPLY_TO_EMAIL=your-name@yourdomain.com
# Email Sending Settings
RESEND_REQUESTS_PER_MINUTE=100
AUTO_SEND_EMAILS=false
EMAIL_REVIEW_REQUIRED=true
# Delivery Tracking
ENABLE_DELIVERY_TRACKING=true
WEBHOOK_URL=https://your-domain.com/webhooks/resendThe system automatically tracks the complete email lifecycle in Notion with dedicated fields:
- Email Generation Status: Tracks email creation progress (Not Generated, Generated, Sent, Failed)
- Email Delivery Status: Monitors delivery state (Not Sent, Sent, Delivered, Bounced, etc.)
- Email Subject & Content: Stores generated email content for review and tracking
- Timestamps: Records email generation and sending dates for analytics
These fields are automatically populated when emails are generated and sent, providing complete visibility into your outreach campaigns directly within your Notion database.
# =============================================================================
# ENHANCED JOB PROSPECT AUTOMATION CONFIGURATION
# =============================================================================
# -----------------------------------------------------------------------------
# Required API Keys
# -----------------------------------------------------------------------------
NOTION_TOKEN=secret_your_notion_integration_token
HUNTER_API_KEY=your_hunter_io_api_key
# -----------------------------------------------------------------------------
# Azure OpenAI Configuration (Recommended)
# -----------------------------------------------------------------------------
USE_AZURE_OPENAI=true
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4-deployment
AZURE_OPENAI_API_VERSION=2024-02-15-preview
# Alternative: Regular OpenAI (if not using Azure)
# USE_AZURE_OPENAI=false
# OPENAI_API_KEY=sk-your_openai_api_key
# -----------------------------------------------------------------------------
# Email Sending Configuration (Resend)
# -----------------------------------------------------------------------------
RESEND_API_KEY=re_your_resend_api_key
SENDER_EMAIL=your-name@yourdomain.com
SENDER_NAME=Your Full Name
REPLY_TO_EMAIL=your-name@yourdomain.com
# -----------------------------------------------------------------------------
# Enhanced AI Features
# -----------------------------------------------------------------------------
ENABLE_AI_PARSING=true
ENABLE_PRODUCT_ANALYSIS=true
ENHANCED_PERSONALIZATION=true
# AI Model Configuration
AI_PARSING_MODEL=gpt-4
PRODUCT_ANALYSIS_MODEL=gpt-4
EMAIL_GENERATION_MODEL=gpt-4
# AI Processing Settings
AI_PARSING_MAX_RETRIES=3
AI_PARSING_TIMEOUT=30
PRODUCT_ANALYSIS_MAX_RETRIES=3
# -----------------------------------------------------------------------------
# Rate Limiting and Performance
# -----------------------------------------------------------------------------
SCRAPING_DELAY=0.3
HUNTER_REQUESTS_PER_MINUTE=10
RESEND_REQUESTS_PER_MINUTE=100
# Processing Limits
MAX_PRODUCTS_PER_RUN=50
MAX_PROSPECTS_PER_COMPANY=10
# -----------------------------------------------------------------------------
# Email Configuration
# -----------------------------------------------------------------------------
EMAIL_TEMPLATE_TYPE=professional
PERSONALIZATION_LEVEL=high
MAX_EMAIL_LENGTH=500
# Email Content Settings
INCLUDE_PRODUCT_CONTEXT=true
INCLUDE_BUSINESS_INSIGHTS=true
INCLUDE_LINKEDIN_CONTEXT=true
MENTION_DISCOVERY_SOURCE=true
# -----------------------------------------------------------------------------
# Workflow Configuration
# -----------------------------------------------------------------------------
ENABLE_ENHANCED_WORKFLOW=true
BATCH_PROCESSING_ENABLED=true
# Email Sending Workflow
AUTO_SEND_EMAILS=false
EMAIL_REVIEW_REQUIRED=true
ENABLE_DELIVERY_TRACKING=true
# -----------------------------------------------------------------------------
# Data Storage Configuration
# -----------------------------------------------------------------------------
NOTION_DATABASE_ID= # Leave empty to auto-create enhanced database
# Enhanced Data Storage
STORE_AI_STRUCTURED_DATA=true
STORE_PRODUCT_ANALYSIS=true
STORE_BUSINESS_METRICS=true
# -----------------------------------------------------------------------------
# Caching Configuration
# -----------------------------------------------------------------------------
ENABLE_CACHING=true
CACHE_MEMORY_MAX_ENTRIES=1000
CACHE_MEMORY_MAX_MB=100
CACHE_PERSISTENT_DIR=.cache
CACHE_DEFAULT_TTL=3600
# -----------------------------------------------------------------------------
# Logging and Monitoring
# -----------------------------------------------------------------------------
LOG_LEVEL=INFO
ENABLE_PERFORMANCE_MONITORING=true
ENABLE_ERROR_REPORTING=true# Use GPT-4 for best results
EMAIL_GENERATION_MODEL=gpt-4
AI_PARSING_MODEL=gpt-4
PRODUCT_ANALYSIS_MODEL=gpt-4# Use GPT-3.5-turbo for cost savings
EMAIL_GENERATION_MODEL=gpt-3.5-turbo
AI_PARSING_MODEL=gpt-3.5-turbo
PRODUCT_ANALYSIS_MODEL=gpt-3.5-turbo# Conservative approach (recommended for new users)
SCRAPING_DELAY=1.0
HUNTER_REQUESTS_PER_MINUTE=5
RESEND_REQUESTS_PER_MINUTE=50
# Optimized approach (default - balanced performance and reliability)
SCRAPING_DELAY=0.3
HUNTER_REQUESTS_PER_MINUTE=10
RESEND_REQUESTS_PER_MINUTE=100
# Aggressive approach (for experienced users with good API limits)
SCRAPING_DELAY=0.1
HUNTER_REQUESTS_PER_MINUTE=15
RESEND_REQUESTS_PER_MINUTE=100# High-quality personalization
PERSONALIZATION_LEVEL=high
ENHANCED_PERSONALIZATION=true
MAX_EMAIL_LENGTH=400
INCLUDE_PRODUCT_CONTEXT=true
INCLUDE_BUSINESS_INSIGHTS=trueEmail Format Guidelines (Updated for High-Converting Outreach):
- Subject Lines: Emotionally resonant and compelling, blending emotion with relevance
- Email Body: Brief (under 150 words), deeply personal with authentic language
- Structure: Bold "tl;dr" section, emotional opener, specific achievements, soft CTA
- Tone: Authentic, motivated, and obsessed builder language (avoiding corporate jargon)
- Personalization: AI-driven content with specific company insights and vulnerability/relatability
# Always review before sending
AUTO_SEND_EMAILS=false
EMAIL_REVIEW_REQUIRED=true
DRY_RUN_MODE=false # Only after thorough testingThe system includes a comprehensive caching service that provides both in-memory and persistent caching capabilities to optimize performance and reduce API costs.
- In-Memory Cache: Fast access with LRU eviction for frequently accessed data
- Persistent Cache: File-based storage for data persistence across restarts
- Automatic Promotion: Data automatically promoted from persistent to memory cache
- TTL Support: Configurable time-to-live for cache entries
- Cache Warming: Pre-populate cache with frequently accessed data
- Pattern-Based Invalidation: Invalidate multiple cache entries using wildcards
- Statistics Monitoring: Detailed cache performance metrics
- Memory Management: Automatic eviction based on size and memory limits
# Caching Configuration
ENABLE_CACHING=true # Enable caching system
CACHE_MEMORY_MAX_ENTRIES=1000 # Maximum entries in memory cache
CACHE_MEMORY_MAX_MB=100 # Maximum memory usage in MB
CACHE_PERSISTENT_DIR=.cache # Directory for persistent cache files
CACHE_DEFAULT_TTL=3600 # Default TTL in seconds (1 hour)from services.caching_service import CachingService
from utils.config import Config
# Initialize caching service
config = Config.from_env()
cache = CachingService(config)
# Store data with TTL
cache.set("linkedin_profile_123", profile_data, ttl=3600)
# Retrieve cached data
cached_profile = cache.get("linkedin_profile_123")
if cached_profile:
print("Using cached profile data")
# Get or set pattern
profile = cache.get_or_set(
"linkedin_profile_456",
lambda: expensive_linkedin_scraping_operation(),
ttl=3600
)# Define cache warming configuration
warming_config = {
"frequent_profiles": {
"factory": lambda: load_frequent_profiles(),
"ttl": 7200,
"priority": 10
},
"company_data": {
"factory": lambda: load_company_data(),
"ttl": 3600,
"priority": 5
}
}
# Warm the cache
cache.warm_cache(warming_config)# Get cache statistics
stats = cache.get_stats()
print(f"Cache hit rate: {stats.hit_rate:.2%}")
print(f"Memory usage: {stats.memory_usage_mb:.1f} MB")
print(f"Total entries: {stats.total_entries}")
# Pattern-based invalidation
cache.invalidate_pattern("linkedin_profile_*")
# Clean up expired entries
cache.cleanup_expired()- Reduced API Costs: Cache expensive AI parsing and API calls
- Faster Response Times: In-memory access is 100x faster than API calls
- Persistent Storage: Data survives application restarts
- Memory Efficient: Automatic eviction prevents memory exhaustion
- Thread Safe: Concurrent access from multiple services
The caching service is automatically integrated with AI services:
from services.ai_service import AIService
# AI service automatically uses caching
ai_service = AIService(config)
# First call makes API request and caches result
result1 = ai_service.parse_linkedin_profile(html_content)
print(f"Cached: {result1.cached}") # False
# Second call uses cached result
result2 = ai_service.parse_linkedin_profile(html_content)
print(f"Cached: {result2.cached}") # TrueThe system includes an optimized LinkedIn profile finder (LinkedInFinderOptimized) that provides dramatic performance improvements over traditional scraping methods. LinkedIn URL discovery is now handled as a separate, dedicated step rather than being automatically integrated into the ProductHunt scraping process.
- Speed: 0.5-2 seconds per profile (450-1800x faster than WebDriver)
- Rate Limiting: 0.5s delays vs 2.0s default (4x faster requests)
- No Browser Overhead: HTTP-only requests eliminate WebDriver dependencies
- Smart Caching: Failed searches cached to prevent repeated attempts
- Direct LinkedIn Search: Pattern-based URL construction with HEAD request validation
- Quick Google Search: Fast search with 3-second timeout
- Intelligent URL Generation: Fallback URL creation from name patterns
- Decoupled from ProductHunt Scraping: LinkedIn finding is now a separate service that can be called independently
- Selective Processing: Only searches for team members that don't already have LinkedIn URLs from ProductHunt
- On-Demand Execution: LinkedIn discovery runs when explicitly requested, not automatically during ProductHunt scraping
# Ultra-Fast LinkedIn Finder Configuration
LINKEDIN_FINDER_DELAY=0.5 # Request delay (reduced from 2.0s)
LINKEDIN_SEARCH_TIMEOUT=3 # Google search timeout
LINKEDIN_URL_CHECK_TIMEOUT=2 # URL validation timeout
LINKEDIN_CACHE_FAILED_SEARCHES=true # Cache failed searchesfrom services.linkedin_finder_optimized import LinkedInFinderOptimized
from models.data_models import TeamMember
from utils.config import Config
# Initialize optimized finder
config = Config.from_env()
finder = LinkedInFinderOptimized(config)
# Create team members (some may already have LinkedIn URLs from ProductHunt)
team_members = [
TeamMember(name="John Smith", role="CEO", company="TechCorp"),
TeamMember(name="Jane Doe", role="CTO", company="TechCorp", linkedin_url="https://linkedin.com/in/janedoe")
]
# Find LinkedIn URLs only for members without existing URLs
updated_members = finder.find_linkedin_urls_for_team(team_members)
# Check results
for member in updated_members:
if member.linkedin_url:
print(f"✅ Found: {member.name} -> {member.linkedin_url}")
else:
print(f"❌ Not found: {member.name}")from services.product_hunt_scraper import ProductHuntScraper
from services.linkedin_finder_optimized import LinkedInFinderOptimized
# Step 1: Extract team from ProductHunt (may include some LinkedIn URLs)
scraper = ProductHuntScraper(config)
team_members = scraper.extract_team_info(product_url)
# Step 2: Find missing LinkedIn URLs separately
finder = LinkedInFinderOptimized(config)
complete_team = finder.find_linkedin_urls_for_team(team_members)- 450-1800x Faster: Seconds instead of minutes per profile
- 4x Faster Requests: Reduced rate limiting delays
- Memory Efficient: No browser overhead or WebDriver dependencies
- Reliable: No browser crashes or timeout issues
- Smart Caching: Prevents repeated failed searches
- Selective Processing: Only processes team members that need LinkedIn URLs
The optimized LinkedIn finder is used as a separate service in the main workflow:
from controllers.prospect_automation_controller import ProspectAutomationController
# The controller handles LinkedIn discovery as a separate step
controller = ProspectAutomationController(config)
results = controller.run_discovery_pipeline(limit=10)
# LinkedIn URLs found through dedicated service when needed
print(f"LinkedIn profiles found: {results['summary'].get('linkedin_profiles_found', 0)}")The system includes a centralized WebDriver management system that provides unified browser automation across all scraping services with connection pooling, lifecycle management, and optimized resource usage.
- Connection Pooling: Reuse WebDriver instances across multiple operations
- Resource Management: Automatic cleanup and lifecycle management
- Thread Safety: Safe concurrent operations for parallel processing
- Performance Optimization: Reduced browser startup overhead
- Unified Chrome Options: Consistent browser settings across all scrapers
- Anti-Detection Features: Stealth mode configurations to avoid blocking
- Customizable Settings: Headless mode, window size, user agent, proxy support
- Performance Tuning: Image/JavaScript disabling for faster scraping
# WebDriver Configuration
WEBDRIVER_HEADLESS=true # Run browsers in headless mode
WEBDRIVER_POOL_SIZE=3 # Number of WebDriver instances in pool
WEBDRIVER_PAGE_LOAD_TIMEOUT=20 # Page load timeout in seconds
WEBDRIVER_IMPLICIT_WAIT=10 # Implicit wait timeout
WEBDRIVER_DISABLE_IMAGES=false # Disable image loading for performance
WEBDRIVER_DISABLE_JAVASCRIPT=false # Disable JavaScript execution
WEBDRIVER_PROXY= # Optional proxy server
WEBDRIVER_WINDOW_WIDTH=1920 # Browser window width
WEBDRIVER_WINDOW_HEIGHT=1080 # Browser window height
WEBDRIVER_USER_AGENT= # Custom user agent stringThe WebDriver manager is automatically used by scraping services:
from utils.webdriver_manager import get_webdriver_manager
# Get the global WebDriver manager
webdriver_manager = get_webdriver_manager(config)
# Use WebDriver with automatic cleanup
with webdriver_manager.get_driver("service_name") as driver:
driver.get("https://example.com")
# Perform scraping operations
# Driver is automatically returned to pool when done- 3-5x Faster: Reduced browser startup overhead through pooling
- Memory Efficient: Shared WebDriver instances across operations
- Resource Cleanup: Automatic cleanup prevents memory leaks
- Concurrent Operations: Thread-safe operations for parallel processing
# Get pool statistics
stats = webdriver_manager.get_pool_stats()
print(f"Active drivers: {stats['active_drivers']}")
print(f"Pool size: {stats['pool_size']}")
print(f"Max pool size: {stats['max_pool_size']}")This guide covers the core enhanced features of the Job Prospect Automation system. The new capabilities significantly improve the quality and effectiveness of your job prospecting workflow through AI-powered automation and intelligent personalization.
For additional support or questions about these enhanced features, refer to the troubleshooting guide or check the system logs for detailed error information.