A distributed social media simulation system using Ray for orchestration and LLM-based agent behaviors.
- Distributed Architecture: Server-client model using Ray for scalable simulation
- Multi-Database Support: SQLite, PostgreSQL, MySQL backends with optional Redis caching
- Configurable Parameters: JSON-based configuration for all simulation parameters
- LLM Integration: Support for Ollama and vLLM backends for realistic agent behaviors with batch inference
- Agent Profiles: User_mgmt-based agent system with Big Five personality traits
- Opinion Dynamics: Configurable models including bounded confidence and LLM-based evaluation for realistic opinion evolution and polarization
- Multi-Client Synchronization: Robust barrier-based coordination with heartbeat liveness detection
- Client-Side Step Management: Clients independently manage their simulation timelines
- Flexible Simulation: Configurable duration, agent population, and LLM parameters
- Structured Logging: Rotating JSON logs with timestamps and execution times
- UUID-Based IDs: Universal identifiers for distributed compatibility
- Performance Optimization: vLLM backend support for 8-30x faster LLM inference through batch processing
π Online Documentation - Complete documentation hosted on GitHub Pages
Tip: Browse the full documentation online with search functionality, or explore the markdown files directly in the docs/ directory.
Complete Documentation Index - Navigate all documentation organized by topic and use case
New in 2.1: Documentation has been reorganized into thematic subdirectories for better navigation and discoverability.
Getting Started:
- Configuration Guide - Complete guide to all configuration options (1,550 lines)
- Architecture Overview - System design and components (960+ lines)
Core Features:
- Recommendation Systems - Content & follow recommendations with 15 algorithms (1,200 lines)
- Opinion Dynamics - Bounded confidence and LLM evaluation models (1,200 lines)
- Interests & Topics - Interest management with attention windows (300 lines)
Agent System:
- Agent Actions - All available agent actions (700+ lines)
- Agent Types - Agent types and archetypes (670+ lines)
- Agent Temporal Activities - Temporal patterns and dynamics (990+ lines)
System & Performance:
- vLLM Integration Guide - High-performance LLM backend with batch inference (8x-30x speedup)
- vLLM Batch Inference - Comprehensive batch inference implementation (10x-50x speedup)
- Database & Storage - Redis/SQL hybrid architecture, 89% Redis coverage (480 lines)
- Redis Integration - Caching strategies and implementation (870 lines)
- Performance Optimization - Bottleneck analysis and optimization strategies
Development:
- Extension Guide - How to add new agent actions and features (1,210+ lines)
- System Diagrams - Visual architecture and interaction diagrams (800 lines)
- Code Formatting - Development guidelines and tooling
Monitoring:
- Logging Configuration - Comprehensive logging setup (420 lines)
- Server Logging - Server log analysis (380 lines)
- Action Logging - Client action tracking (160 lines)
Browse by Use Case: See the Documentation Index for recommended reading paths by role (Researcher, Developer, Admin) and feature cross-references.
The simulator uses JSON configuration files stored in a single directory. See docs/configuration/CONFIG.md for detailed documentation.
All configuration files are kept in the same directory:
server_config.json- Server parameters (name, namespace, address, port, database)simulation_config.json- Client parameters, LLM settings, simulation durationagent_population.json- Agent profiles and distributionllm_prompts.json- LLM prompt templates and personasnetwork.csv- (Optional) Initial social network topology defining follow relationships
pip install -r requirements.txtIf using PostgreSQL or MySQL, initialize the database schema:
python scripts/init_db.py --config my_config# Copy example configuration to a directory
mkdir my_config
cp example_conf/*.json my_config/
# Edit as needed
nano my_config/server_config.jsonpython run_server.py --config my_configpython run_client.py --config my_configYou can start multiple clients to distribute the simulation load.
If --config is not specified, both server and client will use the current directory:
# Uses ./server_config.json
python run_server.py
# Uses ./simulation_config.json, ./agent_population.json, ./llm_prompts.json
python run_client.pyAll output files are created in the configuration directory:
simulation.db- SQLite database with simulation datalogs/- Rotating JSON logs for server and clientray_config.temp- Temporary Ray cluster address file (auto-created by server, used by clients)
Edit the JSON configuration files to customize:
- Number of agents and their characteristics
- LLM model and parameters
- Simulation duration
- Agent personas and behaviors
- Database location
See docs/configuration/CONFIG.md for full configuration options and examples.
YSimulator uses a distributed coordinator-worker pattern:
- Server (Orchestrator): Coordinates temporal progression and manages barriers
- Clients (Workers): Execute simulation steps independently
- Database Middleware: Abstracts storage (SQL + optional Redis)
- Ray: Enables distributed execution without manual networking
For detailed architecture information, including component diagrams and data flow, see docs/architecture/ARCHITECTURE.md and docs/architecture/DIAGRAMS.md.
To add new agent actions or customize behavior:
- Define the data model (SQLAlchemy)
- Add storage methods to DatabaseMiddleware
- Create server handler method
- Implement client-side action logic
- Integrate into the action loop
See docs/development/EXTENDING.md for step-by-step instructions and examples.
YSimulator/
βββ YSimulator/ # Main package
β βββ YServer/ # Server orchestration logic
β βββ YClient/ # Client agent logic
β βββ tests/ # Unit and integration tests
βββ scripts/ # Utility scripts
β βββ init_db.py # Database initialization
β βββ convert_ids_to_uuid.py # ID migration utility
β βββ validate_network_loading.py # Network validation
β βββ postgresql_server.sql # PostgreSQL schema
βββ docs/ # Documentation
βββ example/ # Example configurations
βββ run_server.py # Server entry point
βββ run_client.py # Client entry point
βββ requirements.txt # Python dependencies
Run the test suite:
# Run all tests
python -m pytest YSimulator/tests/
# Run specific test file
python -m pytest YSimulator/tests/test_network_loading.py
# Run tests with better isolation (recommended for avoiding test interference)
# Install pytest-xdist first: pip install pytest-xdist
python -m pytest YSimulator/tests/ -n auto
# Run tests in parallel with 4 workers
python -m pytest YSimulator/tests/ -n 4Note: Some tests use module-level mocking (e.g., test_server.py mocks ray.remote). When running the full test suite, use pytest-xdist with the -n flag to run tests in separate worker processes, which prevents mock interference between test modules.
The scripts/ directory contains utility scripts:
- init_db.py: Initialize database schema for PostgreSQL/MySQL
- convert_ids_to_uuid.py: Migrate existing data to UUID format
- validate_network_loading.py: Validate network topology files
- postgresql_server.sql: PostgreSQL database schema
Code contributions should follow the formatting guidelines in docs/development/FORMATTING.md.
-
Install development dependencies:
pip install -r requirements-dev.txt
-
Install pre-commit hooks:
pre-commit install
The pre-commit hooks will automatically run black, isort, and flake8 on every commit to ensure code quality and consistency.