Skip to content

HenrikMader/LogAnalyzer

Repository files navigation

Linux Log Analyzer with LLM Integration

A powerful log analysis tool for Linux systems (optimized for Power Systems) that combines traditional log collection with AI-powered insights using Large Language Models.

Features

  • Systemd Journal Integration: Collect logs from systemd journal with filtering by units and priority levels
  • File Monitoring: Monitor log files using inotify API for real-time updates
  • LLM Analysis: Analyze logs using OpenAI-compatible API endpoints
  • Multiple Analysis Types:
    • General Summary
    • Security Analysis
    • Performance Analysis
  • Interactive Web UI: Simple Streamlit-based interface for easy interaction
  • Export Capabilities: Export logs in JSON format

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Streamlit Frontend (app.py)              β”‚
β”‚  - Interactive UI for log viewing and analysis              β”‚
β”‚  - Configuration management                                 β”‚
β”‚  - Export functionality                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Log Collector (log_collector.py)               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚ SystemdLogCollector  β”‚  β”‚  FileLogCollector    β”‚        β”‚
β”‚  β”‚ - journalctl         β”‚  β”‚  - inotify/watchdog  β”‚        β”‚
β”‚  β”‚ - Priority filtering β”‚  β”‚  - File monitoring   β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              LLM Analyzer (llm_analyzer.py)                 β”‚
β”‚  - OpenAI-compatible API integration                        β”‚
β”‚  - Log formatting for LLM consumption                       β”‚
β”‚  - Multiple analysis types                                  β”‚
β”‚  - Statistical insights extraction                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    LLM Backend (Power Systems)              β”‚
β”‚  - OpenAI-compatible API endpoint                           β”‚
β”‚  - Running on Power architecture                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Installation

Prerequisites

  • Python 3.8 or higher
  • Linux system with systemd
  • Access to an OpenAI-compatible LLM API endpoint

Setup

  1. Clone or download this repository:
cd /path/to/log-analyzer
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure the application: Edit config.yaml to set your LLM API endpoint and preferences:
llm:
  api_base: "http://your-llm-server:8000/v1"
  api_key: "your-api-key"
  model: "your-model-name"

Usage

Running the Application

Start the Streamlit application:

streamlit run app.py

The application will open in your default web browser at http://localhost:8501

Using the Interface

  1. Logs Tab:

    • Click "Collect Logs" to gather system logs
    • Use filters to narrow down logs by priority, source, or search terms
    • Export logs to JSON format
  2. AI Analysis Tab:

    • Select analysis type (Summary, Security, or Performance)
    • Click "Analyze with AI" for single analysis
    • Use "Batch Analysis" to run all analysis types at once
  3. Statistics Tab:

    • View log distribution metrics
    • See error and warning counts
    • Analyze logs by source
  4. Settings Tab:

    • View and modify configuration
    • Save changes to config.yaml

Command-Line Usage

You can also use the modules programmatically:

import yaml
from log_collector import LogCollector
from llm_analyzer import LLMAnalyzer

# Load configuration
with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Collect logs
collector = LogCollector(config)
logs = collector.collect_all_logs()

# Analyze with LLM
analyzer = LLMAnalyzer(config)
result = analyzer.analyze_logs(logs, analysis_type='summary')

print(result['analysis'])

Configuration

config.yaml Structure

llm:
  api_base: "http://localhost:8000/v1"  # LLM API endpoint
  api_key: "your-api-key"                # API authentication key
  model: "gpt-3.5-turbo"                 # Model name
  max_tokens: 2000                       # Maximum response tokens
  temperature: 0.7                       # Response creativity (0-1)

logs:
  systemd:
    enabled: true
    units:                               # Systemd units to monitor
      - "sshd.service"
      - "systemd-logind.service"
    priority_levels:                     # Log priority levels
      - "err"
      - "warning"
      - "notice"
    max_entries: 1000
    
  file_monitoring:
    enabled: true
    paths:                               # Log files to monitor
      - "/var/log/syslog"
      - "/var/log/auth.log"
    
  time_range_hours: 24                   # How far back to collect logs

analysis:
  batch_size: 50                         # Logs per LLM request
  prompts:                               # Custom analysis prompts
    summary: "Analyze these logs..."
    security: "Review for security..."
    performance: "Identify performance issues..."

Log Sources

Systemd Journal

  • Collects logs using journalctl
  • Filters by unit, priority, and time range
  • Supports all systemd priority levels (emerg, alert, crit, err, warning, notice, info, debug)

File Monitoring

  • Uses watchdog library (inotify API) for real-time monitoring
  • Supports common log files: syslog, auth.log, kern.log, etc.
  • Reads recent entries and monitors for new ones

LLM Integration

The analyzer uses an OpenAI-compatible API, which means it works with:

  • OpenAI API
  • Local LLM servers (llama.cpp, vLLM, etc.)
  • Custom LLM deployments on Power Systems

Analysis Types

  1. Summary Analysis: General overview of system health and events
  2. Security Analysis: Focus on authentication, access, and security events
  3. Performance Analysis: Identify bottlenecks and resource issues

Permissions

Some log collection features require elevated permissions:

# For systemd journal access
sudo usermod -a -G systemd-journal $USER

# For file monitoring (if needed)
sudo chmod +r /var/log/auth.log

Or run with sudo:

sudo streamlit run app.py

Troubleshooting

"journalctl not found"

  • Ensure systemd is installed on your system
  • Check that journalctl is in your PATH

"Permission denied" errors

  • Run with sudo or add user to appropriate groups
  • Check file permissions for monitored log files

LLM API connection errors

  • Verify the API endpoint is accessible
  • Check API key is correct
  • Ensure the LLM server is running

No logs collected

  • Check systemd units exist: systemctl list-units
  • Verify log files exist and are readable
  • Adjust time_range_hours in config.yaml

Performance Considerations

  • Log Volume: Large log volumes may take time to collect and analyze
  • LLM Tokens: Batch size affects API token usage and cost
  • Real-time Monitoring: File monitoring uses system resources (inotify)

Future Enhancements

  • Real-time log streaming
  • Alert system for critical events
  • Historical trend analysis
  • Custom log parsers for specific applications
  • Multi-system log aggregation
  • Advanced filtering and search
  • Log correlation and pattern detection

License

This project is provided as-is for use on Linux Power Systems.

Contributing

Contributions are welcome! Areas for improvement:

  • Additional log sources
  • Enhanced LLM prompts
  • Performance optimizations
  • UI improvements
  • Documentation

Support

For issues specific to:

  • Power Systems: Consult IBM Power documentation
  • LLM Integration: Check your LLM provider's documentation
  • Linux Logs: Refer to systemd and syslog documentation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors