Skip to content

Add GR801 SoC radiation hardness simulation pipeline#62

Merged
danielnovais-tech merged 6 commits into
mainfrom
copilot/add-radiation-model
Feb 8, 2026
Merged

Add GR801 SoC radiation hardness simulation pipeline#62
danielnovais-tech merged 6 commits into
mainfrom
copilot/add-radiation-model

Conversation

Copilot AI commented Feb 8, 2026

Copy link
Copy Markdown
Contributor
  • Create simulation_pipeline_gr801.py file with GR801 SoC radiation model
  • Create corresponding test file test_simulation_pipeline_gr801.py
  • Verify the code runs without errors
  • Run tests to ensure functionality (17 tests passing)
  • Run linters and CI checks (ruff, mypy all passing)
  • Address code review feedback
    • Replace print() with logging.getLogger(name)
    • Fix unused variable in test
    • Add two blank lines between top-level definitions (PEP 8)
    • Fix arithmetic operator spacing (1024*1024 → 1024 * 1024)
    • Move performance metric update to after fault injection/correction
    • Make correction_rate configurable via config parameter
    • Make error_threshold configurable via config parameter
    • Rename metrics to clarify cumulative nature (total_faults_injected, total_faults_corrected)
    • Fix flaky test by seeding random number generator
    • Adjust log level for safe shutdown (ERROR → WARNING)
  • Run CodeQL security scan (no issues found)

Summary

All review feedback has been addressed. The simulation now uses proper logging with appropriate log levels, has configurable parameters for correction_rate and error_threshold, and all tests pass reliably.

Original prompt

simulation_pipeline_gr801.py

import numpy as np
from typing import Dict, Any, List, Optional

--- Data Structures ---

class SoC:
"""Model of the GR801 SoC."""
def init(self, num_cores: int, memory_size: int, accelerator_present: bool = True):
self.num_cores = num_cores
self.memory = np.zeros(memory_size, dtype=np.uint8)
self.accelerator_present = accelerator_present
self.registers = [0] * 32 * num_cores # Assuming 32 registers per core
self.cache = np.zeros(1024, dtype=np.uint8) # Simplified cache
self.errors = 0
self.performance = 0.0 # Some performance metric

class RadiationModel:
"""Models the radiation environment."""
def init(self, particle_flux: float, upset_rate: float):
self.particle_flux = particle_flux # particles per cm^2 per second
self.upset_rate = upset_rate # probability of an upset per particle

class AIApplication:
"""Represents an AI application running on the SoC."""
def init(self, task: str, input_data: np.ndarray):
self.task = task # e.g., "image_classification"
self.input_data = input_data
self.output = None
self.accuracy = 1.0 # Current accuracy of the application

class SimulationState:
"""Holds the current state of the simulation."""
def init(self, soc: SoC, radiation: RadiationModel, app: AIApplication, time: float = 0.0):
self.soc = soc
self.radiation = radiation
self.app = app
self.time = time
self.faults_injected = 0
self.faults_corrected = 0

--- Initialization ---

def initialize_soc(config: Dict[str, Any]) -> SoC:
"""Initialize the SoC with given configuration."""
num_cores = config.get('num_cores', 4)
memory_size = config.get('memory_size', 1024*1024) # 1 MB
accelerator = config.get('accelerator', True)
return SoC(num_cores, memory_size, accelerator)

def initialize_radiation_model(config: Dict[str, Any]) -> RadiationModel:
"""Initialize the radiation model."""
particle_flux = config.get('particle_flux', 1.0) # particles/cm^2/s
upset_rate = config.get('upset_rate', 1e-5) # upsets per particle
return RadiationModel(particle_flux, upset_rate)

def initialize_ai_application(config: Dict[str, Any]) -> AIApplication:
"""Initialize the AI application."""
task = config.get('task', 'image_classification')
input_data = config.get('input_data', np.random.rand(100, 100))
return AIApplication(task, input_data)

--- Core Steps ---

def run_ai_application(soc: SoC, app: AIApplication) -> None:
"""Run the AI application on the SoC."""
# In a real simulation, this would involve running the neural network on the SoC.
# Here, we simulate by doing some computation and updating the application output.
if soc.accelerator_present:
# Use accelerator
# Simulate processing by doing a matrix multiplication (e.g., convolution)
# For simplicity, we'll just compute a dot product.
processed_data = np.dot(app.input_data.flatten(), app.input_data.flatten())
else:
# Use CPU cores
processed_data = np.sum(app.input_data)

# Store the result in memory (simplified)
soc.memory[0] = processed_data % 256
app.output = processed_data

# Update performance metric (e.g., operations per second)
soc.performance = 1.0 / (1.0 + soc.errors)  # Simplified: errors reduce performance

def inject_faults(soc: SoC, radiation: RadiationModel, dt: float) -> int:
"""
Inject radiation-induced faults into the SoC.
Returns the number of faults injected.
"""
# Calculate expected number of particles hitting the chip
chip_area = 1.0 # cm^2 (simplified)
expected_particles = radiation.particle_flux * chip_area * dt

# Poisson distribution for number of particles
num_particles = np.random.poisson(expected_particles)

# Each particle has a chance to cause an upset (bit flip)
faults = 0
for _ in range(num_particles):
    if np.random.random() < radiation.upset_rate:
        faults += 1
        # Choose a random location to flip a bit
        fault_type = np.random.choice(['memory', 'register', 'cache'])
        if fault_type == 'memory':
            address = np.random.randint(0, len(soc.memory))
            bit = np.random.randint(0, 8)
            soc.memory[address] ^= (1 << bit)
        elif fault_type == 'register':
            reg = np.random.randint(0, len(soc.registers))
            soc.registers[reg] ^= 1
        else:  # cache
            address = np.random.randint(0, len(soc.cache))
            bit = np.random.randint(0, 8)
            soc.cache[address] ^= (1 << bit)

soc.errors += faults
return faults

def apply_fault_tolerance(soc: SoC) -> int:
"""
Apply fault tolerance mechanisms to correct errors.
Returns the number of faults cor...


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits February 8, 2026 02:03
Co-authored-by: danielnovais-tech <230455320+danielnovais-tech@users.noreply.github.com>
Co-authored-by: danielnovais-tech <230455320+danielnovais-tech@users.noreply.github.com>
Co-authored-by: danielnovais-tech <230455320+danielnovais-tech@users.noreply.github.com>
Copilot AI changed the title [WIP] Add radiation model to simulation pipeline Add GR801 SoC radiation hardness simulation pipeline Feb 8, 2026
@danielnovais-tech danielnovais-tech marked this pull request as ready for review February 8, 2026 02:14
Copilot AI review requested due to automatic review settings February 8, 2026 02:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new, self-contained GR801 SoC “radiation hardness” simulation module plus a pytest suite to validate basic initialization, fault injection/correction, monitoring, and end-to-end execution.

Changes:

  • Introduces simulation_pipeline_gr801.py implementing SoC/app/radiation/state structures and a main loop with fault injection, ECC-like correction, and shutdown checks.
  • Adds tests/test_simulation_pipeline_gr801.py with unit and smoke tests for the new simulation pipeline.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
simulation_pipeline_gr801.py New GR801 SoC radiation fault simulation pipeline (core logic + metrics + shutdown behavior).
tests/test_simulation_pipeline_gr801.py New pytest coverage for initialization, fault injection/tolerance, monitoring, and simulation runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +165
def log_state(metrics: dict[str, Any]) -> None:
"""Log the current state."""
print(f"Time: {metrics['time']:.2f}s, Errors: {metrics['errors']}, Performance: {metrics['performance']:.2f}")

def safety_violation_detected(state: SimulationState) -> bool:
"""Check for safety violations (e.g., too many errors)."""
# If errors exceed a threshold, trigger a shutdown.
error_threshold = 1000
if state.soc.errors > error_threshold:
print(f"Safety violation: Too many errors ({state.soc.errors})")
return True
return False

def trigger_safe_shutdown(state: SimulationState) -> None:
"""Trigger a safe shutdown of the system."""
print("Triggering safe shutdown.")
# Save critical data, power down, etc.

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module uses print() for logging in log_state, safety_violation_detected, and trigger_safe_shutdown. Elsewhere in the repo (e.g., simulation_pipeline.py) simulation logging uses logging.getLogger(__name__) and LOGGER.info/error. To align with that convention and avoid noisy stdout in library use, switch these to a module logger (and ideally make logging frequency configurable).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to logging.getLogger(name) throughout the module in commit e7b5e45. All print() statements replaced with LOGGER.info/warning/error to match simulation_pipeline.py convention.

sim.run_ai_application(soc, app)

assert app.output is not None
# Memory should be modified (output is stored at memory[0])

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial_memory = soc.memory[0] is assigned but never used, which will fail Ruff/pyflakes (F841). Either remove it or use it to assert that run_ai_application() actually changes memory[0].

Suggested change
# Memory should be modified (output is stored at memory[0])
# Memory should be modified (output is stored at memory[0])
assert soc.memory[0] != initial_memory

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Now using initial_memory in assertion: assert soc.memory[0] != initial_memory

Comment on lines +18 to +22
self.performance = 0.0 # Some performance metric

class RadiationModel:
"""Models the radiation environment."""
def __init__(self, particle_flux: float, upset_rate: float):

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level class/function definitions need two blank lines between them to satisfy Ruff/pycodestyle (E302/E305). For example, there’s only one blank line between SoC and RadiationModel here; apply consistent 2-blank-line spacing throughout the module.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Added two blank lines between all top-level class and function definitions per PEP 8.

Comment on lines +47 to +50
num_cores = config.get('num_cores', 4)
memory_size = config.get('memory_size', 1024*1024) # 1 MB
accelerator = config.get('accelerator', True)
return SoC(num_cores, memory_size, accelerator)

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory_size = config.get('memory_size', 1024*1024) will trigger Ruff E226 (missing whitespace around arithmetic operator). Update to 1024 * 1024 to satisfy the repo lint configuration (ruff.toml enables E rules).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Changed to 1024 * 1024 with proper spacing.

Comment on lines +183 to +203
# Run the AI application
run_ai_application(soc, app)

# Inject faults due to radiation
faults = inject_faults(soc, radiation, dt)
state.faults_injected += faults

# Apply fault tolerance
corrected = apply_fault_tolerance(soc)
state.faults_corrected += corrected

# Update radiation model (if dynamic)
update_radiation_model(radiation, dt)

# Update time
state.time += dt

# Monitor and log
metrics = monitor_state(state)
metrics_history.append(metrics)

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance metric recorded in metrics can be stale: run_ai_application() updates soc.performance before faults are injected/corrected, but monitor_state() reads it after those updates. Consider recomputing soc.performance after inject_faults/apply_fault_tolerance (or inside monitor_state) so the reported performance corresponds to the current soc.errors for that timestep.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Removed performance update from run_ai_application() and moved it to after apply_fault_tolerance() in run_simulation(), ensuring performance reflects current error count after fault handling.

Comment on lines +124 to +129
# Simplified: Assume we can correct some errors with ECC in memory and cache.
# We assume a fixed correction rate.
correction_rate = 0.8 # 80% of errors are corrected
corrected = int(soc.errors * correction_rate)
soc.errors -= corrected
return corrected

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correction_rate is hard-coded to 0.8 here, but the PR description calls out a configurable ECC correction rate. Consider reading this from config (or making it a parameter with a documented default) so callers can tune the simulation without editing code.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made configurable in commit e7b5e45. Added correction_rate parameter (default: 0.8) to apply_fault_tolerance() and reads from config in run_simulation().

Comment thread simulation_pipeline_gr801.py Outdated
Comment on lines +155 to +160
# If errors exceed a threshold, trigger a shutdown.
error_threshold = 1000
if state.soc.errors > error_threshold:
print(f"Safety violation: Too many errors ({state.soc.errors})")
return True
return False

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error_threshold is hard-coded to 1000, but the PR description mentions a configurable error threshold for emergency shutdown. Consider making this configurable via config (or a function parameter) so different missions/environments can set appropriate safety limits.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made configurable in commit e7b5e45. Added error_threshold parameter (default: 1000) to safety_violation_detected() and reads from config in run_simulation().

Comment on lines +188 to +192
state.faults_injected += faults

# Apply fault tolerance
corrected = apply_fault_tolerance(soc)
state.faults_corrected += corrected

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state.faults_injected / state.faults_corrected are accumulated over the whole run (+= each step), but the PR description says metrics are “per-timestep”. Either rename these metrics to make it explicit they are cumulative (e.g., total_faults_injected), or record per-step values in the metrics history (or both) to match the stated API.

Suggested change
state.faults_injected += faults
# Apply fault tolerance
corrected = apply_fault_tolerance(soc)
state.faults_corrected += corrected
# Store per-timestep faults injected
state.faults_injected = faults
# Apply fault tolerance
corrected = apply_fault_tolerance(soc)
# Store per-timestep faults corrected
state.faults_corrected = corrected

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Renamed to total_faults_injected and total_faults_corrected in metrics dict to clearly indicate cumulative tracking over the simulation run.

Comment on lines +217 to +223
'num_cores': 4,
'memory_size': 1024*1024,
'accelerator': True,
'particle_flux': 5.0, # High radiation environment
'upset_rate': 1e-4,
'task': 'image_classification',
'input_data': np.random.rand(100, 100),

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'memory_size': 1024*1024 will trigger Ruff E226 (missing whitespace around arithmetic operator). Update to 1024 * 1024 to satisfy the repo lint configuration.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Changed to 1024 * 1024 with proper spacing in main block.

Comment thread tests/test_simulation_pipeline_gr801.py Outdated
Comment on lines +87 to +89
# With high flux and upset rate, we should see some faults
assert total_faults > 0
assert soc.errors == total_faults

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is statistically flaky: with particle_flux=10, dt=0.1, upset_rate=0.1, the total faults over 10 iterations is ~Poisson(λ≈1), so there's ~37% chance total_faults is 0 and this assertion fails. Make it deterministic by seeding/mocking the RNG (e.g., patch np.random.poisson/random/choice) or by choosing parameters/assertions that don't rely on chance.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e7b5e45. Made test deterministic by:

  • Seeding np.random with np.random.seed(42)
  • Increased particle_flux to 100.0 and upset_rate to 0.5 for higher fault probability
  • Added comment explaining expected behavior

Test now passes reliably every run.

@danielnovais-tech

Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread

…ements

Co-authored-by: danielnovais-tech <230455320+danielnovais-tech@users.noreply.github.com>

Copilot AI commented Feb 8, 2026

Copy link
Copy Markdown
Contributor Author

@copilot apply changes based on the comments in this thread

All changes from the review have been applied in commit e7b5e45. Key updates:

  • Replaced print() with logging.getLogger(name)
  • Made correction_rate and error_threshold configurable via config
  • Fixed PEP 8 spacing issues (blank lines, arithmetic operators)
  • Renamed metrics to clarify cumulative tracking (total_faults_injected, total_faults_corrected)
  • Moved performance metric update to after fault handling
  • Fixed flaky test with deterministic seeding
  • Fixed unused variable warning

All tests passing, linting clean.

Co-authored-by: danielnovais-tech <230455320+danielnovais-tech@users.noreply.github.com>
@danielnovais-tech danielnovais-tech merged commit 5c47c5d into main Feb 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants