Skip to content

Add GR801 radiation-hardened SoC simulation framework scaffold#63

Merged
danielnovais-tech merged 1 commit into
mainfrom
copilot/gr801-simulation-framework
Feb 8, 2026
Merged

Add GR801 radiation-hardened SoC simulation framework scaffold#63
danielnovais-tech merged 1 commit into
mainfrom
copilot/gr801-simulation-framework

Conversation

@danielnovais-tech

Copy link
Copy Markdown
Owner

Adds a new module gr801_simulation_framework.py implementing a lightweight GR801 SoC simulation scaffold (cores, AI accelerator, sparse paged memory controller, radiation environment models, and mission simulator).

Key points:

  • Avoids multi-GB allocations by modeling DDR address space with sparse paged storage.
  • Avoids per-cycle Python loops by batching CPU cycles and scrub cycles.
  • Adds basic pytest coverage to ensure the module stays safe to import/use.

Tests:

  • ruff check .
  • pytest -q tests

Copilot AI review requested due to automatic review settings February 8, 2026 02:09
@danielnovais-tech danielnovais-tech merged commit 45b17d2 into main Feb 8, 2026
8 checks passed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new standalone Python module that scaffolds a lightweight GR801 radiation-hardened SoC simulation (cores, AI accelerator, sparse paged memory, radiation environment models, and a mission simulator) along with minimal pytest coverage to ensure safe import and basic functionality.

Changes:

  • Introduces gr801_simulation_framework.py implementing GR801 system components and a mission simulation driver.
  • Implements sparse paged memory to avoid multi-GB DDR allocations.
  • Adds tests/test_gr801_simulation_framework.py with smoke tests for sparse memory behavior and batched inference I/O.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File Description
gr801_simulation_framework.py New GR801 simulation scaffold (cores/accelerator/memory/radiation/mission simulator) intended to be lightweight and import-safe.
tests/test_gr801_simulation_framework.py Adds basic pytest coverage for sparse memory allocation behavior and batched inference result shape/structure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +589 to +591
if int(self.uptime * 100) % 10 == 0:
self._log_system_state()

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging condition if int(self.uptime * 100) % 10 == 0 will be true for many consecutive timesteps at the beginning due to integer truncation (e.g., for dt=1e-3, int(uptime*100) stays 0 for the first ~10ms), causing log spam and unnecessary overhead. Track the last log time (or the next scheduled log time) instead of using a modulus on a truncated value.

Copilot uses AI. Check for mistakes.
Comment on lines +794 to +806
phase_thresholds: dict[str, float] = {
"LAUNCH": 300,
"ORBIT_INSERTION": 1800,
"SCIENCE_OPS": 3600,
"COMMUNICATIONS": 7200,
"DEORBIT": self.mission_duration - 1800,
}

for phase, threshold in phase_thresholds.items():
if self.simulation_time <= threshold:
self.mission_phase = phase
break

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_update_mission_phase() compares simulation_time <= threshold against a set of thresholds that include DEORBIT: mission_duration - 1800. With the current logic and insertion order, once simulation_time exceeds 7200s it will fall through to DEORBIT for the majority of the mission (until mission_duration - 1800), which is likely not intended. Consider representing thresholds as phase start times and selecting the latest start time <= current time, or otherwise adjusting the comparisons so DEORBIT only activates near the end.

Suggested change
phase_thresholds: dict[str, float] = {
"LAUNCH": 300,
"ORBIT_INSERTION": 1800,
"SCIENCE_OPS": 3600,
"COMMUNICATIONS": 7200,
"DEORBIT": self.mission_duration - 1800,
}
for phase, threshold in phase_thresholds.items():
if self.simulation_time <= threshold:
self.mission_phase = phase
break
# Interpret these values as phase *start times* in seconds.
phase_starts: dict[str, float] = {
"LAUNCH": 0.0,
"ORBIT_INSERTION": 300.0,
"SCIENCE_OPS": 1800.0,
"COMMUNICATIONS": 3600.0,
# Start DEORBIT 1800 seconds before mission end (or at t=0 for very short missions).
"DEORBIT": max(0.0, float(self.mission_duration) - 1800.0),
}
# Select the latest phase whose start time is <= current simulation time.
current_phase = self.mission_phase
for phase, start_time in sorted(phase_starts.items(), key=lambda item: item[1]):
if self.simulation_time >= start_time:
current_phase = phase
else:
break
self.mission_phase = current_phase

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +21
logger = logging.getLogger("GR801_Simulator")

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repository code consistently uses logging.getLogger(__name__) to keep logger names hierarchical (e.g. src/utils/analysis_utils.py:31). Using a fixed string here ("GR801_Simulator") breaks that convention and can make log configuration/filtering harder. Prefer logging.getLogger(__name__) (or a module-level LOGGER) unless there’s a specific need for a custom name.

Copilot uses AI. Check for mistakes.
Comment on lines +896 to +923
def test_gr801_radiation_tolerance() -> dict[str, Any]:
"""Test GR801 radiation tolerance in different environments (quick toy run)."""

environments = [RadiationEnvironment.LEO, RadiationEnvironment.GEO, RadiationEnvironment.JUPITER]

results: dict[str, Any] = {}
for env in environments:
logger.info("Testing GR801 in %s environment", env.value)
system = GR801System(environment=env)

# 10 seconds simulated time in 100 ms steps.
for _ in range(100):
system.execute_timestep(dt=0.1)

model = NeuralNetworkModel.get_preset_model(AIWorkload.IMAGE_CLASSIFICATION)
input_data = np.random.standard_normal((1, *model.input_shape)).astype(np.float32)

result = system.run_inference(AIWorkload.IMAGE_CLASSIFICATION, input_data)

results[env.value] = {
"final_tid": system.total_ionizing_dose,
"seu_count": len(system.seu_log),
"inference_accuracy": result["estimated_accuracy"],
"system_operational": system._system_operational(),
}

return results

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name test_gr801_radiation_tolerance() lives in a non-test module and is invoked from __main__. Prefixing with test_ makes it look like a pytest test (and may confuse readers/tools) even though pytest.ini only collects test_*.py under tests/. Consider renaming to something like run_gr801_radiation_tolerance() / simulate_radiation_tolerance().

Copilot uses AI. Check for mistakes.
Comment on lines +766 to +783
def run_mission(self):
logger.info("Starting space mission simulation for %.0f seconds", self.mission_duration)

dt = 0.1
timesteps = int(max(1, self.mission_duration / dt))

for step in range(timesteps):
self.simulation_time += dt
self._update_mission_phase()

try:
self.gr801_system.execute_timestep(dt)
self._execute_mission_workload()
except Exception as exc: # noqa: BLE001
logger.error("Mission failure at t=%.1fs: %s", self.simulation_time, exc)
self.mission_success = False
break

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpaceMissionSimulator.run_mission() runs an inference on every timestep (dt=0.1), which is ~864,000 inferences for a 24h mission. Given the accelerator path allocates a new random projection matrix each inference, this example can become extremely slow and memory-churn heavy. Consider throttling workload execution (e.g., run inference every N timesteps / per phase-specific cadence) and/or caching/reusing projection weights to keep the example practical.

Copilot uses AI. Check for mistakes.
Comment on lines +402 to +418
def read(self, address: int, size: int) -> np.ndarray:
"""Read memory with ECC checking (simplified)."""

data = self._mem.read(address, size)

# Simulate occasional ECC detection/correction. We don't store ECC bits;
# we simulate the outcomes and counters.
for i in range(0, int(size), 8):
chunk_addr = int(address) + i
if self._check_ecc(chunk_addr):
if self._correct_ecc_error(chunk_addr):
self.errors_corrected += 1
else:
self.errors_detected += 1
self._log_memory_error(chunk_addr, "UNCORRECTABLE")

return data

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemoryController.read() performs an 8-byte chunk loop in Python (for i in range(0, size, 8)), which will scale linearly and become a bottleneck for large reads. If large reads are expected, consider modeling ECC outcomes in a more vectorized/batched way (e.g., sample the number of detected/corrected events for the read and update counters accordingly) rather than iterating per chunk.

Copilot uses AI. Check for mistakes.
Comment on lines +608 to +628
start_time = self.uptime

if self.ai_accelerator is not None:
result = self.ai_accelerator.execute_inference(model.operations, input_data)
self.total_power += GR801Config.NOMINAL_POWER * 0.5
else:
result = self._cpu_inference(input_data)

execution_time = self.uptime - start_time
accuracy = self._calculate_radiation_accuracy()

self.inference_count += 1
self.total_operations += int(model.operations)

return {
"result": result,
"execution_time": execution_time,
"estimated_accuracy": accuracy,
"power_consumed": self.total_power * max(0.0, execution_time),
"radiation_effects": len([e for e in self.seu_log if not e.corrected]),
}

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_inference() computes execution_time as self.uptime - start_time, but self.uptime is never advanced during inference, so execution_time (and thus power_consumed) will always be 0. This makes the returned telemetry misleading. Consider either advancing simulated time by the accelerator/CPU inference time, or computing execution time from a dedicated timing model/clock separate from uptime.

Copilot uses AI. Check for mistakes.
Comment on lines +257 to +275
def execute_inference(self, model_ops: int, input_data: np.ndarray) -> np.ndarray:
"""Execute neural network inference."""

if not self.active:
raise RuntimeError("AI Accelerator not active")

model_ops_i = int(model_ops)
_execution_time = float(model_ops_i) / self.throughput

self.power_consumption = GR801Config.NOMINAL_POWER * 1.5

result = self._simulate_matrix_operations(input_data)

self.operations_completed += model_ops_i
self.mac_utilization = float(self._rng.uniform(0.7, 0.95))

self._update_temperature()
return result

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIAccelerator.execute_inference() calculates _execution_time but never uses it (and callers also don’t consume it). If the simulator intends to report or advance simulated time for inference, this value should be returned/propagated or used to update system state; otherwise it should be removed to avoid implying timing is modeled when it isn’t.

Copilot uses AI. Check for mistakes.
if model is None:
raise RuntimeError("No model selected")

results = [np.random.standard_normal(model.output_shape).astype(np.float32) for _core in self.cores]

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For loop variable '_core' is not used in the loop body.

Suggested change
results = [np.random.standard_normal(model.output_shape).astype(np.float32) for _core in self.cores]
results = [np.random.standard_normal(model.output_shape).astype(np.float32) for _ in range(len(self.cores))]

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants