Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
c0ba9f4
docs: command processor design + implementation proposals
tinebp May 17, 2026
210e112
runtime: introduce async vortex2.h API; legacy vortex.h becomes wrapper
tinebp May 17, 2026
e28cb59
runtime: allow size=0 in mem_access callback (legacy upload path)
tinebp May 17, 2026
b38765c
tests/runtime: add test_async — vortex2 async API conformance
tinebp May 17, 2026
157e7a1
runtime: per-queue worker thread + FIFO; fixes enqueue-gating deadlock
tinebp May 17, 2026
a1ab5d3
hw/cp: VX_cp_arbiter + verilator unit test
tinebp May 17, 2026
f16da81
hw/cp: VX_cp_engine FSM + bid interfaces + verilator unit test
tinebp May 17, 2026
6eb48a0
hw/cp: VX_cp_launch FSM + verilator unit test
tinebp May 17, 2026
7ee01f1
hw/cp: VX_cp_dcr_proxy FSM + verilator unit test
tinebp May 17, 2026
b7f0303
hw/cp: VX_cp_unpack + TB; XRT integration plan
tinebp May 17, 2026
535e060
hw/cp: AXI interfaces + regfile + fetch/completion/xbar bundle (commi…
tinebp May 17, 2026
d752346
hw/cp: VX_cp_dma + full VX_cp_core integration + cp_core end-to-end TB
tinebp May 17, 2026
1224788
docs/cp: update integration plan — RTL substantially done, all 4 back…
tinebp May 17, 2026
04971a2
tests/regression: rewrite vecadd + sgemm from scratch on vortex2.h
tinebp May 17, 2026
893c69c
runtime: push KMU descriptor + kernel-load helpers into vortex2.h
tinebp May 17, 2026
15440a5
xrt: integrate VX_cp_core end-to-end with VORTEX_USE_CP runtime path
tinebp May 17, 2026
8b4fdc8
opae: integrate VX_cp_core end-to-end with VORTEX_USE_CP runtime path
tinebp May 17, 2026
196c4e5
hw/cp: engine retires on resource done, not on arbiter grant
tinebp May 18, 2026
00aa42f
docs: pure-v2 callbacks_t + software CP for simx/rtlsim
tinebp May 18, 2026
16aa1ca
sim/common: software CommandProcessor C++ class + unit test
tinebp May 18, 2026
8bc2564
runtime: add cp_mmio_write/read callbacks; wire all 4 backends
tinebp May 18, 2026
94888e6
runtime: dispatcher owns CP ring submission; Queue routes through it …
tinebp May 18, 2026
a43822c
hw/cp: VX_cp_dcr_proxy latches addr+data on grant (was sampling zeros…
tinebp May 18, 2026
086d26b
runtime: strip legacy launch_*/dcr_* from callbacks_t (Phase E — pure…
tinebp May 18, 2026
e9fe17e
cp: release-style comments and consolidated design doc
tinebp May 18, 2026
1ce7231
hw/cp: add VX_cp_event_unit and VX_cp_profiling skeletons
tinebp May 18, 2026
74efe10
Merge tinebp-patch-2 (CI build fixes) into feature_cp
tinebp May 18, 2026
a618750
docs: proposal — VX_config.toml macro namespace cleanup (VX_CFG_ prefix)
tinebp May 18, 2026
49ae889
Merge tinebp-patch-2 CI fixes into feature_cp
tinebp May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
747 changes: 747 additions & 0 deletions docs/designs/command_processor_design.md

Large diffs are not rendered by default.

1,607 changes: 1,607 additions & 0 deletions docs/proposals/command_processor_proposal.md

Large diffs are not rendered by default.

460 changes: 460 additions & 0 deletions docs/proposals/config_macro_namespace_proposal.md

Large diffs are not rendered by default.

317 changes: 317 additions & 0 deletions docs/proposals/cp_opae_integration_plan.md

Large diffs are not rendered by default.

375 changes: 375 additions & 0 deletions docs/proposals/cp_pure_v2_callbacks_proposal.md

Large diffs are not rendered by default.

951 changes: 951 additions & 0 deletions docs/proposals/cp_rtl_impl_proposal.md

Large diffs are not rendered by default.

1,059 changes: 1,059 additions & 0 deletions docs/proposals/cp_runtime_impl_proposal.md

Large diffs are not rendered by default.

475 changes: 475 additions & 0 deletions docs/proposals/cp_xrt_integration_plan.md

Large diffs are not rendered by default.

238 changes: 214 additions & 24 deletions hw/rtl/afu/opae/vortex_afu.sv

Large diffs are not rendered by default.

428 changes: 372 additions & 56 deletions hw/rtl/afu/xrt/VX_afu_wrap.sv

Large diffs are not rendered by default.

116 changes: 116 additions & 0 deletions hw/rtl/cp/VX_cp_arbiter.sv
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
// Copyright © 2019-2023
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

`include "VX_define.vh"

// ============================================================================
// VX_cp_arbiter — generic round-robin arbiter over N bidders.
//
// Instantiated 3x in VX_cp_core (one per shared resource: KMU, DMA, DCR).
// On any given cycle, picks at most one bidder whose `valid` is asserted,
// rotating fairness across calls. Grant lasts a single cycle; the granted
// CPE is expected to hold its bid until the resource completes (the
// per-resource consumer module signals completion through a separate
// path; this arbiter does not track in-flight requests).
//
// Priority is honored only as a "high-priority bidders are visited first
// in the rotation" hint, not as strict preemption. This keeps the
// implementation small and avoids starvation guarantees beyond plain
// round-robin.
// ============================================================================

module VX_cp_arbiter
import VX_cp_pkg::*;
#(
parameter int N = 1
)(
input wire clk,
input wire reset,

input wire bid_valid [N],
input wire [1:0] bid_priority [N],
output logic bid_grant [N]
);

// Rotating pointer to the bidder that gets first look this cycle.
// For N=1, $clog2(N)=0, so PTR_W collapses to 1 (we still need at least
// one bit to hold the value 0).
localparam int PTR_W = (N > 1) ? $clog2(N) : 1;
// SUM_W is one bit wider than PTR_W so (rr_ptr + N - 1) fits without
// wrap, even when N is a power of 2 (PTR_W'(N) would truncate to 0
// and break the modulo).
localparam int SUM_W = PTR_W + 1;

logic [PTR_W-1:0] rr_ptr;
logic [PTR_W-1:0] winner;
logic any_grant;

always_comb begin
winner = '0;
any_grant = 1'b0;
bid_grant = '{default: 1'b0};

if (N == 1) begin
if (bid_valid[0]) begin
bid_grant[0] = 1'b1;
winner = '0;
any_grant = 1'b1;
end
end else begin
// One-pass scan: starting at rr_ptr, find the first valid bidder.
// Sum in SUM_W bits then conditionally subtract N (faster than
// synthesizing a divider and dodges the PTR_W'(N)==0 hazard).
for (int unsigned i = 0; i < N; ++i) begin
logic [SUM_W-1:0] sum;
logic [PTR_W-1:0] idx;
sum = SUM_W'({1'b0, rr_ptr}) + SUM_W'(i);
idx = (sum >= SUM_W'(N)) ? PTR_W'(sum - SUM_W'(N))
: PTR_W'(sum);
if (!any_grant && bid_valid[idx]) begin
bid_grant[idx] = 1'b1;
winner = idx;
any_grant = 1'b1;
end
end
end

end

// Plain round-robin; priority is reserved for a future eligibility
// pre-filter pass. Suppress unused-bit warnings per-element so the macro
// sees a packed logic instead of the unpacked array.
generate
for (genvar gi = 0; gi < N; ++gi) begin : g_unused_prio
`UNUSED_VAR (bid_priority[gi])
end
endgenerate

// Advance the round-robin pointer one past the winner so the next
// cycle starts the scan after the bidder we just served. Same
// wrap-by-subtract trick as the scan above.
always_ff @(posedge clk) begin
if (reset) begin
rr_ptr <= '0;
end else if (any_grant) begin
if (N == 1) begin
rr_ptr <= '0;
end else begin
logic [SUM_W-1:0] nxt;
nxt = SUM_W'({1'b0, winner}) + SUM_W'(1);
rr_ptr <= (nxt >= SUM_W'(N)) ? PTR_W'(nxt - SUM_W'(N))
: PTR_W'(nxt);
end
end
end

endmodule : VX_cp_arbiter
110 changes: 110 additions & 0 deletions hw/rtl/cp/VX_cp_axi_m_if.sv
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
// Copyright © 2019-2023
// Licensed under the Apache License, Version 2.0.

`ifndef VX_CP_AXI_M_IF_SV
`define VX_CP_AXI_M_IF_SV

`include "VX_define.vh"

// ============================================================================
// VX_cp_axi_m_if.sv — AXI4 master interface bundle used inside rtl/cp/.
//
// Every CP module that needs to issue host-AXI transactions (VX_cp_fetch,
// VX_cp_dma, VX_cp_completion, VX_cp_event_unit, VX_cp_profiling) talks
// through one instance of this interface. VX_cp_axi_xbar fans them into
// the single upstream master that VX_cp_core exposes on its `axi_m` port.
//
// The bundle deliberately omits the optional AW/AR sideband signals
// (LOCK / CACHE / PROT / QOS / REGION); they are tied off at the
// cp_core boundary to whatever value the upstream shell expects
// (typically all zero, write-allocate cache attributes).
// ============================================================================

interface VX_cp_axi_m_if
#(
parameter int ADDR_W = 64,
parameter int DATA_W = 512,
parameter int ID_W = VX_CP_AXI_TID_WIDTH_C
);

import VX_cp_pkg::*;

// ---- Write request address channel (AW) ----
logic awvalid;
logic awready;
logic [ADDR_W-1:0] awaddr;
logic [ID_W-1:0] awid;
logic [7:0] awlen; // number of transfers - 1
logic [2:0] awsize; // log2 bytes per transfer
logic [1:0] awburst; // 2'b01 = INCR

// ---- Write data channel (W) ----
logic wvalid;
logic wready;
logic [DATA_W-1:0] wdata;
logic [DATA_W/8-1:0] wstrb;
logic wlast;

// ---- Write response channel (B) ----
logic bvalid;
logic bready;
logic [ID_W-1:0] bid;
logic [1:0] bresp; // 2'b00 = OKAY

// ---- Read request address channel (AR) ----
logic arvalid;
logic arready;
logic [ADDR_W-1:0] araddr;
logic [ID_W-1:0] arid;
logic [7:0] arlen;
logic [2:0] arsize;
logic [1:0] arburst;

// ---- Read response channel (R) ----
logic rvalid;
logic rready;
logic [DATA_W-1:0] rdata;
logic [ID_W-1:0] rid;
logic rlast;
logic [1:0] rresp;

// ---- Modports ----
modport master (
// AW
output awvalid, awaddr, awid, awlen, awsize, awburst,
input awready,
// W
output wvalid, wdata, wstrb, wlast,
input wready,
// B
input bvalid, bid, bresp,
output bready,
// AR
output arvalid, araddr, arid, arlen, arsize, arburst,
input arready,
// R
input rvalid, rdata, rid, rlast, rresp,
output rready
);

modport slave (
// AW
input awvalid, awaddr, awid, awlen, awsize, awburst,
output awready,
// W
input wvalid, wdata, wstrb, wlast,
output wready,
// B
output bvalid, bid, bresp,
input bready,
// AR
input arvalid, araddr, arid, arlen, arsize, arburst,
output arready,
// R
output rvalid, rdata, rid, rlast, rresp,
input rready
);

endinterface : VX_cp_axi_m_if

`endif // VX_CP_AXI_M_IF_SV
Loading
Loading