Make better use of VPM

Up to 2 VPM writes can be queued to VPM write FIFO (QPU -> VPM), write will block when FIFO full.
-> No need to stall/delay between VPM writes, currently used
-> Information could be used to insert non-VPM-access between pairs of VPM writes (e.g. `write vpm; write vpm; something else to prevent stall; write vpm; ...`)

Up to 2 VPM read setups can be queued to VPM read FIFO (VPM -> QPU), further writes to setup register will be ignored, outstanding VPM reads on program finish are cancelled.
-> We could queue up to 2 read setups before waiting for data to be available. Also, for loops, we could issue the read setup for the next iteration in advance, this needs emptying of data after loop ends (to empty the data read for the one-after-last iteration).

DMA load/store operations cannot be queued, but DMA load and DMA store can run concurrently.

**Is VPM access required to be synchronized between all QPUs?**
There is no statement in the specification to (or against) that fact. Is the VPM really shared (as in locking required) or is it "shared" but can still be used by every QPU at once (like the TMU, no locking required)?

https://github.com/nineties/py-videocore uses mutex to lock VPM access in parallel examples, https://github.com/mn416/QPULib does not seem to use a mutex, https://github.com/maazl/vc4asm uses semaphores to lock VPM access.

Sources:
VideoCore IV Specification, pages 55+

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make better use of VPM #113

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Make better use of VPM #113

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions