Port bpf changes from PR #58. #60

tmiw · 2025-12-01T21:25:30Z

Per previous PLT discussion, this PR ports only the BPF changes from PR #58 as this is shared with RADEV2. Please see original PR for rationale and benchmarks.

drowe67 · 2025-12-02T23:26:08Z

radae/dsp.py

+
+      # Reallocate x_mem if x_baseband size changes
+      if len(self.x_mem) != (len(self.mem) + len(x_baseband)):
+          self.x_mem = np.zeros(len(self.mem) + len(x_baseband), dtype=np.csingle)


I like the idea of pre-allocating x_filt, however lets do it once at init time, based on the maximum buffer length pased into the init function. Have an assert at run time to make sure we don't get passed a bigger n than we have room for.

Updated to allocate only in the constructor.

L75 performs a run time allocation of self.x_mem? The assert should check the length of x_mem.

The idea is no run time allocation of memory, sorry if I didn't make that clear.

drowe67 · 2025-12-02T23:28:40Z

radae/dsp.py

+      # The advantage of operating on the strided array is that we make only one transition between Python
+      # and the NumPy C code, reducing overhead.
+      self.x_filt[0:n] = np.dot(x_mem_slided, self.h)
+      self.mem = self.x_mem[-self.Ntap-1:]                                  # save filter state for next time


OK, well explained 👍

drowe67 · 2025-12-02T23:46:17Z

OK so we need a way to make sure these optimisations perform exactly the same as main. The complex_bpf_test test looks pretty good (actually 2 sub-tests). Can you pls:

Modify complex_bpf_test to write 2 complex 64 bit format .c64 files to disk, say complex_bpf_test1.c64, complex_bpf_test2.c64, that are the output of the filter.
Using these files, lets compare the output samples for the vanilla and optimised filter. I can do that if you email me the 4 files.

tmiw · 2025-12-03T09:08:17Z

OK so we need a way to make sure these optimisations perform exactly the same as main. The complex_bpf_test test looks pretty good (actually 2 sub-tests). Can you pls:

Modify complex_bpf_test to write 2 complex 64 bit format .c64 files to disk, say complex_bpf_test1.c64, complex_bpf_test2.c64, that are the output of the filter.

Using these files, lets compare the output samples for the vanilla and optimised filter. I can do that if you email me the 4 files.

Hopefully 08b9964 is what you meant. I emailed the results from this PR (and from main after I backported the changes to generate the .c64 files).

drowe67 · 2025-12-03T21:43:13Z

radae/dsp.py

   rx = np.cos(2*np.pi*centre_freq_Hz*np.arange(Fs_Hz)/Fs_Hz)    # 1 sec real sinewave
   rx_bpf = bpf.bpf(rx)
+   rx_bpf.tofile("complex_bpf_test1.c64")
   print(rx.shape,rx_bpf.shape)


I'm not getting sensible results when I load the emailed files into Octave:

orig_test1=load_c64('~/Downloads/complex_bpf_test/orig_complex_bpf_test1.c64',1); orig_test1(1:10) ans = 8.6456e-21 + 1.2816e-02i 0 - 1.2046e-04i 0 + 8.4085e-01i 0 + 8.4085e-01i 1.5336e-28 + 8.1424e-01i 2.4249e+07 + 9.5343e-01i -3.6893e+19 - 9.3648e-01i 3.6893e+19 + 9.8109e-01i 1.9996e+21 - 1.0402e+00i -1.0940e+18 + 8.6109e-01i

load_c64.m can be found in #42

Hmm, it looks like NumPy is using complex128 and not complex64. Might explain the issues I'm having with radae_rx_basic having a slightly higher loss figure now, too. I'll experiment some more.

…tatic alloc of x_mem.

tmiw · 2025-12-05T17:52:12Z

OK, I was able to figure out the issue with allocating x_mem in place. Here's an updated set of files for comparison (actually in complex64 this time). Was able to confirm that load_c64 returned something that looked right with these files too.

complex_bpf_test_files.zip

(main_* = from main, optim_* = from this PR)

drowe67 · 2025-12-05T20:23:42Z

radae/dsp.py

+
+      # Store concatenated memory and baseband samples into x_mem
+      np.concatenate([self.mem, x_baseband], out=self.x_mem[0:len(x_baseband) + len(self.mem)])
+


Just an observation - I guess this is kinda where Python gets awkward and we might as well be using C as we're having to think about in place versus pointers etc 🤔 Oh well, it is what it is.

drowe67 · 2025-12-05T20:35:37Z

Thanks @tmiw. It looks like one file is empty:

 dr-radev2 $ ls ~/Downloads/*.c64 -l
-rw-r--r-- 1 user user 64000 Dec  6 04:15 /home/user/Downloads/main_complex_bpf_test1.c64
-rw-r--r-- 1 user user     0 Dec  6 04:15 /home/user/Downloads/main_complex_bpf_test2.c64
-rw-r--r-- 1 user user 64000 Dec  6 04:13 /home/user/Downloads/optim_complex_bpf_test1.c64
-rw-r--r-- 1 user user 61440 Dec  6 04:13 /home/user/Downloads/optim_complex_bpf_test2.c64

tmiw · 2025-12-06T07:39:32Z

Thanks @tmiw. It looks like one file is empty:

Oops, I added the write a bit too early for the second file. Try this:

complex_bpf_test_files_2.zip

drowe67 · 2025-12-06T21:07:41Z

Looks good @tmiw 👍

octave:60> mt1=load_c64("~/Downloads/main_complex_bpf_test1.c64",1);
octave:61> mt2=load_c64("~/Downloads/main_complex_bpf_test2.c64",1);
octave:62> figure(1);
octave:63> plot(mt1)
octave:64> plot(mt2)
octave:65> ot1=load_c64("~/Downloads/optim_complex_bpf_test1.c64",1);
octave:66> ot2=load_c64("~/Downloads/optim_complex_bpf_test2.c64",1);
octave:67> plot(ot1)
octave:68> plot(ot2)
octave:69> plot(mt1-ot1)
octave:70> plot(mt2-ot2)

Shows only some tiny differences. This sort of test could be automated in a C port, e.g. to compare C to Python, but a one off manual test is OK for this PR.

tmiw added 2 commits December 1, 2025 13:23

Port bpf changes from previous PR.

510f062

Add aarch64 ctests.

a59c5df

tmiw mentioned this pull request Dec 1, 2025

Performance improvements to support the Flex waveform #58

Closed

drowe67 reviewed Dec 2, 2025

View reviewed changes

tmiw added 2 commits December 3, 2025 00:44

Allow x_filt max size to be specified in the constructor.

961bc9a

Write test outputs as .c64 files.

08b9964

tmiw added 3 commits December 3, 2025 01:39

Oops, need to pass length of rx in for tests to work properly.

ba6983a

Increase max sample size for BBFM BPF.

0f7833e

Increase BBFM BPF maximum again.

a0923f6

drowe67 reviewed Dec 3, 2025

View reviewed changes

tmiw added 5 commits December 3, 2025 17:00

Remove reallocation of x_mem. Note: this seems to fail radae_rx_basic.

aa6baf6

np.concat with out causes incorrect results for some reason, remove s…

fc1a35e

…tatic alloc of x_mem.

Store np.dot result in-place.

ef1fc96

Figured out why allocating x_mem only once caused test failures.

95f4f15

Calculate right hand side of phase_vec calculation only once.

5095514

drowe67 reviewed Dec 5, 2025

View reviewed changes

drowe67 merged commit 98094c4 into main Dec 6, 2025
2 checks passed


		# Store concatenated memory and baseband samples into x_mem
		np.concatenate([self.mem, x_baseband], out=self.x_mem[0:len(x_baseband) + len(self.mem)])

Port bpf changes from PR #58. #60

Port bpf changes from PR #58. #60

Uh oh!

Conversation

tmiw commented Dec 1, 2025

Uh oh!

drowe67 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

tmiw Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

drowe67 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

drowe67 Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drowe67 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drowe67 commented Dec 2, 2025

Uh oh!

tmiw commented Dec 3, 2025

Uh oh!

drowe67 Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmiw Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

tmiw commented Dec 5, 2025

Uh oh!

drowe67 Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

drowe67 commented Dec 5, 2025

Uh oh!

tmiw commented Dec 6, 2025

Uh oh!

drowe67 commented Dec 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drowe67 Dec 3, 2025 •

edited

Loading

drowe67 Dec 2, 2025 •

edited

Loading

drowe67 Dec 3, 2025 •

edited

Loading