Skip to content

Conversation

@tmiw
Copy link
Collaborator

@tmiw tmiw commented Dec 1, 2025

Per previous PLT discussion, this PR ports only the BPF changes from PR #58 as this is shared with RADEV2. Please see original PR for rationale and benchmarks.

radae/dsp.py Outdated

# Reallocate x_mem if x_baseband size changes
if len(self.x_mem) != (len(self.mem) + len(x_baseband)):
self.x_mem = np.zeros(len(self.mem) + len(x_baseband), dtype=np.csingle)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of pre-allocating x_filt, however lets do it once at init time, based on the maximum buffer length pased into the init function. Have an assert at run time to make sure we don't get passed a bigger n than we have room for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to allocate only in the constructor.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L75 performs a run time allocation of self.x_mem? The assert should check the length of x_mem.

Copy link
Owner

@drowe67 drowe67 Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is no run time allocation of memory, sorry if I didn't make that clear.

radae/dsp.py Outdated
# The advantage of operating on the strided array is that we make only one transition between Python
# and the NumPy C code, reducing overhead.
self.x_filt[0:n] = np.dot(x_mem_slided, self.h)
self.mem = self.x_mem[-self.Ntap-1:] # save filter state for next time
Copy link
Owner

@drowe67 drowe67 Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, well explained 👍

@drowe67
Copy link
Owner

drowe67 commented Dec 2, 2025

OK so we need a way to make sure these optimisations perform exactly the same as main. The complex_bpf_test test looks pretty good (actually 2 sub-tests). Can you pls:

  1. Modify complex_bpf_test to write 2 complex 64 bit format .c64 files to disk, say complex_bpf_test1.c64, complex_bpf_test2.c64, that are the output of the filter.
  2. Using these files, lets compare the output samples for the vanilla and optimised filter. I can do that if you email me the 4 files.

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 3, 2025

OK so we need a way to make sure these optimisations perform exactly the same as main. The complex_bpf_test test looks pretty good (actually 2 sub-tests). Can you pls:

  1. Modify complex_bpf_test to write 2 complex 64 bit format .c64 files to disk, say complex_bpf_test1.c64, complex_bpf_test2.c64, that are the output of the filter.
  2. Using these files, lets compare the output samples for the vanilla and optimised filter. I can do that if you email me the 4 files.

Hopefully 08b9964 is what you meant. I emailed the results from this PR (and from main after I backported the changes to generate the .c64 files).

rx = np.cos(2*np.pi*centre_freq_Hz*np.arange(Fs_Hz)/Fs_Hz) # 1 sec real sinewave
rx_bpf = bpf.bpf(rx)
rx_bpf.tofile("complex_bpf_test1.c64")
print(rx.shape,rx_bpf.shape)
Copy link
Owner

@drowe67 drowe67 Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not getting sensible results when I load the emailed files into Octave:

orig_test1=load_c64('~/Downloads/complex_bpf_test/orig_complex_bpf_test1.c64',1);
orig_test1(1:10)
ans =

   8.6456e-21 + 1.2816e-02i
            0 - 1.2046e-04i
            0 + 8.4085e-01i
            0 + 8.4085e-01i
   1.5336e-28 + 8.1424e-01i
   2.4249e+07 + 9.5343e-01i
  -3.6893e+19 - 9.3648e-01i
   3.6893e+19 + 9.8109e-01i
   1.9996e+21 - 1.0402e+00i
  -1.0940e+18 + 8.6109e-01i

load_c64.m can be found in #42

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it looks like NumPy is using complex128 and not complex64. Might explain the issues I'm having with radae_rx_basic having a slightly higher loss figure now, too. I'll experiment some more.

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 5, 2025

OK, I was able to figure out the issue with allocating x_mem in place. Here's an updated set of files for comparison (actually in complex64 this time). Was able to confirm that load_c64 returned something that looked right with these files too.

complex_bpf_test_files.zip

(main_* = from main, optim_* = from this PR)


# Store concatenated memory and baseband samples into x_mem
np.concatenate([self.mem, x_baseband], out=self.x_mem[0:len(x_baseband) + len(self.mem)])

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an observation - I guess this is kinda where Python gets awkward and we might as well be using C as we're having to think about in place versus pointers etc 🤔 Oh well, it is what it is.

@drowe67
Copy link
Owner

drowe67 commented Dec 5, 2025

Thanks @tmiw. It looks like one file is empty:

 dr-radev2 $ ls ~/Downloads/*.c64 -l
-rw-r--r-- 1 user user 64000 Dec  6 04:15 /home/user/Downloads/main_complex_bpf_test1.c64
-rw-r--r-- 1 user user     0 Dec  6 04:15 /home/user/Downloads/main_complex_bpf_test2.c64
-rw-r--r-- 1 user user 64000 Dec  6 04:13 /home/user/Downloads/optim_complex_bpf_test1.c64
-rw-r--r-- 1 user user 61440 Dec  6 04:13 /home/user/Downloads/optim_complex_bpf_test2.c64

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 6, 2025

Thanks @tmiw. It looks like one file is empty:

Oops, I added the write a bit too early for the second file. Try this:

complex_bpf_test_files_2.zip

@drowe67
Copy link
Owner

drowe67 commented Dec 6, 2025

Looks good @tmiw 👍

octave:60> mt1=load_c64("~/Downloads/main_complex_bpf_test1.c64",1);
octave:61> mt2=load_c64("~/Downloads/main_complex_bpf_test2.c64",1);
octave:62> figure(1);
octave:63> plot(mt1)
octave:64> plot(mt2)
octave:65> ot1=load_c64("~/Downloads/optim_complex_bpf_test1.c64",1);
octave:66> ot2=load_c64("~/Downloads/optim_complex_bpf_test2.c64",1);
octave:67> plot(ot1)
octave:68> plot(ot2)
octave:69> plot(mt1-ot1)
octave:70> plot(mt2-ot2)

Shows only some tiny differences. This sort of test could be automated in a C port, e.g. to compare C to Python, but a one off manual test is OK for this PR.

@drowe67 drowe67 merged commit 98094c4 into main Dec 6, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants