Hi @evanmayer, do you have a way to run your code without hardware?
I've found a few places to make improvements. If no, I've added ideas below if you ever want to try.
- Input to channelizer is read-only, so you might not need to initialize
x to zeros
x = cp.empty(len(x)+len(x)%n_branches, dtype=np.complex128) + x // Line 39
- Move to elementwise kernel to reduce data transfer and launch overhead
xcorr_array = psd_0 * (cp.conj(psd_1) * cp.exp(2j * (cp.pi) * freqs * (total_lag / rate) )) // Line 106
- Move to elementwise kernel to reduce data transfer and launch overhead
xcorr *= cp.exp(2j * cp.pi * freqs * (integer_lag / rate) ) // Line 185
# Prepare to fit residual phase gradient:
phases = cp.angle(xcorr) // Line 187
# Due to receiver bandpass shape, edge frequencies have less power => less certain phase
# Assign weights accordingly
weights = cp.abs(xcorr) // Line 190```
Hi @evanmayer, do you have a way to run your code without hardware?
I've found a few places to make improvements. If no, I've added ideas below if you ever want to try.
xto zeros