Elementwise Kernel optimizations.

Hi @evanmayer, do you have a way to run your code without hardware?
I've found a few places to make improvements. If no, I've added ideas below if you ever want to try. 

1. Input to channelizer is read-only, so you might not need to initialize `x` to zeros
```python
x = cp.empty(len(x)+len(x)%n_branches, dtype=np.complex128) + x  // Line 39
```

2. Move to elementwise kernel to reduce data transfer and launch overhead
```python
xcorr_array = psd_0 * (cp.conj(psd_1) * cp.exp(2j * (cp.pi) * freqs * (total_lag / rate) ))  // Line 106
```

3. Move to elementwise kernel to reduce data transfer and launch overhead
```python
xcorr *= cp.exp(2j * cp.pi * freqs * (integer_lag / rate) )  // Line 185
# Prepare to fit residual phase gradient:
phases = cp.angle(xcorr)  // Line 187
# Due to receiver bandpass shape, edge frequencies have less power => less certain phase
# Assign weights accordingly
weights = cp.abs(xcorr)  // Line 190```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elementwise Kernel optimizations. #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Elementwise Kernel optimizations. #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions