[Bug] ExpandMixedKernel fails with "Tensor view not found" for V→C pattern in split=UP_DOWN block

### Component

Codegen

### Description

When using `pl.at(level=pl.Level.CORE_GROUP, split=pl.SplitMode.UP_DOWN)` (or `pl.incore(split=...)`) in a block where vector ops write to a GM tensor via `pl.assemble`, and cube ops subsequently read from the same GM tensor via `pl.slice` for matmul input, the `ExpandMixedKernel` pass fails with:

```
Tensor view not found for parameter: mid__tile
```

The pattern is: **vector writes GM tensor → cube reads same GM tensor** within a single split block. The compiler does not generate a tensor view mapping for the GM tensor on the cube (AIC) side.

This is a separate issue from #963 (which involves slicing matmul output). Here the matmul *input* comes from a GM tensor that was just written by vector ops in the same block.

### Steps to Reproduce

Minimal reproducer (`examples/beginner/vc_mixed_test.py` in pypto-lib):

```python
import pypto.language as pl

M, K, N = 16, 128, 64

@pl.program
class VCTest:
    @pl.function(type=pl.FunctionType.Opaque)
    def vc_test(
        self,
        x: pl.Tensor[[M, K], pl.FP32],
        w: pl.Tensor[[K, N], pl.BF16],
        out: pl.Out[pl.Tensor[[M, N], pl.FP32]],
    ) -> pl.Tensor[[M, N], pl.FP32]:
        mid = pl.create_tensor([M, K], dtype=pl.BF16)
        with pl.at(level=pl.Level.CORE_GROUP, split=pl.SplitMode.UP_DOWN):
            x_tile = pl.slice(x, [M, K], [0, 0])
            scaled = pl.mul(x_tile, 0.5)
            scaled_bf16 = pl.cast(scaled, target_type=pl.BF16)
            mid = pl.assemble(mid, scaled_bf16, [0, 0])
            a = pl.slice(mid, [M, K], [0, 0])
            b = pl.slice(w, [K, N], [0, 0])
            c = pl.matmul(a, b, out_dtype=pl.FP32)
            out = pl.assemble(out, c, [0, 0])
        return out
```

Run: `python vc_mixed_test.py -p a2a3`

**Note:** Splitting into two separate `pl.at()` blocks (one for vector, one for cube) compiles and runs correctly.

### Expected Behavior

The `ExpandMixedKernel` pass should handle the V→C GM tensor handoff within a single split block: vector side writes `mid` via assemble (store to GM), cube side reads `mid` via slice (load from GM). The kernel should compile successfully.

### Actual Behavior

```
Failed to compile group 'vc_test_incore_0':
Tensor view not found for parameter: mid__tile
```

### Git Commit ID

babf1585df02a7e6413a86f8d24d9d7bfb54bea5

### NPU Kind

Ascend 910C

### Host Platform

Linux (aarch64)

### Additional Context

- Related: #963 (ExpandMixedKernel drops matmul when output is sliced — C→V direction)
- This issue is the **V→C direction**: vector writes a GM tensor, cube reads it
- pypto branch: `feat/incore-split-param`
- pypto-lib commit: `8cfb7c0`
- Also encountered in `qwen3_32b_decode_scope2.py` when attempting to merge softmax (Stage 3, vector) + SV matmul (Stage 4, cube) into a single split block

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] ExpandMixedKernel fails with "Tensor view not found" for V→C pattern in split=UP_DOWN block #965

Component

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Git Commit ID

NPU Kind

Host Platform

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] ExpandMixedKernel fails with "Tensor view not found" for V→C pattern in split=UP_DOWN block #965

Description

Component

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Git Commit ID

NPU Kind

Host Platform

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions