[bug] : Default Config Parallelism Strategy Issue for Qwen3.5-VL 122B-A10B SFT

### Problem


## Description

The default configuration in `qwen35_vl_122b_a10b_sft_config` has issues with parallelism strategy calculation and documentation inconsistency.

### Configuration Analysis

From the code:
```python
def qwen35_vl_122b_a10b_sft_config(hf_path: str = "Qwen/Qwen3.5-122B-A10B") -> ConfigContainer:
    """Return a full SFT config for Qwen3.5-VL 122B-A10B (MoE).

    Default configuration: 4 nodes, 32 GPUs
    - TP=2, PP=6, EP=8
    - LR=2e-5 (full SFT)
    - Sequence length: 4096

    Args:
        hf_path: HuggingFace model ID or local path to model directory.
    """
    cfg = _sft_common_vlm()
    _qwen35_vl_apply_common(cfg, hf_path, tp=2, pp=6, max_lr=2e-5, min_lr=2e-6, gbs=36)
    _qwen35_vl_apply_moe(cfg, ep=8)
    _qwen35_vl_enable_recompute(cfg)
    return cfg
```

but pp=6 ,ep=8 can not devide by 32 gpus 

### Minimal repro

```shell
nope
```

### Expected behavior

this config need 48 gpus or more.

### Affected area

area:misc

### Regression?

No

### Environment

_No response_

### Logs

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] : Default Config Parallelism Strategy Issue for Qwen3.5-VL 122B-A10B SFT #3747

Problem

Description

Configuration Analysis

Minimal repro

Expected behavior

Affected area

Regression?

Environment

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bug] : Default Config Parallelism Strategy Issue for Qwen3.5-VL 122B-A10B SFT #3747

Description

Problem

Description

Configuration Analysis

Minimal repro

Expected behavior

Affected area

Regression?

Environment

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions