Skip to content

Inquire about Deltaflow: 2 frame and 5 fame get really close results #27

@turswiming

Description

@turswiming

Problem Statement

We conducted an ablation study comparing model performance with 2-frame versus 5-frame inputs. Surprisingly, the results show only minimal differences (<0.1% on most metrics), despite the significant difference in input information.

Experimental Setup

Model: DeltaFlow

Configuration:

Voxel size: ${voxel_size}

Point cloud range: ${point_cloud_range}

Planes: [16, 32, 64, 128, 256, 256, 128, 64, 32, 16]

Num_layer: [2, 2, 2, 2, 2, 2, 2, 2, 2] (MinkUnet 18)

Decay factor: 0.4

Decoder option: default

Results

2-Frame Input (without history frames):
2-Frame Results

5-Frame Input (with history frames):
5-Frame Results

Key Observations

Both configurations achieve nearly identical performance across most evaluation metrics.

The difference is consistently within 1% for the majority of measurement items.

Additional experiments show that the 2-frame version converges to performance levels very close to the 5-frame version within 5 training epochs.

Question/Concern

Given the substantial difference in input information (2 frames vs 5 frames), we would expect more significant performance variation. The minimal observed difference raises questions about:

Whether the model is effectively utilizing the additional temporal information from 5-frame inputs

Potential issues with the implementation or configuration

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions