Skip to content

[Bug] Pipeline parameters (ParameterInteger, ParameterString) fail in ModelTrainer hyperparameters due to safe_serialize #5504

@lopezfelipe

Description

@lopezfelipe

PySDK Version

  • PySDK V2 (2.x)
  • PySDK V3 (3.x)

Describe the bug
Pipeline parameters (e.g., ParameterInteger, ParameterString) cannot be used in ModelTrainer hyperparameters because the safe_serialize function in sagemaker-train/src/sagemaker/train/utils.py doesn't handle PipelineVariable objects, causing a TypeError when building the pipeline.

To reproduce
Trying to pass max_depth as a Pipeline parameter to an XGBoost container in ModelTrainer

from sagemaker.core.workflow.parameters import ParameterInteger
from sagemaker.train import ModelTrainer
from sagemaker.core.training.configs import Compute
from sagemaker.core.workflow.pipeline_context import PipelineSession
from sagemaker.mlops.workflow.steps import TrainingStep
from sagemaker.mlops.workflow.pipeline import Pipeline

# Create pipeline session and parameter
pipeline_session = PipelineSession()
max_depth = ParameterInteger(name="MaxDepth", default_value=5)

# Create ModelTrainer with pipeline parameter in hyperparameters
model_trainer = ModelTrainer(
    training_image="683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.0-1-cpu-py3",
    compute=Compute(instance_type="ml.m5.xlarge", instance_count=1),
    sagemaker_session=pipeline_session,
    role=role,
    hyperparameters={
        "max_depth": max_depth,  # Pipeline parameter
    },
)

train_args = model_trainer.train()
step_train = TrainingStep(name="TrainStep", step_args=train_args)

# Create and upsert pipeline
pipeline = Pipeline(
    name="test-pipeline",
    parameters=[max_depth],
    steps=[step_train],
    sagemaker_session=pipeline_session,
)

It will fail on this step:

pipeline.upsert(role_arn=role)

Expected behavior
Pipeline parameters should be serialized correctly and the pipeline should be created successfully, allowing hyperparameters to be parameterized at pipeline execution time.

Screenshots or logs
Error logs

│ /opt/conda/lib/python3.12/site-packages/sagemaker/train/utils.py:191 in safe_serialize           │
│                                                                                                  │
│   188 │   try:                                                                                   │
│   189 │   │   return json.dumps(data)                                                            │
│   190 │   except TypeError:                                                                      │
│ ❱ 191 │   │   return str(data)                                                                   │
│   192                                                                                            │
│   193                                                                                            │
│   194 def _run_clone_command_silent(repo_url, dest_dir):                                         │
│                                                                                                  │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/core/helper/pipeline_variable.py:38 in __str__ │
│                                                                                                  │
│   35 │                                                                                           │
│   36 │   def __str__(self):                                                                      │
│   37 │   │   """Override built-in String function for PipelineVariable"""                        │
│ ❱ 38 │   │   raise TypeError(                                                                    │
│   39 │   │   │   "Pipeline variables do not support __str__ operation. "                         │
│   40 │   │   │   "Please use `.to_string()` to convert it to string type in execution time "     │
│   41 │   │   │   "or use `.expr` to translate it to Json for display purpose in Python SDK."     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: Pipeline variables do not support __str__ operation. Please use `.to_string()` to convert it to string 
type in execution time or use `.expr` to translate it to Json for display purpose in Python SDK.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 3.3.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): XGBoost
  • Framework version: 1.0
  • Python version: 3.12.9
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions