This guide shows you how to add a custom optimizer to DeepFense.
Optimizers in DeepFense are registered functions that create PyTorch optimizer instances. They must be registered with @register_optimizer in the registry file. Optimizers are typically wrappers around PyTorch's built-in optimizers with custom parameter configurations.
Add your optimizer to deepfense/training/optimizers/utils.py:
import torch
from deepfense.utils.registry import register_optimizer
@register_optimizer("my_optimizer")
def MyOptimizer(params, config):
"""
Custom optimizer builder function.
Args:
params: Model parameters (iterable) to optimize
config: Dictionary containing optimizer configuration
- lr: Learning rate (required)
- momentum: Momentum parameter (optional)
- weight_decay: Weight decay (optional)
- Other optimizer-specific parameters
Returns:
PyTorch optimizer instance
"""
lr = config.get("lr", 0.001)
momentum = config.get("momentum", 0.9)
weight_decay = config.get("weight_decay", 1e-4)
nesterov = config.get("nesterov", False)
# Create and return optimizer
return torch.optim.SGD(
params,
lr=lr,
momentum=momentum,
weight_decay=weight_decay,
nesterov=nesterov
)Alternatively, if you prefer to keep it in the main registry file, add it to deepfense/utils/registry.py at the end of the file (before any existing optimizer registrations, or in a dedicated section).
The optimizer is automatically registered when the module is imported. Check that it's registered:
deepfense list --component-type optimizersOr programmatically:
from deepfense.training.optimizers import utils # Import to register
from deepfense.utils.registry import OPTIMIZER_REGISTRY
# Check if registered
if "my_optimizer" in OPTIMIZER_REGISTRY:
print("Optimizer registered successfully!")
print("Available optimizers:", OPTIMIZER_REGISTRY.list())Use your optimizer in a YAML configuration file:
training:
optimizer:
type: "my_optimizer" # Your registered name
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
nesterov: TrueHere's a complete example for SGD with custom parameter groups:
import torch
from deepfense.utils.registry import register_optimizer
@register_optimizer("custom_sgd")
def CustomSGDOptimizer(params, config):
"""
SGD optimizer with separate learning rates for different parameter groups.
Supports:
- Different learning rates for frontend/backend
- Layer-wise learning rates
- Custom momentum and weight decay
"""
base_lr = config.get("lr", 0.001)
momentum = config.get("momentum", 0.9)
weight_decay = config.get("weight_decay", 1e-4)
nesterov = config.get("nesterov", False)
# Option 1: Simple - same LR for all parameters
if not config.get("use_param_groups", False):
return torch.optim.SGD(
params,
lr=base_lr,
momentum=momentum,
weight_decay=weight_decay,
nesterov=nesterov
)
# Option 2: Different LR for different parameter groups
param_groups = []
# Group 1: Frontend parameters (typically frozen, but if not)
frontend_lr = config.get("frontend_lr", base_lr * 0.1)
frontend_params = [p for n, p in params if "frontend" in n]
if frontend_params:
param_groups.append({
"params": frontend_params,
"lr": frontend_lr,
"momentum": momentum,
"weight_decay": weight_decay
})
# Group 2: Backend parameters
backend_params = [p for n, p in params if "backend" in n or "loss" in n]
if backend_params:
param_groups.append({
"params": backend_params,
"lr": base_lr,
"momentum": momentum,
"weight_decay": weight_decay
})
# If no groups matched, use all params
if not param_groups:
param_groups = [{"params": list(params)}]
return torch.optim.SGD(param_groups, momentum=momentum, nesterov=nesterov)Here's an example of Adam optimizer with learning rate warmup (note: warmup is typically handled by schedulers, but shown here for completeness):
import torch
from deepfense.utils.registry import register_optimizer
@register_optimizer("adam_warmup")
def AdamWarmupOptimizer(params, config):
"""
Adam optimizer with custom beta values.
"""
lr = config.get("lr", 0.001)
betas = config.get("betas", (0.9, 0.999))
weight_decay = config.get("weight_decay", 1e-4)
eps = config.get("eps", 1e-8)
amsgrad = config.get("amsgrad", False)
return torch.optim.Adam(
params,
lr=lr,
betas=betas,
weight_decay=weight_decay,
eps=eps,
amsgrad=amsgrad
)Example of adding RAdam (Rectified Adam) if you have the timm library:
import torch
from deepfense.utils.registry import register_optimizer
try:
from timm.optim import RAdam
except ImportError:
RAdam = None
@register_optimizer("radam")
def RAdamOptimizer(params, config):
"""
RAdam (Rectified Adam) optimizer.
Requires: pip install timm
"""
if RAdam is None:
raise ImportError("RAdam requires 'timm' package. Install with: pip install timm")
lr = config.get("lr", 0.001)
betas = config.get("betas", (0.9, 0.999))
weight_decay = config.get("weight_decay", 1e-4)
eps = config.get("eps", 1e-8)
return RAdam(
params,
lr=lr,
betas=betas,
weight_decay=weight_decay,
eps=eps
)Example of wrapping an optimizer with Lookahead:
import torch
from deepfense.utils.registry import register_optimizer
try:
from pytorch_optimizer import Lookahead
except ImportError:
Lookahead = None
@register_optimizer("lookahead_adam")
def LookaheadAdamOptimizer(params, config):
"""
Adam optimizer wrapped with Lookahead.
Requires: pip install pytorch-optimizer
"""
if Lookahead is None:
raise ImportError("Lookahead requires 'pytorch-optimizer' package")
# Base optimizer config
base_lr = config.get("lr", 0.001)
betas = config.get("betas", (0.9, 0.999))
weight_decay = config.get("weight_decay", 1e-4)
# Create base optimizer
base_optimizer = torch.optim.Adam(
params,
lr=base_lr,
betas=betas,
weight_decay=weight_decay
)
# Wrap with Lookahead
lookahead_k = config.get("lookahead_k", 5)
lookahead_alpha = config.get("lookahead_alpha", 0.5)
return Lookahead(
base_optimizer,
k=lookahead_k,
alpha=lookahead_alpha
)- Use @register_optimizer decorator: Register with a unique string name
- Function signature: Must accept
(params, config)whereparamsis an iterable of model parameters - Return optimizer: Must return a PyTorch optimizer instance
- Config parameters: Extract parameters from the config dictionary
- Default values: Provide sensible defaults for optional parameters
- No import needed: Optimizers are registered when the module is imported
Your optimizer function should follow this pattern:
@register_optimizer("optimizer_name")
def MyOptimizer(params, config):
"""
Args:
params: Iterable of model parameters (typically model.parameters())
config: Dictionary with optimizer configuration
- lr: Learning rate (typically required)
- Other optimizer-specific parameters
Returns:
torch.optim.Optimizer: PyTorch optimizer instance
"""
lr = config.get("lr", 0.001) # Extract with defaults
# ... other parameters
return torch.optim.SomeOptimizer(params, lr=lr, ...)Test your optimizer before using it in training:
import torch
import torch.nn as nn
from deepfense.training.optimizers import utils # Import to register
from deepfense.utils.registry import build_optimizer
# Create a simple model
model = nn.Sequential(
nn.Linear(10, 5),
nn.ReLU(),
nn.Linear(5, 1)
)
# Create optimizer config
optimizer_config = {
"type": "my_optimizer",
"lr": 0.001,
"momentum": 0.9,
"weight_decay": 1e-4
}
# Build optimizer
optimizer = build_optimizer(optimizer_config["type"], model.parameters(), optimizer_config)
# Test optimizer step
dummy_input = torch.randn(2, 10)
dummy_target = torch.randn(2, 1)
output = model(dummy_input)
loss = nn.MSELoss()(output, dummy_target)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Optimizer step completed successfully!")
print(f"Optimizer type: {type(optimizer)}")DeepFense supports any PyTorch optimizer. Common ones include:
torch.optim.SGD- Stochastic Gradient Descenttorch.optim.Adam- Adamtorch.optim.AdamW- Adam with decoupled weight decaytorch.optim.RMSprop- RMSproptorch.optim.Adagrad- Adagradtorch.optim.Adadelta- Adadelta
You can also use optimizers from external libraries like timm or pytorch-optimizer.
- See Adding Schedulers for learning rate schedulers
- See Training Guide for how to use optimizers in training
- See Configuration Reference for full config options
- See existing optimizers in
deepfense/training/optimizers/utils.pyfor reference