Skip to content

Conversation

@JingXuu
Copy link

@JingXuu JingXuu commented Oct 10, 2025

  1. set receptor as motif, scaffold peptide
  2. change features: res_seq_pdb_idx(peptide+200) and chain_break_per_res
  3. add metrics calculation codes

@JingXuu JingXuu requested a review from Copilot October 10, 2025 09:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements peptide design functionality for the proteina model by setting the receptor as a motif and designing scaffold peptides. The key modifications include chain-aware PDB writing, peptide-specific feature engineering, and comprehensive metrics calculation capabilities for evaluation.

Key changes:

  • Enhanced PDB writing to support multi-chain structures with proper chain indexing
  • Modified training pipeline to handle peptide-specific features including residue indexing with offsets and chain break markers
  • Added comprehensive metrics calculation infrastructure including RMSD computation, structure diversity analysis, and designability assessment

Reviewed Changes

Copilot reviewed 29 out of 31 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
proteinfoundation/utils/ff_utils/pdb_utils.py Enhanced PDB writing with chain support and motif-based structure organization
proteinfoundation/proteinflow/model_trainer_base.py Added peptide features and validation RMSD tracking
proteinfoundation/motif_inference.py Enhanced inference with PDB patching and RMSD calculation
test/cal_metrics/ Complete metrics calculation suite including RMSD, diversity, and designability
configs/ Configuration files for peptide training and inference

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 272 to +273
boundaries = torch.linspace(
0.0, max_dist_boundary, num_dist_buckets - 1, device=pair_pred.device
0.1, max_dist_boundary, num_dist_buckets - 1, device=pair_pred.device
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded value 0.1 for distance boundary start may cause issues with very short distances. Consider using 0.0 or making this configurable.

Copilot uses AI. Check for mistakes.
n=n, shape=batch_shape, device=self.device, dtype=dtype, mask=mask
)

batch['x_1'] = x_1 #Ensure that the unit used during training is nanometers (nm) not ang!!!!!
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: 'ang' should be 'Å' or 'angstroms'.

Suggested change
batch['x_1'] = x_1 #Ensure that the unit used during training is nanometers (nm) not ang!!!!!
batch['x_1'] = x_1 # Ensure that the unit used during training is nanometers (nm), not angstroms (Å)!

Copilot uses AI. Check for mistakes.
if random.random() > 0.5 and self.cfg_exp.training.self_cond:
x_pred_sc, _ = self.predict_clean(batch)
batch["x_sc"] = self.detach_gradients(x_pred_sc)
#import pdb; pdb.set_trace()
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out debug statement should be removed from production code.

Suggested change
#import pdb; pdb.set_trace()

Copilot uses AI. Check for mistakes.
motif_min_length = cfg.motif_min_length,
motif_max_length = cfg.motif_max_length,
motif_task_name = cfg.motif_task_name)
#import pdb;pdb.set_trace()
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out debug statement should be removed from production code.

Suggested change
#import pdb;pdb.set_trace()

Copilot uses AI. Check for mistakes.
Returns:
List of sequences (strings)
"""
# import pdb;pdb.set_trace()
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out debug statement should be removed from production code.

Suggested change
# import pdb;pdb.set_trace()

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +56
# cal_metrics.py (only code & comments in English)

Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orphaned comment that doesn't relate to surrounding code should be removed or clarified.

Suggested change
# cal_metrics.py (only code & comments in English)

Copilot uses AI. Check for mistakes.
chain_index=np.concatenate([prot_rec.chain_index, prot_pep.chain_index], axis=0),
b_factors=np.concatenate([prot_rec.b_factors, prot_pep.b_factors], axis=0),
)
pdb_prot = to_pdb(full_prot, model=1, add_end=True) # Now `add_end=True` for automatic ENDMDL and END
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment mentions 'Now add_end=True' but doesn't explain why this change was made or when it should be used.

Suggested change
pdb_prot = to_pdb(full_prot, model=1, add_end=True) # Now `add_end=True` for automatic ENDMDL and END
# Use add_end=True to ensure ENDMDL and END records are included in the PDB output.
# This is necessary when writing combined structures (e.g., motif and peptide as separate chains)
# to properly terminate the PDB file for downstream compatibility. In other cases, set add_end=False
# if termination records are not required.
pdb_prot = to_pdb(full_prot, model=1, add_end=True)

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +325
pep_seqs = pep_seqss[1:]
print(pep_seqs)
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statement should be removed or replaced with proper logging.

Copilot uses AI. Check for mistakes.
gt_base = os.path.basename(gt_pdb_path)
m_case = re.search(r'([0-9][A-Za-z0-9]{3})', gt_base)
case_id = m_case.group(1) if m_case else os.path.splitext(gt_base)[0]
#import pdb;pdb.set_trace()
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another commented-out debug statement should be removed from production code.

Suggested change
#import pdb;pdb.set_trace()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant