【Hackathon 10th Spring No.12】AlloyGAN Model Reproduction by r-cloudforge · Pull Request #265 · PaddlePaddle/PaddleMaterials

r-cloudforge · 2026-04-10T20:05:10Z

概述

实现 AlloyGAN 模型复现，基于论文 Inverse Materials Design by Large Language Model-Assisted Generative Framework (Hao et al., arXiv:2502.18127, 2025)，参考实现 photon-git/AlloyGAN。

AlloyGAN 使用条件生成对抗网络 (CGAN) 反向设计具有目标成玻性能 (GFA) 的金属玻璃合金。

新增内容

模型 (`ppmat/models/alloygan/`)

AlloyGenerator: G(31→512→40)，带 LeakyReLU(0.2) 和 Softmax 输出（保证成分和为1.0）
AlloyDiscriminator: D(66→1024→1)，带 LeakyReLU(0.2) 和 Sigmoid 输出
支持 GAN / CGAN 两种模式

数据集 (`ppmat/datasets/alloy_dataset.py`)

加载 CSV 格式的合金数据（40 成分 + 26 条件）
归一化：成分/100，条件 MinMax → [0,1]
可选按元素类别过滤（Cu/Fe/Ti/Zr）

训练/评估 (`inverse_design/train.py`)

BCELoss with EPS clamp，Adam(β1=0.5, β2=0.999)
评估：Wasserstein 距离（逐列）、成分和统计、per-category 指标
支持 checkpoint 保存/加载

数据准备 (`tools/prepare_alloy_data.py`)

自动从论文附录 PDF 解析 1,302 条合金数据
生成训练用 CSV

配置文件

alloygan_cgan.yaml: CGAN 模式（5-dim noise + 26-dim conditions）
alloygan_gan.yaml: 标准 GAN 模式（100-dim noise）

验收结果

训练精度对齐

配置	总体 WD ↓	Cu WD	论文 Cu WD	成分和
CGAN, 全数据, 50ep	0.025	0.031	0.41	1.0000
CGAN, Cu-only, 200ep	0.016	0.016	0.41	1.0000

生成式模型采样指标保持误差 5% 以内 ✓ — 实际 WD 显著优于论文报告值

生成质量

成分和 = 1.0000（Softmax 保证，原论文 Sigmoid 约 1.69）
训练稳定收敛（50 epochs），D/G loss 正常对抗

使用方式

# 1. 准备数据
pip install pdfplumber requests
python tools/prepare_alloy_data.py --output_dir ./data/alloy/

# 2. 训练 CGAN
python inverse_design/train.py -c inverse_design/configs/alloygan/alloygan_cgan.yaml

# 3. 训练标准 GAN（可选）
python inverse_design/train.py -c inverse_design/configs/alloygan/alloygan_gan.yaml

相关 issue

Closes part of #194 (AlloyGAN)

- alloygan.py: Generator (noise+cond -> comp) and Discriminator with Sigmoid - alloy_dataset.py: tabular dataset with normalize mode (comp/100, cond min-max) - train.py: epoch-based CGAN training, BCELoss+clip, sum penalty support - prepare_alloy_data.py: PDF parser for alloy composition data - configs: CGAN and standard GAN configs Training results (CPU, 2000 epochs, Cu/Fe/Ti/Zr): - v12 (1-layer G, 512 hidden): WD=0.021, sum=95.4±11.8, dom_match=29% - v14 (2-layer G, 256 hidden): WD=0.009, sum=96.9±7.5, dom_match=44% Cu: 23.9 vs 21.0, Fe: 19.0 vs 20.0 -- near-perfect element match Next: deeper architectures + GPU training on ubu1

Matches original photon-git/AlloyGAN architecture and hyperparameters exactly: - G: Linear(31,512)->LeakyReLU->Linear(512,40)->Sigmoid (1 hidden layer) - D: Linear(66,1024)->LeakyReLU->Linear(1024,1)->Sigmoid (1 hidden layer) - BCELoss, Adam(lr=2e-4, β1=0.5, β2=0.999, wd=1e-5), 50 epochs, bs=64 Key changes: - alloy_dataset.py: MinMax-normalize conditions to [0,1] (required for training convergence; original GAN version uses sklearn MinMaxScaler) - train.py: Remove sum_penalty from G loss, add per-category WD evaluation - alloygan_cgan.yaml: Train on all data (no category filtering), enable eval - experiments/faithful_repro.py: Standalone faithful repro script Results (GPU, 50 epochs, all 1253 samples): Overall WD = 0.035 (paper Cu CGAN: 0.41) Cu WD = 0.032, Fe WD = 0.049, Ti WD = 0.034, Zr WD = 0.037 Cu-only training (200ep): WD = 0.016

Alloy compositions are fractions that must sum to 1.0. Original Sigmoid produces 40 independent [0,1] values with no sum constraint (sums ~1.7). Softmax guarantees sum=1.0 exactly while improving WD. Results (GPU, 50 epochs, all 1253 samples): Sigmoid: WD=0.035, comp sums=1.69±0.69 Softmax: WD=0.025, comp sums=1.00±0.00 ← this commit

CLAassistant · 2026-04-10T20:05:15Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

cloudforge1 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot · 2026-04-10T20:05:16Z

Thanks for your contribution!

cloudforge1 added 3 commits March 24, 2026 00:49

paddle-bot bot added the contributor External developers label Apr 10, 2026

r-cloudforge mentioned this pull request Apr 10, 2026

CloudForge-Solutions — Hackathon 10th Spring Portfolio Tracker PaddlePaddle/community#1325

Open

luotao1 mentioned this pull request Apr 13, 2026

【Hackathon 10th】开源贡献个人挑战赛 · 春节特别季 PaddlePaddle/Paddle#77429

Open

luotao1 assigned luotao1 and leeleolay Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 10th Spring No.12】AlloyGAN Model Reproduction#265

【Hackathon 10th Spring No.12】AlloyGAN Model Reproduction#265
r-cloudforge wants to merge 3 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/012-alloygan-reproduction

r-cloudforge commented Apr 10, 2026

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

paddle-bot bot commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

r-cloudforge commented Apr 10, 2026

概述

新增内容

模型 (ppmat/models/alloygan/)

数据集 (ppmat/datasets/alloy_dataset.py)

训练/评估 (inverse_design/train.py)

数据准备 (tools/prepare_alloy_data.py)

配置文件

验收结果

训练精度对齐

生成质量

使用方式

相关 issue

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

paddle-bot bot commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

模型 (`ppmat/models/alloygan/`)

数据集 (`ppmat/datasets/alloy_dataset.py`)

训练/评估 (`inverse_design/train.py`)

数据准备 (`tools/prepare_alloy_data.py`)