GroundingBooth: Grounding Text-to-Image Customization

TRANSACTIONS ON MACHINE LEARNING RESEARCH (TMLR)

👥 Authors

Zhexiao Xiong, Wei Xiong, Jing Shi, He Zhang, Yizhi Song, Nathan Jacobs

🛠️ Installation

# Clone the repository
git clone https://github.com/your-username/GroundingBooth.git
cd GroundingBooth

# Create and activate conda environment  
conda env create -f environment.yaml
conda activate groundingbooth

📥 Model Downloads

Required Pretrained Models

GroundingBooth Pretrained Model:
- Download from: SharePoint Link
- Place in: ./checkpoints/
DINOv2 Pretrained Model (ViT-G/14):
- Download from: https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_pretrain.pth
- Place in your model directory

Directory Structure

GroundingBooth/
├── checkpoints/
│   └── checkpoint.pth    # GroundingBooth model
├── configs/                       # Configuration files
├── dataset/                       # Dataset implementations
├── ldm/                          # Latent Diffusion Model components
├── grounding_input/              # Grounding tokenizer inputs
└── dinov2/                       # DINOv2 model components

🎯 Quick Start

Basic Inference

Run the default inference pipeline:

bash infer.sh

This executes:

python inference_single.py \
    --batch_size 1 \
    --guidance_scale 3 \
    --folder OUTPUT_test \
    --dataset dreambench \
    --background \
    --ckpt_path checkpoints/checkpoint.pth

Customized Inference

For specific bounding box control:

python infer_customized_all.py \
    --batch_size 1 \
    --guidance_scale 5 \
    --folder OUTPUT_custom \
    --dataset dreambench \
    --ckpt_path checkpoints/checkpoint.pth \
    --position 0.1 0.1 0.9 0.9

Command Line Arguments

Argument	Type	Default	Description
`--folder`	str	`generation_samples`	Output directory for generated images
`--guidance_scale`	float	`3.0`	Classifier-free guidance scale
`--dataset`	str	`dreambench`	Dataset type ( `dreambench`)
`--ckpt_path`	str	Required	Path to model checkpoint
`--position`	float×4	`(0.25,0.25,0.75,0.75)`	Bounding box coordinates (x1,y1,x2,y2)
`--negative_prompt`	str	Auto	Negative prompt for generation
`--background`	flag	False	Include background object grounded generation

🏋️ Training

Note: Training code will be open-sourced soon. Stay tuned for updates!

🔧 Configuration

Model Configurations

The configs/ directory contains:

inference.yaml: Inference settings

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

📚 Citation

If you find GroundingBooth useful in your research, please consider citing:

@article{xiong2024groundingbooth,
  title={Groundingbooth: Grounding text-to-image customization},
  author={Xiong, Zhexiao and Xiong, Wei and Shi, Jing and Zhang, He and Song, Yizhi and Jacobs, Nathan},
  journal={arXiv preprint arXiv:2409.08520},
  year={2024}
}

📞 Support

Email: x.zhexiao@wustl.edu

Note: Training code will be released soon. Follow this repository for updates!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GroundingBooth: Grounding Text-to-Image Customization

TRANSACTIONS ON MACHINE LEARNING RESEARCH (TMLR)

👥 Authors

🛠️ Installation

📥 Model Downloads

Required Pretrained Models

Directory Structure

🎯 Quick Start

Basic Inference

Customized Inference

Command Line Arguments

🏋️ Training

🔧 Configuration

Model Configurations

🤝 Contributing

📄 License

📚 Citation

📞 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
configs		configs
dataset		dataset
dinov2		dinov2
grounding_input		grounding_input
ldm		ldm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
distributed.py		distributed.py
environment.yaml		environment.yaml
infer.sh		infer.sh
infer_customized_all.py		infer_customized_all.py
inpaint_mask_func.py		inpaint_mask_func.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

GroundingBooth: Grounding Text-to-Image Customization

TRANSACTIONS ON MACHINE LEARNING RESEARCH (TMLR)

👥 Authors

🛠️ Installation

📥 Model Downloads

Required Pretrained Models

Directory Structure

🎯 Quick Start

Basic Inference

Customized Inference

Command Line Arguments

🏋️ Training

🔧 Configuration

Model Configurations

🤝 Contributing

📄 License

📚 Citation

📞 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages