Skip to content

mrj65/CAP-GEN-EXP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CAP-GEN-EXP

Pipeline combining the usage of BLIP ViT(https://huggingface.co/Salesforce/blip-image-captioning-large) ,fine-tuned version of SmolLM2 360M (https://huggingface.co/HuggingFaceTB/SmolLM2-360M) and Stable-diffusion-v1-4 (https://huggingface.co/CompVis/stable-diffusion-v1-4) for the purpose of image captioning , porviding grad-CAM overlay, self-attention and generating new images based on the extracted caption . The app focuses on forwarding generated captions into the SmolLM2 in order to explain the word/object in the image in more detail as weel as forwarding into stable diffisuion to generate new images and includeing XAI (grad-cam, self-attention) for the whole process. Finet notebook provides a simple workflow for fine-tuning the models along with integrated wandb logger.

App can be tested at the following link : https://huggingface.co/spaces/Fine-Tuning-DLSE-Smol2/dlasw-pipeline-deploy?logs=container. Or by running inisde a Docker container locally:

docker run -it -p 7860:7860 --gpus all --platform=linux/amd64  registry.hf.space/fine-tuning-dlse-smol2-dlasw-pipeline-deploy:latest python app.py

USER INTERFACE

image image image

WORKFLOW BLOCK DIAGRAM

image

FINE-TUNE

image image

ADDITIONAL EXAMPLES

image image

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published