diff --git a/README.md b/README.md index b988f9b3..66f08a00 100755 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ English | [简体中文](./README_zh-CN.md) | [日本語](./README_ja-JP.md) - [License](#license) # Introduction -VideoX-Fun is a video generation pipeline that can be used to generate AI images and videos, as well as to train baseline and Lora models for Diffusion Transformer. We support direct prediction from pre-trained baseline models to generate videos with different resolutions, durations, and FPS. Additionally, we also support users in training their own baseline and Lora models to perform specific style transformations. +VideoX-Fun is a video generation pipeline that can be used to generate AI images and videos, as well as to train baseline and LoRA models for Diffusion Transformer. We support direct prediction from pre-trained baseline models to generate videos with different resolutions, durations, and FPS. Additionally, we also support users in training their own baseline and LoRA models to perform specific style transformations. We will support quick pull-ups from different platforms, refer to [Quick Start](#quick-start). @@ -30,7 +30,7 @@ What's New: - Update Wan2.1-Fun-V1.1: Support for 14B and 1.3B model Control + Reference Image models, support for camera control, and the Inpaint model has been retrained for improved performance. [2025.04.25] - Update Wan2.1-Fun-V1.0: Support I2V and Control models for 14B and 1.3B models, with support for start and end frame prediction. [2025.03.26] - Update CogVideoX-Fun-V1.5: Upload I2V model and related training/prediction code. [2024.12.16] -- Reward Lora Support: Train Lora using reward backpropagation techniques to optimize generated videos, making them better aligned with human preferences. [More Information](scripts/README_TRAIN_REWARD.md). New version of the control model supports various control conditions such as Canny, Depth, Pose, MLSD, etc. [2024.11.21] +- Reward LoRA Support: Train LoRA using reward backpropagation techniques to optimize generated videos, making them better aligned with human preferences. [More Information](scripts/README_TRAIN_REWARD.md). New version of the control model supports various control conditions such as Canny, Depth, Pose, MLSD, etc. [2024.11.21] - Diffusers Support: CogVideoX-Fun Control is now supported in diffusers. Thanks to [a-r-r-o-w](https://github.com/a-r-r-o-w) for contributing support in this [PR](https://github.com/huggingface/diffusers/pull/9671). Check out the [documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox) for more details. [2024.10.16] - Update CogVideoX-Fun-V1.1: Retrain i2v model, add Noise to increase the motion amplitude of the video. Upload control model training code and Control model. [2024.09.29] - Update CogVideoX-Fun-V1.0: Initial code release! Now supports Windows and Linux. Supports video generation at arbitrary resolutions from 256x256x49 to 1024x1024x49 for 2B and 5B models. [2024.09.18] @@ -78,7 +78,7 @@ cd VideoX-Fun mkdir models/Diffusion_Transformer mkdir models/Personalized_Model -# Please use the hugginface link or modelscope link to download the model. +# Please use the Hugging Face link or ModelScope link to download the model. # CogVideoX-Fun # https://huggingface.co/alibaba-pai/CogVideoX-Fun-V1.1-5b-InP # https://modelscope.cn/models/PAI/CogVideoX-Fun-V1.1-5b-InP @@ -111,7 +111,7 @@ The detailed of Linux: We need about 60GB available on disk (for saving weights), please check! #### b. Weights -We'd better place the [weights](#model-zoo) along the specified path: +We recommend placing the [weights](#model-zoo) along the specified path: **Via ComfyUI**: Put the models into the ComfyUI weights folder `ComfyUI/models/Fun_Models/`: @@ -428,7 +428,7 @@ For details, refer to [ComfyUI README](comfyui/README.md). - Modify `control_video`, `validation_image_end`, `prompt`, `neg_prompt`, `guidance_scale`, and `seed` in the file `examples/cogvideox_fun/predict_v2v_control.py`. - `control_video` is the control video extracted using operators such as Canny, Pose, or Depth. You can use the following demo video: [Demo Video](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/cogvideox_fun/asset/v1.1/pose.mp4). - Run the file `examples/cogvideox_fun/predict_v2v_control.py` and wait for the results. The generated videos will be saved in the folder `samples/cogvideox-fun-videos_v2v_control`. -- **Step 3**: If you want to integrate other backbones or Loras trained by yourself, modify `lora_path` and relevant paths in `examples/{model_name}/predict_t2v.py` or `examples/{model_name}/predict_i2v.py` as needed. +- **Step 3**: If you want to integrate other backbones or LoRAs trained by yourself, modify `lora_path` and relevant paths in `examples/{model_name}/predict_t2v.py` or `examples/{model_name}/predict_i2v.py` as needed. ##### ii. Multi-GPU Inference: When using multi-GPU inference, please make sure to install the xfuser. We recommend installing xfuser==0.4.2 and yunchang==0.6.2. @@ -468,7 +468,7 @@ A complete model training pipeline should include data preprocessing and Video D