Skip to content

Collaborative Score Distillation for Consistent Visual Synthesis (NeurIPS 2023)

Notifications You must be signed in to change notification settings

subin-kim-cv/CSD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSD

Official implementation of the paper Collaborative Score Distillation for Consistent Visual Editing (NeurIPS 2023).

Subin Kim*1, Kyungmin Lee*1, June Suk Choi1, Jongheon Jeong1, Kihyuk Sohn2, Jinwoo Shin1.
1KAIST, 2Google Research
paper | project page | arXiv

TL;DR: Consistent zero-shot visual synthesis across various and complex visual modalities

Requirements

Environments

Required packages you should install are listed below:

conda create -n csd python=3.8
conda activate csd
pip install torch==2.0.1 torchvision==0.15.2
pip install diffusers==0.20.0 transformers accelerate mediapy
# for consistency decoder
pip install git+https://github.com/openai/consistencydecoder.git

Image Editing

Run the following script with a single GPU.

python csdedit_image.py --device=0 --svgd --fp16 --stride=16 \
--save_path='output/' --data_path='data/river.jpg' \
--batch=4 --tgt_prompt='turn into van gogh style painting' \
--guidance_scale=7.5 --image_guidance_scale=5

python csdedit_image.py --device=0 --svgd --fp16 --stride=16 \
--save_path='output/' --data_path='data/sheeps.jpg' \
--batch=4 --tgt_prompt='turn the sheeps into wolves' \
--guidance_scale=7.5 --image_guidance_scale=5 

To edit image of high resolution, we encode and decode in patch-wise. To do that, add '--stride_vae':

python csdedit_image.py --device=0 --svgd --fp16 --stride=16 \
--save_path='output/' --data_path='data/michelangelo.jpeg' \
--batch=8 --tgt_prompt='Re-imagine people are in galaxy' \
--guidance_scale=15 --image_guidance_scale=5 --stride_vae --lr=4.0

Compositional Image Editing

To edit the image with region-wise prompts while ensuring smooth transitions between patches with different instructions, do the following:

python csdedit_image_region.py --device 0 --svgd --fp16 \
--save_path 'output/' --data_path 'data/vienna.jpg' \
--tgt_prompt 'turn into sunny weather' 'turn into cloudy weather' 'turn into rainy weather' 'turn into snowy weather' \
--stride 16 --batch 4 --guidance_scale 15 --image_guidance_scale 5

Video Editing

python csdedit_video.py --device 0 --svgd --fp16 \
--save_path 'output/break/' --data_path 'data/break' \
--tgt_prompt="Change the color of his T-shirt to yellow" \
--guidance_scale=7.5 --image_guidance_scale=1.5 --lr=0.5 \
--rows 2 --cols 12 --svgd --num_steps 100 

3D Scene Editing

One can obtain 3D editing results by following the codebase of Instruct-NeRF2NeRF but with a few lines of adaptation, particularly for this line into that of CSD-Edit.

Citation

@inproceedings{
    kim2023collaborative,
    title={Collaborative score distillation for consistent visual editing},
    author={Kim, Subin and Lee, Kyungmin and Choi, June Suk and Jeong, Jongheon and Sohn, Kihyuk and Shin, Jinwoo},
    booktitle={Advances in Neural Information Processing Systems},
    year={2023},
}

About

Collaborative Score Distillation for Consistent Visual Synthesis (NeurIPS 2023)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages