-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partial models spec in model_index.json #10413
Comments
This sounds like a power-user feature and IMO is already possible. Users can load a pipeline partially from a pre-trained ckpt and use different components. from diffusers import StableDiffusionXLPipeline, AutoencoderKL
import torch
pipeline = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
vae=AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=tprch.float16)
torch_dtype=torch.float16,
).to("cuda") This way, all components from "stabilityai/stable-diffusion-xl-base-1.0" will be downloaded except for the |
its possible if user knows exactly where to load components from - that is what i'm talking about - standardize a way for model publishers so each component has its "default" location well known and defined and state explicitly which component is present in the DDUF vs which is NOT (but it would still be required). even if the proposed change is a no-op as far as diffusers implementation goes (although it could be handled automatically in |
DDUFs are packaged with all the pipeline components, not partial. This is what will be shipped in the v0 and will be refined in the subsequent iterations. Cc: @SunMarc
We will be shipping a lot of "auto" classes at the model-level and I believe that should make it quite easier to implement what you're envisioning. Cc: @yiyixuxu |
Maybe it makes sense to add some background information for this feature request. There was a discussion a while back on Discord about a proposed new model format that could solve a few pain points we currently have. Let's take Sana as an example for modern diffusion model. It's DiT has 1.6B parameters, while it's text encoder is roughly twice as large. Models are commonly fine tuned, while leaving the text encoder (and vae) frozen. But to distribute that fine tuned model, we still need to distribute the text encoder and vae. Even if their weights haven't changed from the base model. This proposal would allow the distribution of a partial model file that just points to already existing models. Without this information, any user of that model would need to guess where they can get the missing parts. The plan would be to start switching to DDUF as a new standardized file format for most tools (model output of fine tuning tools, and model input for inference tools). Right now there is absolutely no standard, and everyone just does their own thing. With this addition, we would have most of the issues solved and could start standardizing the eco system a bit more.
What exactly is your plan for this new file format? Is it only supposed to be used inside the diffusers library, or do you think it could be used outside of that library? Those classes would only be useful within diffusers. Some of the more popular tools don't use this library, but would still benefit from a standardized model format. (similar to how the hf hub can be used outside of the transformers/diffusers libraries) As for the specific format proposal, I think this needs a bit more thought. The proposed format would only allow models to be downloaded automatically from the hf hub. But what happens if the user already has a version of that model downloaded manually? In that case it would be useful to have some kind of type identifier in there as well. Also, can this only point to a subfolder in a hf repo, or would it be useful if it could point to a different DDUF file that's hosted somewhere else? |
sorry to hear that, as it makes DDUF almost dead-on-arrival since new models are just huge with massive frozen components like LLM that do NOT need to be redistributed in each and every finetune of the model. |
@sayakpaul - To be very clear, this isn’t a “power user” feature request. This is an attempt to help turn the DDUF format into “the standard” that has been actively discussed/debated in the OMI discord. Nerogar and the Onetrainer team have done some of the deepest thinking about this, but the aim is for the standard to serve:
Partial model distribution and composability is a problem faced often during both training and in deployment. |
I mentioned "power-user" because:
The original issue was about Thanks for providing the DDUF-related comments. As mentioned also in the docs, the v0 will be extremely minimal and we will be improving it as we go. I will let @julien-c @LysandreJik @Wauplin shed more details if I am missing something if needed. |
as is
dduf
does have benefits, but it would be great if it could cover one of the most common use-cases: partial models.e.g.
why not have model config that can point each model component not just to subfolder, but to other repo as well.
so for example, sdxl model could have unet-only with links to vae, te1, te2 in reference repo if they are unchanged.
for example:
Originally posted by @vladmandic in #10037 (comment)
The text was updated successfully, but these errors were encountered: