Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Encoder in diffusers.models.autoencoders.vae's forward method return type mismatch leads to AttributeError #10382

Open
mq-yuan opened this issue Dec 25, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@mq-yuan
Copy link

mq-yuan commented Dec 25, 2024

Describe the bug

Issue Description:
When using the Encoder from the diffusers.models.autoencoders.vae module, calling its forward method returns a value type mismatch, resulting in an AttributeError during subsequent processing. Specifically, when calling the Encoder's forward method, the returned result is a tuple, while the subsequent code expects to receive a tensor.

Reproduction

Please use the following code to reproduce the issue

from diffusers.models.autoencoders.vae import Encoder
import torch

encoder = Encoder(
    down_block_types=["DownBlock2D", "DownBlock2D"],
    block_out_channels=[64, 64],
)

encoder(torch.randn(1, 3, 256, 256)).shape

Expected Behavior:
The Encoder's forward method in diffusers.models.autoencoders.vae should return a tensor for further processing.

Actual Behavior:
Running the above code results in the following error:

AttributeError: 'tuple' object has no attribute 'dim'

Additional Information:

  • Error log:
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    encoder(torch.randn(1, 3, 256, 256)).shape
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    sample = down_block(sample)
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    hidden_states = resnet(hidden_states, temb)
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    hidden_states = self.norm1(hidden_states)
  File "python3.11/site-packages/torch/nn/modules/normalization.py", line 313, in forward
    return F.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
  File "python3.11/site-packages/torch/nn/functional.py", line 2947, in group_norm
    if input.dim() < 2:
 AttributeError: 'tuple' object has no attribute 'dim'
  • Relevant code snippet:

    • In diffusers/models/autoencoders/vae.py, lines 171-173:
    for down_block in self.down_blocks:
        sample = down_block(sample)
    • DownBlock2D's forward method declaration:
    def forward(
        self, hidden_states: torch.Tensor, temb: Optional[torch.Tensor] = None, *args, **kwargs
    ) -> Tuple[torch.Tensor, Tuple[torch.Tensor, ...]]:

Logs

No response

System Info

  • 🤗 Diffusers version: 0.31.0
  • Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39
  • Running on Google Colab?: No
  • Python version: 3.11.11
  • PyTorch version (GPU?): 2.5.1 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.26.5
  • Transformers version: 4.47.0
  • Accelerate version: 1.2.1
  • PEFT version: 0.14.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.5
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 3090, 24576 MiB
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help?

@DN6 @sayakpaul

@mq-yuan mq-yuan added the bug Something isn't working label Dec 25, 2024
@Maharnab-Saikia
Copy link

Maharnab-Saikia commented Jan 9, 2025

The issue arises because the down_blocks in diffusers.models.unets.unet_2d_blocks return different outputs: sometimes a single tensor (hidden_states), and other times tuples like (hidden_states, output_states) or (hidden_states, output_states, skip_sample). However, in the Encoder implementation, the loop that processes down_blocks assumes only a single tensor output:

for down_block in self.down_blocks:
    sample = down_block(sample)

This behavior causes errors when the down_block returns a tuple instead of a single tensor. Specifically, the tuple is passed to the next layer, which expects only a tensor, leading to runtime errors.

@hlky
Copy link
Collaborator

hlky commented Jan 14, 2025

Encoder uses DownEncoderBlock2D not DownBlock2D.

from diffusers.models.autoencoders.vae import Encoder
import torch

encoder = Encoder(
    down_block_types=["DownEncoderBlock2D", "DownEncoderBlock2D"],
    block_out_channels=[64, 64],
)

encoder(torch.randn(1, 3, 256, 256)).shape
# torch.Size([1, 6, 128, 128])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants