Skip to content

Commit

Permalink
fix bugs
Browse files Browse the repository at this point in the history
  • Loading branch information
wenhuach21 committed Jun 3, 2024
1 parent 063009b commit dd40f17
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion auto_round/auto_quantizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,7 @@ def convert_model(self, model: nn.Module):
return model

def _dynamic_import_inference_linear(self, bits, backend):
if bits == 4 and self.exllama2_available and "exllama2" in backend:
if bits == 4 and self.exllama2_available and "exllamav2" in backend:
from auto_round_extension.cuda.qliner_exllamav2 import QuantLinear
else:
from auto_round_extension.cuda.qliner_triton import QuantLinear
Expand Down

0 comments on commit dd40f17

Please sign in to comment.