Disable dot/mul optimizations when there is int4 weights #3645

pfultz2 · 2024-11-20T17:23:11Z

No description provided.

codecov · 2024-11-20T22:42:45Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.18%. Comparing base (f11472c) to head (da1eadd).
Report is 12 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #3645   +/-   ##
========================================
  Coverage    92.18%   92.18%           
========================================
  Files          513      513           
  Lines        21576    21596   +20     
========================================
+ Hits         19889    19908   +19     
- Misses        1687     1688    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

lakhinderwalia · 2024-11-21T00:30:03Z

src/simplify_algebra.cpp

+            if(contains({"reshape", "dequantizelinear"}, alias->name()))
+                return self(alias->inputs().front());
+            if(alias->name() == "concat")
+                return all_of(alias->inputs(), self);


Shouldn't this be any_of? Also, the name of this function/lambda from_int4 should say any or all, just to make its intent clear. Thanks.

Shouldn't this be any_of?

No, if its mixed then we cant get int4 out of it.

migraphx-bot · 2024-11-22T20:27:30Z

Test	Batch	Rate new da1ead	Rate old 0ad807	Diff	Compare
torchvision-resnet50	64	3,257.59	3,239.44	0.56%	✅
torchvision-resnet50_fp16	64	6,983.44	6,989.98	-0.09%	✅
torchvision-densenet121	32	2,434.09	2,435.40	-0.05%	✅
torchvision-densenet121_fp16	32	4,061.07	4,083.28	-0.54%	✅
torchvision-inceptionv3	32	1,627.04	1,630.36	-0.20%	✅
torchvision-inceptionv3_fp16	32	2,747.44	2,743.98	0.13%	✅
cadene-inceptionv4	16	765.05	765.72	-0.09%	✅
cadene-resnext64x4	16	810.37	811.01	-0.08%	✅
slim-mobilenet	64	7,460.35	7,471.20	-0.15%	✅
slim-nasnetalarge	64	208.48	208.50	-0.01%	✅
slim-resnet50v2	64	3,442.10	3,438.88	0.09%	✅
bert-mrpc-onnx	8	1,150.93	1,147.12	0.33%	✅
bert-mrpc-tf	1	493.27	469.98	4.96%	🔆
pytorch-examples-wlang-gru	1	423.16	416.34	1.64%	✅
pytorch-examples-wlang-lstm	1	380.10	463.26	-17.95%	🔴
torchvision-resnet50_1	1	760.87	770.23	-1.21%	✅
cadene-dpn92_1	1	396.47	399.89	-0.85%	✅
cadene-resnext101_1	1	383.09	382.62	0.12%	✅
onnx-taau-downsample	1	345.98	345.09	0.26%	✅
dlrm-criteoterabyte	1	33.33	33.34	-0.04%	✅
dlrm-criteoterabyte_fp16	1	52.78	52.72	0.12%	✅
agentmodel	1	8,219.49	10,261.49	-19.90%	🔴
unet_fp16	2	58.91	58.87	0.07%	✅
resnet50v1_fp16	1	933.46	982.51	-4.99%	🔴
resnet50v1_int8	1	1,003.14	1,007.17	-0.40%	✅
bert_base_cased_fp16	64	1,168.84	1,169.34	-0.04%	✅
bert_large_uncased_fp16	32	363.18	363.46	-0.08%	✅
bert_large_fp16	1	198.68	198.15	0.27%	✅
distilgpt2_fp16	16	2,202.56	2,201.47	0.05%	✅
yolov5s	1	526.76	536.48	-1.81%	✅
tinyllama	1	43.43	43.38	0.12%	✅
vicuna-fastchat	1	176.92	172.97	2.28%	✅
whisper-tiny-encoder	1	418.34	418.04	0.07%	✅
whisper-tiny-decoder	1	427.78	423.83	0.93%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-11-22T20:27:33Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

pfultz2 added 4 commits November 20, 2024 08:17

Skip matrices with int4 weights for dot/mul optimizations

0acc927

Format

ed2492c

Use recursion

5c7f825

Format

f173033

pfultz2 requested a review from causten as a code owner November 20, 2024 17:23

pfultz2 requested review from CharlieL7, shivadbhavsar and turneram and removed request for causten November 20, 2024 17:26

causten requested a review from lakhinderwalia November 20, 2024 18:01

This was referenced Nov 20, 2024

Improve fusions with dequantizelinear #3551

Closed

[INT4] Compress model by quantizing weights to int4 #3307

Closed

pfultz2 self-assigned this Nov 20, 2024

shivadbhavsar approved these changes Nov 20, 2024

View reviewed changes

lakhinderwalia reviewed Nov 21, 2024

View reviewed changes

CharlieL7 approved these changes Nov 21, 2024

View reviewed changes

pfultz2 added 2 commits November 22, 2024 11:24

Fix cppcheck

9f01418

Format

da1eadd

causten merged commit d688810 into develop Nov 22, 2024
43 of 45 checks passed

causten deleted the skip-in4-weights-dot-mul branch November 22, 2024 21:20

shivadbhavsar pushed a commit that referenced this pull request Dec 18, 2024

Disable dot/mul optimizations when there is int4 weights (#3645)

fac0cb2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable dot/mul optimizations when there is int4 weights #3645

Disable dot/mul optimizations when there is int4 weights #3645

pfultz2 commented Nov 20, 2024

codecov bot commented Nov 20, 2024 •

edited

Loading

lakhinderwalia Nov 21, 2024

pfultz2 Nov 22, 2024

migraphx-bot commented Nov 22, 2024

migraphx-bot commented Nov 22, 2024

Disable dot/mul optimizations when there is int4 weights #3645

Disable dot/mul optimizations when there is int4 weights #3645

Conversation

pfultz2 commented Nov 20, 2024

codecov bot commented Nov 20, 2024 • edited Loading

Codecov Report

lakhinderwalia Nov 21, 2024

Choose a reason for hiding this comment

pfultz2 Nov 22, 2024

Choose a reason for hiding this comment

migraphx-bot commented Nov 22, 2024

migraphx-bot commented Nov 22, 2024

codecov bot commented Nov 20, 2024 •

edited

Loading