Skip to content

Commit

Permalink
Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 (#3757)
Browse files Browse the repository at this point in the history
I noticed allocation segments reaching close to uint64 max value, which clearly would throw an out of memory error when trying to allocate on the GPU. This happened when MIGRAPHX_NSTREAMS was set to 2 or greater and the model somehow was large enough to trigger it.

Changing from auto to size_t seems to fix the issue.

After some further investigation, the segment start value is the culprit which becomes int32. I've updated n as size_t for clarity.
  • Loading branch information
kahmed10 authored Jan 16, 2025
1 parent 9b6b58f commit 889fabc
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 4 deletions.
6 changes: 3 additions & 3 deletions src/memory_coloring.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2024 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -177,9 +177,9 @@ struct allocation_segment
{
assert(ins->get_shape().bytes() > 0);
// Compute alignment
auto n = 1 + (ins->get_shape().bytes() - 1) / alignment;
std::size_t n = 1 + (ins->get_shape().bytes() - 1) / alignment;
assert(n > 0);
auto start = 0;
std::size_t start = 0;
// Insert at end if it cant fit at the begining
if(segments.empty() or segments.begin()->first <= n)
{
Expand Down
16 changes: 15 additions & 1 deletion test/memory_coloring_test.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* The MIT License (MIT)
*
* Copyright (c) 2015-2023 Advanced Micro Devices, Inc. All rights reserved.
* Copyright (c) 2015-2025 Advanced Micro Devices, Inc. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -3805,4 +3805,18 @@ TEST_CASE(test_tuple)
CHECK(is_disjoint({a1, a2}));
}

TEST_CASE(test_large_offsets)
{
migraphx::module m;

auto a1 = add_alloc(m, {migraphx::shape::float_type, {10000000000}});
auto m1 = m.add_instruction(pass_op{}, a1);
auto a2 = add_alloc(m, {migraphx::shape::float_type, {10000000000}});
m.add_instruction(pass_op{}, a2, m1);
run_pass(m);
CHECK(m.get_parameter_shape("scratch").bytes() == 80000000000);
CHECK(no_allocate(m));
CHECK(is_disjoint({a1, a2}));
}

int main(int argc, const char* argv[]) { test::run(argc, argv); }

0 comments on commit 889fabc

Please sign in to comment.