Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bridges - Add improved congestion control mechanism #6231

Draft
wants to merge 97 commits into
base: master
Choose a base branch
from

Conversation

bkontur
Copy link
Contributor

@bkontur bkontur commented Oct 25, 2024

Closes: #5551
Closes: #5550

Context

Before permissionless lanes, bridges only supported hard-coded, static lanes. The congestion mechanism was based on sending Transact(report_bridge_status(is_congested)) from pallet-xcm-bridge-hub to pallet-xcm-bridge-hub-router. Depending on is_congested, we adjusted the fee factor to increase or decrease fees. This congestion mechanism relied on monitoring XCMP queues, which could cause issues like suspending the entire XCMP queue rather than just the affected bridge.

Additionally, we are progressing with deploying bridge message pallets/routing directly on AssetHub, where we don’t interact with XCMP to perform ExportXcm locally.

Description

This PR re-introduces and improves congestion for bridges:

  • Enhanced Bridge Congestion Mechanism: The bridge queue mechanism has been restructured to operate independently of XCMP, with a refined protocol for congestion detection and suspension management.

  • Bridge-Specific Channel Suspension: pallet-xcm-bridge-hub and pallet-xcm-bridge-hub-router now use BridgeId to identify specific bridges, enabling selective suspension and resumption of individual bridge channels.

  • Dynamic Congestion Detection: pallet-xcm-bridge-hub now includes callbacks for fn suspend_bridge and fn resume_bridge based on congestion status:

    • For sibling chains, the router sends xcm::Transact(report_bridge_status(bridge_id, is_congested)) using the stored callback information.
    • For local chain deployments, the router manages state directly.
  • New Stop Threshold: A stop_threshold limit in pallet-xcm-bridge-hub enables or disables ExportXcm::validate, providing a fallback mechanism when the router does not adhere to the suspend signal.

  • Flexible Message Routing: pallet-xcm-bridge-hub-router has been refactored to support message routing for both sibling chains (ExportMessage) and local deployment (ExportXcm).

These updates improve modularity, allow more granular bridge congestion handling, and support diverse deployment scenarios.

Open questions

  • when the router receives is_congested = false with fn do_update_bridge_status(, can we directly remove it (its fee factor) from Bridges and the next message fee wont be affected by increased fee factor or slowly start decreasing fee factor on idle (this is how it is implemented now)? Original comment

@bkontur bkontur added the T15-bridges This PR/Issue is related to bridges. label Oct 25, 2024
@bkontur bkontur self-assigned this Oct 25, 2024
@bkontur
Copy link
Contributor Author

bkontur commented Oct 25, 2024

bot fmt
/cmd prdoc --audience runtime_dev --bump patch

@bkontur bkontur force-pushed the bko-bridges-congestion branch 2 times, most recently from 659be89 to b48b8a5 Compare October 25, 2024 21:14
prdoc/pr_6231.prdoc Outdated Show resolved Hide resolved
@bkontur bkontur force-pushed the bko-bridges-congestion branch from a663bc2 to cbc6ae7 Compare October 26, 2024 20:29
@bkontur
Copy link
Contributor Author

bkontur commented Oct 26, 2024

bot fmt

@bkontur bkontur force-pushed the bko-bridges-congestion branch from c78f9bc to 501a5c0 Compare October 28, 2024 15:01
@bkontur
Copy link
Contributor Author

bkontur commented Oct 28, 2024

bot fmt

@bkontur bkontur force-pushed the bko-bridges-congestion branch 7 times, most recently from c78e707 to 152389a Compare November 5, 2024 12:33
@bkontur
Copy link
Contributor Author

bkontur commented Nov 5, 2024

bot fmt

@bkontur bkontur force-pushed the bko-bridges-congestion branch 3 times, most recently from edd9c5c to 38f1bb3 Compare November 7, 2024 13:33
@bkontur
Copy link
Contributor Author

bkontur commented Nov 7, 2024

/cmd bench --runtime asset-hub-westend asset-hub-rococo --pallet pallet_xcm_bridge_hub_router

@bkontur
Copy link
Contributor Author

bkontur commented Nov 7, 2024

bot bench cumulus-assets --runtime=asset-hub-westend --pallet=pallet_xcm_bridge_hub_router
bot bench cumulus-assets --runtime=asset-hub-rococo --pallet=pallet_xcm_bridge_hub_router

@bkontur bkontur force-pushed the bko-bridges-congestion branch from f06433a to d329dec Compare November 7, 2024 16:44
@bkontur
Copy link
Contributor Author

bkontur commented Nov 7, 2024

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-assets --runtime=asset-hub-westend --pallet=pallet_xcm_bridge_hub_router
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-assets --runtime=asset-hub-rococo --pallet=pallet_xcm_bridge_hub_router

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-rococo --pallet=pallet_bridge_messages
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-westend --pallet=pallet_bridge_messages

@bkontur
Copy link
Contributor Author

bkontur commented Nov 7, 2024

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-rococo --pallet=pallet_bridge_messages
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-westend --pallet=pallet_bridge_messages
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --subcommand=xcm --runtime=bridge-hub-rococo --pallet=pallet_xcm_benchmarks::generic
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --subcommand=xcm --runtime=bridge-hub-westend --pallet=pallet_xcm_benchmarks::generic

@bkontur bkontur force-pushed the bko-bridges-congestion branch from 21baa7f to c00ff6b Compare November 8, 2024 17:44
@bkontur
Copy link
Contributor Author

bkontur commented Nov 8, 2024

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-rococo --pallet=pallet_bridge_messages
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-westend --pallet=pallet_bridge_messages
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-rococo --pallet=pallet_xcm_bridge_hub
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --runtime=bridge-hub-westend --pallet=pallet_xcm_bridge_hub

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --subcommand=xcm --runtime=bridge-hub-rococo --pallet=pallet_xcm_benchmarks::generic
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-bridge-hubs --subcommand=xcm --runtime=bridge-hub-westend --pallet=pallet_xcm_benchmarks::generic

bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-assets --runtime=asset-hub-westend --pallet=pallet_xcm_bridge_hub_router
bot bench -v PIPELINE_SCRIPTS_REF=bko-fix cumulus-assets --runtime=asset-hub-rococo --pallet=pallet_xcm_bridge_hub_router

@bkontur bkontur added the A4-needs-backport Pull request must be backported to all maintained releases. label Nov 11, 2024
Copy link

Command "fmt" has started 🚀 See logs here

Copy link

Command "fmt" has finished ✅ See logs here

Copy link
Contributor

@acatangiu acatangiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the XCM transport (pallet-messages layer) will be deployed to Asset Hub instead of Bridge Hub, so would it be easier to not go through all of the effort of upgrading Bridge Hub with all this only to deprecate it on the next iteration?

What if instead of changing and renaming xcm-bridge-hub to support permissionless lanes, we just create a new pallet xcm-bridge which is what you basically have in this PR, and we revert xcm-bridge-hub to the same code actually deployed today on Bridge Hub?

That way we keep legacy code with legacy lane on BH without any migrations or risk, and all of this new code goes straight to AH, then at some point we switch the AH exporter from legacy to new?

I know this is late and maybe we should've had this idea earlier, so I'm leaving it up to you to decide which way is easiest to do.

@@ -1127,7 +1126,6 @@ mod tests {
Option<PreDispatchData<ThisChainAccountId, BridgedChainBlockNumber, TestLaneIdType>>,
TransactionValidityError,
> {
sp_tracing::try_init_simple();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why remove this? it's useful in debugging, no?

bridges/modules/xcm-bridge-hub-router/src/impls.rs Outdated Show resolved Hide resolved
bridged_dest.clone()
}
} else {
// if `bridged_dest` does not contain `GlobalConsensus`, let's prepend one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a case when ensure_is_remote(UniversalLocation::get(), dest.clone()) returns Ok, but dest does not contain GlobalConsensus?
should we even support such a case?

bridges/modules/xcm-bridge-hub-router/src/lib.rs Outdated Show resolved Hide resolved
bridges/primitives/xcm-bridge-hub/src/lib.rs Outdated Show resolved Hide resolved
bridges/modules/xcm-bridge-hub/src/exporter.rs Outdated Show resolved Hide resolved
bridges/modules/xcm-bridge-hub/src/exporter.rs Outdated Show resolved Hide resolved
bridges/modules/xcm-bridge-hub/src/exporter.rs Outdated Show resolved Hide resolved
prdoc/pr_6231.prdoc Outdated Show resolved Hide resolved
@acatangiu acatangiu changed the title Bridges - revert-back and improve congestion Bridges - add new congestion control protocol for dedicated lanes Dec 9, 2024
github-merge-queue bot pushed a commit that referenced this pull request Dec 10, 2024
Closes: #5551

## Description

With [permissionless lanes
PR#4949](#4949), the
congestion mechanism based on sending
`Transact(report_bridge_status(is_congested))` from
`pallet-xcm-bridge-hub` to `pallet-xcm-bridge-hub-router` was replaced
with a congestion mechanism that relied on monitoring XCMP queues.
However, this approach could cause issues, such as suspending the entire
XCMP queue instead of isolating the affected bridge. This PR reverts
back to using `report_bridge_status` as before.

## TODO
- [x] benchmarks
- [x] prdoc

## Follow-up

#6231

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: command-bot <>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
bkontur added a commit that referenced this pull request Dec 10, 2024
Closes: #5551

## Description

With [permissionless lanes
PR#4949](#4949), the
congestion mechanism based on sending
`Transact(report_bridge_status(is_congested))` from
`pallet-xcm-bridge-hub` to `pallet-xcm-bridge-hub-router` was replaced
with a congestion mechanism that relied on monitoring XCMP queues.
However, this approach could cause issues, such as suspending the entire
XCMP queue instead of isolating the affected bridge. This PR reverts
back to using `report_bridge_status` as before.

## TODO
- [x] benchmarks
- [x] prdoc

## Follow-up

#6231

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: command-bot <>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
(cherry picked from commit 8f4b99c)

# Conflicts:
#	Cargo.lock
#	cumulus/parachains/runtimes/assets/asset-hub-rococo/tests/tests.rs
#	cumulus/parachains/runtimes/assets/asset-hub-westend/src/lib.rs
Ank4n pushed a commit that referenced this pull request Dec 15, 2024
Closes: #5551

## Description

With [permissionless lanes
PR#4949](#4949), the
congestion mechanism based on sending
`Transact(report_bridge_status(is_congested))` from
`pallet-xcm-bridge-hub` to `pallet-xcm-bridge-hub-router` was replaced
with a congestion mechanism that relied on monitoring XCMP queues.
However, this approach could cause issues, such as suspending the entire
XCMP queue instead of isolating the affected bridge. This PR reverts
back to using `report_bridge_status` as before.

## TODO
- [x] benchmarks
- [x] prdoc

## Follow-up

#6231

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: command-bot <>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Comment on lines +229 to 244
for (bridge_id, previous_value, bridge_state) in bridges_to_update.into_iter() {
let new_value = bridge_state.delivery_fee_factor;
log::info!(
target: LOG_TARGET,
"Bridge channel with id {:?} is uncongested. Decreasing fee factor from {} to {}!",
bridge_id,
previous_value,
new_value,
);
Bridges::<T, I>::insert(&bridge_id, bridge_state);
Self::deposit_event(Event::DeliveryFeeFactorDecreased {
previous_value,
new_value,
bridge_id,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why not guard the weight usage with meter in second loop, what if there is a lot of bridges to update/remove here while the weight is used up in first loop?

@bkontur bkontur changed the title Bridges - add new congestion control protocol for dedicated lanes Bridges - Add improved congestion control mechanism Dec 19, 2024
@paritytech-review-bot paritytech-review-bot bot requested a review from a team December 20, 2024 21:34
@bkontur
Copy link
Contributor Author

bkontur commented Dec 22, 2024

/cmd fmt

Copy link

Command "fmt" has started 🚀 See logs here

Copy link

Command "fmt" has finished ✅ See logs here

dudo50 pushed a commit to paraspell-research/polkadot-sdk that referenced this pull request Jan 4, 2025
Closes: paritytech#5551

## Description

With [permissionless lanes
PR#4949](paritytech#4949), the
congestion mechanism based on sending
`Transact(report_bridge_status(is_congested))` from
`pallet-xcm-bridge-hub` to `pallet-xcm-bridge-hub-router` was replaced
with a congestion mechanism that relied on monitoring XCMP queues.
However, this approach could cause issues, such as suspending the entire
XCMP queue instead of isolating the affected bridge. This PR reverts
back to using `report_bridge_status` as before.

## TODO
- [x] benchmarks
- [x] prdoc

## Follow-up

paritytech#6231

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: command-bot <>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
@acatangiu
Copy link
Contributor

@bkontur please ping when this is ready for review

@bkontur bkontur marked this pull request as draft January 13, 2025 12:25
@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/12750200852
Failed job name: fmt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A4-needs-backport Pull request must be backported to all maintained releases. T15-bridges This PR/Issue is related to bridges.
Projects
Status: In Progress
Status: Backlog
Development

Successfully merging this pull request may close these issues.

Add benchmarks for pallet-xcm-bridge-hub Add LocalXcmChannelManager impls for XcmpQueue and BridgeHubs
6 participants