-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] malus-collator: implement initial version of malicious collator submitting same collation to all backing groups #6924
base: master
Are you sure you want to change the base?
Conversation
…or submitting same collation to all backing groups
All GitHub workflows were cancelled due to failure one of the required jobs. |
…o Collator implementation
…extend test scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall!
/// If true, the collator will behave maliciously by submitting | ||
/// the same collations to all assigned backing groups. | ||
#[arg(long, default_value_t = false)] | ||
pub malus: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could have this malus flag map to an enum of variants, describing what malicious behaviour it's triggering
@@ -19,7 +19,7 @@ | |||
use polkadot_cli::{Error, Result}; | |||
use polkadot_node_primitives::CollationGenerationConfig; | |||
use polkadot_node_subsystem::messages::{CollationGenerationMessage, CollatorProtocolMessage}; | |||
use polkadot_primitives::Id as ParaId; | |||
use polkadot_primitives::{self, Id as ParaId}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the self
used?
let scheduled_cores: Vec<CoreIndex> = claim_queue | ||
.iter() | ||
.filter_map(move |(core_index, paras)| { | ||
Some((*core_index, *paras.get(claim_queue_offset.0 as usize)?)) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look at ClaimQueueSnapshot
to reuse some of the code there with iter_claims_at_depth
let claim_queue = match client.runtime_api().claim_queue(relay_parent) { | ||
Ok(claim_queue) => | ||
if claim_queue.is_empty() { | ||
log::info!(target: LOG_TARGET, "Claim queue is empty."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this case because it'll be caught by scheduled_cores.is_empty()
}) | ||
.collect(); | ||
|
||
if scheduled_cores.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this log is super spammy until the para is scheduled. It's also not an error.
We should wait for a new slot before retrying (the 6 seconds). Same for all the other continue
s here.
I would also log something if scheduled_cores
only has one core assigned, as in that case, this malus collator will not do anything malicious
|
||
overseer_handle | ||
.send_msg( | ||
CollatorProtocolMessage::DistributeCollation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment about why we're not using SubmitCollation
commitments: commitments.clone(), | ||
}; | ||
|
||
let candidate_receipt = ccr.to_plain(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid building a CommittedCandidateReceiptV2
if you only need the plain receipt
charlie: js-script ./assign-core.js with "2,2000,57600" return is 0 within 600 seconds | ||
|
||
# Ensure parachains made progress. | ||
alice: parachain 2000 block height is at least 10 within 300 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add some better assertions, including some log checks that indeed validators got invalid candidates.
Also, if you want this test being run in the CI, it needs to be added to the gitlab config in polkadot.yml
# Register three cores assigned to this parachain. | ||
alice: js-script ./assign-core.js with "0,2000,57600" return is 0 within 600 seconds | ||
bob: js-script ./assign-core.js with "1,2000,57600" return is 0 within 600 seconds | ||
charlie: js-script ./assign-core.js with "2,2000,57600" return is 0 within 600 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core 2 is assigned by default already when force registering the parachain
@@ -0,0 +1,54 @@ | |||
[settings] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DSL is deprecated, we have a goal to port tests to Zombienet SDK.
@@ -86,6 +90,8 @@ pub struct GraveyardState { | |||
pub zombies: u64, | |||
// Grave seal. | |||
pub seal: [u8; 32], | |||
// Increasing sequence number for core selector. | |||
pub core_selector_number: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting we use the block number (LSB), is there anything preventing us to do so ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use upward messages to place UMP signals and specify the core index, we needed this core_selector_number for the core selector. Do you mean doing something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Andrei means that we typically use the block number for the core selector in real parachains.
The undying collator's state does not store the block number, so this core_selector_number
is actually the block number
} | ||
|
||
// Wait before repeating the process. | ||
sleep(Duration::from_secs(6 as u64)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to wait 6s ? If relay chain skips a slot, we build another candidate, which is different than what the honest collator is doing. We should be building on each relay parent we import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean no waiting at all, or should the waiting time somehow depend on the imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current code is very roughly mimicking what the slot based collator does (every slot, pick the best relay chain block)
What Andrei is suggesting is to simulate what the lookahead collator does and subscribe to relay chain block import notifications instead. This would indeed be closer to what the honest collator is doing. I think for a test collator either is fine
@@ -119,6 +125,7 @@ pub fn execute_transaction(mut block_data: BlockData) -> GraveyardState { | |||
// Chain hash the seals and burn CPU. | |||
block_data.state.seal = hash_state(&block_data.state); | |||
} | |||
block_data.state.core_selector_number += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to use wrapping_add.
some info about overflows in rust https://doc.rust-lang.org/reference/behavior-not-considered-unsafe.html#integer-overflow
Issues
Description
Modified the undying collator to include a malus mode, in which it submits the same collation to all assigned backing groups.
TODO
same-collation-to-all-assigned-cores.toml
using theassign-core.js
script;same-collation-to-all-assigned-cores.zndsl
Zombienet test;