Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make subnet nodes configurable and fix https outcall cost calculation #351

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ninegua
Copy link
Member

@ninegua ninegua commented Jan 6, 2025

XC-257

Fix the cycle computation according to the latest document on the cost for https outcalls https://internetcomputer.org/docs/current/developer-docs/gas-cost/#https-outcalls

Also add nodesInSubnet as an optional initialization parameter when creating the EVM RPC canister. It defaults to 34 if not set, which is also the default used previously.

@ninegua ninegua force-pushed the paulliu/fix-http-request-cost-calculation branch from 1259708 to b4d4885 Compare January 7, 2025 04:54
@ninegua ninegua force-pushed the paulliu/fix-http-request-cost-calculation branch from b4d4885 to e98a9d3 Compare January 20, 2025 13:47
@ninegua ninegua force-pushed the paulliu/fix-http-request-cost-calculation branch from f0f049d to dc90e48 Compare January 23, 2025 14:09
@ninegua ninegua marked this pull request as ready for review January 23, 2025 14:28
@ninegua ninegua requested a review from a team as a code owner January 23, 2025 14:28
Copy link
Contributor

@lpahlavi lpahlavi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ninegua Thanks a lot for this PR! Some small nits, otherwise LGTM!

candid/evm_rpc.did Show resolved Hide resolved
src/accounting.rs Show resolved Hide resolved
@@ -1,15 +1,5 @@
// HTTP outcall cost calculation
// See https://internetcomputer.org/docs/current/developer-docs/gas-cost#special-features
// Constant used in HTTP outcall cost calculation
pub const INGRESS_OVERHEAD_BYTES: u128 = 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there perhaps a link where this is documented? I'm not super familiar with all of this so I appreciate when there is documentation available :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why this gets the value 100 in the first place. But I guess it was some heuristics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I also have the impression that this shouldn't be there otherwise this would mean that the formula in the developer docs is not correct. So I would remove it. WDYT?

@@ -1042,6 +1049,38 @@ fn candid_rpc_should_err_without_cycles() {
);
}

#[test]
fn candid_rpc_should_err_with_insufficient_cycles() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would add a test for when the number of nodes in the subnet is not specified in the install args.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the explicit setting of 34 nodes. So the default will be used in all tests (if not explicitly set).

Copy link
Member

@gregorydemay gregorydemay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @ninegua for this PR! Some minor comments and suggestions. One question relates to whether we could get rid of this heuristic to approximate the request size.

candid/evm_rpc.did Show resolved Hide resolved
@@ -14,7 +14,7 @@ shared ({ caller = installer }) actor class Main() {

// (`subnet name`, `nodes in subnet`, `expected cycles for JSON-RPC call`)
type SubnetTarget = (Text, Nat32, Nat);
let fiduciarySubnet : SubnetTarget = ("fiduciary", 34, 642_627_200);
let fiduciarySubnet : SubnetTarget = ("fiduciary", 34, 544_244_800);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cycle cost reduction by almost 20% 💰

tests/tests.rs Outdated Show resolved Hide resolved
fn candid_rpc_should_err_with_insufficient_cycles() {
let setup = EvmRpcSetup::with_args(InstallArgs {
demo: Some(true),
nodes_in_subnet: Some(33),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also add the happy path as part of this test, meaning

  1. First try with 33 nodes and fail (already done)
  2. Then, try with the correct number of nodes with the exact same request and ensure it's succesful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Added

let env = Arc::new(PocketIc::new());
// The `with_fiduciary_subnet` setup below requires that `nodes_in_subnet`
// setting (part of InstallArgs) to be set appropriately. Otherwise
// http outcall will fail due to insufficient cycles, even when `demo` is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a fix here would be add a lot of cycles to the canister when the canister is installed (after L.125), so that even in demo mode the canister can pay for the calls. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think it needs a fix. It is just how "demo" is being interpreted, as in "the canister pays all the cost, the caller doesn't pay".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exactly, but to ensure that the canister in demo mode has enough cycles to pay for the calls, we could just set it up with u64::MAX cycles from the beginning. Maybe I misunderstood your comment and it's also ok to leave as is.

) if regex.is_match(&message)
);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional test idea: one thing we want to ensure is that the canister is sufficiently charging users so that it's not loosing cycles:
For a bunch of methods (eth_getTransactionReceipt, ...), do

  1. get canister cycle balance
  2. do call, should be successful.
  3. ensure current canister cycle balance is no smaller than the one before the call.

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least currently the canister is not charging the user cycle cost of its own computation, but only the http outcalls cost (despite it being over calculated).

If this is a goal, we should tell the user that we intend to over charge them to recover running cost.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a goal, we should tell the user that we intend to over charge them to recover running cost.

I think this is already the case with this COLLATERAL_CYCLES_PER_NODE, no?

@@ -1,15 +1,5 @@
// HTTP outcall cost calculation
// See https://internetcomputer.org/docs/current/developer-docs/gas-cost#special-features
// Constant used in HTTP outcall cost calculation
pub const INGRESS_OVERHEAD_BYTES: u128 = 100;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I also have the impression that this shouldn't be there otherwise this would mean that the formula in the developer docs is not correct. So I would remove it. WDYT?

payload_size_bytes: u64,
max_response_bytes: u64,
) -> u128 {
let n = nodes_in_subnet as u128;
let ingress_bytes =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression that this heuristic is very brittle. Why not take as argument instead a &CanisterHttpRequestArgument and then serialize it to find out the exact request size?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking about it again, this might not be such a good idea, since this could open the door to a cycle draining attack, as serialization is a costly operation and this would need to be done before charging for cycles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ninegua : The cycle cost computation on the replica side actually doesn't take into account Candid encoding so that I think the method to compute cost could

  1. take as an argument &CanisterHttpRequestArgument as initially suggested
  2. Compute the exact cost without needing to do Candid serialization, use the same logic as variable_parts_size to compute n
  3. Maybe add a margin as a security factor (e.g. 5%)

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This canister also provides a public method request_cost to help users with cost calculation, which cannot use the exact CanisterHttpRequestArgument object.

I think we could over-charge a bit and refund the user with cycles not used. The above request_cost method calculates the cost using get_cost_with_collateral function and we do reject if users pay any less, but the actual charge is without the collateral. So I'm confused because the surplus user provided wasn't charged and therefore not put in any use. I think this could be a mistake.

I'll propose we charge user by the amount from get_cost_with_collateral and then pass on the refund back to the user. Or we can skip the refund and keep the surplus for the evm canister so that it hopefully can sustain itself. WDYT, and which do you prefer? @gregorydemay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Paul for looking into it! I also think that the canister should sustain itself so I would go with

Or we can skip the refund and keep the surplus for the evm canister so that it hopefully can sustain itself.

So I'm confused because the surplus user provided wasn't charged and therefore not put in any use. I think this could be a mistake.

I'm not entirely sure what happened here to be honest. I think it makes sense to have some collateral and that we charge the user with, since ideally they should also pay for the API keys.

regarding request_cost, I think I would still create a dummy CanisterHttpRequestArgument, because for example we do know that there sill be some headers (e.g. content type) and like this we could follow the same logic as within the replica. The collateral should be big enough so that I don't think additional security margins are needed. Does that make sense?

src/accounting.rs Show resolved Hide resolved
) if regex.is_match(&message)
);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a goal, we should tell the user that we intend to over charge them to recover running cost.

I think this is already the case with this COLLATERAL_CYCLES_PER_NODE, no?

let env = Arc::new(PocketIc::new());
// The `with_fiduciary_subnet` setup below requires that `nodes_in_subnet`
// setting (part of InstallArgs) to be set appropriately. Otherwise
// http outcall will fail due to insufficient cycles, even when `demo` is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exactly, but to ensure that the canister in demo mode has enough cycles to pay for the calls, we could just set it up with u64::MAX cycles from the beginning. Maybe I misunderstood your comment and it's also ok to leave as is.

request
.max_response_bytes
.unwrap_or(DEFAULT_MAX_RESPONSE_BYTES),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: here for a lack of a better place. Regarding L.91

ic_cdk::api::call::msg_cycles_accept128(cycles_cost);

Shouldn't we check the return value to ensure that it's at least cycles_cost?

payload_size_bytes: u64,
max_response_bytes: u64,
) -> u128 {
let n = nodes_in_subnet as u128;
let ingress_bytes =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Paul for looking into it! I also think that the canister should sustain itself so I would go with

Or we can skip the refund and keep the surplus for the evm canister so that it hopefully can sustain itself.

So I'm confused because the surplus user provided wasn't charged and therefore not put in any use. I think this could be a mistake.

I'm not entirely sure what happened here to be honest. I think it makes sense to have some collateral and that we charge the user with, since ideally they should also pay for the API keys.

regarding request_cost, I think I would still create a dummy CanisterHttpRequestArgument, because for example we do know that there sill be some headers (e.g. content type) and like this we could follow the same logic as within the replica. The collateral should be big enough so that I don't think additional security margins are needed. Does that make sense?

) -> RpcResult<HttpResponse> {
let cycles_cost = get_http_request_cost(
get_num_subnet_nodes(),
request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a mistake, not only the body matters for n but also headers and transform see variable_parts_size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants