True cost of function calls #7227

jakmeier · 2022-07-21T11:02:01Z

Measure the "true" cost of setting up a function call. This is about everything from processing the transaction all the way to persisting results. Excluded is the dynamic part, i.e. what happens inside the guest VM.

The model itself is discussed in #6992. This issue here is about finding the correct gas numbers for each parameter.

jakmeier · 2022-07-21T13:02:57Z

With all the changes around the estimator in the past two weeks, I was now able to estimate these "true" function call costs without instability or inaccuracy. I ran the tests on our estimator hardware setup on gcloud, with a state that is preloaded with 1M accounts and RocksDB stored on an SSD.

I found that icount measurement are consistently 20%-70% lower lower than time measurements. So I decided to ignore icount measurement and go with time instead, as this reflects better how long a node really takes to process a function call.

Here is the summary of what changes to parameters the estimator suggests:

	[Ggas]		[Ggas]	Parameter Change Factor	Note
action_receipt_creation_send_not_sir	108.060	->	237.154	2.1947	* This affects all actions
action_receipt_creation_execution	108.060	->	237.154	2.1947	* This affects all actions
action_function_call_send_not_sir	2319.862	->	20.676	0.0089
action_function_call_execution	2319.862	->	4337.353	1.8697	** Due to reading 4MiB contract from DB
wasm_contract_loading_base	0.0354460	->	25.844976214	729.1374	***
wasm_contract_loading_bytes	0.00021675	->	0.034621497	159.7301	***

* Might be reduced with flat storage
** Contract prefetching could reduce this, or storing size of contract. (The problem here is that we have to cover the DB fetching cost in this parameter without knowing the size, so this cost is for a 4MiB contract. If we know the size ahead of time, we could charge less for smaller contracts. Even better, if we prefetch the code ahead of time, we can reduce this to just 20Ggas.)
*** Contract loading ahead of time might solve this

These changes translate to per fn-call minimum overhead gas cost as follows, depending on contract code size.

Fn Overhead [Ggas]	10kB	Sweatcoin (202kB)	Aurora (1MB)	4MiB
today	4858.044945963	4899.713399463	5078.211792963	5764.992837963
new	5204.397639685	11860.103465959	40371.667918393	150071.266022773

jakmeier · 2022-07-21T13:42:59Z

cc @akhi3030 @bowenwang1996 @matklad @nagisa

I don't think we can move forward with the plan to "ship the new fn base costs". Some re-prioritisation is probably required here.

As the table above shows, we could keep total fn base costs for small contracts at around the same cost as today, but not decrease it. Even worse, we would have to increase costs for large contracts, e.g. Aurora would go from 5 Tgas currently to 40 Tgas afterwards.

There are two main problems.

DB reads amount to roughly 5ms of costs, and that's not even worst-case. This alone already prevents us from reducing the costs, as we currently charge 4.855 Tgas for all steps combined.
For large contracts, the time spent on contract loading is too long. Base cost is almost irrelevant here, the per-byte cost is important. Let's take Aurora as example, the base loading cost is estimated to be at 0.026 Tgas, or 26us. The cost based on size is 35 Tgas, or 35ms.

For number 1, we could prefetch DB reads for account data, access key data, and contract code. We have a potential of 4.1ms we could shave off this way, for all contracts regardless of size. This would make the smallest contracts up to 85% cheaper compared to today.

For number 2, it's important to understand that estimation is based on a contract that adds more and more functions. This is a proxy for what we assume to be a close to worst-case contract for loading.
We could either try to understand this case better and see if we can optimise for that, but we would also have to consider other possible worst-case contracts. But the bottom line is, as long as we know of valid contracts that are so slow to compile, we cannot ship a cheaper gas cost.
Alternatively, we could start loading all contracts ahead of time in a background thread, which would also allow to reduce gas costs. Prefetching the code from the DB would also be a requirement for that.

akhi3030 · 2022-07-21T13:49:15Z

There is quite a big reduction in action_function_call_send_not_sir. Do we know what happened there? Did we overestimate it originally?

akhi3030 · 2022-07-21T13:53:19Z

Do we have a design already for prefetching contracts?

jakmeier · 2022-07-21T14:01:43Z

There is quite a big reduction in action_function_call_send_not_sir. Do we know what happened there? Did we overestimate it originally?

Yeah, we know that we have been overcharging this part to cover undercharged costs in function loading. I'd say that was the main motivation behind looking all of these costs in combination and fixing the gas model first, before changing any single parameter.

Do we have a design already for prefetching contracts?

Not that I am aware of. But it wouldn't be substantially different from accounts + access key prefetching, for which we have a Jira issue STR-4.

matklad · 2022-07-21T14:18:22Z

Contract prefetching could reduce this, or storing size of contract.

we already store the size of the contract, so this should be "easy" to do I think

diff --git a/core/store/src/lib.rs b/core/store/src/lib.rs
index eeeb4cf3e..e9b890f9b 100644
--- a/core/store/src/lib.rs
+++ b/core/store/src/lib.rs
@@ -555,8 +555,12 @@ pub fn get_code(
     code_hash: Option<CryptoHash>,
 ) -> Result<Option<ContractCode>, StorageError> {
     state_update
-        .get(&TrieKey::ContractCode { account_id: account_id.clone() })
-        .map(|opt| opt.map(|code| ContractCode::new(code, code_hash)))
+        .get_ref(&TrieKey::ContractCode { account_id: account_id.clone() })
+        .map(|opt| opt.map(|ptr| {
+            let l = ptr.len();
+            gas_counter.add(l * loading_cost_per_byte)?;
+            ContractCode::new(ptr.deref_value()?, code_hash)
+        }))
 }
 
 /// Removes account, code and all access keys associated to it.

filed https://near.zulipchat.com/#narrow/stream/295306-pagoda.2Fcontract-runtime/topic/When.20to.20charge.20loading.20cost for discussing this

akhi3030 · 2022-07-22T09:21:56Z

Summary from the zulip thread + VC call between @matklad and @jakmeier 👍

@matklad to revert the stabilization (feat: stabilize precharged contract loading gas #7177)
After @jakmeier is back, we'll think carefully how we charge costs: there's a bunch of considerations we surfaced in just five minutes. E.g.:
- it seems logical to say that loading the contract is just storage_read op, which should be charged accordingly
- except that if we do that, we'll hit a problem where we could load WASM, but run out of gas when loading potentially larger compiled code
- so perhaps we need an extra parameter here?

matklad · 2022-07-22T12:33:41Z

I think wasm_contract_loading_bytes is measured incorrectly: it seems we don't do warm-up iterations there. With warm-up, the cost becomes couple of orders of magnitudes less.

$ cargo run -r -- --costs ContractLoadingPerByte --metric time --warmup-iters 1
ContractLoadingPerByte                                     192_248 gas

$ cargo run -r -- --costs ContractLoadingPerByte --metric time --warmup-iters 0
ContractLoadingPerByte                                  55_820_001 gas

$ cargo run -r -- --costs ContractLoadingPerByte --metric time
ContractLoadingPerByte                                  65_831_844 gas

It's interesting that we don't flag this as uncertain, because we measure aggregate time there, rather than average.

jakmeier · 2022-07-24T12:41:07Z

Pretty sure I was running 3 warm-up iterations and then 10 measured iterations,

I suspect the warmup only makes a difference because you are running with caches enabled here. I was running without caches, a.k.a. --features=required. (Also release mode sometimes makes a big difference. LTO isn't that important here but in my reported measurments, that was also used. I didn't use page cache drops, it only makes results less stable in this case.)

But it's still an interesting finding, I would have thought pure loading costs are not affected by caches. I think we need to find out:

what exactly are we caching
should it be cached or not for a realistic worst-case estimation?

This reverts commit f7aca03. Reverting based on discussion at #7227 The TL;DR is while the current impl of the feature charges correct cost for contract loading at *logically* correct place, we'd want to move the charging a little bit earlier physically, even before we load the contract from the database, so that the cost can be lower without adding an extra protocol feature here.

jakmeier · 2022-08-24T13:21:11Z

It seems the numbers presented here for current "true costs" are mostly accurate. Closing this issue now.

jakmeier added T-contract-runtime Team: issues relevant to the contract runtime team A-params-estimator Area: runtime params estimator labels Jul 21, 2022

jakmeier self-assigned this Jul 21, 2022

matklad mentioned this issue Jul 22, 2022

Revert "feat: stabilize precharged contract loading gas (#7177)" #7241

Merged

matklad mentioned this issue Aug 18, 2022

Profiling wasmer 2 to find bottlenecks in cold start #5014

Closed

jakmeier closed this as completed Aug 24, 2022

matklad mentioned this issue Aug 30, 2022

Charge contract_loading_cost before loading data from the database #7511

Open

matklad mentioned this issue Sep 20, 2022

protocol_feature_fix_contract_loading_cost tracking issue #5962

Closed

1 task

This was referenced Sep 30, 2022

New Function Call Cost Model (Without parameter estimation) #6992

Closed

Base gas costs for function calls are too high #7741

Open

Improve function call fee estimation #4826

Closed

aborg-dev mentioned this issue Mar 19, 2024

Investigate the effects of decreasing function call gas/compute cost #10829

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

True cost of function calls #7227

True cost of function calls #7227

jakmeier commented Jul 21, 2022 •

edited by exalate-issue-sync bot

Loading

jakmeier commented Jul 21, 2022 •

edited

Loading

jakmeier commented Jul 21, 2022

akhi3030 commented Jul 21, 2022

akhi3030 commented Jul 21, 2022

jakmeier commented Jul 21, 2022

matklad commented Jul 21, 2022 •

edited

Loading

akhi3030 commented Jul 22, 2022

matklad commented Jul 22, 2022 •

edited

Loading

jakmeier commented Jul 24, 2022

jakmeier commented Aug 24, 2022

True cost of function calls #7227

True cost of function calls #7227

Comments

jakmeier commented Jul 21, 2022 • edited by exalate-issue-sync bot Loading

jakmeier commented Jul 21, 2022 • edited Loading

jakmeier commented Jul 21, 2022

akhi3030 commented Jul 21, 2022

akhi3030 commented Jul 21, 2022

jakmeier commented Jul 21, 2022

matklad commented Jul 21, 2022 • edited Loading

akhi3030 commented Jul 22, 2022

matklad commented Jul 22, 2022 • edited Loading

jakmeier commented Jul 24, 2022

jakmeier commented Aug 24, 2022

jakmeier commented Jul 21, 2022 •

edited by exalate-issue-sync bot

Loading

jakmeier commented Jul 21, 2022 •

edited

Loading

matklad commented Jul 21, 2022 •

edited

Loading

matklad commented Jul 22, 2022 •

edited

Loading