-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition #10
Comments
On mobile for a week so don't have links handy, but this is a known issue (though it should be better documented) and I believe I created an EIP to get this added to the RPC protocol already. Unfortunately, the EIP process is painfully slow, so I have no idea how long until this gets addressed. Feel free to submit a PR to update to Readme with a known issues section. If not I'll try to remember to update the Readme when I'm back at a computer next week. |
Looking through the source code, it looks like I already had an idea on how to fix this. Just check the block hash of the returned block to validate it is correct before processing logs. However, it will require a bit of work to implement since he current design doesn't allow an easy way of exposing the inconsistency back up the call stack. Perhaps the best option is to just drop the logs if that condition is reached? Will need to research more to see if the reorg will correctly be deal with us dropping the logs here. |
I found the EIP: ethereum/EIPs#597 Someone needs to write an implementation for Geth/Parity as a next step. |
I took a bit of time to dig into how hard doing the comment would be and remembered why I didn't implement it at the time. The root problem is that the block processor does not have a mechanism for the log processor to say, "the block you just had me process is invalid, please roll it back" other than by throwing an exception. The problem is that by throwing an exception, the block processor will unwind its callstack, and therefore any queued up block processing. If we are in the middle of dealing with a reorg when this happens, the reorg will be partially announced, but we won't announce the removals which is arguably just as bad of a bug as we already have. A comment about this can be seen at
🤔 some more on the issue, perhaps it is "safe" to simply swallow any logs that don't match the expected block hash? This would cause them to not be announced, but that is fine because we know that a re-org is about to happen. However, it is also possible that we'll re-org the block back in before actually witnessing the first re-org, which would then cause us to never announce the logs for that block. All of this brings me back to the conclusion I think I came to originally, which is that there is no great solution to this problem given the current architecture, so we need to pick which problem we want to have until such time as someone can sit down and come up with a new architecture, or a good solution to this problem. Is it worse to receive double logs, no logs, logs announces without rollbacks, or block announces without rollbacks? I believe I settled on the current solution because it is the least code, therefore the least likely to introduce any unwanted side effects and none of the options are particularly great. |
EIP accepted as a draft, still needs Geth/Parity implementation: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-234.md |
You can also just get the logs from the transaction receipt of each transaction hash in the block, which you get when you first get the block B. Then aggregate logs from all the receipts to report the logs for the block. It is more requests (mitigated by batch requests) but if any request in the bunch fails you can drop the block and try again to get the block by number, presumably ending up with C since a failure to get a receipt would likely be due to a reorg. |
Yeah, I considered that but it would result in a ton of data being shipped from the node to the client since you would have to dig into the transaction receipts for every transaction, even those that didn't have any logs. 😢 |
It's really not that bad though if you're only fetching them once and using the batch RPC so you only send a single request. I went that route and it works great. BTW https://ethercast.io is live on mainnet and Kovan now if you'd like to try it. |
@moodysalem I'm assuming that you are working on a server? My concern is that in a browser or on mobile it doesn't take much to overload the client and/or connection. If you are monitoring for some uncommon event, fetching full blocks for every block ever is a significant increase in data volume compared to fetching blocks without transactions. It isn't about the absolute size as that depends on your environment, it is about the relative size difference requirements. |
Re-opening the issue as I think it is very valuable to keep some kind of documentation about this problem, and a place for people to aggregate thoughts on it. Feel free to unsubscribe of course. 😁 |
I believe the correct architectural change is to make it so when something goes wrong, blockstream fires a rollback for any log/block that it fired since the last known good state. The last known good state is basically anytime the block history or log history promises finalize. This will require tracking all announced changes, rolling them back in order, and swapping out |
More notes to future implementer (me): Rather than firing block/log callbacks immediately, we can queue them up in a datastructure internal to |
While #18 helps this issue, we still need EIP-234 to be implemented in all major clients before we can actually fix the issue mention here. The remaining unhandled case is when the following sequence occurs:
Once EIP-234 is implemented we can ask the server for logs for a specific block hash and get an error if the block is not found, rather than being given logs for an incorrect block. |
EIP-234 implemented on geth |
openethereum/parity-ethereum#9256 implements this on Parity. |
Now that EIP-234 has been implemented in geth and parity, is it safe to say all the pieces are there to patch this? |
Ah. Sorry. Version 7.0.0 was published to NPM a while ago which fully fixes this. I didn't realize this issue was still open. On mobile now, I'll read over the issue again when at computer and verify nothing is left and then close. |
Note that 7.0.0 has some other breaking changes in it, in particular all of the logs for a block are sent in a single callback with an array of logs rather than in a separate callback per log. It should be pretty easy to fan out if you prefer the old behavior with a simple adapter, but it will require some code changes. Also, the signature for the If you only want the hash change, then version 6.0.1 will get you that, but I do recommend using 7.0.0+ since that is the line that will receive support. |
Blockstream fetches the logs by block number. If between the block request and log request block reorg happens - it'll leed to a wrong log history and possibly skipped events.
B (2) C (2)
\ ....... /
A (1)
Let's say someone does reconcileBlock with block B.
Blockstream requests logs from block 2. It gets logs from block C cause the reorg happened. Then another reorg happenes and some block mined on top of A is the new head. Blockstream will not request logs from B cause it missed the first reorg. User will never receive those logs.
On the first sight - it involves perfect timings of reorgs and two reorgs which seems to be impractical, but if we claim that it's a 'reliable' block stream - than this should not happen.
The solution will be to request logs by hash not by number. Unfortunately nodes don't support that.
We should either ask nodes to support that or support returning logs as part of getBlock request. It already has a boolean flag to return transactions. Might as well include logs.
The text was updated successfully, but these errors were encountered: