external: normalize unambiguous file id lookup#2992
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to make external regular-file id matching more robust across equivalent numeric representations (decimal vs hex) and improve restore diagnostics when CRIU falls back from --inherit-fd to path-based reopen.
Changes:
- Add normalization logic for
file[mnt_id:inode]matching in external id lookup (keeping exact string compare as a fast path). - Emit a warning during restore when an external file has no
--inherit-fdmapping and CRIU falls back to reopening by path. - Update CLI help and documentation to describe accepted numeric formats for
file[mnt_id:inode].
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
criu/files-reg.c |
Adds a warning when --inherit-fd mapping is missing for an external file. |
criu/external.c |
Introduces parsing/normalization for file[mnt_id:inode] during external id lookup. |
criu/crtools.c |
Updates --external / --inherit-fd help text for file[mnt_id:inode] formats. |
Documentation/criu.txt |
Clarifies documentation for --external file[mnt_id:inode] numeric formats. |
11276ef to
7bf7613
Compare
|
please review when possible @avagin |
7bf7613 to
0f87717
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## criu-dev #2992 +/- ##
============================================
- Coverage 57.76% 57.26% -0.51%
============================================
Files 142 154 +12
Lines 37664 40486 +2822
Branches 0 8874 +8874
============================================
+ Hits 21758 23185 +1427
- Misses 15906 17037 +1131
- Partials 0 264 +264 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@cheese-cakee please update the commit message with more details. |
|
I get the goal here, but we need to be careful. Since the file[hex:hex] format lacks the 0x prefix, any hex value consisting only of numbers could be parsed as a decimal either... |
f3b1e9f to
4499038
Compare
I addressed this by making CRIU emit the 0x prefix (changed the snprintf in dump_one_reg_file from %x to 0x%x), so the format is unambiguous going forward. The parser now requires 0x for hex |
I don't understand how it will help us. We can't break existing users, so we need to proper handle the old format... |
|
Adding a note that I believe this is also related to the comments in #2951 (comment) and #2973 |
| int base; | ||
|
|
||
| /* | ||
| * 0x prefix means hexadecimal; everything else is decimal. |
There was a problem hiding this comment.
I believe this is the breaking change that Andrei is referring to. There is no reliable way for us to parse decimal numbers here. For example, 0x11 == 17. When 0x is removed, we have no way of knowing if the input is 11 or 17.
There was a problem hiding this comment.
I'll drop the emission change and instead make the parser try decimal first (matching user expectation for --inherit-fd arguments) and fall back to hex only when the decimal parse fails
There was a problem hiding this comment.
again, it can break backward compatibility. Please explain in the commit message in details why existing users will be not affected by this change. Try to think on a case when there are two external resources with ids '11' and '17'.
There was a problem hiding this comment.
Now normalization is applied only when both components (mnt_id and inode) are unambiguous:
- 0x... => unambiguous hex
- contains a-f/A-F => unambiguous hex
- digits-only => ambiguous, no normalization
So file[11:17] and file[17:23] keep literal/exact semantics and won’t be cross-matched by numeric normalization.
external_lookup_id() still does exact strcmp() first, then numeric compare only for unambiguous IDs
There was a problem hiding this comment.
@cheese-cakee Were you able to test these changes? It sounds overcomplicated, and I am not sure the logic is correct. Note that AI chatbots can make mistakes.
There was a problem hiding this comment.
I have tested the new changes and I have tried to simplify the implementation :
Normalization only when both mnt_id AND inode are unambiguous:
- 0x... or contains a-f → hex
- digits-only (like 11) → treated as decimal, no normalization
please review if its alright
43b95cb to
6409fc5
Compare
ea45cc5 to
079e01b
Compare
external_lookup_id() uses strict string comparison, so file[0xa:0xa] and file[10:10] (the same file) wont match when one side emits hex and the other decimal. Add a numeric comparison path, but only for unambiguous forms to preserve backward compatibility. The key insight from avagins review: a component like 11 is ambiguous it could mean decimal 11 or hex (0x11 = 17 decimal). Without the explicit 0x prefix, there is no way to know the users intent. file[11:17] and file[17:23] could silently match incorrectly. Parse rules: - 0x... => unambiguous hex (explicit prefix) - contains hex digits (a-f/A-F) => unambiguous hex - digits only => ambiguous, do NOT normalize-match Only normalize when BOTH the mnt_id AND inode components are unambiguous. Ambiguous IDs retain their literal/exact semantics. This narrow fix addresses backward compatibility concerns: - CRIU emits bare hex from %x (e.g., file[a:a]): unambiguous, normalizes - CRIU emits bare digits from %x (e.g., file[10:10]): ambiguous, doesnt normalize - User provides file[0xa:0xa]: unambiguous, matches to stored file[0xa:0xa] - User provides file[10:10]: ambiguous, matches to stored file[10:10] - file[10:10] will NOT normalize-match file[0xa:0xa] because the latter is unambiguous Fixes: checkpoint-restore#2951 Signed-off-by: Farzan Aman Khan <farzanaman99@gmail.com>
079e01b to
aeba1ac
Compare
Why this PR is necessary
external_lookup_id() used strict string comparison, so logically identical file[mnt_id:inode] IDs could fail to
match when represented differently.
What this PR changes
Scope
Testing
git diff --check HEAD~1..HEADpython3 -m py_compile test/zdtm.py test/zdtm/criu_config.py