Skip to content

fix(cpp): Handle npos in PathToDirectory for S3 URIs without query strings#888

Open
shivendra-dev54 wants to merge 7 commits intoapache:mainfrom
shivendra-dev54:881-pathtodir-can-throw
Open

fix(cpp): Handle npos in PathToDirectory for S3 URIs without query strings#888
shivendra-dev54 wants to merge 7 commits intoapache:mainfrom
shivendra-dev54:881-pathtodir-can-throw

Conversation

@shivendra-dev54
Copy link
Contributor

Reason for this PR

Fixes #881

PathToDirectory was crashing with a std::out_of_range exception when parsing valid S3 URIs that do not contain a query string (?). This happens because find_last_of('?') returns std::string::npos, which was being passed directly into path.substr().

What changes are included in this PR?

Added an explicit check for std::string::npos in PathToDirectory. When an S3 URI without a query string is processed, it now correctly identifies the entire path as the prefix and sets the suffix to an empty string, safely avoiding the out-of-bounds crash.

Are these changes tested?

Yes, the C++ test suite was compiled and run locally, and all tests pass successfully.

Are there any user-facing changes?

No.

Critical Fix: Fixes a bug that causes a crash (std::out_of_range exception) when parsing valid S3 URIs without query strings.

@shivendra-dev54
Copy link
Contributor Author

@Sober7135
check the code and also the title of the PR

@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.67%. Comparing base (163123f) to head (0a634c1).

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #888      +/-   ##
============================================
+ Coverage     76.83%   80.67%   +3.84%     
  Complexity      615      615              
============================================
  Files            84       94      +10     
  Lines          8897    10707    +1810     
  Branches       1055     1055              
============================================
+ Hits           6836     8638    +1802     
- Misses         1821     1829       +8     
  Partials        240      240              
Flag Coverage Δ
cpp 71.04% <100.00%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sober7135
Copy link
Contributor

Yes, the C++ test suite was compiled and run locally, and all tests pass successfully.

This is not quite accurate, since no test case was added for this bug.

Testing this bug is also somewhat complicated, because PathToDirectory is a static helper inside an anonymous namespace rather than a public API. In practice, we cannot test it directly in the current test suite.

More importantly, this bug is only exercised through GraphInfo::Load(const std::string& path): Load() first reads the graph metadata file, and then calls PathToDirectory(path). So to reproduce this bug in a test, we need to drive the full GraphInfo::Load code path rather than test the helper in isolation.

Result<std::shared_ptr<GraphInfo>> GraphInfo::Load(const std::string& path) {
std::string no_url_path;
GAR_ASSIGN_OR_RAISE(auto fs, FileSystemFromUriOrPath(path, &no_url_path));
GAR_ASSIGN_OR_RAISE(auto yaml_content,
fs->ReadFileToValue<std::string>(no_url_path));
GAR_ASSIGN_OR_RAISE(auto graph_meta, Yaml::Load(yaml_content));
std::string default_name = "graph";
std::string default_prefix = PathToDirectory(path);
no_url_path = PathToDirectory(no_url_path);
return ConstructGraphInfo(graph_meta, default_name, default_prefix, fs,
no_url_path);
}

I can think of two possible solutions:

  • Set up a mock S3 server so that we can exercise the full workflow and reliably trigger this bug.
  • Expose this function — for example, move it into a namespace like graphar::details / graphar::internal, and avoid installing that header into the user system.

Actually, I am not familiar with this kind of problem.

Any advice? CC @yangxk1

@shivendra-dev54
Copy link
Contributor Author

shivendra-dev54 commented Mar 3, 2026

@Sober7135
I have added test as you mentioned in the 2nd possible solution

here is what I did

  • moved the PathToDirectory function to namespace util to expose it for tests.
  • updated util.h with the function declaration.
  • added test as you suggested.
  • added it to the cpp/test/CMakeLists.txt.
  • checked with the new tests.

But I have added that function to graphar::util namespace
tell me if you want me to change that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PathToDirectory can throw for valid S3 URIs without ?

3 participants