Skip to content

fix(cpp): Handle npos in PathToDirectory for S3 URIs without query strings#888

Open
shivendra-dev54 wants to merge 1 commit intoapache:mainfrom
shivendra-dev54:881-pathtodir-can-throw
Open

fix(cpp): Handle npos in PathToDirectory for S3 URIs without query strings#888
shivendra-dev54 wants to merge 1 commit intoapache:mainfrom
shivendra-dev54:881-pathtodir-can-throw

Conversation

@shivendra-dev54
Copy link
Contributor

Reason for this PR

Fixes #881

PathToDirectory was crashing with a std::out_of_range exception when parsing valid S3 URIs that do not contain a query string (?). This happens because find_last_of('?') returns std::string::npos, which was being passed directly into path.substr().

What changes are included in this PR?

Added an explicit check for std::string::npos in PathToDirectory. When an S3 URI without a query string is processed, it now correctly identifies the entire path as the prefix and sets the suffix to an empty string, safely avoiding the out-of-bounds crash.

Are these changes tested?

Yes, the C++ test suite was compiled and run locally, and all tests pass successfully.

Are there any user-facing changes?

No.

Critical Fix: Fixes a bug that causes a crash (std::out_of_range exception) when parsing valid S3 URIs without query strings.

@shivendra-dev54
Copy link
Contributor Author

@Sober7135
check the code and also the title of the PR

@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2026

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.86%. Comparing base (d535a43) to head (60923ec).

Files with missing lines Patch % Lines
cpp/src/graphar/graph_info.cc 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #888      +/-   ##
============================================
+ Coverage     79.81%   79.86%   +0.04%     
  Complexity      615      615              
============================================
  Files            93       93              
  Lines         10296    10296              
  Branches       1055     1055              
============================================
+ Hits           8218     8223       +5     
+ Misses         1838     1833       -5     
  Partials        240      240              
Flag Coverage Δ
cpp 70.89% <0.00%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sober7135
Copy link
Contributor

Yes, the C++ test suite was compiled and run locally, and all tests pass successfully.

This is not quite accurate, since no test case was added for this bug.

Testing this bug is also somewhat complicated, because PathToDirectory is a static helper inside an anonymous namespace rather than a public API. In practice, we cannot test it directly in the current test suite.

More importantly, this bug is only exercised through GraphInfo::Load(const std::string& path): Load() first reads the graph metadata file, and then calls PathToDirectory(path). So to reproduce this bug in a test, we need to drive the full GraphInfo::Load code path rather than test the helper in isolation.

Result<std::shared_ptr<GraphInfo>> GraphInfo::Load(const std::string& path) {
std::string no_url_path;
GAR_ASSIGN_OR_RAISE(auto fs, FileSystemFromUriOrPath(path, &no_url_path));
GAR_ASSIGN_OR_RAISE(auto yaml_content,
fs->ReadFileToValue<std::string>(no_url_path));
GAR_ASSIGN_OR_RAISE(auto graph_meta, Yaml::Load(yaml_content));
std::string default_name = "graph";
std::string default_prefix = PathToDirectory(path);
no_url_path = PathToDirectory(no_url_path);
return ConstructGraphInfo(graph_meta, default_name, default_prefix, fs,
no_url_path);
}

I can think of two possible solutions:

  • Set up a mock S3 server so that we can exercise the full workflow and reliably trigger this bug.
  • Expose this function — for example, move it into a namespace like graphar::details / graphar::internal, and avoid installing that header into the user system.

Actually, I am not familiar with this kind of problem.

Any advice? CC @yangxk1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PathToDirectory can throw for valid S3 URIs without ?

3 participants