Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of SpatialJoin #1248

Merged
merged 63 commits into from
Sep 28, 2024

Conversation

Jonathan24680
Copy link
Collaborator

@Jonathan24680 Jonathan24680 commented Jan 28, 2024

Implement a an Operation SpatialJoin that takes two inputs, one variable per input, and a maximal distance d. This operation joins each row pair [left, right] where the spatial distance between the value of the two variables is less or equal than d.
It can currently be called by adding a triple using the special predicate <max-distance-in-meters:x>, where x is a positive integer that is used as the maximal distance. For example:

SELECT * WHERE {
 ?a <is-a> <tram-stop> .
 ?b <is-a> <restaurnat> .
 ?a <max-distance-in-meters:100> ?b
}

Current limitations (will be fixed in the future)

  • Only supports points (no linestrings, polygons, etc).
  • The implementation is a simple quadratic loop over the cartesian product of both inputs.
  • The spatial join should not be called via a special predicate, but via FILTER(geo:distance(?a, ?b) < 100).

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR. Three things I noted from looking over it quickly:

  1. In SpatialJoin::computeResult you have two very similar code blocks for computing the Cartesian product twice (with and without condition). You probably want a common implementation for this (with a template parameter for the filtering condition).

  2. Please use // ... for comments and not /* ... */

  3. A newline is missing at the end of some of the new files.

P.S. No need to call the fork on your account qlever-fork, you can just call it qlever.

Copy link

codecov bot commented Jan 28, 2024

Codecov Report

Attention: Patch coverage is 87.90323% with 30 lines in your changes missing coverage. Please review.

Project coverage is 88.13%. Comparing base (85793e3) to head (de2c3f4).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/engine/SpatialJoin.cpp 87.43% 14 Missing and 12 partials ⚠️
src/engine/QueryPlanner.cpp 88.57% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master    #1248    +/-   ##
========================================
  Coverage   88.12%   88.13%            
========================================
  Files         357      359     +2     
  Lines       26764    27012   +248     
  Branches     3606     3649    +43     
========================================
+ Hits        23587    23808   +221     
- Misses       1941     1954    +13     
- Partials     1236     1250    +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed everything but the computeResult and the VarColumnMaps.
Please also fix the build failures (mostly the warnings) and please start writing unit tests, start with your spatialJoin class (We have several examples in tests for other operations as to how you can input stuff, otherwise contact me).

src/engine/QueryPlanner.h Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional requests for the missing functions.

src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A round of requests on everything but SpatialJoinTest (which I will look at next).

Comment on lines 66 to 70
std::shared_ptr<QueryExecutionTree> onlyForTestingGetLeftChild() const {
return childLeft_;
}

std::shared_ptr<QueryExecutionTree> onlyForTestingGetRightChild() {
std::shared_ptr<QueryExecutionTree> onlyForTestingGetRightChild() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is const then it should return shared_ptr<const QueryExecutionTree> ("shallow constness")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't work as then the call onlyForTestingGetRightChild()->getSizeEstimate() doesn't work, as this operation of the superclass isn't const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but then you probably shouldn't mark this function const. If sonarcloud complains, that is fine in this case, as it is not smarter than the compiler.

src/engine/SpatialJoin.h Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
test/QueryPlannerTest.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A review of SpatialJoinTest.cpp up to line 1116

test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
#include "engine/SpatialJoin.h"
#include "parser/data/Variable.h"

using namespace ad_utility::testing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the global helpers that only appear in a .cppfile should be in the anonymous namespace

namespace {
// functions and variables here
}

This makes them inaccessible outside this .cpp files and avoids linker problems.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i have done this (but named the namespace localTestHelpers). I'm not completely sure, if i have done this correctly. I have (already before this comment) created a lot of namespaces to structure the test file. The only functions which weren't contained in a namespace, where the ones in the beginning. The reason for the many namespaces is the following: if you are developing a test for a certain function, you can collapse all namespaces, which aren't required for this function, which minimizes the amount of scrolling through the source code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just found out, that you can do both: (and should)

namespace { // avoid linking problems
namespace localTestHelpers { // because the file is long
// put stuff here
void f() {return 42;}
}
}

// outside of namspace, you can use localTestHel0pers:::f()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the TEST(...) macros need to be outside the anonymouse namespace to work correctly.

test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Show resolved Hide resolved
expectedMaxDist10000000_rows_diff, columnNames, false);
}

} // end of Namespace computeResultTest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at this until here.
I see you put a lot of effort into those tests which I like very much.
The last part (the manual setting up of inputs) is okay like this, because it is crucial for the correctness, so it might be a bit longer.

I will continue to look at the remainder (but you can also apply the same principles for your remaining tests).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :)
so the comment from above with the refactoring of the setting up of inputs doesn't apply anymore?

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now much cleaner.

src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoin.h Outdated Show resolved Hide resolved
#include "engine/SpatialJoin.h"
#include "parser/data/Variable.h"

using namespace ad_utility::testing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just found out, that you can do both: (and should)

namespace { // avoid linking problems
namespace localTestHelpers { // because the file is long
// put stuff here
void f() {return 42;}
}
}

// outside of namspace, you can use localTestHel0pers:::f()

#include "engine/SpatialJoin.h"
#include "parser/data/Variable.h"

using namespace ad_utility::testing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the TEST(...) macros need to be outside the anonymouse namespace to work correctly.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this initial infrastructure, I am looking forward to the upcoming more efficient implementations.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this initial infrastructure, I am looking forward to the more efficient implementations.

Copy link

sonarcloud bot commented Sep 27, 2024

@joka921 joka921 changed the title infrastructure for the SpatialJoin class First implementation of SpatialJoin Sep 28, 2024
@joka921 joka921 changed the title First implementation of SpatialJoin Initial implementation of SpatialJoin Sep 28, 2024
@joka921 joka921 merged commit 2e0ac35 into ad-freiburg:master Sep 28, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants