Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong column pkl filename with DistributionBased matching #53

Closed
cchristodoulaki opened this issue Apr 5, 2023 · 2 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@cchristodoulaki
Copy link

Hello, and thank you for the library! 🍾

When using DistributionBased matching, I have the following use case:

  • I create an instance of the matcher. I have two source tables (source_1 and source_2), and one target table target.
  • I call matcher.get_matches(source_1, target). Pickle files for columns of source_1 and target tables are written to e.g., /tmp/tmpkpakbdjz, and the same files are read back with clustering_utils.get_column_from_store. Matches are generated.
  • I call matcher.get_matches(source_2, target). Pickle files for columns of source_2 and target tables are written to e.g., /tmp/tmp41gf90n2. HOWEVER, clustering_utils.get_column_from_store attempts to read pkl files created for columns of source_1 from directory /tmp/tmp41gf90n2
@kPsarakis kPsarakis self-assigned this Apr 5, 2023
@kPsarakis kPsarakis added the bug Something isn't working label Apr 5, 2023
@kPsarakis
Copy link
Member

Hello! Thank you for reporting this.

I will try to push a fix in the coming days.

@kPsarakis
Copy link
Member

Hi! I have pushed a fix and made a new release that could be installed with pip install valentine==0.1.6

Could you check if this solves the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants