wrong column pkl filename with DistributionBased matching #53

cchristodoulaki · 2023-04-05T19:37:06Z

Hello, and thank you for the library! 🍾

When using DistributionBased matching, I have the following use case:

I create an instance of the matcher. I have two source tables (source_1 and source_2), and one target table target.
I call matcher.get_matches(source_1, target). Pickle files for columns of source_1 and target tables are written to e.g., /tmp/tmpkpakbdjz, and the same files are read back with clustering_utils.get_column_from_store. Matches are generated.
I call matcher.get_matches(source_2, target). Pickle files for columns of source_2 and target tables are written to e.g., /tmp/tmp41gf90n2. HOWEVER, clustering_utils.get_column_from_store attempts to read pkl files created for columns of source_1 from directory /tmp/tmp41gf90n2

The text was updated successfully, but these errors were encountered:

kPsarakis · 2023-04-05T19:53:18Z

Hello! Thank you for reporting this.

I will try to push a fix in the coming days.

kPsarakis · 2023-04-11T10:12:57Z

Hi! I have pushed a fix and made a new release that could be installed with pip install valentine==0.1.6

Could you check if this solves the issue?

kPsarakis self-assigned this Apr 5, 2023

kPsarakis added the bug Something isn't working label Apr 5, 2023

kPsarakis added a commit that referenced this issue Apr 11, 2023

fix issue #53 reading previous run's files if the filename was the same

0cfc190

kPsarakis closed this as completed Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wrong column pkl filename with DistributionBased matching #53

wrong column pkl filename with DistributionBased matching #53

cchristodoulaki commented Apr 5, 2023

kPsarakis commented Apr 5, 2023

kPsarakis commented Apr 11, 2023

wrong column pkl filename with DistributionBased matching #53

wrong column pkl filename with DistributionBased matching #53

Comments

cchristodoulaki commented Apr 5, 2023

kPsarakis commented Apr 5, 2023

kPsarakis commented Apr 11, 2023