Assertions prevent using joindiff on bigquery tables in different projects, even though it works fine. #302
Description
Request: If I explicitly call data_diff.diff_tables
and pass in algorithm=data_diff.Algorithm.JOINDIFF
it'd be wonderful if the tool would let me use it, even if it thinks I'm applying to two different DBs (maybe just a warning message that I can ignore?)
Context:
In bigquery you can always (permissions aside) join tables in different datasets and projects, however since data_diff considers two different projects to be two differnt databases a few of the assertions in JoinDiff fail.
Unlike in snowflake or postgres DBs, where prod and dev would be different schemas in the same DB, when using BigQuery I've found it to be pretty common to have a prod project and then each dev gets their own playground project.
I went in and commented out three assertions (and had to change the default value of SegmentInfo.rowcounts
to {1: 0, 2: 0}
, but once that was done the code worked great and was substantially faster than the hash diff.