UNION should propagate identical input attributes to its output #270
Description
Consider the following case:
(a, b, c)
/ \
. .
SELECT a= .
. .
. SELECT b=
. .
\ /
UNION
|
(a', b', c')
The UNION
combines two data frames that are created from the same shared data frame (a, b, c)
by applying different transformation chains.
To preserve both left and right paths in the resulting (a', b', c')
data frame lineage we create new synthetic attributes for the UNION
output, and connecting those attributes to all corresponding attributes in the input data frames.
The problem arises when some of those attributes are identical, e.g. in the above example, attribute c
isn't touched and is simply propagated everywhere. Thus in the resulting data frame the attribute c
is expected to be presented itself, not c' = f(c)
as Spline currently shows.
Technically speaking c' = f(c)
is not particularly incorrect representation, as f
could be an identity function so the expression boils down to just the c' = c
as expected. But it just creates unnecessary clutter in the graph structure with duplicated intermediate attribute names making it difficult to reason about.
Metadata
Assignees
Labels
Type
Projects
Status
New