Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue 15960 - deprecate setUnion in favor of merge #4249

Merged
merged 1 commit into from
Aug 29, 2016

Conversation

wilzbach
Copy link
Member

Have a look at:

http://en.cppreference.com/w/cpp/algorithm/merge
http://en.cppreference.com/w/cpp/algorithm/set_union

-> setUnion should filter for duplicates

[1,2].merge([1,3]) == [1,1,2,3]
[1,2].setUnion([1,3]) == [1,2,3]

I propose to rename setUnion to merge and let setUnion filter (like in C++). Moreover I added more tests and a better ddoc test for the user.

I know that @andralex isn't quite a fan of such "house keeping", but sharing the same name to C++ and doing something different is quite astonishing to the user.
Let's fix this while the usage is still pretty low.

If this get's the OK pulled in, I will add a proper changelog with a long explanation.

@dlang-bot
Copy link
Contributor

Fix Bugzilla Description
15960 SetUnion should filter duplicates

@wilzbach wilzbach force-pushed the fix_15960 branch 4 times, most recently from d0c53b7 to 7439725 Compare April 27, 2016 04:40
@ntrel
Copy link
Contributor

ntrel commented Apr 27, 2016

Personally I think setUnion should be renamed to merge, although this has been raised & defended before; there is also nWayUnion:
https://issues.dlang.org/show_bug.cgi?id=6718#c1

One thing that is definitely out is silently changing the behaviour of setUnion. At the very least it should have one release where setUnionis deprecated.

@wilzbach
Copy link
Member Author

One thing that is definitely out is silently changing the behaviour of setUnion. At the very least it should have one release where setUnionis deprecated.

makes sense. PR now deprecates setUnion in favor of merge.

@wilzbach wilzbach changed the title fix issue 15960 - replace setUnion with merge and let setUnion filter for uniques fix issue 15960 - deprecate setUnion in favor of merge Apr 27, 2016
@wilzbach
Copy link
Member Author

@andralex tag needed, maybe we can even have a special deprecation one?

@andralex
Copy link
Member

Pardon my hazy memory, does setUnion currently preserve dupes or not? Generally I agree it's nice to have correspondence with homonym functions in other libs, but I don't think this adds enough value. One nice thing to do would be to pass a Flag to setUnion to preserve duplicates or not.

@wilzbach
Copy link
Member Author

Pardon my hazy memory, does setUnion currently preserve dupes or not? Generally I agree it's nice to have correspondence with homonym functions in other libs, but I don't think this adds enough value.

The issue was that setUnion in C++ filters for duplicates, whereas in D it doesn't. My argument was that "sharing the same name to C++ and doing something different is quite astonishing to the user, plus the usage is still pretty low."

[1,2].setUnion([1,3]) == [1,1,2,3] // D
[1,2].set_union([1,3]) == [1,2,3] // C++

One nice thing to do would be to pass a Flag to setUnion to preserve duplicates or not.

It would still have a different behavior than C++ in the default case without Flags?

Apart from that I think that the Flags as currently used are not very nice, because templates require order and thus they imply unneeded redundancy when multiple Flags are available, but only one is needed. Thus I would prefer if we move away from this pattern.

@nordlow
Copy link
Contributor

nordlow commented Jun 24, 2016

I agree with @wilzbach that D's setUnion should have same behaviour as C++'s set_union.

I think a deprecation phase is the way forward with this.

I'm fine with an extra Flags parameter for now.

BTW: @wilzbach did you look at my previous rejected PR at #3315?

Note that Github says it's merged eventhough it was rejected.

@dlang-bot
Copy link
Contributor

@wilzbach, thanks for your PR! By analyzing the annotation information on this pull request, we identified @quickfur, @andralex and @WalterBright to be potential reviewers. @quickfur: The PR was automatically assigned to you, please reassign it if you were identified mistakenly.

(The DLang Bot is under development. If you experience any issues, please open an issue at its repo.)

@wilzbach
Copy link
Member Author

I agree with @wilzbach that D's setUnion should have same behaviour as C++'s set_union.
I think a deprecation phase is the way forward with this.

After looking again at this, I still think that introducing merge and deprecate the misleading behavior of setUnion is the best way forward.

To give a quick summary, the issue is that setUnion has a different behavior between C++ and D:

[1,2].setUnion([1,3]) // C++: [1,2,3], D:  [1,1,2,3]

The proposed way out is to rename setUnion to merge, which will have the same behavior as in C++ and trigger deprecation warnings for setUnion:

[1,2].merge([1,3]) // C++/D: 1,1,2,3]

After an appropriate deprecation phase setUnion may be recycled to filter for duplicates as in C++.

BTW: @wilzbach did you look at my previous rejected PR at #3315?

Sorry lost track of this PR. I like, but AFAICT it just adds bidirectionality to setUnion, which we should add independently to this PR ;-)

Last but not least I saw that I am not the only one who complained, on the other PR @lomereiter said:

but maybe functions from std.algorithm.setops should also mirror the respective C++ counterparts and not include duplicated elements in the output.

@andralex
Copy link
Member

Eh, fine

@andralex
Copy link
Member

Auto-merge toggled on

@andralex andralex merged commit 5c5dda3 into dlang:master Aug 29, 2016
@wilzbach wilzbach deleted the fix_15960 branch August 29, 2016 00:26
@JackStouffer
Copy link
Member

JackStouffer commented Aug 29, 2016

@wilzbach merge seems like a duplicate of roundRobin, what's the difference?

@wilzbach
Copy link
Member Author

@wilzbach merge seems like a duplicate of roundRobin, what's the difference?

This was just a rename from setUnion to merge, nothing changed except the name (see above for the motivation).

what's the difference?

The difference between setUnion / mergeand roundRobin is that roundRobin doesn't look at the sortedness, it just makes a "round" between all ranges. A short example:

void main(string[] args)
{
    auto a = [1, 4, 9];
    auto b = [1, 2, 3, 4, 5];

    import std.algorithm, std.range, std.stdio;
    writeln(setUnion(a, b)); // [1, 1, 2, 3, 4, 4, 5, 9]
    writeln(roundRobin(a, b)); // [1, 1, 4, 2, 9, 3, 4, 5]
}

You should also have a look at #3315, which is in fact even more similar to roundRobin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants