Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional iterator to _.uniq #194

Merged
merged 1 commit into from
Aug 3, 2011
Merged

Conversation

albemuth
Copy link
Contributor

@albemuth albemuth commented May 4, 2011

Hi,

I needed to be able to pass in an iterator to _.uniq like in ruby's Array.uniq to do things like:

_.uniq([{name:'moe'}, {name:'curly'}, {name:'larry'}, {name:'curly'}], false, function (value) { return value.name; })
=>[{name:'moe'}, {name:'curly'}, {name:'larry'}]

Added test and updated docs too, any feedback welcome.

Cheers!
Alfredo Mesén

@bradleybuda
Copy link

+1

2 similar comments
@edtsech
Copy link

edtsech commented Jul 7, 2011

+1

@fgalassi
Copy link

fgalassi commented Aug 2, 2011

+1

@jimschubert
Copy link

This looks excellent

@ngauthier
Copy link

+1

jashkenas added a commit that referenced this pull request Aug 3, 2011
@jashkenas jashkenas merged commit b930716 into jashkenas:master Aug 3, 2011
@jashkenas
Copy link
Owner

It's a shame about the api, having isSorted before iterator ... but I don't think it's worth breaking backwards-compatibility over this.

@ngauthier
Copy link

Alternatively, it could become uniqBy (like sortBy) and then you could deprecate uniq.

@andrewplummer
Copy link

@albemuth I'm curious about the use case here... uniquing on a property returns the first object for which that property uniquely occurs, yes? If that is the case, you have no guarantee that the other omitted data will be unique or not. I can understand needing a unique array of the properties themselves, but in that case you would simply map the properties into their own array first using _.map, no?

@ngauthier
Copy link

@adrewplummer no, uniquing on a property returns all the objects that are unique when compared using that property. So in the example code at the top, he is uniquing object by their name (as opposed to the default which would be equality between the objects themselves).

@andrewplummer
Copy link

I guess I'm not clear on the phrase "all the objects that are unique". What if the objects aren't unique by the name property but are unique by other properties? Examples are worth a thousand words:

var people = [ { name: 'John', age: 20 }, { name: 'Mary', age: 31 }, { name: 'Kevin', age: 20 }]; 
_.uniq(people, false, function(p){ return p.age; });

I was going to ask "what happens when I do this", but realized I should just try for myself... as I thought, it returns only "John" and "Mary". So I'm just curious what the use case is of needing just John here (or maybe a better example if mine is too contrived)? Now of course I could understand wanting the ages themselves uniqued, but in that case this would seem more appropriate:

_.uniq(_.map(people, function(p){ return p.age; }));

@ngauthier
Copy link

often it's not the case that the attributes differ but the primary key (name, in this case) do not. The usual use case is:

var people = [{ name: 'John', age: 20 }, { name: 'John', age: 20 }, { name: 'Kevin', age: 20 }];
_.uniq(people)
// =>  [{ name: 'John', age: 20 }, { name: 'John', age: 20 }, { name: 'Kevin', age: 20 }]
_.uniq(people, false, function(p){ return p.name; });
// => [{ name: 'John', age: 20 }, { name: 'Kevin', age: 20 }]

It's really useful to specify the key when uniquing objects because uniq uses === so two objects that are equivalent are not actually uniq. It would inefficient to do a deep compare when as a programmer we know what the primary key is.

@andrewplummer
Copy link

Ah I see. But this kind of case does make the assumption that the objects are identical if the keys they are uniqueing on are identical, is that not true? I suppose that would follow from the "primary key" paradigm. But in your experience, would it really be so common to have multiple identical entries in the same array?

@ngauthier
Copy link

Yes.

Although the true power here is allowing the programmer to define their own comparator so they can choose what is important and what is not.

Note that uniq calls include and include uses === to compare. In another language, we could redefine the comparator between objects, but in javascript we can't override ===. Meaning we can't define object comparison.

Allowing uniq to accept a function as a comparator lets us do this.

@andrewplummer
Copy link

Ok... interesting... that makes a lot more sense now... thanks!

@ngauthier
Copy link

no problem! I think I learned more about underscore and javascript comparison in the process. :-)

@davidgoli
Copy link

It's too bad this iterator is idiosyncratic and not a comparison iterator like Javascript's native sort iterator, otherwise it would be so easy to use underscore's own comparison methods like:

_.uniq([{'foo': 'bar', 'a':1}, {'foo': 'bar', 'a':1}, {'foo': 'bar', 'a': 2}], false, _.isEqual)

As it is, you have to "serialize" the object in order to have a single value type to return, eg.:

function (item) { return item.foo + item.a; }

Probably too late now, though you could use iterator.arguments.length to see if the iterator expects 1 or 2 arguments.

@jdalton
Copy link
Contributor

jdalton commented Sep 22, 2012

@davidgoli I really dig that use case but it's trickier because currently the iterator for unique is passed 3 arguments, value, index, and array.

@CWSpear
Copy link

CWSpear commented Feb 28, 2013

@davidgoli I was thinking that getting it to work with _.isEqual was how it was supposed to work, and so it totally threw me off my game. Finally, I found myself here and your idea to serialize it and it was kind of a lightbulb moment, haha. Thanks for your comment!

An implementation like you suggest, if possible, would make it a lot more powerful, because then you could compare any array of collections instead of having to know what the keys are, etc.

@jdalton
Copy link
Contributor

jdalton commented Feb 28, 2013

Comparing like Array#sort would be expensive, also I'm not entirely sure how that would work. Uniqueness isn't between two values its uniqueness of elements in the entire array regardless of element order.

@CWSpear
Copy link

CWSpear commented Feb 28, 2013

Oh yeah, I am aware of the cost. I just had a relatively small array with objects with three fields. The concatenation situation is perfectly adequate, and yes, faster. It would just be nifty with isEqual, haha.

@jdalton
Copy link
Contributor

jdalton commented Feb 28, 2013

@CWSpear Ah! So like a _.uniqWhere instead of strict equality using object comparisons to simply like another object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.