Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: add tools for merging of maps #4017

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hosuaby
Copy link

@hosuaby hosuaby commented Sep 13, 2020

The present PR adds methods for merging of maps.

Merging of maps is an operation that combines content of two or more maps into a new map while handles entries with duplicate key. Example:

    Map<String, Set<String>> tags1 = ImmutableMap.of(
            "guava", newHashSet("java", "collections"),
            "junit", newHashSet("tests"));
    Map<String, Set<String>> tags2 = ImmutableMap.of(
            "spring", newHashSet("di", "boot"),
            "junit", newHashSet("runner", "platform"));

    Map<String, Set<String>> merged = Maps.merge(tags1, tags2, Sets::union);

    // Result: {
    //    "guava": ["java", "collections"]
    //    "spring": ["di", "boot"]
    //    "junit": ["tests", "runner", "platform"]
    // }

The present PR add new methods to class Maps:

  • Map merge(firstMap, secondMap, mergeFunction) - merges two maps
  • Map mergeAll(mergeFunction, firstMap, Map... otherMaps) - merges multiple maps, using signature with varagrs
  • Map mergeAll(mergeFunction, Iterable maps) - merges multiple maps
  • Collector mergeCollector(Collector<V, ?, V> downstream) - creates a collector able to merge stream of maps

Real life use-case:
Assuming we have multiple data sources that returns data in form of maps. We would like to aggregate those data by merging those maps. Unit tests give us an example using measurements provided by different sources.

@jbduncan
Copy link
Contributor

@hosuaby I'm not a member of the Guava team or even a Googler, just so that you're aware, but have you considered merging your maps together like the following?

// If you're able to turn your Map<String, Set<String>>s into SetMultimap<String, String>s beforehand:
var merged = 
    ImmutableSetMultimap.builder()
        .putAll(multimapTags1)
        .putAll(multimapTags2)
        .build();

// Otherwise:
var merged =
    Stream.of(tags1, tags2)
        .flatMap(tags -> tags.entries().stream())
        .collect(toImmutableSetMultimap(Entry::getKey, Entry::getValue));

@jbduncan
Copy link
Contributor

Oops, sorry, I didn't realise that one of my examples above prints the wrong value! Here's my updated code snippet:

// If you're able to turn your Map<String, Set<String>>s into SetMultimap<String, String>s beforehand:
var merged =
    ImmutableSetMultimap.builder()
        .putAll(multiTags1)
        .putAll(multiTags2)
        .build();

// Otherwise:
var merged =
    Stream.of(tags1, tags2)
        .flatMap(tags -> tags.entrySet().stream())
        .collect(flatteningToImmutableSetMultimap(Entry::getKey, entry -> entry.getValue().stream()));

@hosuaby
Copy link
Author

hosuaby commented Sep 13, 2020

Hello, @jbduncan
Thanks for review!

It's all the question of simplicity. In your example, the user must "prepare" the maps by converting them into SetMultimap beforehand. Feature provided by present PR doesn't have such requirements. All methods accept input maps "as is", mutable or immutable.

The second point, in your example, We are looking to merge maps of Map<K, Set<V>> or Map<K, List<V>>. Methods in the present PR are able to merge values of any compatible types. It unit tests, I merge maps of Map<String, ? extend Number>.

But, thank you for your suggestion of MultiMaps. I refactored implementation of methods using ImmutableListMultiMap and it became much shorter.

If you have, any questions, suggestions, objections do not hesitate to comment. I try my best to make those tools the most generic and convenient to as much use-cases as possible.

@jbduncan
Copy link
Contributor

@hosuaby Okay, thanks for explaining!

If or when the Guava team get around to reviewing this, the first thing that they'll probably say is they'll need to internally discuss the API first, as mentioned in the contribution docs and the project philosophy. And even then there's no guarantee it will end up in Guava, because they may think that existing APIs, like in my example above, are good enough.

Just so that you're warned. :)

@ogregoire
Copy link

The biggest question would come from the resolution of conflicts. You are conveniently speaking about Map<X, Set<Y>> which has a rather trivial resolution, but what about any Map<X, Collection<Y>>. Do you want one entry, several entries?

I believe that you'd be best using Maps::difference. Ok, it's just a bit of work compared to a simple helper method, but it'll get your work done rather easily.

@hosuaby
Copy link
Author

hosuaby commented Sep 14, 2020

@ogregoire That's what mergeFunction made for. We can specify which collection we want (in case, we merge Map<X, Collection<Y>>:

  public void testGithub() {
    Map<String, Collection<String>> tags1 = ImmutableMap.of(
            "guava", newHashSet("java", "collections"));

    Map<String, Collection<String>> tags2 = ImmutableMap.of(
            "guava", newHashSet("optionals", "java", "library"));

    Map<String, Collection<String>> asSet = Maps.merge(tags1, tags2, (c1, c2) -> newHashSet(Iterables.concat(c1, c2)));
    assertThat(asSet.get("guava")).hasSize(4);
    assertThat(asSet.get("guava")).containsExactly("java", "collections", "optionals", "library");

    Map<String, Collection<String>> asList = Maps.merge(tags1, tags2, (c1, c2) -> newArrayList(Iterables.concat(c1, c2)));
    assertThat(asList.get("guava")).hasSize(5);
    assertThat(asList.get("guava")).containsExactly("java", "collections", "optionals", "java", "library");
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants