Skip to content

Implement a generalisation for nodematch() that tests if the two nodes have any attributes or properties in common or counts them? #481

Open
@krivit

Description

@krivit

Term description

This stems from this question on Stack Overflow: generalising it, suppose that each node i has some set A[i] of properties (I am avoiding "attributes", since we use that term elsewhere.). We wish to specify a dyadic predictor that, in pseudocode, can be represented x[i,j] = length(intersect(A[i], A[j])) (the number of properties i and j have in common) or x[i,j] = length(intersect(A[i], A[j])) > 0 (whether i and j have any properties in common).

Some examples:

  • A[i] is the set of languages i speaks, and we wish to use an indicator of whether i and j speak at least one common language as a predictor of their interaction. (This is from Stack Overflow.)
  • A[i] is a list of i's hobbies, and we wish to use the number of hobbies i and j have in common to predict acquaintance.
  • A[i] is a list of places i visited over the course of a day (e.g., from a contact diary), and we wish to use the number of common areas visited by i and j to predict whether they had a contact.

This seems like something that can be useful in a variety of circumstances.

A further generalisation of this concept is to make A[i] a mapping that maps property k to some value (e.g., proficiency in a language) so that, e.g., x[i,j] = max[k](min(A[i][k], A[j][k])) (or some other "interacting" and "combining" functions in place of min() and max[k](), respectively). In the language example, this predictor represents the proficiency of the less-proficient actor in the two actors' best common language (where "best common language" is the language in which the less-proficient actor has the highest proficiency).

In all cases, this would be a dyad-independent term, so in principle representable with edgecov().

Questions

  1. How broadly useful would this be? I suspect @CarterButts and @mbojan might have some applications I hadn't thought about.
  2. Would the generalisation to a mapping be useful? What "interacting" and "combining" functions would be useful?
  3. What would be an efficient way to implement these?
  4. What kind of a user interface (required data format and syntax) would we want for this term?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions