Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Venn Diagram Syntax #2583

Open
arjansingh opened this issue Dec 22, 2021 · 90 comments · May be fixed by #5932
Open

Venn Diagram Syntax #2583

arjansingh opened this issue Dec 22, 2021 · 90 comments · May be fixed by #5932

Comments

@arjansingh
Copy link

Problem

I'm always frustrated when I have to manually create Venn diagrams of more than 4 intersecting circles. It might sound trivial but arranging complex Venn diagrams in a legible manner is annoying and hard with most of the standard diagraming tools I've used.

Request

This was already brought up many years ago in #200 but Venn diagram in syntax in Mermaid would be great.

Venn Diagrams are basically just a visualization of set theory. So the syntax would just have to implement basic set operations:

  • Unions
  • Intersections
  • Compliments
  • Differences

Potential Syntax

Something like the following could be a good syntax:

vennDiagram
  set A
  set B
  set C

  intersect A B
  union B C
  relativeComplement A C

I know everyone is busy. I am just wondering what the general interest is from the community on prioritizing this versus other new diagram formats.

@arjansingh arjansingh added Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request labels Dec 22, 2021
@wolfspyre
Copy link

It would be useful, for sure. There have been multiple times when I looked to see if the gods had finally conspired to make Venn diagrams in mermaid a thing…

that being said is venn more useful than…..
That, idk

@arjansingh
Copy link
Author

Yup, I agree. I think a useful strategy here would be to ask anyone who wants it to 👍 the issue. If it's got enough community interest. I'd be happy to look at thinking it through more thoroughly in 2022 (among the million other things I gotta do 😂 )

@deslee
Copy link

deslee commented Jan 22, 2022

I would love venn diagrams :) I'd be interested in helping when I start having more free time

@wilsonjholmes
Copy link

Yes! Venn diagrams would be lovely!

@bobturneruk
Copy link

I, too, would like Venn diagrams!

@serpro69
Copy link

+1 this would be a killer feature.

@philbo87
Copy link

philbo87 commented Mar 4, 2022

+1 I am also interested!

@MathiasSven
Copy link

+1 for sure!

@MoniqueSmiling
Copy link

+1 Would be amazing.

@mathiasgheno
Copy link

mathiasgheno commented Apr 3, 2022

+1 I was doing an documentation in Notion and I asked to myself if mermaid could help my with that. Would be awesome to have this feature.

@kimploo
Copy link

kimploo commented Apr 18, 2022

+1, any walkaround?

@PogiNate
Copy link

+1, this would be helpful so frequently!

@JonLoose
Copy link

JonLoose commented May 1, 2022

+1

@joniek
Copy link

joniek commented May 5, 2022

⬆️ Upvote

@j-r-e-i-d
Copy link

yes please!

@LucVidal360
Copy link

+1 👍

@Casyfill
Copy link

+1

4 similar comments
@FrodoChen
Copy link

+1

@FelipeAdachi
Copy link

+1

@pedramamini
Copy link

👍

@tyten
Copy link

tyten commented Jun 24, 2022

+1

@ElectricSwan
Copy link

Definitely a +1 from me

@intzaaa
Copy link

intzaaa commented Aug 10, 2022

+1

2 similar comments
@I7T5
Copy link

I7T5 commented Aug 15, 2022

+1

@rsdlt
Copy link

rsdlt commented Sep 6, 2022

+1

@AhmedThahir
Copy link

Any updates on this?

@pete-debiase
Copy link

+1 !

@gillyspy
Copy link

+1

@brandondrew
Copy link

brandondrew commented Sep 29, 2023

@aboy021
That was a very helpful overview of the issues. Thanks.

I think we also need to decide if this is going to be a Venn diagram or an Euler diagram or if it is feasible to make that a setting. My impression is that for many use cases a Venn diagram is very useful, but in practical terms this means a maximum of 3 sets (see Venn diagram on Wikipedia.

When the average person says "Venn diagram" they really have in mind an Euler diagram. So—using the more correct terminology—I think we want Euler diagrams. I think the more difficult decision is which nomenclature to use. Be purists and have only 1% of the population understand? Or go with common parlance and be technically incorrect?

With regard to the number of sets, even if we could only have 3 sets, that still covers people's needs probably more than 80% of the time. And it's better than what we have now.

@the-solipsist
Copy link

When the average person says "Venn diagram" they really have in mind an Euler diagram.

Multiple people have said this in this thread. I don't know if it is true. It definitely isn't true in my case. When I studied Venn diagrams in school, it was only Venn diagrams (non-intersection of two sets is shown by non-shading of overlapping are between intersecting circles, and not by non-intersection of the circles), not Euler diagrams. Taking this example from Wikipedia, in school I only ever learnt the second way of representing sets, not the first.

1000019533

In graduate studies, we had to learn to use Venn diagrams to represent categorical syllogisms. As you can see from this example, these are Venn diagrams, not Euler diagrams:
1000019532

While many people might refer to Euler diagrams as "Venn diagrams", I don't think it can be taken for granted that when people say, "Venn diagrams" they're actually referring to Euler diagrams.

I think the term Venn diagram is used colloquially to refer to both Venn and Euler diagrams, and not just to Euler diagrams.

@brunolnetto
Copy link

Please, I recommend using python library eule. Regardless from that, please give me a paid remote job. I am begging you!

@jgwinner
Copy link

jgwinner commented Oct 2, 2023

I'd take either Euler or Venn - but I can see from the discussion, this may be difficult for anything but trivial data sets.

Still ...

I'm doing some work with a company that has multiple, overlapping cloud services, and thought if I could describe the overlaps via code, then view it in an Euler diagram, it could be useful.

@brunolnetto
Copy link

@jgwinner the library https://pypi.org/project/eule/ provides the service you need.

@earonesty
Copy link

venn and euler can be two different directives.

using this MIT licensed library, it should be easy to add venn: https://github.com/benfred/venn.js ... can leave it as a lib, or just absorb it, depending on what the mermaid maintainers prefer

the expression of sets, labels, intersections and intersection-labels is all very mermaid-like already, just need to parse the syntax and use the lib

venn
  "Backstreet Boys" size 1 alias A
  "Algebra Teachers" alias B
  "Toddlers"
  A B Toddlers: "Tell Me Why?" size 1

syntax:

venn [title <optional title goes here>]
    "<set 1 label>" [size <int>]  [alias <word>]
    ...more labels followed by one or more intersection lables...
    <list of set labels or aliases>: "Intersection Label" [size X]

without sizes specified, the sizes of all sets (circles) are equal by default
without overlap sizes specified, the sizes of all overlaps are equal (set to 1) by default

additional classes added to inner items should make mermaid-syntax styling easy

@migueltorrescosta
Copy link

Has there been an update on this? It would be an awesome feature to have 🥇

@brunolnetto
Copy link

@migueltorrescosta It is a challenging one. Do you need a visualization or data analysis in this regard?

@migueltorrescosta
Copy link

Yes. My goal would be to define a Venn and/or Euler code as a diagram, as I generate a lot of them. The current example that yrig is having a layout of time complexity classes ( P Vs NP Vs NP Hard Vs NP Complete Vs ... ), but I'm building others as needed

@brunolnetto
Copy link

Your example is rather an Euler Diagram, as coleagues has mentioned already. Once you have the key-value structure, also called 'dictionary' with the respective class categories and a list of elements respective to that category, you may use the library eule to describe the intersections among categories.

@d-led
Copy link

d-led commented Mar 28, 2024

Check out the input format in this one:
https://docs.anychart.com/Basic_Charts/Venn_Diagram
With this choice it’s probably possible to move the analysis to the user a bit.

it’s also an interesting Euler/Venn hybrid

@migueltorrescosta
Copy link

migueltorrescosta commented Mar 29, 2024

Your example is rather an Euler Diagram, as colleagues has mentioned already.

Definitely, sorry for the confusion

you may use the library eule

eule is a python package. Isn't python independent from mermaid-js? The list of mermaid-js supported visualizations are documented here, and from this list I cannot find any mention of euler ( nor venn ) diagrams.

Check out the input format in this one:
https://docs.anychart.com/Basic_Charts/Venn_Diagram

Thank you for the link @d-led 🙏 AnyChart requires us to specify and setup the diagram with javascript, instead of using a "pure text" description, as is done with a sequenceDiagram, for example. Am I correct that there is no text only option in mermaid-js to describe a Euler diagram, i.e. users need to code in JavaScript if they want Euler diagrams?

@d-led
Copy link

d-led commented Mar 29, 2024

@migueltorrescosta sorry, as I didn't elaborate much from the phone. What I meant with the anychart link is that the task of providing a Venn/Euler syntax could be simplified by not solving the set operations. I haven't looked at how AnyChart do it, however their input is as follows, from the link above:

var data = [
  {x: "A", value: 100},
  {x: "B", value: 100},
  {x: "C", value: 100},
  {x: ["A", "B"],	value: 20},
  {x: ["B", "C"], value: 20},
  {x: ["A", "B", "C"], value: 20}
];

you can see that the input is in a shape that already hints at the resulting picture, including the intersection areas. Taking this as a starting point for the syntax could probably lead to faster results (and less code) without having to solve the set operations.

@migueltorrescosta
Copy link

migueltorrescosta commented Apr 1, 2024

@jgreywolf I noticed that the following labels have been added on the 6th April 2023

  • Status: Approved
  • Type: New Diagram
  • Contributor needed

I am happy to be said contributor. The Community Contributing Guidelines are clearly written. It is unclear what the venn/euler diagram specification is. Are there issues that specify it, or should I design it as part of this contribution? If I am to design it, @d-led 's comment and @aboy021's comment provide good starting points to building up the syntax.

Regarding the visualisation of Euler Diagrams

I suggest we

  1. Represent each set combination as a node in a graph. with $n$ sets, we have $2^n$ combinations.
  2. Remove nodes that do not occur in the data. Removing non occurring nodes should vastly reduce the number of regions to represent.
  3. Connect by an edge any node in which a region is shared. This would be regions of the same colour in this paper.
  4. Check if the graph is planar: This can be done by checking for the existence of $K_5$ or $K_{33}$ sub graphs. The Python NetworkX package already does this ( relevant docs ). If needed the algorithm can be converted to JS.
  5. If the graph is planar, generate the relevant Euler diagram from it.
  6. If the graph is not planar, it cannot be represented as an Euler diagram. In this case we can either split the graph and use colours to represent regions ( might get nasty quickly ), or we can display a message stating that no Euler diagram exists matching the required description.

@NicolasNewman
Copy link
Collaborator

@migueltorrescosta

I'd love to see things start moving on this issue.

Reading over the full conversation here, I think we're really over complicating what needs to happen to make this a reality. Looking at the venn.js library mgaitan mentioned, there's a more actively maintained fork which provides an interface to the actual (x,y) coordinates for creating your own rendered diagram.

While I haven't tested the library myself thoroughly, if it works as I envision the process would them become:

  1. Implement the parser in Langium for converting from the chosen syntax to a representation similar to what d-led described
  2. Plug that representation into venn.js which outputs the information needed to layout the SVG
  3. Use that information to render the diagram within Mermaid

This approach is similar to what I did for generating Architecture diagrams and would completely skip having to implement our own layout engine that involves set theory.

As for the actual requirements, I personally think the average user of this diagram type wouldn't need more then 4 or 5 sets, nor would they care about the difference between an Euler or Venn diagram. The library I linked to doesn't even clearly state what type it actually generates and at a quick glance it seems to pick the one best suited to the data provided.

Let me know your thoughts on this approach and if you're still interesting in being the contributor. I'm also happy to collaborate with you and divvy up with work if you'd prefer to focus on a particular part of the implementation.

@migueltorrescosta
Copy link

@NicolasNewman that sounds ideal. I'd appreciate help on getting the syntax right: the given example looks good but I'd need to standardise it. We also need to decide what happens with non-planar graphs: maybe mgaitan already has a good solution. I am a bit overscheduled until the 14th September, but will happily work on it after that.

vennDiagram
  set A
  set B
  set C

  intersect A B
  union B C
  relativeComplement A C

@NicolasNewman
Copy link
Collaborator

NicolasNewman commented Sep 3, 2024

@migueltorrescosta

Given the format of the input for venn.js, we first need to decide if the diagram having proportional area relative to the set's size is important. I personally don't think the functionality will be important to our use-case (so we would default to passing 1) and it could potentially be used to tweak the size of the generated SVG paths internally based on the length of the labels (the need to do this will depends on how flexible the output of the library is).

The library expects input in the format:

const sets = [
  { sets: ['A'], size: 12 },
  { sets: ['B'], size: 12 },
  { sets: ['A', 'B'], size: 2 },
];

so this could be translated as (assuming we're fixing size to 1 and the labels will be handled by ourselves)

vennDiagram
  set A "Blue"
  set B "Green"

  intersect A B "Cyan"

I personally don't think using the language of $\cap$, $\cup$, and $-$ in the syntax is necessary as what I envision to be the average user of this diagram won't be thinking in terms of set theory.

As for non-planar graph's, could you clue me in on how it's relevant to Venn diagrams? It's been quite some time since I took a discrete/graph theory course.

@migueltorrescosta
Copy link

In the described notation, how do you differentiate the size of the intersection with the size of the union? I assume that sets['A', 'B'] is the intersection , but then the union would be inferred rather than explicitly set.

Regarding the relevancy of planar graphs, these are essential to know whether it can be plotted at all or not. If we have 6 sets, with all possible combinations occurring, we will never be able to plot it as a Venn diagram. The thinking for this is to

  1. Set each set with non zero area as a node
  2. create an edge between two nodes that differ by just one element
    In a Venn diagram, these would be neighbours ( i.e. share a border ), at which point the existence of a Venn diagram to represent it is equivalent to the associated graph being planar ( i.e. can it be plotted with no overlapping edges ).

I think the above is correct, and if so then when the associated graph is not planar, we need to decide either to show an error, or to find an adaptation of a Venn diagram that represents the same information

@NicolasNewman
Copy link
Collaborator

NicolasNewman commented Sep 3, 2024

Personally I don't think the implementation added to Mermaid should concern itself with the size of a set, just that the labels of the intersection between sets can (mostly) fit within the region. I'd be interested to hear from others who've add to this issue's discussion over the years and how they envision a Venn/Euler diagram in Mermaid though as what I'm envisioning could be quite different.

From my perspective, I picture something like this:
cca81015-3fdd-48f0-83b7-fe8216bf4750
As a user who'd like to create this with Mermaid, I shouldn't need to worry about how to represent these relations with set theory. Additionally, the sizes of the various sets are irrelevant here. Since the end-user won't be providing them, we can use that to our advantage to manually adjust the size ourselves to ensure the labels the user gives can fit into the diagram.

There's a writeup of the algorithm that venn.js uses. The updated fork I linked adds a separate API for creating a layout and it takes additional options that I'd like to investigate further. It could vastly simplify fitting the labels into the diagram. When I get the chance in the next few days I'll play around with venn.js locally so I have a better idea of how it works and its limitations. I'll update you with my findings once I'm done.

Regarding determining if a Venn diagram is non-planar, I believe venn.js will handle that on its own and I'll let you know once I have a concrete answer.

@d-led
Copy link

d-led commented Sep 4, 2024

There’s one reason one might want to control the sizes of the intersections and circles: labels inside them. Another challenge: fitting text. Newlines in the input could be valuable to make the diagram legible. If one cannot see where the label belongs to, the point of having such a diagram vanishes.

@NicolasNewman
Copy link
Collaborator

I was thinking from the angle of computing the sizes behind the scenes to make sure everything fits. You make a good point and I think you're right in that even if that approach is fessable, users should still have the option to specify their own sizes if they don't like the default fit of the labels.

@brunolnetto
Copy link

I think, using an adaptation of libraries provided by @NicolasNewman, excluding the zero-measure intersections to make them more Euler-diagram-like is a great way to go. I designed a library in javascript called eulejs and in python called eule.

@migueltorrescosta
Copy link

@brunolnetto as you already designed libraries related to this and I won't be able to take this for a couple weeks, would you be happy to take on this work?

@brunolnetto
Copy link

I think, I am not the right guy for this task, barely had some frontend experience. At other hand, @NicolasNewman seems to have some better starting points than mine.

@migueltorrescosta
Copy link

Fair. @NicolasNewman , with the caveat that I won't be able to pick this for a couple weeks, I am happy to have this issue assigned to myself

@NicolasNewman
Copy link
Collaborator

NicolasNewman commented Sep 6, 2024

Happy to hear it! In the meantime, I'll jot down some of my findings with venn.js to hopefully give you a clearer picture once you're able to start. Once I'm satisfied with testing the library and draft up the syntax.

Test 1

1600

was able to be replicated with venn.js with a couple caveats

const sets = [
        { sets: ['M'], size: 64},
        { sets: ['S'], size: 64}, 
        { sets: ['P'], size: 64},
        { sets: ['E'], size: 64},
        { sets: ['SA'], size: 64},
        { sets: ['O'], size: 64},
        { sets: ['S', 'P'], size: 16},
        { sets: ['E', 'P'], size: 16},
        { sets: ['M', 'P'], size: 16},
        { sets: ['M', 'E'], size: 16},
        // 
        { sets: ['E', 'SA'], size: 8},
        { sets: ['E', 'O'], size: 8},
        { sets: ['SA', 'O'], size: 8},
        { sets: ['M', 'S', 'P'], size: 2},
];

Produced the following diagram:
image

In the console, the message WARNING: area M,S,P not represented on screen was displayed (we'll want to change this to throwing a catch-able error so we may need to open a PR/fork the fork). I managed to correct this by changing the size of sets [M], [S] to 128
image

Unfortunately, the region of set [M,S,P] is quite small. As I increased the size from 2 $\rightarrow$ 128, other regions started to become squished. Increasing the size of those then caused the warning to be printed again.

Conclusion

  1. We're going to want to expose the sizes to the user
  2. From initial testing, $2^{(N-n)}$ where $N$ equals the size of the largest set and $n$ is the size of the current set seems to be a good initial value for the sizes to auto generate
  3. End-users getting their desired output will be a finicky process of tweaking sizes. This needs to be clearly reflected in the documentation for the diagram.

Test 2

image

This diagram wasn't able to be replicated. While this feature shouldn't be a priority in the initial draft, we may want to considering supporting putting a Venn diagram into a Venn diagram. I'll include the ability to do this in the syntax once I'm done investigating.

Test 3

image

const sets = [
    { sets: ['A'], size: 64},
    { sets: ['B'], size: 64},
    { sets: ['C'], size: 64},
    { sets: ['D'], size: 64},
    { sets: ['A','B'], size: 32},
    { sets: ['A','C'], size: 32},
    // These two combinations reach across the diagram and are cannot be accurately displayed
    // { sets: ['A','D'], size: 32},
    // { sets: ['B','C'], size: 32},
    { sets: ['B','D'], size: 32},
    { sets: ['C','D'], size: 32},
    { sets: ['A','B','C'], size: 16},
    { sets: ['A','B','D'], size: 16},
    { sets: ['A','C','D'], size: 16},
    { sets: ['B','C','D'], size: 16},
    { sets: ['A','B','C','D'], size: 8}
];

No warning is thrown if the two commented out sets are included. The layout engine puts the labels so they overlap set [A,B,C,D]. Additionally, two variations of the diagram would be shown, meaning Mermaid wouldn't always output the same diagram each time.

image
image

In this particular case, we should do some analysis on our end to detect this and show an error.

@luq7
Copy link

luq7 commented Sep 7, 2024

I thought that the discussion was stale... to find out that a new comment was added 7 hours ago. Thanks for the great work, it is looking very promising.

@NicolasNewman
Copy link
Collaborator

NicolasNewman commented Sep 11, 2024

Here's my thoughts on what the syntax for the diagram should look like:

venn
    set B "Blue" size 8
    set G "Green"
    set R "Red"
    intersect G B "Cyan"
    intersect R G B "White" size 12

A couple notes:

  • intersect and intersection should both be supported as the keyword
  • Line breaks should be supported in the labels with <br />
  • I feel there could be a use-case for adding KaTeX support for this diagram. Check out flowcharts to see how its implemented.
  • Specifying a size is optional. If none is specified, I found giving the intersections with the most sets a value of 2 and then incrementing by powers of 2 to be a good starting point but it will most likely need some refinement.

Let me know if there's anything I may have missed.

Also, I mentioned previously that supporting putting a Venn diagram inside of a Venn diagram may be something to consider supporting to achieve a diagram like this. I'm still thinking of the best way to accomplish this in the syntax but if anyone has any suggestions feel free to share them!

@broofa
Copy link

broofa commented Sep 15, 2024

venn
    set A "Users who just want two frickin' circles and some text!" size 100
    set B "Users who want more than that" size 10
    intersect A B 100%

(And who do I have to throw money at for this to work in github markdown?!? 😉 )

@exoego exoego linked a pull request Oct 4, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.