Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for visualisation tools #78

Open
duncanMR opened this issue Feb 15, 2024 · 1 comment
Open

Ideas for visualisation tools #78

duncanMR opened this issue Feb 15, 2024 · 1 comment

Comments

@duncanMR
Copy link
Collaborator

duncanMR commented Feb 15, 2024

We need to develop tools to visualise GIGs, and the algorithms used to simulate them. As I starting point, I wanted to visualise how the random_matching_positions algorithm worked:

import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np


def add_rect(ax, x, y, width, color):
    rect = patches.Rectangle((x, y), width, 0.8, facecolor=color)
    ax.add_patch(rect)

def plot_mrcas(mrcas_structure, rng, figsize=(10, 5)):
    fig, ax = plt.subplots(figsize=figsize)
    tot_len = sum(x[1] - x[0] for v in mrcas_structure.values() for x in v.keys())
    loc = rng.integers(tot_len)
    y_pos = 3
    x_max = 0
    found_breakpoint = False
    for mrca_node, mrca_intervals in mrcas_structure.items():
        for x in mrca_intervals.keys():
            x_max = max(x_max, x[1])
            width = x[1] - x[0]
            if loc < width and found_breakpoint == False:
                u, v = mrca_intervals[x]
                assert len(u) != 0
                assert len(v) != 0
                u = u[0] if len(u) == 1 else rng.choice(u)
                v = v[0] if len(v) == 1 else rng.choice(v)
                u_width = u[1] - u[0]
                v_width = v[1] - v[0]

                # u interval
                add_rect(ax, u[0], 0.5, u_width, "#b0acac")
                ax.text(-2, 0.8, f"U", va="center", ha="right", fontsize=14)

                # v interval
                add_rect(ax, v[0], 1.5, v_width, "#b0acac")
                ax.text(-2, 1.8, f"V", va="center", ha="right", fontsize=14)

                # highlight selected comparible points
                u_0 = u[0] + loc if u[0] < u[1] else -(u[1] - loc - 1)
                v_0 = v[0] + loc if v[0] < v[1] else -(v[1] - loc - 1)
                add_rect(ax, u_0, 0.5, 1, "red")
                add_rect(ax, v_0, 1.5, 1, "red")

                # MRCA interval
                add_rect(ax, x[0], y_pos, width, "red")
                ax.text(
                    -2,
                    y_pos + 0.4,
                    f"MRCA: {mrca_node}",
                    va="center",
                    ha="right",
                    fontsize=14,
                    color="red",
                )

                found_breakpoint = True
            else:
                add_rect(ax, x[0], y_pos, width, "gray")
                ax.text(
                    -2,
                    y_pos + 0.4,
                    f"MRCA: {mrca_node}",
                    va="center",
                    ha="right",
                    fontsize=14,
                )
            loc -= width
        y_pos += 1

    ax.plot([0, x_max], [2.5, 2.5], color="black", linestyle="-", linewidth=2)
    ax.set_ylim(0, y_pos)
    ax.set_xlim(0, x_max)
    ax.set_xlabel("Position")
    ax.yaxis.set_visible(False)

    plt.tight_layout()
    plt.show()

Here is an example, from this simulation:

sim = gigutil.DTWF_one_break_no_rec_inversions_slow_sim()
gig = sim.run(num_diploids=7, seq_len=100, gens = 20)

I extracted the last MRCA structure used in the algorithm and plotted it:

rng = np.random.default_rng(1)
mrcas_structure = {
    227: {(16, 18): ([(16, 18)], [(16, 18)])},
    212: {(31, 42): ([(31, 42)], [(31, 42)])},
    165: {(22, 24): ([(22, 24)], [(22, 24)])},
    148: {(10, 13): ([(10, 13)], [(10, 13)])},
    127: {(13, 16): ([(13, 16)], [(13, 16)]), (18, 22): ([(18, 22)], [(18, 22)])},
    40: {(51, 68): ([(51, 68)], [(51, 68)])},
    24: {(9, 10): ([(9, 10)], [(9, 10)])},
    7: {(68, 78): ([(68, 78)], [(68, 78)])},
    0: {
        (0, 9): ([(0, 9)], [(0, 9)]),
        (24, 31): ([(24, 31)], [(24, 31)]),
        (42, 51): ([(42, 51)], [(42, 51)]),
        (78, 100): ([(78, 100)], [(78, 100)]),
    },
}
plot_mrcas(mrcas_structure, rng)

image

We can see how fragmented the MRCA nodes are, and how the MRCA node's interval matches the nodes U and V. It might be useful for debugging once SVs are added.

Not sure how useful it is in this form, though, since it's difficult to integrate plotting into the simulation code. @hyanwong, maybe it would be helpful to make the mrcas_structure a class? We could add a simplified version of this code as a plot method perhaps, and also add methods to extract useful data conveniently. Probably only worthwhile at a later stage, since the data structure might change.

@hyanwong
Copy link
Owner

Thanks a lot for this @duncanMR: it's really helpful to have someone else looking at all this. I'm sure there are things I've missed or done wrong.

I was indeed wondering about making the mrcas structure a class, which could help elsewhere too. It needs to be pretty lightweight, but that's fine. We could even make it a namedtuple, which is about as lightweight as you can get (and still allows methods to be attached to the class).

I'll open another issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants