Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordering clusters in plotClusterTrajectories() #144

Closed
hichew22 opened this issue Nov 21, 2023 · 9 comments
Closed

Ordering clusters in plotClusterTrajectories() #144

hichew22 opened this issue Nov 21, 2023 · 9 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@hichew22
Copy link

Hello Niek,

I am using the dtw method in latrend to plot some clusters like so:
plotClusterTrajectories(dtw_model_2)
image

or plot(dtw_model_2)
image

Is there a way within plotClusterTrajectories() where I can specify that the cluster with more values is always listed as cluster "A" (highest frequency to lowest frequency)?

Or would I need to reorder them manually, something like:

# Create dataframe with cluster assignments and UPNs
cluster_dtw <- trajectoryAssignments(dtw_model_2)
upn <- ids(dtw_model_2)
df_cluster_dtw <- cbind(cluster_dtw, upn) %>%
  as.data.frame()

# Recode clusters from largest (1) to smallest (2)
cluster_freq <- table(df_cluster_dtw$cluster_dtw)
ordered_clusters <- names(sort(cluster_freq, decreasing = TRUE))
df_cluster_dtw$cluster_dtw <-
  factor(
    df_cluster_dtw$cluster_dtw,
    levels = ordered_clusters,
    labels = seq_along(ordered_clusters)
  ) 

# Create column with cluster labels and percentages
cluster_percentages <- prop.table(table(df_cluster_dtw$cluster_dtw))
cluster_labels = sprintf("%s (%d%%)", names(cluster_percentages), round(cluster_percentages * 100))

df_cluster_dtw <- df_cluster_dtw %>%
  mutate(cluster_label = factor(
    cluster_dtw,
    levels = names(cluster_percentages),
    labels = cluster_labels
  ))

# Add cluster assignments to df_long
df_long <- df_long %>%
  left_join(df_cluster_dtw, by = "id")

# Plot cluster trajectories
latrend::plotClusterTrajectories(
  df_long,
  response = "value",
  cluster = "cluster_label",
  trajectories = TRUE,
  facet = TRUE,
  size = 2
)

?

Thank you!

@niekdt niekdt added the enhancement New feature or request label Nov 21, 2023
@niekdt
Copy link
Collaborator

niekdt commented Nov 21, 2023

Hi @hichew22, thanks for the suggestion. It's a useful feature to have, especially when comparing similar cluster solutions.

I'll start with adding an argument to plotClusterTrajectories for specifying which clusters to plot, and the ordering.
Ultimately, having a wrapper lcModel class for which you can specify the ordering logic would be best. But that'll take some more effort.

@niekdt niekdt self-assigned this Nov 24, 2023
niekdt added a commit that referenced this issue Jan 15, 2024
@niekdt
Copy link
Collaborator

niekdt commented Jan 15, 2024

plotClusterTrajectories now has a clusterOrder argument allowing you to specify which clusters to plot, and the order thereof. Either by name or index

@niekdt niekdt closed this as completed Jan 15, 2024
@hichew22
Copy link
Author

Awesome, thanks, Niek! Do you have an example, and does this also allow ordering clusters from most to least frequent?

@niekdt
Copy link
Collaborator

niekdt commented Jan 15, 2024

You're welcome!

data(latrendData)
method <- lcMethodLMKM(Y ~ Time, id = "Id", time = "Time")
model <- latrend(method, latrendData, nClusters = 3)

# change cluster order
plotClusterTrajectories(model, clusterOrder = c('B', 'C', 'A'))

# show only specific clusters
plotClusterTrajectories(model, clusterOrder = c('B', 'C'))

It's intended as a quick way to set a custom order, but to dynamically order by cluster size, you can use:

plotClusterTrajectories(model, clusterOrder = order(-clusterSizes(model)))

In the future I intend to add some lcModel wrapper classes that would automatically relabel clusters based on some criterion, so the ordening would then be handled during the latrend fitting procedure.

@hichew22
Copy link
Author

I tried doing that for a 4-cluster DTW model as such:
plotClusterTrajectories(dtw_model_4, clusterOrder = order(-clusterSizes(dtw_model_4)))

However, the clusters do not appear ordered. I did make sure to download the most recent installation of latrend. Could you help me with this?
image

@niekdt
Copy link
Collaborator

niekdt commented Jan 16, 2024

Did you install the latest commit (not release)?

remotes::install_github('philips-software/latrend')

@hichew22
Copy link
Author

Yes, I just did but seems like the function does not work.

─ preparing ‘latrend’: (555ms)
✔ checking DESCRIPTION meta-information ...
─ installing the package to process help pages (811ms)
Loading required namespace: latrend
─ saving partial Rd database (3.1s)
─ checking for LF line-endings in source and make files and shell scripts (335ms)
─ checking for empty or unneeded directories
─ building ‘latrend_1.5.1.tar.gz’

@niekdt
Copy link
Collaborator

niekdt commented Jan 17, 2024

I can't spot any issues in the source code. Could you let me know what the output is of:

latrend:::make.orderedClusterNames(clusterNames(dtw_model_4), order(-clusterSizes(dtw_model_4)))

@niekdt niekdt reopened this Jan 17, 2024
@niekdt niekdt added this to the 1.6.0 milestone Jan 17, 2024
@hichew22
Copy link
Author

I just tried re-running it and it works now! Perhaps I had to restart my R session. Thank you so much!

@niekdt niekdt closed this as completed Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants