Skip to content
This repository was archived by the owner on May 20, 2025. It is now read-only.

Conversation

nichael-meilson
Copy link

I wasn't sure how to effectively make this PR smaller because so many of the commits are intertwined.
In essence, very few of the existing components of CAMEL were altered (only a switch from MariaDB to MySQL which doesn't seem to have any major effect). Mutation data functionality was added to the CAMEL API, SQL database, and a data querying interface - similar to the "experiments" - for the front end

@MaybeJustJames
Copy link
Member

Hi @BramVDBe ,
thanks for this. No worries that you couldn't split it up, I'll get through it. The most important question is for you to explain why we need to switch from MariaDB to MySQL. I am strongly against this change so I need a very good reason to accept it.

@BramVDBe
Copy link
Collaborator

Hi @MaybeJustJames,

thanks for getting back at this. As for why @nichael-meilson shifted from MariaDB to MySQL, I'll leave that to him to explain to avoid making wrong assumptions.

@nichael-meilson
Copy link
Author

nichael-meilson commented Aug 17, 2021

Hi @MaybeJustJames @BramVDBe

I switched from MariaDB to MySQL because joining and subquerying larger tables in MariaDB was slow enough to timeout GET requests from the browser. There's a paper on it here regarding join performance. I chose MySQL because it's mostly compatible with MariaDB as a "drop-in replacement". I didn't really know how to go about scaling the database to handle mutation data without switching away from MariaDB - do you know if there's another way?

@MaybeJustJames
Copy link
Member

Hi @nichael-meilson, @BramVDBe,
sorry looks like I got you both mixed up. Thanks for tagging me for a review @BramVDBe. Indeed I did not notice the PR earlier.

@nichael-meilson I do remember you had performance issues. In fact we discussed using simulated materialised views as a potential solution:

[...] GET request [...] API will timeout [...] SQL query being executed takes ~60-90 seconds due to joining large tables.

Ouch!!! Indeed that does sound like a likely culprit. In web apps like this we generally optimise for read performance because
reads happen far more regularly than writes. And in situations like yours we often use "materialised views"
(https://en.wikipedia.org/wiki/Materialized_view). Unfortunately we're using MySQL/MariaDB which doesn't do materialised
views :( All is not lost though, that Wikipedia article talks about simulating them with triggers...

pre-join these tables into one table

Indeed, this is exactly what you should do. Here's a very nice blog/article you should read about how to do it properly.
"METHOD 1" is the method to use IMHO. http://www.coding-dude.com/wp/databases/creating-mysql-materialized-views/

What came of this? I'm pushing back hard on this because switching database server on the R910 is, to dramatically understate it, not a trivial task. For perspective, that's true wherever you host CAMEL: switching SQL implementations even ones as compatible as Maria and MySQL is not a decision to be taken lightly.

Copy link
Member

@MaybeJustJames MaybeJustJames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @nichael-meilson, these comments are a brief look over your work. It's clear you've put a lot of effort into this and I'm confident these changes work for you. My job here is to ensure that we can continue to maintain them and ensure we can safely deploy them.
To this end there are a couple of major things I'd like you to consider doing for me please (in order of priority):

  1. Use a code formatter on the js: my suggestion
  2. Add some unit tests. This isn't going to be easy since there are no existing tests and your code wasn't written with testing in mind. The trade-off is that it they make me more confident that I can safely accept your changes. Ideally I'd like you to test only your changes using frameworks like Py.Test for the server, and Jest for the client.
  3. Do some linting. ESLint which integrates with Prettier for the client, and PyLint for the server.

The above requests only apply to your changes. Please don't feel you have to change any of the existing code. Please don't change existing code (in this PR; you're extremely welcome to change existing code in a different PR).

These comments seem harsh. I don't mean to belittle the huge effort you've put in to get as far as you have. Really good job. I hope you can put some more effort in so we can deploy your work :)

update: {method: 'PUT'}
});
})
.factory("Mutation", function MutationFactory($resource, config) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful with consistent formatting. You're using TAB characters in a space indented file.

return $resource(config.apiUrl+"/mutation/:id", {id: '@id'}, {
update: {method: 'PUT'}
});
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

update: {method: 'PUT'}
});
})
.factory("FieldMut", function FieldMutFactory($resource, config) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

return $resource(config.apiUrl+"/field_mut/:id", {id: '@id'}, {
update: {method: 'PUT'}
});
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

controller: 'ExperimentsController',
controllerAs: 'experiments'
})
.when('/mutation/:id', {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

if (State.expFields.length == 0){
ctrl.fields = FieldMut.query(function(){
var i = 0;
for (var field_i in ctrl.fields){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this supposed to do?

ctrl.toggleFilterItem = function(field){
if (field.filter){
field.filter = false;
if (field.hasOwnProperty('type_column')){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is all this logic supposed to do?

field.show = !field.show;
}

ctrl.filterUpdate = function(filter_id){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

scope.itemsPerPage = 20;

//Sort functionality
if (!State.expOrder.hasOwnProperty('key')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not using State.expOrder.key, why are you checking for it?

scope.orderParams = State.expOrder;
}

scope.sortExperiments = function (key, keyRealm = 'exp') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really prefer you not use default parameter values, please be explicit

@MaybeJustJames
Copy link
Member

@nichael-meilson if you're at all confused about how to work on my suggestions please don't try to guess your way through. Let me know and we can pair on it.

@MaybeJustJames
Copy link
Member

Ping @nichael-meilson ?

@nichael-meilson
Copy link
Author

Hey @MaybeJustJames, I'm a bit busy with work (+ I have to re-sit some exams) so I was planning on going over your comments on September 4th. Do you have a deadline in mind for this?

@MaybeJustJames
Copy link
Member

Do you have a deadline in mind for this?

No worries from our side @nichael-meilson . You should talk to @BramVDBe about a deadline. Best of luck with your exams :)

@BramVDBe
Copy link
Collaborator

Hi @nichael-meilson,

any update on this?

kind regards
Bram

@nichael-meilson
Copy link
Author

Hi @BramVDBe,

Progress has been slow as I was setting up a development environment on a new computer but things seem to be working now. I'm nervous to see how buggy my code is when I start testing but it's all necessary

@MaybeJustJames
Copy link
Member

I'm nervous to see how buggy my code is when I start testing but it's all necessary

Don't be nervous about this @nichael-meilson . You are not your code and finding bugs now means users won't see them in prod.

By the way, can you respond to my MySQL materialized views query above? I want to know why triggering updates to a cache table (simulated materialized views) wasn't an option?

@MaybeJustJames
Copy link
Member

Ping @nichael-meilson , any progress? Please let me know if you're struggling

@MaybeJustJames
Copy link
Member

Ping @nichael-meilson

@nichael-meilson
Copy link
Author

Hi All, sorry for the delayed reply - I've been swamped with work lately.

Fortunately I think I've figured out how to appropriately test the API, but unfortunately, there needs to be some massive overhauling to the API. For instance, making a post request means sending a pandas df object retrieved from the mutation prediction tools like this:

CHROM_ID Start POS End POS TYPE ...
NC_000913 506 506 DEL ...
... ... ... ... ...

And convert them to a format like this to match the DB schema which is made with JavaScript functionality in mind, like this:

experiment_id mutation_id field_id value_VARCHAR value_TEXT value_INT ...
784 1 41 NC_000913 ...
784 1 42 506 ...
784 1 43 506 ...
784 1 44 DEL ...
... ... ... ... ... ... ...

Which, to put lightly, is a pain, and to put heavily, slows down data retrieval performance exponentially

Regarding MariaDB, it doesn't support nested querying (current solution) or materialized views. Apparently it only supports (non-materialized) views which I don't think is the solution to this issue. Additionally, I'm not extremely versed in more complex concepts of SQL so I would have to learn how to do all of this before determining if it's an option

As for the JavaScript testing, I don't actually know how to write JavaScript and basically learned enough of the angular concepts to decipher existing UI and replicate it so I genuinely don't know how/where one would begin to test

So yeah, it's still a ton of work from my perspective and I'm even having a hard time figuring out how many things need to be fixed and in what order they need to be done

@BramVDBe
Copy link
Collaborator

BramVDBe commented Nov 8, 2021

Hi @nichael-meilson,

Is there anywhere @MaybeJustJames could help out to make it a more efficient transition? I know James is also swamped likely but perhaps there things that he could perform swiftly?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants