-
Notifications
You must be signed in to change notification settings - Fork 26
Use UAST columns from FilePairs table #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
server/repository/file_pairs.go
Outdated
err := queryRow.Scan(&pair.ID, | ||
&pair.Left.BlobID, &pair.Left.RepositoryID, &pair.Left.CommitHash, | ||
&pair.Left.Path, &pair.Left.Content, &pair.Left.Hash, | ||
&pair.Left.Path, &pair.Left.Content, &pair.Left.UAST, &pair.Left.Hash, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases, UAST can be really big, and it is not needed for printing FilePairs, so it could be removed from the query.
Let's keep it in mind for the future in case we notice that getting the next file pair is delaying too much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving this red cross here to avoid merging this PR till we have considered/solved/scheduled migration
update: → #214
blob_id_a TEXT, repository_id_a TEXT, commit_hash_a TEXT, path_a TEXT, content_a TEXT, | ||
blob_id_b TEXT, repository_id_b TEXT, commit_hash_b TEXT, path_b TEXT, content_b TEXT, | ||
blob_id_a TEXT, repository_id_a TEXT, commit_hash_a TEXT, path_a TEXT, content_a TEXT, uast_a BLOB, | ||
blob_id_b TEXT, repository_id_b TEXT, commit_hash_b TEXT, path_b TEXT, content_b TEXT, uast_b BLOB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type blob is not supported in postgres
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an internal tool, and (internally) we're not using PostgreSQL but SQLite, I'd not block this PR because of this problem with PostgreSQL.
If Blobs cannot be easily solved for PostgreSQL, I'd merge as it is, and create a new issue to work on it when facing v1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do something similar to this:
https://github.com/src-d/code-annotation/blob/master/server/dbutil/dbutil.go#L38
And use bytea
type for postgres. We'd have to check if it works different or has any other side effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it requires extra work, I'd not block the merge as explained at #206 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank you for pointing it out @carlosms
So, if that (and that's how it seems from example above) would require some custom code, to change the query at the runtime by replacing the type, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. For the <INCREMENT_TYPE>
the substitution is done in this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done by:
- 836d8db Add PostgreSQL support for BLOB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering PostgreSQL issue pointed out by #206 (review) if we decide to merge without providing support to PostgreSQL, we should remove from our docs the mentions to the support of PostgreSQL.
On the other hand, if the PostgreSQL issue is solved, we can disregard this comment
the plans for the migration were defined by #214 |
dfbc202
to
8874abd
Compare
I rebased and fixed a bug in the original branch with 04f34a0 @carlosms @bzz could you review this PR considering #206 (review) ? |
|
||
The pieces of code to be labeled are called _file pairs_. They must be provided via an [SQLite](https://sqlite.org/) database. The database **must follow the expected schema**, please [follow this link](./cli/import/examples/example.sql) to see an example. | ||
|
||
The `import` command will use those file pairs to create a new [SQLite](https://sqlite.org/) or [PostgreSQL](https://www.postgresql.org/) database that will be used internally by the Annotation Tool. The destination database does not need to be empty, new imported file pairs can be added to previous imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a discussion on postgresql support but at any rate, even if this PR breaks it - removing existing Posgresql support in this PR does not seem like a good idea to me.
If merging fast of this work is required, I would propose create a new issue for adding the support of switching a type on top of this work, and then implementing it asap after this is merged, rather then having both - adding new feature and removing an old one in the same PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me explain the current situation:
- I don't see the benefit of blocking the new feature (UASTs in FilePairs) till we fix a problem that we do not have yet (nobody is using PostgreSQL yet)
- If we don't block this PR, it would be merged, so the PostgreSQL support would be broken the facto. If we leave the docs (and the code) as they currently are, it could confuse people trying to use PostgreSQL from now on.
- I don't see the benefit of releasing something we already know it is broken.
- I'm not purposing to drop definitively the PostgreSQL support; It can be resumed considering the issue Blob is not supported in PostgreSQL #222 . The commit dropping PostgreSQL support 94d41ab can be reverted; nothing is lost.
Thanks a lot for fixing bug with an order of columns. I must be missed it. Though I would disagree with the last commit. Same thoughts as of @bzz above. If we need to merge this PR now, just merge it as it is with BLOB type column without removing code for postgres altogether. And later we can fix it using bytea type. |
6e6169f
to
edf2bbb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you @bzz @smacker @carlosms review the PR again considering the comments below?
I Rebased and restored the offending commit dropping PostgreSQL support as you suggested. I found some other problems with PostgreSQL that should be solved separately.
I rebased over master, and fixed some other problems that I found in the original branch:
- 7f05627 Avoid star select to avoid problems with some sources
- edf2bbb Reorder columns to ensure that export command works
I also added support for BLOB data type if PostgreSQL DB is used. I tried it locally, and I could only import the first 100 FilePairs, that were properly processed and its content was the same that when they are imported into an SQLite database. The rest 1900 FilePairs could not be imported because of an unrelated problem as described by #224
- 836d8db Add PostgreSQL support for BLOB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @dpordomingo for taking care, addressing all feedback. 👍 for having tests
Signed-off-by: Maxim Sukharev <maxim@sourced.tech>
edf2bbb
to
fa2f839
Compare
Fixes #197
Fixes #222
Blocks #221