Skip to content

Conversation

carlosms
Copy link
Contributor

@carlosms carlosms commented Jan 31, 2018

Implements #41.

Please take a look @dpordomingo, @smacker.

TODOs:

  • Update README with new env variable (done in e23160c)
  • Check if DB strings need to be escaped

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@carlosms carlosms requested a review from bzz January 31, 2018 14:47

// start the router
router := server.Router(logger, jwt, oauth, conf.UIDomain, userRepo, "build")
router := server.Router(logger, jwt, oauth, conf.UIDomain, db.SQLDB, "build")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it look weird to pass DB connection to a router only for me?
(no changes required, just asking)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with moving the repositories initialization back to the server.go file. I just did that to avoid having lots of arguments here. But of course we could always store all the repos in a single struct...

Let me know if I should change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, please don't. It simplifies the code.

// DB groups a sql.DB and the driver used to initialize it
type DB struct {
sqlDB *sql.DB
SQLDB *sql.DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about composition here?
It would allow not write db.SQLDB.method everywhere, just db.method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 77f23a1

requestedAssignmentID := chi.URLParam(r, "assignmentId")
assignmentID, err := strconv.Atoi(requestedAssignmentID)
requestedPairID := chi.URLParam(r, "pairId")
pairID, err := strconv.Atoi(requestedPairID)
Copy link
Contributor

@smacker smacker Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we do it often. Just an idea for the future:

func URLParamInt(r, key) (int, error) {
  str := chi.URLParam(r, "pairId")
  return strconv.Atoi(str)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 01f1cbc

}

if user == nil {
user, err = userRepo.Create(ghUser.Login, ghUser.Username, ghUser.AvatarURL, model.Requester)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this pattern with passing fields of structs a few times already.
No changes are required now, but I prefer to pass a model to a repository.
It makes life easier when the model starts growing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. But the problem, in this specific case, is that there is a field of the User model that is not up to you, the ID will be chosen by the DB.

You could create a User model and leave the ID as 0, but that can lead to bugs in my opinion... you may be tempted to continue using the User object you created. In the current code it is clear for the caller that the complete User object will be provided by Create.

Any other opinions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't get the problem with ID.

u := User{login,name,...}
err := repo.save(&u)
fmt.Println(u.ID) // prints user id

keep id as zero value for objects that aren't saved yet is totally normal.

You can see in for example in most popular ORM for go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in b09114c

return nil, fmt.Errorf("Error getting file_pairs from the DB: %v", err)
}

answer := ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to #2 it should be nil. Frontend also expects nil there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a null value in the DB, but isn't it going to end up as an empty string when we read it back?
https://github.com/src-d/code-annotation/pull/52/files#diff-b01fac8859e3c130c7e1a433072473edR24

Copy link
Contributor

@smacker smacker Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no if you use sql.NullString

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok... what about in serializers.go, how does the final json field look like?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be *string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 07e2f9c


// FilePairs repository
type FilePairs struct {
DB *sql.DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should be private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB is initialized in router.go. Or should it be a private field with a setter method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually, people do it with a constructor


userID := service.GetUserID(r.Context())
assignments, err := repository.GetAssignmentsFor(userID, experimentID)
if userID == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do it often. How about moving it to a helper?

func checkUserInCtx(ctx) (id, error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in c9d26ae

@dpordomingo dpordomingo self-requested a review February 1, 2018 07:08
@dpordomingo
Copy link
Contributor

@carlosms your work looks great,
please, ping me once you address the Maxim feedback to perform a second review :D

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
return nil, fmt.Errorf("DB error: %v", err)
}

rows, err := repo.DB.Query(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer rows.Close()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in e3886d2

// GetAll returns all the Assignments for the given user and experiment IDs.
// Returns an ErrNoAssignmentsInitialized if they do not exist yet
func (repo *Assignments) GetAll(userID, experimentID int) ([]*model.Assignment, error) {
rows, err := repo.DB.Query(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer rows.Close()


// FilePairs repository
type FilePairs struct {
DB *sql.DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually, people do it with a constructor

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! that's great, I only found some things (repeated in some places), here is the list:

  • one FilePair can be assigned to the same User many times in different experiments, so it can not be PRIMARY KEY
  • avoid named returns
  • use http status 500 when user can not be fetched
  • what FilePair.Score is?
  • the error returned by repository.Whatever.Create(Whatever) must be the the error returned by the insert query
  • do we really need `repository.Experiment.GetByName(string)?

w.Write(content)
}

// URLParamInt returns the url parameter from a http.Request object. If the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/URLParamInt/urlParamInt
Anyhow, private functions use to be undocummented.

func urlParamInt(r *http.Request, key string) (val int, err error) {
str := chi.URLParam(r, key)
val, err = strconv.Atoi(str)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to avoid named returns.
The code below this comment can be replaced by:

if err != nil {
    err = serializer.NewHTTPError(
        http.StatusBadRequest,
        fmt.Sprintf("Wrong format for URL parameter %q; received %q", key, str)
    )
}

return val, err;

It is only needed to change the returned because val is already 0 when an error occurs.

server/router.go Outdated
r.Get("/", handler.Get(handler.GetAssignmentsForUserExperiment()))
r.Put("/{assignmentId}", handler.Get(handler.SaveAssignment()))
r.Get("/", handler.Get(handler.GetAssignmentsForUserExperiment(assignmentRepo)))
r.Put("/{pairId}", handler.Get(handler.SaveAssignment(assignmentRepo)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not saved a FilePair but an Assignment, that's because it should be:
s/pairId/assignmentId

return func(r *http.Request) (*serializer.Response, error) {
requestedAssignmentID := chi.URLParam(r, "assignmentId")
assignmentID, err := strconv.Atoi(requestedAssignmentID)
pairID, err := urlParamInt(r, "pairId")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not saved a FilePair but an Assignment, that's because it should be:
s/pairId/assignmentId

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense assuming the current primary key of the assignments table is correct.
Right now there isn't an assignment ID (again, following the schema).

If this looks bad, we could actually change the route to be PUT /experiments/{experimentId}/, and get hte pairID from the body.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add assignmentId to schema to follow rest conventions and don't rewrite frontend (I believe it's easier to add id, than change FE)

}

if err := repository.UpdateAssignment(assignmentID, assignmentRequest.Answer, assignmentRequest.Duration); err != nil {
err = repo.Update(userID, pairID, assignmentRequest.Answer, assignmentRequest.Duration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not saved a FilePair but an Assignment, that's because it should be:
s/pairId/assignmentId

We considered that the same FilePair can be assigned to the same User many times in different experiments, so it can not be PRIMARY KEY

}

cmd := fmt.Sprintf(
"UPDATE assignments SET answer='%v', duration=%v WHERE user_id=%v AND pair_id=%v",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We considered that the same FilePair can be assigned to the same User many times in different experiments, so it can not be PRIMARY KEY

The idea of checking the user_id is redundant if it is used the assignmentId, but it can be a good idea to ensure that the assignment being saved, belongs to the user in session.

"INSERT INTO experiments (name, description) VALUES ($1, $2)",
exp.Name, exp.Description)

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem
Can the insertion succeed, but the Get fail?

return err

"INSERT INTO users (login, username, avatar_url, role) VALUES ($1, $2, $3, $4)",
user.Login, user.Username, user.AvatarURL, user.Role)

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem
Can the insertion succeed, but the Get fail?

return err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's unlikely, the Get may fail if the DB connection breaks, or someone deletes the entry at that very exact moment.
In that case the user pointer is not correctly populated with the assigned ID, and I'd say the method should notify of this returning an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm... I see two patterns:

  • error := userRepo.Create(user) -> it should not modify the user, but return an error if it failed storing in the DB
  • error := user.Save() -> it stores the user, and assigns its ID

If we go with the first one, the ID should be assigned from the outside of the Create method, I mean: instead of hiding the .Get call inside the .Create method, doing it from the outside:

if error := userRepo.Create(userToStore); err != nil { 
    return err
}
// in some cases, we won't need to continue, so the Get call won't be
// neccessary, but if we need a user with ID, we can continue:
user, err := userRepo.Get(user.Login)
if err != nil { 
    return err
}
//at this point, we have a valid user
fmt.Println(user)

^ would you approve that @smacker ?

I saw that many times to avoid hiding complexity in methods that seem to be "low level" (and userRepo.Create seems to be that kind of methods)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS:
if it is assumed a possible race condition and the necessity of calling .Get from the .Create to modify the user retrieved as an input parameter, it can happen unexpected things;
It can happen that the final (modified) user will be quite different than the input one (another Name, Login...) so the caller function will operate with that user without noticing it was "changed in behind" by the .Create func.

(that's another reason because it could be a bad idea to modify the input user inside the .Create func)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with current solution and your proposal is okay too. Both approaches are common. For error in get and not in exec people invented transactions if we really need it.

I would suggest to merge it as is now.


// Get returns the Experiment with the given name. If the Experiment does not
// exist, it returns nil, nil
func (repo *Experiments) Get(name string) (*model.Experiment, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we assume this method is going to be needed in the future?
I can not imagine with which purpose... Any idea?

If it makes sense to need it, I'd rename by GetByName (same as GetById)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to avoid Not implemented methods at all as long as we don't need them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that non implemented methods are a good way to define the architecture and the API, but only when it is really possible to need them. But in this case, I don't see when are we using repo.Experiment.GetByName

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's exactly what I mean. If we don't see where we are going to use them soon ("don't need" in my previous comment), let's avoid them.


type githubUser struct {
// GithubUser represents the user response returned by the GitHub auth.
type GithubUser struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does not need to be public, because handlers does not build it, just use it

@bzz
Copy link
Contributor

bzz commented Feb 6, 2018

@carlosms to speed up the reviews, could you please summarize the current status of the PR?

Asking, as In PR description there are some TODO items that are not checked and then if I'm not mistaken, there seems to be some un-addressed feedback.

If addressing the reviews requires substantial time, I find very useful putting together a comment with markdown checkboxes, to visualize the progress better and this way to communicate the status with reviewers.

@carlosms
Copy link
Contributor Author

carlosms commented Feb 6, 2018

@bzz the description checkboxes were TODOs I found myself to remind me of doing that when the final version before the merge was ready.

  • Update README with new env variable (done in e23160c)
  • Check if DB strings need to be escaped

The current un-adressed feedback are the following two changes:

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Remove named return values
Use unexported types
Remove unused methods

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
@carlosms
Copy link
Contributor Author

carlosms commented Feb 6, 2018

Do you @smacker & @dpordomingo think we can leave the requirement 'Allow file-pairs to belong to several experiments' out of this PR?

I propose to open an issue to do it later it in a different PR once this branch is in master. Reasoning:

}

if userID != assignment.UserID {
return nil, serializer.NewHTTPError(http.StatusUnauthorized,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. StatusForbitten may be better here. It's a common approach to logout user on StatusUnauthorized when there is a separate frontend application. Because you usually return this error in auth middleware. (for example when token is outdated)
  2. is it better to Get(id, userID)?

not a blocker though

func (repo *FilePairs) getWithQuery(queryRow *sql.Row) (*model.FilePair, error) {
var pair model.FilePair

var a, b model.File
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something weird is going on here. You never use a&b beside scanning values to them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this should be pair.Left and Right. I missed it because the serializer does nothing with these two values.

r.users = append(r.users, m)
return nil

user, err = repo.Get(user.Login)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need this get to get id?
this way it's not going to work. Simplified example of what you are doing: https://play.golang.org/p/dlEmRgTL1uN

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Copy link
Contributor

@dpordomingo dpordomingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; you did a great work @carlosms
Many thanks!!!

I have only one concern about the final implementation as explained in the comment #52 (comment) and #52 (comment), but if @smacker disagrees I can stand without it.

The thinks you marked as "TODO", imo can be done in a separated PR, and they're not blocker:

  • Update README with new env variable -> just agree with @smacker how to push to #29
  • Allow file-pairs to belong to several experiments -> your current implementation seems to follow our specs, and it works; it could be done in the future if really needed.

Before merging I'd only require:

  • open an issue explaining the (possible) problem with unescaped things,
  • rebase and squash to obtain a history that makes sense to you.

"INSERT INTO users (login, username, avatar_url, role) VALUES ($1, $2, $3, $4)",
user.Login, user.Username, user.AvatarURL, user.Role)

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm... I see two patterns:

  • error := userRepo.Create(user) -> it should not modify the user, but return an error if it failed storing in the DB
  • error := user.Save() -> it stores the user, and assigns its ID

If we go with the first one, the ID should be assigned from the outside of the Create method, I mean: instead of hiding the .Get call inside the .Create method, doing it from the outside:

if error := userRepo.Create(userToStore); err != nil { 
    return err
}
// in some cases, we won't need to continue, so the Get call won't be
// neccessary, but if we need a user with ID, we can continue:
user, err := userRepo.Get(user.Login)
if err != nil { 
    return err
}
//at this point, we have a valid user
fmt.Println(user)

^ would you approve that @smacker ?

I saw that many times to avoid hiding complexity in methods that seem to be "low level" (and userRepo.Create seems to be that kind of methods)

"INSERT INTO users (login, username, avatar_url, role) VALUES ($1, $2, $3, $4)",
user.Login, user.Username, user.AvatarURL, user.Role)

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS:
if it is assumed a possible race condition and the necessity of calling .Get from the .Create to modify the user retrieved as an input parameter, it can happen unexpected things;
It can happen that the final (modified) user will be quite different than the input one (another Name, Login...) so the caller function will operate with that user without noticing it was "changed in behind" by the .Create func.

(that's another reason because it could be a bad idea to modify the input user inside the .Create func)

@dpordomingo
Copy link
Contributor

Since @smacker is ok with both approaches #52 (comment) lets merge when @bzz approves the PR and:

  • open an issue explaining the (possible) problem with unescaped things, if necessary,
  • rebase and squash to obtain a history that makes sense to you.

@bzz
Copy link
Contributor

bzz commented Feb 7, 2018

Both, handling 'Allow file-pairs to belong to several experiments' out of this PR and adding usage of the transactions for complex failure modes in subsequent PRs makes sense to me.

Great job @carlosms !

Probably a nitpick but 2 things:

  • One thing I find useful is to have string literal extracted to the constants inside the code.
    The reasons are:

    • it makes SQL queries more visible
    • they are grouped in the single place
    • it simplifies the subsequent refactoring
    • and facilitates re-use (easier to notice if you already have it defined)
  • Another minor but nice practice is
    In case you are using //TODO to label some short-term subsequent fixes is to save some time for a readers by adding //TODO(gh-username) so no need for git glame while looking at it later.

Would you mind adding those?

One very last and very minor thing, but for me, as a reviewer, it's a bit confusing to merge PR with todo checkboxes not checked in PR description/comments of PR author - they are communicating the idea of more work needed to be done. It would help to avoid the confusion if you could update #52 (comment) and #52 (comment) .

Feel free to use strikethough for items that are intended not to be checked in this PR, as they have a link to elsewhere, where this item is been taken care of.

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
carlosms added a commit to smacker/code-annotation that referenced this pull request Feb 7, 2018
This new env var is introduced by code in
src-d#52

Signed-off-by: Carlos Martín <carlos.martin.sanchez@gmail.com>
Copy link
Contributor

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thank you Carlos!

@carlosms
Copy link
Contributor Author

carlosms commented Feb 7, 2018

Replying @dpordomingo

open an issue explaining the (possible) problem with unescaped things, if necessary,

With the final code strings are inserted using the $N query placeholders, which should be safe. The only place where a "raw" string is used is here, but the string is validated to be one of model.Answers. So I won't open an issue, no work is needed.

In reply to @bzz

Both, handling 'Allow file-pairs to belong to several experiments' out of this PR

New issue created: #66

and adding usage of the transactions for complex failure modes in subsequent PRs makes sense to me.

Are you refering to this comment #52 (comment) ? I don't think a transaction is the proposed solution, as @smacker suggested to leave the current code as it is.

@bzz
Copy link
Contributor

bzz commented Feb 7, 2018

Sounds good to me. Let's :shipit:

@carlosms carlosms merged commit 2fd37c9 into src-d:master Feb 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants