Skip to content

WIP: Add GET_OID_GENTLY #504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

WIP: Add GET_OID_GENTLY #504

wants to merge 6 commits into from

Conversation

ungps
Copy link
Contributor

@ungps ungps commented Jun 7, 2018

No description provided.

@ungps ungps force-pushed the get_oid_gently branch 7 times, most recently from 5fc4ce1 to ade03a8 Compare June 9, 2018 20:37
Copy link
Contributor

@tgummerer tgummerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if some of this has been discussed before, but I can't find the old comments.

I'm not very familiar with this code, but hope this helps anyway :)

sha1-name.c Outdated
if (ret == -1) {
return -1;
}
if (ret) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still die in this if condition, even after this change I think, in line 833. Or am I missing something, and that case would only be hit if ret == -1? I feel like it would be clearer to just guard against the die here, to not leave readers such as myself wondering about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be guarded against the die, just to be safe and make code more easy to be maintained.

sha1-name.c Outdated
return NULL;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the curlies here is an unrelated change, and probably shouldn't be done if there's no good reason for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Old habits die hard.

sha1-name.c Outdated
if (!is_inside_work_tree())
if (!is_inside_work_tree()) {
if (gently) {
return (char*) resolve_error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns a char *, which in C often means: "I transfer ownership over this return value to you, and you're now responsible for freeing it", while returning const char * means "I'm keeping ownership of this, please do not free it", and existing callers do indeed free the value.

So here the callers of resolve_relative_path_gently are supposed ot free what they get returned from this function, so this would probably crash if we hit this code path. (edit: I have now looked at the callsites, and if used like that it actually seems okay)

I also find it a bit weird that we're returning an error this way here. Would it be better to change the function signature to return an int as an indication whether or not the function call succeeded, and have an out parameter to return the string? Dunno what's best here.

Copy link
Contributor Author

@ungps ungps Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also in doubts which would be the best way to approach this, since I started to make the gently of the function. I guess that, by going with the int resolve_relative_path_gently(), there shouldn't be any problem regarding the free() calls in the future.

@dscho , what is your opinion regarding this matter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract of this function is to return NULL if nothing was resolved. I think we should keep this behavior, and output the message instead.

sha1-name.c Outdated
startup_info->prefix ? strlen(startup_info->prefix) : 0,
rel);
startup_info->prefix ? strlen(startup_info->prefix) : 0,
rel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated indentation change. If you want to make it this should be in a separate commit, although I'm not convinced it's worth making the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right.

tree-walk.c Outdated
}
return -1;
}

int get_tree_entry(const struct object_id *tree_oid, const char *name, struct object_id *oid, unsigned *mode)
int get_tree_entry_gently(const struct object_id *tree_oid, const char *name, struct object_id *oid, unsigned *mode, int gently)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was already long, and is now getting even longer. Maybe it's time to break it up, and put parameters on a new line?

Although I do see a lot of function signatures that are all long and on a single line, so maybe we can shrug this off as "follows existing style in this file".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that the best way would be to do as Documentation/CodingGuidelines states. In a future patch (since it is not related to GET_OID_GENTLY), I will fix the existing tiny mistakes regarding code styling.

I will edit this.

tree-walk.c Outdated
if (gently) {
if (init_tree_desc_gently(&t, tree, size)) {
free(tree);
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to set retval to -1 here, and then goto done. If at some point this function grows, and we allocate some more memory before here, when we make the change we would probably forget to free it. Although I see we do things the same way in line 557-558, so maybe that's not a valid concern.

Copy link
Contributor Author

@ungps ungps Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. I am not sure though what is git org's opinion on using goto statements. In most cases, such a statement would make the code more unreadable; but I think in this case it would be good alternative and actually make the code future-proof.

Thanks for the review!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ git grep goto | wc -l
1032

While there may be a few false positives in there, from a quick look it looks like mostly these are goto statements. So I'd say it's fairly widespread, and I do feel like it would make sense to use it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fairly common to use goto in Git's source code to simplify resource management (free, close) in error code paths.

Copy link
Contributor

@tgummerer tgummerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also in doubts which would be the best way to approach this, since I started to make the gently of the function. I guess that, by going with the int resolve_relative_path_gently(), there shouldn't be any problem regarding the free() calls in the future.

Ah I didn't realize there was some previous discussion on this. So while I find returning a return value easier to understand personally, I'll trust @dscho's opinion on it, he has a lot more experience in this codebase :)

(Have to reply outside of the thread, as GitHubs UI seemingly doesn't allow continuing that thread after a force push, one of the more annoying parts of the GitHub UI)

tree-walk.c Outdated
if (gently) {
if (init_tree_desc_gently(&t, tree, size)) {
free(tree);
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ git grep goto | wc -l
1032

While there may be a few false positives in there, from a quick look it looks like mostly these are goto statements. So I'd say it's fairly widespread, and I do feel like it would make sense to use it here.

@dscho
Copy link
Member

dscho commented Jun 16, 2018

A word on the commit messages: please use the imperative, i.e. Add .... In most cases, Junio prefers something like tree-walk: add two new gentle helpers (i.e. starting by a word or term followed by a colon, then continuing in lower-case).

Also, I wonder how those two commits came about that are not authored by you, but committed by you... was this a funny rebase?

@ungps ungps force-pushed the get_oid_gently branch 3 times, most recently from e8ed335 to 3aac2d4 Compare June 16, 2018 15:37
@ungps
Copy link
Contributor Author

ungps commented Jun 16, 2018

I changed the name of the commits to be more alike to the format you suggested. I think that I should also squash WIP: tree-walk: add two new gentle helpers and WIP: tree-walk: add one new gentle helper into one commit.. What do you say?

About the rebase, I do not remember what I did (I bet it was something stupid), but now it's all good! 👍

@dscho
Copy link
Member

dscho commented Jun 17, 2018

I changed the name of the commits to be more alike to the format you suggested.

Thanks!

I think that I should also squash WIP: tree-walk: add two new gentle helpers and WIP: tree-walk: add one new gentle helper into one commit.. What do you say?

That sounds like a good idea.

About the rebase, I do not remember what I did (I bet it was something stupid), but now it's all good! 👍

:-) I still surprise myself at times with entertaining rebase results...

BTW how about running this through git rebase --autosquash? By now, I think I forgot enough about this PR that I would like to start a fresh review from scratch, sound okay?

@ungps ungps force-pushed the get_oid_gently branch 3 times, most recently from a476b2f to 232656d Compare June 17, 2018 11:48
@ungps
Copy link
Contributor Author

ungps commented Jun 17, 2018

Done!

@ungps ungps force-pushed the get_oid_gently branch 3 times, most recently from 0058bbf to 08a3f4c Compare June 17, 2018 15:13
@dscho
Copy link
Member

dscho commented Jun 20, 2018

The test failure looks like some connectivity issue on Travis' side ("No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.")

@@ -1314,12 +1314,14 @@ struct object_context {
#define GET_OID_FOLLOW_SYMLINKS 0100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the commit subject should use gentle instead of gently...

Also, it would be good to have a commit message body that goes a little bit into detail, say, talking about library functions and how callers may have callers may have callers that want to tell the user something helpful upon failure, such as advice how to get out of this mess.

cache.h Outdated
@@ -1314,12 +1314,14 @@ struct object_context {
#define GET_OID_FOLLOW_SYMLINKS 0100
#define GET_OID_RECORD_PATH 0200
#define GET_OID_ONLY_TO_DIE 04000
#define GET_OID_GENTLY 010000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the alignment of the numbers looks off only in GitHub's web view...

cache.h Outdated

#define GET_OID_DISAMBIGUATORS \
(GET_OID_COMMIT | GET_OID_COMMITTISH | \
GET_OID_TREE | GET_OID_TREEISH | \
GET_OID_BLOB)

extern int get_oid_gently(const char *str, struct object_id *oid, int gently);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instinctive reaction was that the gently is correct in the function name, but it would be gentle in the parameter name.

It seems that Git's source code agrees, by 6:3 votes (compare with git grep -c ', int gentle' vs git grep -c ', int gently').

sha1-name.c Outdated
return get_oid_with_context(name, 0, oid, &unused);
if (gently)
gently = GET_OID_GENTLY;
return get_oid_with_context(name, gently, oid, &unused);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or put it directly into the call:

return get_oid_with_context(name, gently ? GET_OID_GENTLY : 0, oid, &unused);

(It a good practice to avoid changing parameters, in particular when you change their role (from a Boolean to a bitfield of flags, in this case)).

sha1-name.c Outdated
if (is_missing_file_error(errno)) {
char *fullname = xstrfmt("%s%s", prefix, filename);

if (gently)
return -1;
if (!get_tree_entry(tree_oid, fullname, &oid, &mode)) {
die("Path '%s' exists, but not '%s'.\n"
"Did you mean '%.*s:%s' aka '%.*s:./%s'?",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need those error messages (as warnings in the gentle case). Maybe note this in the commit message as something we might want to consider to do differently?

tree-walk.c Outdated
}
} else {
init_tree_desc(&t, tree, size);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here again, the simpler arm of the if ... else should come first.

}
free(tree);
return retval;
}

int get_tree_entry(const struct object_id *tree_oid, const char *name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this function is used in too many places (23, if I counted correctly) to change the signature: You would have to add tons of , 0, making the patch harder to read. So I'd be okay with renaming get_tree_entry() (but I would suggest describing the reason in the commit message).

tree-walk.c Outdated
tree = read_object_with_reference(&current_tree_oid,
tree_type, &size,
&root);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again: simpler conditional block first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll not repeat this for the rest of the file, it really applies to all if...else conditionals ;-))

tree-walk.c Outdated
@@ -752,6 +825,12 @@ enum follow_symlinks_result get_tree_entry_follow_symlinks(struct object_id *tre
return retval;
}

enum follow_symlinks_result get_tree_entry_follow_symlinks(struct object_id *tree_oid, const char *name, struct object_id *result, struct strbuf *result_path, unsigned *mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only three callers of this function, currently. I think we can change the signature to accept that gentle flag.

@@ -1773,12 +1773,12 @@ static int get_oid_with_context_1(const char *name,
if (new_filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit subject: make ... more gentle (it's an adjective in this form, not an adverb).

sha1-name.c Outdated
ret = get_tree_entry(&tree_oid, filename, oid,
&oc->mode);
ret = get_tree_entry_gently(&tree_oid, filename, oid,
&oc->mode, flags & GET_OID_GENTLY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to invert the commit order, I guess, to make lower-level functions optionally gentle first, and only then use that in subsequent commits to make callers more gentle.

(I know, this is not the way it was developed, but it sounds more logical to me, as a reader of the patches.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I know, this is not the way it was developed, but it sounds more logical to me, as a reader of the patches.)

I totally agree with this. I guess it would make even more sense to split the first commit since it does more things than it states. This way the commit order can be also changed.

In my opinion, this would be the (right) order:

  1. WIP: sha1-name: Add GET_OID_GENTLY flag
  2. WIP: tree-walk: Add three new gentle helpers
  3. WIP: refs.c: Teach read_ref_at() to accept GET_OID_GENTLY flag
  4. WIP: sha1-name: Teach get_oid_basic() to be gentle
  5. WIP: sha1-name: Teach get_oid_with_context[_1]() to be gentle
  6. WIP: sha1-name: Add gentle alternative for get_oid()

First commit only introduces the GET_OID_GENTLY flag; second and third teach low-level functions to accept GET_OID_GENTLY as flag and act upon. Fourth, fifth and sixth make get_oid.*() functions gentle. I guess this would be the correct order, since get_oid_basic() is called by get_oid_with_context_1() which is called by get_oid_with_context() which is called by get_oid().

The first commit also has more details about the process of adding a gentle option get_oid() in the commit message.

@ungps ungps force-pushed the get_oid_gently branch from 08a3f4c to 22da316 Compare July 2, 2018 15:20
The current API does not provide a method to call
`get_oid()` and to avoid `exit()` to be called. This commit
intention is to introduce a flag in order to make `get_oid()`
able to get the sha1 safely, without exiting the program.

Since `get_oid()` calls a lot of functions, which call other
functions as well (and so on), there are a lot of cases in which
`exit()` could be called. To make this idea more clear, here
is one example, which could cause `get_oid()` to die.

  get_oid() -> get_oid_with_context() -> get_oid_with_context_1()
  -> get_oid_1() -> read_ref_at() -> exit()

Where `function1() -> function2()` means that `function1()` might
call `function2()` at some point.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
@ungps ungps force-pushed the get_oid_gently branch from 22da316 to 225bd20 Compare July 2, 2018 15:22
Add `get_tree_entry_gently()`, `find_tree_entry_gently()`
and `get_tree_entry_follow_symlinks_gently()`, which will
make `get_oid()` to be more gently.

Since `get_tree_entry()` is used in more than 20 places,
adding a new parameter will make this commit harder to read.
In every place it is called there will need to be an additional
0 parameter at the end of the call. The solution to avoid this is
to rename the function in `get_tree_entry_gently()` which gets
an additional `flags` variable. A new `get_tree_entry()`
will call `get_tree_entry_gently()` with `flags` being 0.
This way, no additional changes will be needed.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
ungps added 4 commits July 16, 2018 23:58
This commit introduces a way to call `read_ref_at()` without
exiting on failure.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
After teaching `read_ref_at()` we need to teach `get_oid_basic()`
that `read_ref_at()` might not call `exit()`, but to report an
error by the return value.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
This commit makes `get_oid_with_context()` and `get_oid_with_context_1()`
to recognize the `GET_OID_GENTLY` flag.

The `gentle` flag does not imply `quiet` and we might need to reconsider
whether we should display any message if `GET_OID_GENTLY` is given.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
Add `get_oid_gently()` to be a gentle alternative to `get_oid()`.

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
@dscho
Copy link
Member

dscho commented Jun 3, 2019

@ungps I guess this is still an ongoing project, eh?

@dscho
Copy link
Member

dscho commented Jun 24, 2019

@ungps ping?

@ungps
Copy link
Contributor Author

ungps commented Jun 25, 2019

@ungps ping?

Sorry for late reply; I was caught with exams and wasn't so active online. But yes, I hope I will resume the work in a week.

@dscho
Copy link
Member

dscho commented Jun 25, 2019

I hope I will resume the work in a week.

Excellent!

@dscho
Copy link
Member

dscho commented Sep 3, 2019

@ungps ping?

@ungps
Copy link
Contributor Author

ungps commented Sep 10, 2019

Hello.

It will take more time than I thought. I started an internship on 1st of July and my schedule is pretty messed up (mostly because of me, not of the internship). Also, I find this issue to be a little bit difficult to work on because it takes a lot time to explore the codebase (go a long way back in the caller graph, through all callees and understand what happens there). I need to find a big window of free time instead of trying to work "fragmented" because I lose the flow in between.

@dscho
Copy link
Member

dscho commented Sep 11, 2019

Makes sense. So let's keep this PR open.

@dscho
Copy link
Member

dscho commented Jun 25, 2020

@ungps are you still planning on working on this? Otherwise, I think it would be okay to simply close this PR.

@ungps
Copy link
Contributor Author

ungps commented Jun 26, 2020

Hi @dscho .
For the moment I think it is better to just close it since I do not know exactly when I will find enough time. I hope that I will be able to work as soon as possible.

@ungps ungps closed this Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants