Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shallow cloning can create pull requests with one million commits #2556

Closed
marc-hb opened this issue Nov 6, 2020 · 11 comments
Closed

Shallow cloning can create pull requests with one million commits #2556

marc-hb opened this issue Nov 6, 2020 · 11 comments
Assignees
Labels
P3 Low-impact bugs or features

Comments

@marc-hb
Copy link
Collaborator

marc-hb commented Nov 6, 2020

Summary: the gap left when shallow cloning can cause git merge-base to report a totally wrong merge base. This wrong merge base makes the pull request look like it's submitting all git commits since the start of the project.

Experienced in failing https://travis-ci.org/github/thesofproject/linux/jobs/740309392, https://travis-ci.org/github/thesofproject/linux/jobs/739977895 and others.

Bug announced in #2547 and tentative #2543

Cc: @kv2019i , @plbossart , @ranj063 , @xiulipan

Plug'n'play reproduction with the 100-lines long shallow_backfire.sh script. This reproduction script is purely made of git commands only: no github, no Travis,...

# After git clone --depth 20 target_branch :


#                     I---------------------------------->             upstream        
#                   grafted                               \                            
#                                                          \                           
#                                                           \                          
#                                                            \  frequent "back merges" 
#                                                             \                        
#                                                              \                       
#                                                               \                      
#                                                                v                     
#                                                        I-------------> target branch 
#                                                      grafted                         



#  After git fetch pull_request :


#                          Spurious                                                    
#                          pull request                                                
#                          merge base                                                  
#    ---------         I---------------------------------->             upstream       
#              \     grafted   \                           \                           
#               \               \                           \                          
#                \               \                           \                         
#   1 million     \               \                           \
#   commits!       \               \                           \                       
#                   \               \      Actual               \                      
#                    \               \     pull request          \                     
#                     v               v    merge base             v                    
#   --------------------------------------------          I-------------> target branch
#                                                \      grafted                        
#                                                 \                                    
#                                                  \                                   
#                                                   -  pull_request                    

# ASCII art made with the excellent textik.com

cc:

@keyonjie
Copy link

keyonjie commented Nov 9, 2020

@marc-hb this should be reported on the CI repo?

@keyonjie keyonjie added the Travis CI Travis CI issues label Nov 9, 2020
@marc-hb marc-hb removed the Travis CI Travis CI issues label Nov 9, 2020
@marc-hb marc-hb changed the title Shallow cloning in CI can create pull requests with one million commits Shallow cloning can create pull requests with one million commits Nov 9, 2020
@marc-hb
Copy link
Collaborator Author

marc-hb commented Nov 9, 2020

Not sure what the "CI repo" is. I removed "CI" from the title. Anyone or anything using shallow clones is affected, not just CI.

Only Linux is affected because other repos use git in much simpler ways.

It's probably not a bug, just a strange but probably intended git fetch behaviour/limitation.

Maybe there's some magit git fetch option to workaround this issue?

@keyonjie
Copy link

@marc-hb I change this to P3 as it is not our driver related or at least nobody is actually looking into it, feel free to change it if needed.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Feb 22, 2021

No, problem it was not P1 or P2 either :-)

One possible fix: fetching the PR commits directly:
https://docs.github.com/en/rest/reference/pulls#list-commits-on-a-pull-request

Github now allows fetching SHA1s directly , see zephyrproject-rtos/west#475 and https://git-scm.com/docs/git-fetch-pack

@marc-hb marc-hb self-assigned this Feb 22, 2021
marc-hb added a commit to marc-hb/sof that referenced this issue Mar 4, 2021
Let's try to run checkpatch in an open-source and fast way while
leveraging Github's very good user interface.

Run checkpatch both with and without --strict; this has been a source of
confusion in the past, see
thesofproject/linux#1988

Note this attempt does not rely on git merge-bases which has been found
to be not compatible with shallow cloning:
thesofproject/linux#2556

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb added a commit to marc-hb/sof that referenced this issue Mar 4, 2021
Let's try to run checkpatch in an open-source and fast way while
leveraging Github's very good user interface.

Run checkpatch both with and without --strict; this has been a source of
confusion in the past, see
thesofproject/linux#1988

Note this attempt does not rely on git merge-bases which has been found
to be not compatible with shallow cloning:
thesofproject/linux#2556

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb
Copy link
Collaborator Author

marc-hb commented Mar 4, 2021

This different approach should fix it: thesofproject/sof#3899

3899 is submitted to thesofproject/sof for now but I tested it locally with the large #2772 and it ran successfully in less than 10 minutes without trying to checkpatch every linux commit since the dawn of time. So 3899 could be copied (sorry) to thesofproject/linux later.

lgirdwood pushed a commit to thesofproject/sof that referenced this issue Mar 5, 2021
Let's try to run checkpatch in an open-source and fast way while
leveraging Github's very good user interface.

Run checkpatch both with and without --strict; this has been a source of
confusion in the past, see
thesofproject/linux#1988

Note this attempt does not rely on git merge-bases which has been found
to be not compatible with shallow cloning:
thesofproject/linux#2556

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@plbossart
Copy link
Member

@marc-hb do we still have an issue or did you fix this with your work on GitHub actions

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jun 10, 2021

I prototyped something in thesofproject/sof and it's been working great.
https://github.com/thesofproject/sof/blob/73df64444b752/.github/workflows/codestyle.yml#L38

It only needs a couple minor fixes and a way to re-use in this repo here without copy/paste/diverge. Afraid the latter would mean creating a new Github Action, something I've never done before.

I would like to keep this issue open as low priority because that is its accurate status right now but no big deal if you prefer to close it. Thanks for asking.

@plbossart
Copy link
Member

I'll close @marc-hb, you're the only one working on this and we don't need to track this as an issue. Thanks for your work btw, I promise that on a rainy day I'll finally try to understand the git clones :-)

@marc-hb
Copy link
Collaborator Author

marc-hb commented Mar 2, 2023

Looks like much better options are now available in git:
https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/

@plbossart
Copy link
Member

What is the suggested improvement @marc-hb ?

@marc-hb
Copy link
Collaborator Author

marc-hb commented Mar 6, 2023

Clone using one of the (new?) --filter=XX options described in the article. I don't know which one, it requires some testing.

On a different performance topic, this could help share CI scripts and CI optimizations across multiple github projects: https://docs.github.com/en/actions/using-workflows/reusing-workflows. Again this requires investigation and testing.

cc:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low-impact bugs or features
Projects
None yet
Development

No branches or pull requests

3 participants