Skip to content

A mildly simple approach at introducing the 'combined diff format' #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Handle empty commit headers
In some cases, especially for some merges, a commit can contain zero files.  This can throw off the header parser by interpreting the following headers as the body text of the first commit.  Down-stream, this causes the GitLeaks tool to identify the wrong commit ID with the leaked data.

This patch currently fails the empty-commit check.  A minor issue with the expected data eludes me at the moment.
  • Loading branch information
groboclown committed Jun 15, 2024
commit a79f9cd52874ad2c40de7d509d1f42d423aadd43
44 changes: 43 additions & 1 deletion gitdiff/parser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -630,7 +630,49 @@ Date: Tue Apr 2 22:55:40 2019 -0700
TextFragments: textFragments,
},
},
Preamble: binaryPreamble,
Preamble: textPreamble,
},
"emptyCommit": {
InputFile: "testdata/empty-commit.patch",
Output: []*File{
{
PatchHeader: &PatchHeader{
SHA: "48f13f61afb200ad8386573632cba3abd1703af2",
Author: &PatchIdentity{
Name: "Morton Haypenny",
Email: "mhaypenny@example.com",
},
AuthorDate: asTime("2017-11-03T12:29:49Z"),
Title: "A simple change",
Body: "The change is simple.",
},
OldName: "dir/file1.txt",
NewName: "dir/file1.txt",
OldMode: os.FileMode(0100644),
OldOIDPrefix: "422cce7",
NewOIDPrefix: "24b39ed",
TextFragments: textFragments,
},
{
PatchHeader: &PatchHeader{
SHA: "183d9cdbecda47e6e95acf9e4e23fa3a71ba99ad",
Author: &PatchIdentity{
Name: "Regina Smithee",
Email: "rsmithee@example.com",
},
AuthorDate: asTime("2017-09-14T19:46:12Z"),
Title: "Simple change",
Body: "Simple change body.",
},
OldName: "d1/file2.txt",
NewName: "d1/file2.txt",
OldMode: os.FileMode(0100755),
OldOIDPrefix: "c6513ff",
NewOIDPrefix: "bf4c6cc",
TextFragments: textFragments,
},
},
Preamble: textPreamble,
},
}

Expand Down
28 changes: 22 additions & 6 deletions gitdiff/patch_header.go
Original file line number Diff line number Diff line change
Expand Up @@ -294,10 +294,14 @@ func parseHeaderPretty(prettyLine string, r io.Reader) (*PatchHeader, error) {

if title != "" {
// Don't check for an appendix
body, _ := scanMessageBody(s, indent, false)
body, _, remainder := scanMessageBody(s, indent, false)
if s.Err() != nil {
return nil, s.Err()
}
if remainder != "" {
// There was another header immediately after this one.
return ParsePatchHeader(remainder)
}
h.Body = body
}

Expand Down Expand Up @@ -326,22 +330,34 @@ func scanMessageTitle(s *bufio.Scanner) (title string, indent string) {
return b.String(), indent
}

func scanMessageBody(s *bufio.Scanner, indent string, separateAppendix bool) (string, string) {
func scanMessageBody(s *bufio.Scanner, indent string, separateAppendix bool) (string, string, string) {
// Body and appendix
var body, appendix strings.Builder
c := &body
var empty int
for i := 0; s.Scan(); i++ {
line := s.Text()
baseLine := s.Text()

line = strings.TrimRightFunc(line, unicode.IsSpace)
line := strings.TrimRightFunc(baseLine, unicode.IsSpace)
line = strings.TrimPrefix(line, indent)

if line == "" {
empty++
continue
}

if baseLine == line && indent != "" {
// The line does not start with the indent.
var remainder strings.Builder
remainder.WriteString(baseLine)
remainder.WriteByte('\n')
for ; s.Scan(); i++ {
remainder.WriteString(s.Text())
remainder.WriteByte('\n')
}
return body.String(), appendix.String(), remainder.String()
}

// If requested, parse out "appendix" information (often added
// by `git format-patch` and removed by `git am`).
if separateAppendix && c == &body && line == "---" {
Expand All @@ -359,7 +375,7 @@ func scanMessageBody(s *bufio.Scanner, indent string, separateAppendix bool) (st

c.WriteString(line)
}
return body.String(), appendix.String()
return body.String(), appendix.String(), ""
}

func parseHeaderMail(mailLine string, r io.Reader) (*PatchHeader, error) {
Expand Down Expand Up @@ -402,7 +418,7 @@ func parseHeaderMail(mailLine string, r io.Reader) (*PatchHeader, error) {
h.SubjectPrefix, h.Title = parseSubject(subject)

s := bufio.NewScanner(msg.Body)
h.Body, h.BodyAppendix = scanMessageBody(s, "", true)
h.Body, h.BodyAppendix, _ = scanMessageBody(s, "", true)
if s.Err() != nil {
return nil, s.Err()
}
Expand Down
33 changes: 33 additions & 0 deletions gitdiff/patch_header_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,39 @@ Author: Morton Haypenny <mhaypenny@example.com>
Title: expectedTitle,
},
},
"emptyCommit": {
Input: `commit 989f970ba7e43cd0eac6fcf71acfd9c92effc047
Author: Morton Haypenny <mhaypenny@example.com>
Date: Sat Apr 12 05:20:49 2020 -0700

An empty commit

With a body.


commit 61f5cd90bed4d204ee3feb3aa41ee91d4734855b
Author: Morton Haypenny <mhaypenny@example.com>
Date: Sat Apr 11 15:21:23 2020 -0700

A sample commit to test header parsing


The medium format shows the body, which
may wrap on to multiple lines.


Another body line.


`,
Header: PatchHeader{
SHA: expectedSHA,
Author: expectedIdentity,
AuthorDate: expectedDate,
Title: expectedTitle,
Body: expectedBody,
},
},
}

for name, test := range tests {
Expand Down
83 changes: 83 additions & 0 deletions gitdiff/testdata/empty-commit.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@

commit 48f13f61afb200ad8386573632cba3abd1703af2
Author: Morton Haypenny <mhaypenny@example.com>
Date: Thu Nov 3 12:29:49 2017 +0000

A simple change

The change is simple.

diff --git a/dir/file1.txt b/dir/file1.txt
index 422cce7..24b39ed 100644
--- a/dir/file1.txt
+++ b/dir/file1.txt
@@ -3,6 +3,8 @@ fragment 1
context line
-old line 1
-old line 2
context line
+new line 1
+new line 2
+new line 3
context line
-old line 3
+new line 4
+new line 5
@@ -31,2 +33,2 @@ fragment 2
context line
-old line 4
+new line 6

commit dbbc5f1e926391d737e397ec60cc1ff94787e105
Author: Regina Smithee <rsmithee@example.com>
Date: Thu Sep 29 12:58:35 2017 +0000

Merged commit 3

Merged body 3.


commit f90c2e23af6618388cb4082f39849476e39105af
Merge: 989f970 e84c7ab
Author: Regina Smithee <rsmithee@example.com>
Date: Thu Sep 15 16:37:30 2017 +0000

Merged commit 2


commit 989f970ba7e43cd0eac6fcf71acfd9c92effc047
Merge: 183d9cd e84c7ab
Author: Regina Smithee <rsmithee@example.com>
Date: Thu Sep 15 14:41:57 2017 +0000

Merged commit 1


commit 183d9cdbecda47e6e95acf9e4e23fa3a71ba99ad
Author: Regina Smithee <rsmithee@example.com>
Date: Wed Sep 14 19:46:12 2017 +0000

Simple change

Simple change body.

diff --git a/d1/file2.txt b/d1/file2.txt
index c6513ff..bf4c6cc 100755
--- a/d1/file2.txt
+++ b/d1/file2.txt
@@ -3,6 +3,8 @@ fragment 1
context line
-old line 1
-old line 2
context line
+new line 1
+new line 2
+new line 3
context line
-old line 3
+new line 4
+new line 5
@@ -31,2 +33,2 @@ fragment 2
context line
-old line 4
+new line 6