Skip to content

sarif: initial implementation of csdiff fingerprints #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 3, 2024

Conversation

kdudka
Copy link
Member

@kdudka kdudka commented Mar 29, 2024

csdiff/v0 fingerprints

It hashes the data that csdiff uses in its matching algorithm.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

csdiff/v1 fingerprints

It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces. For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by csgrep --embed-context.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: #98

@kdudka kdudka self-assigned this Mar 29, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Mar 29, 2024
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
Closes: csutils#168
@kdudka kdudka force-pushed the sarif-fingerprints branch from 73afae8 to 9ef8967 Compare March 29, 2024 21:00
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch from 4364c7d to 4e4c284 Compare April 1, 2024 13:36
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch from 4e4c284 to d6af004 Compare April 1, 2024 13:50
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch 2 times, most recently from 53621d0 to 05db189 Compare April 1, 2024 13:55
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 1, 2024
@kdudka kdudka marked this pull request as ready for review April 1, 2024 14:12
@kdudka kdudka force-pushed the sarif-fingerprints branch from 05db189 to 1197c75 Compare April 9, 2024 14:51
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 9, 2024
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 19, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch from 1197c75 to 0308d93 Compare April 19, 2024 13:22
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 23, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch from 0308d93 to 596864f Compare April 23, 2024 08:19
kdudka added a commit to kdudka/csdiff that referenced this pull request Apr 23, 2024
@kdudka kdudka force-pushed the sarif-fingerprints branch from 596864f to b463599 Compare April 23, 2024 08:26
@jamacku
Copy link
Member

jamacku commented Apr 26, 2024

Great example when codeql fingerprints work:

✋ Defects, NEEDS INSPECTION
Error: SHELLCHECK_WARNING:
./src/functions.sh:309:46: warning[SC2154]: GITHUB_REF is referenced but not assigned.
#  306|   uploadSARIF () {
#  307|     is_debug && local verbose=--verbose
#  308|   
#  309|->   echo '{"commit_oid":"'"${HEAD}"'","ref":"'"${GITHUB_REF//merge/head}"'","analysis_key":"differential-shellcheck","sarif":"'"$(gzip -c output.sarif | base64 -w0)"'","tool_names":["differential-shellcheck"]}' > payload.json
#  310|   
#  311|     local curl_args=(
#  312|       "${verbose:---silent}"

Error: SHELLCHECK_WARNING:
./src/functions.sh:331:24: info[SC2312]: Consider invoking this command separately to avoid masking its return value (or use '|| true' to ignore).
#  328|   
#  329|   get_shellcheck_version () {
#  330|     local shellcheck_version
#  331|->   shellcheck_version=$(shellcheck --version | grep -w "version:" | cut -s -d ' ' -f 2)
#  332|   
#  333|     echo "${shellcheck_version}"
#  334|   }

Error: SHELLCHECK_WARNING:
./src/functions.sh:331:47: info[SC2312]: Consider invoking this command separately to avoid masking its return value (or use '|| true' to ignore).
#  328|   
#  329|   get_shellcheck_version () {
#  330|     local shellcheck_version
#  331|->   shellcheck_version=$(shellcheck --version | grep -w "version:" | cut -s -d ' ' -f 2)
#  332|   
#  333|     echo "${shellcheck_version}"
#  334|   }

The code hasn't changed for the last two defects, but they were still reported by csdiff.

I'll try to check if csdiff fingerprints would give the same results as codeql.

@jamacku
Copy link
Member

jamacku commented Apr 26, 2024

Here is the csgrep output from before the change and after:

You should be able to reproduce the above situation form:

kdudka pushed a commit to kdudka/csdiff that referenced this pull request Apr 26, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
@kdudka kdudka force-pushed the sarif-fingerprints branch from b463599 to 95e35d3 Compare April 26, 2024 15:51
@kdudka
Copy link
Member Author

kdudka commented Apr 26, 2024

@jamacku Nice! I have added it as a regression test in 95e35d3. It passes with ca7f32a but not without it.

Copy link
Member

@jamacku jamacku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kdudka

@jamacku jamacku requested a review from lzaoral April 26, 2024 16:24
kdudka pushed a commit to kdudka/csdiff that referenced this pull request May 2, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
@kdudka kdudka force-pushed the sarif-fingerprints branch from 95e35d3 to 95d7570 Compare May 2, 2024 15:15
@kdudka kdudka requested a review from lzaoral May 2, 2024 15:46
@lzaoral
Copy link
Member

lzaoral commented May 2, 2024

Somehow, the comment I've just written is not here. Reposting:

Copy link
Member

@lzaoral lzaoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you if you want to use a move or copy constructor for the elements of the map. Otherwise, LGTM.

jamacku added a commit to jamacku/differential-shellcheck that referenced this pull request May 3, 2024
This is preparation for Fingerprints that are coming in next version of csdiff.

csutils/csdiff#168

> It hashes the data that csdiff uses in its matching algorithm
> and the line content without spaces. For this fingerprint to
> be computed, the results need to include the line content for
> the key event in the format produced by `csgrep --embed-context`.
jamacku added a commit to redhat-plumbers-in-action/differential-shellcheck that referenced this pull request May 3, 2024
This is preparation for Fingerprints that are coming in next version of csdiff.

csutils/csdiff#168

> It hashes the data that csdiff uses in its matching algorithm
> and the line content without spaces. For this fingerprint to
> be computed, the results need to include the line content for
> the key event in the format produced by `csgrep --embed-context`.
kdudka added 9 commits May 3, 2024 12:52
It hashes the data that csdiff uses in its matching algorithm.
The interfaces are already prepared for csdiff/v1, which will
also take the line content into account when available.

From the updated tests it is obvious that these hashes already
have numerous collisions on the existing test data.

Related: https://issues.redhat.com/browse/OSH-9
Related: csutils#98
It hashes the data that csdiff uses in its matching algorithm
and the line content without spaces.  For this fingerprint to
be computed, the results need to include the line content for
the key event in the format produced by `csgrep --embed-context`.

Related: https://issues.redhat.com/browse/OSH-9
Resolves: csutils#98
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
... when the line content in the `csgrep --embed-context` format
is available.

Out of the 1783 fingerprints generated for the csgrep regression tests
we got 115 collisions, which need to be analyzed.  Some of them look
undesired as, for example:
```
Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:256: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 254|             release_header (hdr);
 # 255|             hdr->name = (void *)name;
 # 256|->           hdr->value = (void *)value;
 # 257|             hdr->release_policy = release_policy;
 # 258|             return;

Error: CERT EXP40-C (CWE-758):
wget-1.21.1/src/http.c:271: cert_exp40_c_violation: Casting pointer "value" with type "char const *" to type "void *" allows an object defined with a const-qualified type to be modified through use of an lvalue with non-const-qualified type.
 # 269|     hdr = &req->headers[req->hcount++];
 # 270|     hdr->name = (void *)name;
 # 271|->   hdr->value = (void *)value;
 # 272|     hdr->release_policy = release_policy;
 # 273|   }
```

Related: csutils#98
No change in behavior intended with this commit.

Related: csutils#98
... to make the code easier to follow.  No change in behavior intended
with this commit.

Related: csutils#98
kdudka pushed a commit to kdudka/csdiff that referenced this pull request May 3, 2024
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
Closes: csutils#168
@kdudka kdudka force-pushed the sarif-fingerprints branch from 972ab54 to 860502f Compare May 3, 2024 10:52
jamacku and others added 2 commits May 3, 2024 13:04
... taken from a real-world example encountered
by differential-shellcheck:
redhat-plumbers-in-action/differential-shellcheck#376

Related: csutils#98
We use boost-1.69 in our EPEL-7 builds.

Related: csutils#98
Closes: csutils#168
@kdudka kdudka force-pushed the sarif-fingerprints branch from 860502f to 4ae879f Compare May 3, 2024 11:06
@kdudka
Copy link
Member Author

kdudka commented May 3, 2024

@jamacku @lzaoral Thank you both for review!

@kdudka kdudka merged commit 4ae879f into csutils:main May 3, 2024
33 checks passed
@kdudka kdudka deleted the sarif-fingerprints branch May 3, 2024 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SARIF: Please add support for fingerprints in SARIF format 🐾
3 participants