first cut cve pipeline and refactor fetcher #2

Scanteianu · 2023-11-22T08:49:14Z

This PR is an MVP/POC level VDR Creation for Temurin.

The finished report is in data/vdr.json
Data downloaded from NIST and OJVG is also saved in data directory, sometimes as intermediate representations, in order to allow for offline testing and report re-creation when the original site is unavailiable.

tools to fetch OJVG reports and parse them to a json which then gets converted to cyclonedx vulnerabilities
tools to talk to NIST API to fetch extra information about cves when available
unit tests for tools
pipeline files to create a vdr from the NIST/OJVG websites

Things which are not yet here:

cve deduplication in case a cve shows up twice
better parsing of NIST affects versions/decoupling from Oracle jdk affects versions
backup plans for when data is not found or when NIST access is denied

Code organization:

individual components for reaching out to APIs and interpreting responses are in the cve_reporter package
pipelines are in the top level directory
tests are in the tests directory

tellison

Wondering if, in the long term, you will be better served with an intermediate representation of the vulnerabilities before you build a BOM model. It may allow us to visualize and debug the incoming data (OpenJDK list, NIST data, affected ranges, etc) before you finally combine them into a single BOM for consumption.

tellison · 2023-11-24T11:37:34Z

cvePipeline.py

+
+bom = report.get_base_bom()
+#todo: take date as arg or figure out other way to seed 
+vulns = fetch_vulnerabilities.fetch_cves('2023-01-17')


I'd love to see the fetching of CVEs result in a serialized JSON file. That way we can study it to check this part of the pipeline (which is highly likely to be affected by external OpenJDK website changes) is working as expected.

The pipeline can continue, and eventually with the option of running from the serialized fetched vulnerabilities file - so we don't have to run the full pipeline each time, or we can run from a patched file, etc.

there is now an intermediate json representation which we can dump to a file

tellison · 2023-11-24T11:39:54Z

cvereporter/fetch_vulnerabilities.py

+            affects = BomTarget(
+                ref=component
+            )
+            for v in affected_versions:
+                affects.versions.add(v)
+            vuln = Vulnerability(
+                id=id,
+                source=VulnerabilitySource(name="National Vulnerability Database", url=link),
+                #todo: dummy date
+                published=datetime.fromisoformat(date),
+                updated=datetime.fromisoformat(date),
+                description="",
+                recommendation=""
+            )
+            vuln.affects.add(affects)
+            vulnerabilities.append(vuln)
+            print(vuln)
+    return vulnerabilities


Do you want to build a BOM here, or just return a simple data structure that captures the info scraped from the website?

If BOMs provide all you need then fine, but simplify where you can at this stage IMHO.

Scanteianu · 2023-12-07T09:30:47Z

cvereporter/nist_enhance.py

+    for metrics in cve["metrics"]["cvssMetricV31"]:
+        #todo: do we need recommendations from NIST as well?
+        relevant = {}
+        relevant["source"] = metrics["source"]


hi @tellison is this the kind of intermediate data structure you were thinking about for representing the data before populating it into the BOM itself? (I know this is on the nist side, i can eventually move the ojvg side to a similar thing as well)

Scanteianu · 2024-01-24T23:07:04Z

cvereporter/nist_enhance.py

+    resp_dict["description"] = description
+    resp_dict["versions"] = extract_versions(cve["configurations"])
+    return resp_dict
+def extract_versions(cve_configs):


@tellison i think this is dubious, but i don't have a better way of finding an affects version. I'm open to suggestions here (i've basically manually parsed out the oracle jdk version, minus update, but we can special case that, and i'm assuming it's 1:1 with open jdk). There's code to extract it from openjvg, but they publish it at the top of the webpage, and the webpage can contain multiple cves, so i'm not sure that's the best place to get information

first cut cve pipeline and refactor fetcher

50fb462

tellison reviewed Nov 24, 2023

View reviewed changes

tim comments round 1

469576b

Scanteianu mentioned this pull request Dec 3, 2023

use cyclonedx lib to generate dummy pom #1

Closed

Scanteianu added 3 commits December 4, 2023 09:51

make api call to nist

af93137

begin extracting info from nist

1b82256

poplulate first cut rating info in cyclonedx

470f364

Scanteianu commented Dec 7, 2023

View reviewed changes

Scanteianu added 7 commits December 26, 2023 21:48

add basic pytest for ojvg fetch

43083f8

add super basic nist tests, save nist data for unit testing

3d22edf

extract description

1478ef6

intermediate dict rep

3692ae1

add affected versions

f945cba

extract versions

06ffbbe

make it work from commandline (remove extra imports)

428483f

Scanteianu commented Jan 24, 2024

View reviewed changes

Scanteianu and others added 6 commits January 24, 2024 23:22

refactor for bulk fetch

925ccd6

date iterator so we prepare to download all dates of vulnerabilities

78aee9c

add dummy headers to placate ojvg site

6316073

fetch all open jvg website, encode in openjvg_summary.json

b724e21

create a first cut vdr

924b7b9

add documentation to each file

ba9cfc1

Scanteianu requested a review from tellison February 15, 2024 14:02

Scanteianu mentioned this pull request Mar 13, 2024

Add initial VDR generation pipeline adoptium/temurin-vdr-generator#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first cut cve pipeline and refactor fetcher #2

first cut cve pipeline and refactor fetcher #2

Scanteianu commented Nov 22, 2023 •

edited

Loading

tellison left a comment

tellison Nov 24, 2023

Scanteianu Jan 24, 2024 •

edited

Loading

tellison Nov 24, 2023

Scanteianu Dec 7, 2023

Scanteianu Jan 24, 2024

first cut cve pipeline and refactor fetcher #2

Are you sure you want to change the base?

first cut cve pipeline and refactor fetcher #2

Conversation

Scanteianu commented Nov 22, 2023 • edited Loading

tellison left a comment

Choose a reason for hiding this comment

tellison Nov 24, 2023

Choose a reason for hiding this comment

Scanteianu Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

tellison Nov 24, 2023

Choose a reason for hiding this comment

Scanteianu Dec 7, 2023

Choose a reason for hiding this comment

Scanteianu Jan 24, 2024

Choose a reason for hiding this comment

Scanteianu commented Nov 22, 2023 •

edited

Loading

Scanteianu Jan 24, 2024 •

edited

Loading