Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Vulnerabilities in Runtime Image #1491

Open
codyharris-h2o-ai opened this issue Mar 20, 2024 · 24 comments
Open

Resolve Vulnerabilities in Runtime Image #1491

codyharris-h2o-ai opened this issue Mar 20, 2024 · 24 comments
Labels
area/security Security issues

Comments

@codyharris-h2o-ai
Copy link

Hello!
As part of our ongoing to ensure the security of our products, one or more vulnerabilties requiring redmediation have been identified.

Vulnerability Severity Image Package Description
CVE-2023-37920 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 certifi:2022.12.7 Certifi is a curated collection of Root Certificates for validating the trustworthiness of SSL certificates while verifying the [...]
CVE-2023-48022 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 ray:2.9.3 Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's p[...]

To resolve this, we recommend the following approach:

  1. Install trivy (https://aquasecurity.github.io/trivy)
  2. Scan the current version of the image using a command like trivy image --scanners vuln --severity CRITICAL,HIGH --timeout 60m [...image address...]
  3. Validate that the CVEs are detected using trivy. The provided scans were taken using a different scanner (ECR), so the first step should be to validate that trivy can see them as well.
  4. Iterate to resolve the vulnerabilities. trivy enables you to scan the image without pushing them, so it should help in finding the resolution
  5. Test and publish the fix version, and let us know where we can find the fixed image(s) so we can validate the fixes on our side as well.

Note that we disregard the severity levels assigned by various tools and operate soley on CVSS in line with NIST guidelines. Also note that this scan was performed by ECR, so the results will likely be different. It is in our experience that Trivy produces more results than ECR or Prisma.

@codyharris-h2o-ai codyharris-h2o-ai added the area/security Security issues label Mar 20, 2024
@pseudotensor
Copy link
Collaborator

pseudotensor commented Mar 20, 2024

  • The certifi package is 2024.2.2 in image 0.2.0 408. The older vulnerable version being detected is in a "pkgs" folder that is unused and just part of conda base installation before installing other packages. So the notice is a false positive on the wrong version.

  • There's no resolution for the ray package, no new version is specified, no action can be taken as it's required part of vLLM. Ray is not exposed directly, only the vLLM port that is not ray directly, so there's no real issue.

@codyharris-h2o-ai
Copy link
Author

@pseudotensor Thanks! For certifi, then can we remove it from the filesystem during the build process?

@codyharris-h2o-ai
Copy link
Author

There are also a handful of HIGH severities, some of these may or may not be real

Vulnerability Severity Image Package Description
CVE-2022-3996 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 If an X.509 certificate contains a malformed policy constraint and policy processing is enabled, then a write lock will be taken[...]
CVE-2022-40898 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 wheel:0.37.1 An issue discovered in Python Packaging Authority (PyPA) Wheel 0.37.1 and earlier allows remote attackers to cause a denial of s[...]
CVE-2022-4450 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 The function PEM_read_bio_ex() reads a PEM file from a BIO and parses and decodes the "name" (e.g. "CERTIFICATE"), any header da[...]
CVE-2023-0215 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 The public API function BIO_new_NDEF is a helper function used for streaming ASN.1 data via a BIO. It is primarily used internal[...]
CVE-2023-0216 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to load malformed PKCS7 data with the d2i_PKCS[...]
CVE-2023-0217 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to check a malformed DSA public key by the EVP[...]
CVE-2023-0286 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 There is a type confusion vulnerability relating to X.400 address processing inside an X.509 GeneralName. X.400 addresses were p[...]
CVE-2023-0401 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 A NULL pointer can be dereferenced when signatures are being verified on PKCS7 signed or signedAndEnveloped data. In case the ha[...]
CVE-2023-38325 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 The cryptography package before 41.0.2 for Python mishandles SSH certificates that have critical options.
CVE-2023-43804 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 urllib3:1.26.14 urllib3 is a user-friendly HTTP client library for Python. urllib3 doesn't treat the Cookie HTTP header special or provide any[...]
CVE-2023-4807 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 Issue summary: The POLY1305 MAC (message authentication code) implementation contains a bug that might corrupt the internal stat[...]
CVE-2023-49083 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 cryptography is a package designed to expose cryptographic primitives and recipes to Python developers. Calling `load_pem_pkcs7_[...]
CVE-2023-50782 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 A flaw was found in the python-cryptography package. This issue may allow a remote attacker to decrypt captured messages in TLS [...]
CVE-2023-5363 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 cryptography:38.0.4 Issue summary: A bug has been identified in the processing of key and initialisation vector (IV) lengths. This can lead to pote[...]
CVE-2023-6730 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 transformers:4.28.1 Deserialization of Untrusted Data in GitHub repository huggingface/transformers prior to 4.36.
CVE-2023-7018 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-408 transformers:4.28.1 Deserialization of Untrusted Data in GitHub repository huggingface/transformers prior to 4.36.

@pseudotensor
Copy link
Collaborator

@achraf-mer Can you add the removal of pkgs folders for the h2ogpt/vllm installs like we have for DAI?

@pseudotensor
Copy link
Collaborator

Just randomly, @codyharris-h2o-ai For transformers, I only see 4.38.2 in the image, not 4.28.1. I don't know where it is getting the versions.

@codyharris-h2o-ai
Copy link
Author

It's picking it up from workspace/spaces/demo/requirements.txt

@codyharris-h2o-ai
Copy link
Author

codyharris-h2o-ai commented Mar 20, 2024

findings.json
Attaching the raw report from ECR

Search for "filePath" in the JSON

@pseudotensor
Copy link
Collaborator

Ok, that's old code, could be updated, not part of image really.

@pseudotensor
Copy link
Collaborator

@codyharris-h2o-ai I pushed those changes to remove those unnecessary files. Try again tomorrow on 0.2.0-410

@pseudotensor
Copy link
Collaborator

@codyharris-h2o-ai Please check again.

@codyharris-h2o-ai
Copy link
Author

@pseudotensor thanks,
I scanned 412 with the following results:

Vulnerability Severity Image Package Description
CVE-2023-48022 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 ray:2.9.3 Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's p[...]
CVE-2024-0964 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 gradio:3.50.2 A local file include could be remotely triggered in Gradio due to a vulnerable user-supplied JSON value in an API request.
SNYK-PYTHON-GRADIO-6263801 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 gradio:3.50.2 ## Overview gradio is a Python library for easily interacting with trained machine learning m[...]
CVE-2022-3996 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 If an X.509 certificate contains a malformed policy constraint and policy processing is enabled, then a write lock will be taken[...]
CVE-2022-40898 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 wheel:0.37.1 An issue discovered in Python Packaging Authority (PyPA) Wheel 0.37.1 and earlier allows remote attackers to cause a denial of s[...]
CVE-2022-4450 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 The function PEM_read_bio_ex() reads a PEM file from a BIO and parses and decodes the "name" (e.g. "CERTIFICATE"), any header da[...]
CVE-2023-0215 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 The public API function BIO_new_NDEF is a helper function used for streaming ASN.1 data via a BIO. It is primarily used internal[...]
CVE-2023-0216 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to load malformed PKCS7 data with the d2i_PKCS[...]
CVE-2023-0217 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to check a malformed DSA public key by the EVP[...]
CVE-2023-0286 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 There is a type confusion vulnerability relating to X.400 address processing inside an X.509 GeneralName. X.400 addresses were p[...]
CVE-2023-0401 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 A NULL pointer can be dereferenced when signatures are being verified on PKCS7 signed or signedAndEnveloped data. In case the ha[...]
CVE-2023-38325 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 The cryptography package before 41.0.2 for Python mishandles SSH certificates that have critical options.
CVE-2023-4807 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 Issue summary: The POLY1305 MAC (message authentication code) implementation contains a bug that might corrupt the internal stat[...]
CVE-2023-49083 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 cryptography is a package designed to expose cryptographic primitives and recipes to Python developers. Calling `load_pem_pkcs7_[...]
CVE-2023-50782 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 A flaw was found in the python-cryptography package. This issue may allow a remote attacker to decrypt captured messages in TLS [...]
CVE-2023-51449 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 gradio:3.50.2 Gradio is an open-source Python package that allows you to quickly build a demo or web application for your machine learning mod[...]
CVE-2023-5363 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 cryptography:38.0.4 Issue summary: A bug has been identified in the processing of key and initialisation vector (IV) lengths. This can lead to pote[...]
CVE-2023-6572 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-412 gradio:3.50.2 Command Injection in GitHub repository gradio-app/gradio prior to main.

@pseudotensor
Copy link
Collaborator

Sorry 512 is gradio 3 for k8 and 513 failed during push due to some network issue. Need to avoid the gradio 3 builds we make for k8 issue.

@codyharris-h2o-ai
Copy link
Author

Ok will try 410

@pseudotensor
Copy link
Collaborator

pseudotensor commented Mar 21, 2024

I'm building new one, 414.

@achraf-mer
Copy link
Collaborator

@achraf-mer Can you add the removal of pkgs folders for the h2ogpt/vllm installs like we have for DAI?

I see done in 98e390b and you are building a new image, so will wait and see how to address new findings, thanks.

@pseudotensor
Copy link
Collaborator

@achraf-mer I already removed the items, I unassigned you thanks!

@codyharris-h2o-ai
Copy link
Author

codyharris-h2o-ai commented Mar 22, 2024

Latest scan of 414:

Vulnerability Severity Image Package Description
CVE-2023-48022 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 ray:2.10.0 Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's p[...]
CVE-2022-3996 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 If an X.509 certificate contains a malformed policy constraint and policy processing is enabled, then a write lock will be taken[...]
CVE-2022-40898 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 wheel:0.37.1 An issue discovered in Python Packaging Authority (PyPA) Wheel 0.37.1 and earlier allows remote attackers to cause a denial of s[...]
CVE-2022-4450 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 The function PEM_read_bio_ex() reads a PEM file from a BIO and parses and decodes the "name" (e.g. "CERTIFICATE"), any header da[...]
CVE-2023-0215 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 The public API function BIO_new_NDEF is a helper function used for streaming ASN.1 data via a BIO. It is primarily used internal[...]
CVE-2023-0216 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to load malformed PKCS7 data with the d2i_PKCS[...]
CVE-2023-0217 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 An invalid pointer dereference on read can be triggered when an application tries to check a malformed DSA public key by the EVP[...]
CVE-2023-0286 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 There is a type confusion vulnerability relating to X.400 address processing inside an X.509 GeneralName. X.400 addresses were p[...]
CVE-2023-0401 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 A NULL pointer can be dereferenced when signatures are being verified on PKCS7 signed or signedAndEnveloped data. In case the ha[...]
CVE-2023-38325 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 The cryptography package before 41.0.2 for Python mishandles SSH certificates that have critical options.
CVE-2023-4807 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 Issue summary: The POLY1305 MAC (message authentication code) implementation contains a bug that might corrupt the internal stat[...]
CVE-2023-49083 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 cryptography is a package designed to expose cryptographic primitives and recipes to Python developers. Calling `load_pem_pkcs7_[...]
CVE-2023-50782 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 A flaw was found in the python-cryptography package. This issue may allow a remote attacker to decrypt captured messages in TLS [...]
CVE-2023-5363 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-414 cryptography:38.0.4 Issue summary: A bug has been identified in the processing of key and initialisation vector (IV) lengths. This can lead to pote[...]

wrt ray, we must mitigate the functionality by removing the offending source files in the package (such as overwriting or deleting or stubbing out the appropriate functions), or remove ray altogether.

@pseudotensor
Copy link
Collaborator

Where is cryptography==38.0.04 from? I only see we install any latest version, unconstrained. Should be 42.0.5.

@codyharris-h2o-ai
Copy link
Author

@pseudotensor, hey it appears to be coming from h2ogpt_conda/lib/python3.10/site-packages/cryptography-38.0.4.dist-info/METADATA

@pseudotensor
Copy link
Collaborator

I think it's because docker build was using fixed miniconda version, not latest, so should be ok tomorrow.

@codyharris-h2o-ai
Copy link
Author

Vulnerability Severity Image Package Description
CVE-2023-48022 critical 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-446 ray:2.10.0 Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's p[...]
SNYK-PYTHON-PILLOW-6514866 high 223008754879.dkr.ecr.us-east-1.amazonaws.com/h2ogpt-runtime:0.2.0-446 pillow:10.2.0 ## Overview Affected versions of this package are vulnerable to Buffer Overflow via the strcpy function in _imagingcms.c, d[...]

@achraf-mer
Copy link
Collaborator

achraf-mer commented Apr 2, 2024

@codyharris-h2o-ai is the ray:2.10.0 issue a case of a bad report?
according to https://nvd.nist.gov/vuln/detail/CVE-2023-48022 and https://bishopfox.com/blog/ray-versions-2-6-3-2-8-0 the CVE only applies to 2.6.3 and 2.8.0.

@codyharris-h2o-ai
Copy link
Author

I discussed this with @YogevMaty and it sounds like it is still an issue

@YogevMaty
Copy link

Apparently this CVE is very similar to the one we had in h2o3 .
The default installation does not require authentication and is listening on 0.0.0.0
The company behind Ray is saying it is not a CVE it's by design this is the reason it is not visible in some scanners.
Currently they are not planing of fixing this issue.

what to do
Security and isolation must be enforced outside of the Ray Cluster. Ray expects to run in a safe network environment and to act upon trusted code. Developers and platform providers must maintain the following invariants to ensure the safe operation of Ray Clusters.

https://docs.ray.io/en/latest/ray-security/index.html#best-practices

more info in -https://www.oligo.security/blog/shadowray-attack-ai-workloads-actively-exploited-in-the-wild

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security Security issues
Projects
None yet
Development

No branches or pull requests

4 participants