Skip to content

Commit

Permalink
Added on-demand bucket protection for gcp project
Browse files Browse the repository at this point in the history
  • Loading branch information
carlosmmatos committed Oct 19, 2022
1 parent ec4e19a commit c5da0e9
Show file tree
Hide file tree
Showing 4 changed files with 472 additions and 1 deletion.
4 changes: 4 additions & 0 deletions cloud-storage-protection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ This solution leverages the same APIs and logic that is implemented by the serve

The read more about this component, review the documentation located [here](on-demand).

You can also launch a tutorial by clicking the following button:

[![Open in Cloud Shell](https://gstatic.com/cloudssh/images/open-btn.svg)](https://shell.cloud.google.com/cloudshell/editor?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2FCrowdStrike%2FCloud-GCP&cloudshell_workspace=cloud-storage-protection%2Fon-demand&cloudshell_tutorial=tutorial.md)

## Deploying to an existing bucket
A helper routine is provided as part of this integration that assists with deploying protection to an existing bucket. This helper leverages Terraform, and can be started by executing the `existing.sh` script.

Expand Down
65 changes: 64 additions & 1 deletion cloud-storage-protection/on-demand/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,67 @@
This example provides a stand-alone solution for scanning a Cloud Storage bucket before implementing protection.
While similar to the serverless function, this solution will only scan the bucket's _existing_ file contents.

## Coming Soon...
> This example requires the `google-cloud-storage` and `crowdstrike-falconpy` (v0.8.7+) packages.
## Running the program
[![Open in Cloud Shell](https://gstatic.com/cloudssh/images/open-btn.svg)](https://shell.cloud.google.com/cloudshell/editor?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2FCrowdStrike%2FCloud-GCP&cloudshell_workspace=cloud-storage-protection%2Fon-demand&cloudshell_tutorial=tutorial.md)

In order to run this solution, you will need:
+ Name of the target GCP Cloud Storage bucket
+ The Project ID associated with the target bucket
+ access to CrowdStrike API keys with the following scopes:
| Service Collection | Scope |
| :---- | :---- |
| Quick Scan | __READ__, __WRITE__ |
| Sample Uploads | __READ__, __WRITE__ |

### Execution syntax
The following command will execute the solution against the bucket you specify using default options.

```shell
python3 quickscan_target.py -k CROWDSTRIKE_API_KEY -s CROWDSTRIKE_API_SECRET -t gs://TARGET_BUCKET_NAME -p PROJECT_ID
```

A small command-line syntax help utility is available using the `-h` flag.

```shell
python3 quickscan_target.py -h
usage: Falcon Quick Scan [-h] [-l LOG_LEVEL] [-d CHECK_DELAY] [-b BATCH] -p PROJECT -t TARGET -k KEY -s SECRET

options:
-h, --help show this help message and exit
-l LOG_LEVEL, --log-level LOG_LEVEL
Default log level (DEBUG, WARN, INFO, ERROR)
-d CHECK_DELAY, --check-delay CHECK_DELAY
Delay between checks for scan results
-b BATCH, --batch BATCH
The number of files to include in a volume to scan.
-p PROJECT_ID, --project PROJECT_ID
Project ID the target bucket resides in
-t TARGET, --target TARGET
Target folder or bucket to scan. Bucket must have 'gs://' prefix.
-k KEY, --key KEY CrowdStrike Falcon API KEY
-s SECRET, --secret SECRET
CrowdStrike Falcon API SECRET
```

### Example output

```shell
2022-10-19 16:37:56,904 Quick Scan INFO Process startup complete, preparing to run scan
2022-10-19 16:37:59,962 Quick Scan INFO Assembling volumes from target bucket (test_sample_bucket) for submission
2022-10-19 16:38:02,078 Quick Scan INFO Uploaded README.md to 7f3efe17610c09e537c2494ad8d251ac300573f1c0f3ad4be500d650c9de5e7b
2022-10-19 16:38:03,934 Quick Scan INFO Uploaded README.md to 5252d7c5b99506a6a7b1fe8819485ca9847f7528476a4bb9f5d8b869a8c8726c
2022-10-19 16:38:06,563 Quick Scan INFO Uploaded youtube.png to 47af72b75c35839a381bf91f03f4d3b87eb4283af58ff4809e137eff2f06cb40
2022-10-19 16:38:08,479 Quick Scan INFO Uploaded .gitignore to ce2de08a3889bf39fcd4cdb43d9f83197fcf17ab5c5707b1c4490e9b6cede8f4
...
...
2022-10-19 16:38:50,466 Quick Scan INFO Unscannable file container/gke-implementation-guide.md: verdict unknown
2022-10-19 16:38:50,467 Quick Scan INFO Unscannable file container/pull-secret-override.md: verdict unknown
2022-10-19 16:38:50,467 Quick Scan INFO Verdict for safe1.bin: no specific threat
2022-10-19 16:38:50,467 Quick Scan INFO Unscannable file test.pdf: verdict unknown
...
...
2022-10-19 16:38:50,467 Quick Scan INFO Removing artifacts from Sandbox
2022-10-19 16:39:55,389 Quick Scan INFO Scan completed
```
315 changes: 315 additions & 0 deletions cloud-storage-protection/on-demand/quickscan_target.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
# pylint: disable=W1401
# flake8: noqa
"""Scan a GCP bucket with the CrowdStrike Quick Scan API.
_______ ___ ___ ___ _______ ___ ___ _______ _______ _______ ______ _______ _______ ___
| _ | Y | | _ | Y ) | _ | _ | _ | _ \ | _ | _ | |
|. | |. | |. |. 1___|. 1 / | 1___|. 1___|. 1 |. | | |. 1 |. 1 |. |
|. | |. | |. |. |___|. _ \ |____ |. |___|. _ |. | | |. _ |. ____|. |
|: 1 |: 1 |: |: 1 |: | \ |: 1 |: 1 |: | |: | | |: | |: | |: |
|::.. |::.. . |::.|::.. . |::.| . ) |::.. . |::.. . |::.|:. |::.| | |::.|:. |::.| |::.|
`----|:.`-------`---`-------`--- ---' `-------`-------`--- ---`--- ---' `--- ---`---' `---'
`--'
Scans a CGP Cloud Storage bucket using the CrowdStrike Falcon Quick Scan and Sample Uploads APIs.
Created // 10.19.22: carlos.matos@CrowdStrike
Based on // jshcodes@CrowdStrike - S3 Quick Scan Script
===== NOTES REGARDING THIS SOLUTION ============================================================
A VOLUME is a collection of files that are uploaded and then scanned as a singular batch.
The bucket contents are inventoried, and then the contents are downloaded to local memory and
uploaded to the Sandbox API in a linear fashion. This method does NOT store the files on the local
file system. Due to the nature of this solution, the method is heavily impacted by data transfer
speeds. Recommended deployment pattern involves running in GCP within a container, an Compute instance
or as a serverless Cloud Function. Scans the entire bucket whether you like it or not. You must specify a
target that includes the string "gs://" in order to perform a scan.
The log file rotates because cool kids don't leave messes on other people's file systems.
This solution is dependant upon Google's Storage library, and CrowdStrike FalconPy >= v0.8.7.
python3 -m pip install google-cloud-storage crowdstrike-falconpy
"""
# pylint: disable=E0401,R0903
#
import io
import os
# import json
import time
import argparse
import logging
from logging.handlers import RotatingFileHandler
# GCP Storage library
from google.cloud import storage
# !!! Requires FalconPy v0.8.7+ !!!
# Authorization, Sample Uploads and QuickScan Service Classes
from falconpy import OAuth2, SampleUploads, QuickScan


class Analysis:
"""Class to hold our analysis and status."""
def __init__(self):
self.uploaded = []
self.files = []
self.scanning = True
# Dynamically create our payload using the contents of our uploaded list
self.payload = lambda: {"samples": list(dict.fromkeys(self.uploaded))}


class Configuration: # pylint: disable=R0902
"""Class to hold our running configuration."""
def __init__(self, args):
self.log_level = logging.INFO
if args.log_level:
if args.log_level.upper() in "DEBUG,WARN,ERROR".split(","):
if args.log_level.upper() == "DEBUG":
self.log_level = logging.DEBUG
elif args.log_level.upper() == "WARN":
self.log_level = logging.WARN
elif args.log_level.upper() == "ERROR":
self.log_level = logging.ERROR

self.batch = 1000
if args.batch:
self.batch = int(args.batch)
self.scan_delay = 3
if args.check_delay:
try:
self.scan_delay = int(args.check_delay)
except ValueError:
# They gave us garbage, ignore it
pass
# Will stop processing if you give us a bucket and no project
self.project = None
if args.project_id:
self.project = args.project_id
# Target directory or bucket to be scanned
if "gs://" in args.target:
self.target_dir = args.target.replace("gs://", "")
self.bucket = True
# CrowdStrike API credentials
self.falcon_client_id = args.key
self.falcon_client_secret = args.secret


def submit_scan(incoming_analyzer: Analysis):
"""Submit the collected file batch for analysis."""
scanned = Scanner.scan_samples(body=incoming_analyzer.payload())
if scanned["status_code"] < 300:
# Submit our volume for analysis and grab the id of our scan submission
scan_id = scanned["body"]["resources"][0]
# Inform the user of our progress
logger.info("Scan %s submitted for analysis", scan_id)
# Retrieve our scan results from the API and report them
report_results(scan_uploaded_samples(incoming_analyzer, scan_id), incoming_analyzer)
else:
if "errors" in scanned["body"]:
logger.warning("%s. Unable to submit volume for scan.", scanned["body"]["errors"][0]["message"])
else:
# Rate limit only
logger.warning("Rate limit exceeded.")
# Clean up our uploaded files from out of the API
clean_up_artifacts(incoming_analyzer)

def upload_bucket_samples():
"""Retrieve keys from a bucket and then uploads them to the Sandbox API."""
if not Config.project:
logger.error("You must specify a project ID in order to scan a bucket target")
raise SystemExit(
"Target project ID not specified. Use -p or --project to specify the target project ID."
)
# Connect to GCP in our target project
gcs = storage.Client(project=Config.project)
# Connect to our target bucket
try:
bucket = gcs.get_bucket(Config.target_dir)
except Exception as err:
logger.error("Unable to connect to bucket %s. %s", Config.target_dir, err)
raise SystemExit(
"Unable to connect to bucket %s. %s" % (Config.target_dir, err)
)
# Retrieve a list of all objects in the bucket
summaries = list(bucket.list_blobs())
# Inform the user as this may take a while
logger.info("Assembling volumes from target bucket (%s) for submission", Config.target_dir)
# Loop through our list of files, downloading each to memory then upload them to the Sandbox
analyzer = None
analyzed = []
for item in summaries:
if not analyzer:
analyzer = Analysis()
# Grab the file name
filename = os.path.basename(item.name)
# Upload the file to the CrowdStrike Falcon Sandbox
response = Samples.upload_sample(file_name=filename,
file_data=io.BytesIO(
item.download_as_bytes()
)
)
# Retrieve our uploaded file SHA256 identifier
sha = response["body"]["resources"][0]["sha256"]
# Add this SHA256 to the upload payload element
analyzer.uploaded.append(sha)
# Track the upload so we recognize the file when we're done
analyzer.files.append([filename, item.name, sha])
# Inform the user of our progress
logger.info("Uploaded %s to %s", filename, sha)
if len(analyzer.uploaded) == Config.batch:
analyzed.append(analyzer)
submit_scan(analyzer)
analyzer = None

analyzed.append(analyzer)
submit_scan(analyzer)


def scan_uploaded_samples(incoming_analyzer: Analysis, scan_id: str) -> dict:
"""Retrieve a scan using the ID of the scan provided by the scan submission."""
while incoming_analyzer.scanning:
# Retrieve the scan results
scan_results = Scanner.get_scans(ids=scan_id)
try:
if scan_results["body"]["resources"][0]["status"] == "done":
# Scan is complete, retrieve our results
results = scan_results["body"]["resources"][0]["samples"]
# and break out of the loop
incoming_analyzer.scanning = False
else:
# Not done yet, sleep for a bit
time.sleep(Config.scan_delay)
except IndexError:
# Results aren't populated yet, skip
pass

return results


def report_results(results: dict, incoming_analyzer: Analysis):
"""Retrieve the scan results for the submitted scan."""
# Loop thru our results, compare to our upload and return the verdict
for result in results:
for item in incoming_analyzer.files:
if result["sha256"] == item[2]:
if "no specific threat" in result["verdict"]:
# File is clean
logger.info("Verdict for %s: %s", item[1], result["verdict"])
else:
if "error" in result:
# Unscannable
logger.info("Unscannable file %s: verdict %s", item[1], result["verdict"])
else:
# Mitigation would trigger from here
logger.warning("Verdict for %s: %s", item[1], result["verdict"])


def clean_up_artifacts(incoming_analyzer: Analysis):
"""Remove uploaded files from the Sandbox."""
logger.info("Removing artifacts from Sandbox")
for item in incoming_analyzer.uploaded:
# Perform the delete
response = Samples.delete_sample(ids=item)
if response["status_code"] > 201:
# File was not removed, log the failure
logger.warning("Failed to delete %s", item)
else:
logger.debug("Deleted %s", item)
logger.debug("Artifact cleanup complete")


def parse_command_line():
"""Parse any inbound command line arguments and set defaults."""
parser = argparse.ArgumentParser("Falcon Quick Scan")
parser.add_argument("-l", "--log-level",
dest="log_level",
help="Default log level (DEBUG, WARN, INFO, ERROR)",
required=False
)
parser.add_argument("-d", "--check-delay",
dest="check_delay",
help="Delay between checks for scan results",
required=False
)
parser.add_argument("-b", "--batch",
dest="batch",
help="The number of files to include in a volume to scan.",
required=False
)
parser.add_argument("-p", "--project",
dest="project_id",
help="Project ID the target bucket resides in",
required=True
)
parser.add_argument("-t", "--target",
dest="target",
help="Target folder or bucket to scan. Bucket must have 'gs://' prefix.",
required=True
)
parser.add_argument("-k", "--key",
dest="key",
help="CrowdStrike Falcon API KEY",
required=True
)
parser.add_argument("-s", "--secret",
dest="secret",
help="CrowdStrike Falcon API SECRET",
required=True
)

return parser.parse_args()


def load_api_config():
"""Return an instance of the authentication class using our provided credentials."""
return OAuth2(client_id=Config.falcon_client_id,
client_secret=Config.falcon_client_secret
)


def enable_logging():
"""Configure logging."""
logging.basicConfig(level=Config.log_level,
format="%(asctime)s %(name)s %(levelname)s %(message)s"
)
# Create our logger
log = logging.getLogger("Quick Scan")
# Rotate log file handler
rfh = RotatingFileHandler("falcon_quick_scan.log", maxBytes=20971520, backupCount=5)
# Log file output format
f_format = logging.Formatter('%(asctime)s %(name)s %(levelname)s %(message)s')
# Set the log file output level
rfh.setLevel(Config.log_level)
# Add our log file formatter to the log file handler
rfh.setFormatter(f_format)
# Add our log file handler to our logger
log.addHandler(rfh)

return log


if __name__ == '__main__':
# Parse the inbound command line parameters and setup our running Config object
Config = Configuration(parse_command_line())
# Activate logging
logger = enable_logging()
# Grab our authentication object
auth = load_api_config()
# Connect to the Samples Sandbox API
Samples = SampleUploads(auth_object=auth)
# Connect to the Quick Scan API
Scanner = QuickScan(auth_object=auth)
# Create our analysis object
# Analyzer = Analysis()
# Log that startup is done
logger.info("Process startup complete, preparing to run scan")
# Upload our samples to the Sandbox
if Config.bucket:
# GCP bucket
upload_bucket_samples()
else:
NOT_A_BUCKET = "Invalid bucket name specified. Please include 'gs://' in your target."
raise SystemExit(NOT_A_BUCKET)
# We're done, let everyone know
logger.info("Scan completed")
Loading

0 comments on commit c5da0e9

Please sign in to comment.