Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dlp] testing: fix Pub/Sub notifications #3925

Merged
merged 7 commits into from
Jun 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 18 additions & 16 deletions dlp/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,15 @@ This directory contains samples for Google Data Loss Prevention. `Google Data Lo

.. _Google Data Loss Prevention: https://cloud.google.com/dlp/docs/

To run the sample, you need to enable the API at: https://console.cloud.google.com/apis/library/dlp.googleapis.com


To run the sample, you need to have the following roles:
* `DLP Administrator`
* `DLP API Service Agent`



Setup
-------------------------------------------------------------------------------

Expand Down Expand Up @@ -58,15 +67,6 @@ Install Dependencies
.. _pip: https://pip.pypa.io/
.. _virtualenv: https://virtualenv.pypa.io/

#. For running *_test.py files, install test dependencies

.. code-block:: bash

$ pip install -r requirements-test.txt
$ pytest inspect_content_test.py

** *_test.py files are demo wrappers and make API calls. You may get rate limited for making high number of requests. **

Samples
-------------------------------------------------------------------------------

Expand All @@ -83,7 +83,7 @@ To run this sample:

.. code-block:: bash

$ python quickstart.py <project-id>
$ python quickstart.py


Inspect Content
Expand All @@ -101,15 +101,16 @@ To run this sample:

$ python inspect_content.py

usage: inspect_content.py [-h] {string,file,gcs,datastore,bigquery} ...
usage: inspect_content.py [-h] {string,table,file,gcs,datastore,bigquery} ...

Sample app that uses the Data Loss Prevention API to inspect a string, a local
file or a file on Google Cloud Storage.

positional arguments:
{string,file,gcs,datastore,bigquery}
{string,table,file,gcs,datastore,bigquery}
Select how to submit content to the API.
string Inspect a string.
table Inspect a table.
file Inspect a local file.
gcs Inspect files on Google Cloud Storage.
datastore Inspect files on Google Datastore.
Expand All @@ -135,13 +136,14 @@ To run this sample:

$ python redact.py

usage: redact.py [-h] [--project PROJECT] [--info_types INFO_TYPES]
usage: redact.py [-h] [--project PROJECT]
[--info_types INFO_TYPES [INFO_TYPES ...]]
[--min_likelihood {LIKELIHOOD_UNSPECIFIED,VERY_UNLIKELY,UNLIKELY,POSSIBLE,LIKELY,VERY_LIKELY}]
[--mime_type MIME_TYPE]
filename output_filename

Sample app that uses the Data Loss Prevent API to redact the contents of a
string or an image file.
Sample app that uses the Data Loss Prevent API to redact the contents of an
image file.

positional arguments:
filename The path to the file to inspect.
Expand All @@ -151,7 +153,7 @@ To run this sample:
-h, --help show this help message and exit
--project PROJECT The Google Cloud project id to use as a parent
resource.
--info_types INFO_TYPES
--info_types INFO_TYPES [INFO_TYPES ...]
Strings representing info types to look for. A full
list of info categories and types is available from
the API. Examples include "FIRST_NAME", "LAST_NAME",
Expand Down
8 changes: 7 additions & 1 deletion dlp/README.rst.in
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ product:
name: Google Data Loss Prevention
short_name: Data Loss Prevention
url: https://cloud.google.com/dlp/docs/
description: >
description: >
`Google Data Loss Prevention`_ provides programmatic access to a powerful
detection engine for personally identifiable information and other
privacy-sensitive data in unstructured data streams.
Expand All @@ -13,6 +13,12 @@ setup:
- auth
- install_deps

required_api_url: https://console.cloud.google.com/apis/library/dlp.googleapis.com

required_roles:
- DLP Administrator
- DLP API Service Agent

samples:
- name: Quickstart
file: quickstart.py
Expand Down
20 changes: 0 additions & 20 deletions dlp/conftest.py

This file was deleted.

21 changes: 12 additions & 9 deletions dlp/inspect_content.py
Original file line number Diff line number Diff line change
Expand Up @@ -459,11 +459,12 @@ def inspect_gcs_file(
url = "gs://{}/{}".format(bucket, filename)
storage_config = {"cloud_storage_options": {"file_set": {"url": url}}}

# Convert the project id into a full resource id.
parent = dlp.project_path(project)
# Convert the project id into full resource ids.
topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id)
parent = dlp.location_path(project, 'global')

# Tell the API where to send a notification when the job is complete.
actions = [{"pub_sub": {"topic": "{}/topics/{}".format(parent, topic_id)}}]
actions = [{"pub_sub": {"topic": topic}}]

# Construct the inspect_job, which defines the entire inspect content task.
inspect_job = {
Expand Down Expand Up @@ -623,11 +624,12 @@ def inspect_datastore(
}
}

# Convert the project id into a full resource id.
parent = dlp.project_path(project)
# Convert the project id into full resource ids.
topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id)
parent = dlp.location_path(project, 'global')

# Tell the API where to send a notification when the job is complete.
actions = [{"pub_sub": {"topic": "{}/topics/{}".format(parent, topic_id)}}]
actions = [{"pub_sub": {"topic": topic}}]

# Construct the inspect_job, which defines the entire inspect content task.
inspect_job = {
Expand Down Expand Up @@ -790,11 +792,12 @@ def inspect_bigquery(
}
}

# Convert the project id into a full resource id.
parent = dlp.project_path(project)
# Convert the project id into full resource ids.
topic = google.cloud.pubsub.PublisherClient.topic_path(project, topic_id)
parent = dlp.location_path(project, 'global')

# Tell the API where to send a notification when the job is complete.
actions = [{"pub_sub": {"topic": "{}/topics/{}".format(parent, topic_id)}}]
actions = [{"pub_sub": {"topic": topic}}]

# Construct the inspect_job, which defines the entire inspect content task.
inspect_job = {
Expand Down
37 changes: 23 additions & 14 deletions dlp/inspect_content_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@
BIGQUERY_DATASET_ID = "dlp_test_dataset" + UNIQUE_STRING
BIGQUERY_TABLE_ID = "dlp_test_table" + UNIQUE_STRING

TIMEOUT = 300 # 5 minutes


@pytest.fixture(scope="module")
def bucket():
Expand Down Expand Up @@ -298,6 +300,7 @@ def cancel_operation(out):
client.cancel_dlp_job(operation_id)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_gcs_file(bucket, topic_id, subscription_id, capsys):
try:
inspect_content.inspect_gcs_file(
Expand All @@ -307,15 +310,16 @@ def test_inspect_gcs_file(bucket, topic_id, subscription_id, capsys):
topic_id,
subscription_id,
["EMAIL_ADDRESS", "PHONE_NUMBER"],
timeout=1
timeout=TIMEOUT
)

out, _ = capsys.readouterr()
assert "Inspection operation started" in out
assert "Info type: EMAIL_ADDRESS" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_gcs_file_with_custom_info_types(
bucket, topic_id, subscription_id, capsys):
try:
Expand All @@ -331,15 +335,16 @@ def test_inspect_gcs_file_with_custom_info_types(
[],
custom_dictionaries=dictionaries,
custom_regexes=regexes,
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()

assert "Inspection operation started" in out
assert "Info type: EMAIL_ADDRESS" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_gcs_file_no_results(
bucket, topic_id, subscription_id, capsys):
try:
Expand All @@ -350,15 +355,16 @@ def test_inspect_gcs_file_no_results(
topic_id,
subscription_id,
["EMAIL_ADDRESS", "PHONE_NUMBER"],
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()

assert "Inspection operation started" in out
assert "No findings" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_gcs_image_file(bucket, topic_id, subscription_id, capsys):
try:
inspect_content.inspect_gcs_file(
Expand All @@ -368,14 +374,15 @@ def test_inspect_gcs_image_file(bucket, topic_id, subscription_id, capsys):
topic_id,
subscription_id,
["EMAIL_ADDRESS", "PHONE_NUMBER"],
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()
assert "Inspection operation started" in out
assert "Info type: EMAIL_ADDRESS" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_gcs_multiple_files(bucket, topic_id, subscription_id, capsys):
try:
inspect_content.inspect_gcs_file(
Expand All @@ -385,15 +392,16 @@ def test_inspect_gcs_multiple_files(bucket, topic_id, subscription_id, capsys):
topic_id,
subscription_id,
["EMAIL_ADDRESS", "PHONE_NUMBER"],
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()

assert "Inspection operation started" in out
assert "Info type: EMAIL_ADDRESS" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_datastore(
datastore_project, topic_id, subscription_id, capsys):
try:
Expand All @@ -404,14 +412,15 @@ def test_inspect_datastore(
topic_id,
subscription_id,
["FIRST_NAME", "EMAIL_ADDRESS", "PHONE_NUMBER"],
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()
assert "Inspection operation started" in out
assert "Info type: EMAIL_ADDRESS" in out
finally:
cancel_operation(out)


@pytest.mark.flaky(max_runs=2, min_passes=1)
def test_inspect_datastore_no_results(
datastore_project, topic_id, subscription_id, capsys):
try:
Expand All @@ -422,10 +431,10 @@ def test_inspect_datastore_no_results(
topic_id,
subscription_id,
["PHONE_NUMBER"],
timeout=1)
timeout=TIMEOUT)

out, _ = capsys.readouterr()
assert "Inspection operation started" in out
assert "No findings" in out
finally:
cancel_operation(out)

Expand Down
Loading