Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Priortized CUDs attribution tool. #395

Merged
merged 30 commits into from
Apr 13, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
f2dcfd4
Add Priortized CUDs attribution tool.
arifkasim Jan 24, 2020
a88ed7e
Applying the python formatting
bipinupd Mar 27, 2020
053152f
Formatting and rremoving dead code
bipinupd Mar 27, 2020
32eb335
Fixed linking to location and gsutil update
kardiff18 Mar 30, 2020
567135d
Changed from Python operator to BQOperator
kardiff18 Mar 30, 2020
36d2ff9
Updated gcp airflow pypi package
kardiff18 Mar 30, 2020
3b38602
Delete Table operator
kardiff18 Mar 31, 2020
fef3295
File based test
bipinupd Apr 1, 2020
b3ca13a
Run test in all the samples
bipinupd Apr 2, 2020
debbb2e
shell cleanup
arifkasim Apr 2, 2020
756191d
pytest clean
arifkasim Apr 2, 2020
5966967
testing cost option b tests properly.
arifkasim Apr 6, 2020
cb11c1f
Standardize style.
arifkasim Apr 6, 2020
06d18f8
rewrite to use CSV writer and standard tempfile
arifkasim Apr 7, 2020
e165f5e
Simplify combine schedule logic
arifkasim Apr 7, 2020
c3fecd4
Adding Dockerfile for automated testing
arifkasim Apr 7, 2020
313a843
Point Dockerfile to the correct GoogleCloudPlatform PSO github repo.
arifkasim Apr 7, 2020
3d93e91
update main.py to cud_correction_dag.py, fix requirements.txt, change…
kardiff18 Apr 9, 2020
3e66377
updated new directory location
kardiff18 Apr 9, 2020
b0703c5
added .airflowignore file and fix file error
kardiff18 Apr 9, 2020
c329c78
Code cleanup.
arifkasim Apr 9, 2020
52474a0
Pin TF versioning, add .airflowignore
kardiff18 Apr 9, 2020
a02c400
Merge branch 'master' of https://github.com/arifkasim/professional-se…
kardiff18 Apr 9, 2020
811066f
Merge branch 'master' into master
Apr 9, 2020
b249188
fix gcs upload deployment; update 0.12 language configs
kardiff18 Apr 10, 2020
3ebb79f
delete json config
kardiff18 Apr 10, 2020
1e5f540
updating to use jinja templating
kardiff18 Apr 11, 2020
6de9b88
simplify gcs upload on gcloud
kardiff18 Apr 11, 2020
6b413e8
Refactor tests to be compatible with Jinja templates.
arifkasim Apr 11, 2020
369b473
Merge branch 'master' into master
Apr 13, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ The tools folder contains ready-made utilities which can simpilfy Google Cloud P
Python package that provides support tools for Cloud AI Vision. Currently
there are a few scripts for generating an AutoML Vision dataset CSV file from
either raw images or image annotation files in PASCAL VOC format.
* [CUD Prioritized Attribution](tools/cuds-prioritized-attribution) - A tool that allows GCP customers who purchased Committed Use Discounts (CUDs) to prioritize a specific scope (e.g. project or folder) to attribute CUDs first before letting any unconsumed discount float to other parts of an organization.
* [DNS Sync](tools/dns-sync) - Sync a Cloud DNS zone with GCE resources. Instances and load balancers are added to the cloud DNS zone as they start from compute_engine_activity log events sent from a pub/sub push subscription. Can sync multiple projects to a single Cloud DNS zone.
* [GCE Disk Encryption Converter](tools/gce-google-keys-to-cmek) - A tool that converts disks attached to a GCE VM instnace from Google-managed keys to a customer-managed key stored in Cloud KMS.
* [GCE Quota Sync](tools/gce-quota-sync) - A tool that fetches resource quota usage from the GCE API and synchronizes it to Stackdriver as a custom metric, where it can be used to define automated alerts.
Expand Down
1 change: 1 addition & 0 deletions helpers/exclusion_list.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ tools/bq-visualizer
tools/cloudconnect
tools/cloudera-parcel-gcsconnector
tools/cloud-vision-utils
tools/cuds-prioritized-attribution
tools/dataproc-edge-node
tools/dns-sync
tools/gce-google-keys-to-cmek
Expand Down
402 changes: 402 additions & 0 deletions tools/cuds-prioritized-attribution/README.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Original file line number Diff line number Diff line change
@@ -0,0 +1,313 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

query = """
CREATE TEMP FUNCTION string_to_label_array(lables_str STRING)
RETURNS ARRAY<STRUCT<key STRING, value STRING>>
LANGUAGE js
AS '''
return JSON.parse(lables_str);
''';

CREATE TEMP FUNCTION
commitmentSKUToNegationSKU(sku_desc STRING)
RETURNS STRING AS ( IF(REGEXP_CONTAINS(sku_desc, r"[cC]ommitment v[0-9]: [a-zA-Z]+ in [a-zA-Z0-9\\-]+ for [0-9]+ [_a-zA-Z]+"),
CONCAT(
--prefix
"Reattribution_Negation_CUD_",
--number
REGEXP_EXTRACT(sku_desc, r"[cC]ommitment v[0-9]: [a-zA-Z]+ in [a-zA-Z0-9\\-]+ for ([0-9]+) [_a-zA-Z]+"),
--timeframe
REGEXP_EXTRACT(sku_desc, r"[cC]ommitment v[0-9]: [a-zA-Z]+ in [a-zA-Z0-9\\-]+ for [0-9]+ ([_a-zA-Z]+)"), "_",
--UPPER(type)
UPPER(REGEXP_EXTRACT(sku_desc, r"[cC]ommitment v[0-9]: ([a-zA-Z]+) in [a-zA-Z0-9\\-]+ for [0-9]+ [_a-zA-Z]+")), "_COST_",
--region
REGEXP_EXTRACT(sku_desc, r"[cC]ommitment v[0-9]: [a-zA-Z]+ in ([a-zA-Z0-9\\-]+) for [0-9]+ [_a-zA-Z]+") ),
NULL));

CREATE TEMP FUNCTION
regionMapping(gcp_region STRING)
RETURNS STRING AS (
CASE
WHEN gcp_region IS NULL THEN NULL
WHEN gcp_region LIKE "us-%"
OR gcp_region LIKE "northamerica%"
OR gcp_region LIKE "southamerica%" THEN "Americas"
WHEN gcp_region LIKE "europe-%" THEN "EMEA"
WHEN gcp_region LIKE "australia-%"
OR gcp_region LIKE "asia-%" THEN"APAC" END);

CREATE TEMP FUNCTION
ratio(numerator float64, denominator float64)
as (IF(denominator = 0,
0,
numerator / denominator));

(
WITH
billing_export_table AS (
SELECT
*
FROM
`{billing_export_table_name}`
WHERE
CAST(DATETIME(usage_start_time, "America/Los_Angeles") AS DATE) >= "2018-09-20"),

correct_cud_costs AS (
SELECT
billing_account_id AS billing_account_id,
STRUCT ( service_id AS id,
service_description AS description) AS service,
STRUCT (CONCAT("Reattribution_Addition_CUD_", IF(LOWER(unit_type) LIKE "ram",
"RAM_COST",
"CORE_COST"), "_", regionMapping(region)) AS id,
CONCAT("Reattribution_Addition_CUD_", IF(LOWER(unit_type) LIKE "ram",
"RAM_COST",
"CORE_COST"), "_", regionMapping(region)) AS description) AS sku,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_start_time,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_end_time,
STRUCT (project_id AS id,
project_name AS name,
ARRAY<STRUCT<key STRING,
value STRING>> [] AS labels,
ancestry_numbers AS ancestry_numbers) AS project,
string_to_label_array(d.labels) as labels,
ARRAY<STRUCT<key STRING,value STRING>> [] AS system_labels,
STRUCT ( "" AS location,
"" AS country,
region AS region,
"" AS zone ) AS location,
CURRENT_TIMESTAMP() AS export_time,
P_alloc_commitment_cost_{cud_cost_attribution_option} AS cost,
"USD" AS currency,
1.0 AS currency_conversion_rate,
STRUCT ( 0.0 AS amount,
IF(LOWER(unit_type) LIKE "ram", "byte-seconds", "gibibyte hour") AS unit,
0.0 AS amount_in_pricing_units,
IF(LOWER(unit_type) LIKE "ram", "seconds", "hour") AS pricing_unit ) AS usage,
ARRAY<STRUCT<name STRING,
amount FLOAT64>> [] AS credits,
STRUCT ( FORMAT_DATE("%Y%m", usage_date) AS month) AS invoice,
cost_type
FROM
`{project_id}.{corrected_dataset_id}.{distribute_commitments_table}` d
WHERE
{enable_cud_cost_attribution}
AND P_alloc_commitment_cost_{cud_cost_attribution_option} <> 0
),

correct_cud_credits AS (
SELECT
billing_account_id AS billing_account_id,
STRUCT ( service_id AS id,
service_description AS description) AS service,
STRUCT ( CONCAT("Reattribution_Addition_CUD_", IF(LOWER(unit_type) LIKE "ram","RAM",
"CORE"), "_CREDIT_", regionMapping(region)) AS id,
CONCAT("Reattribution_Addition_CUD_", IF(LOWER(unit_type) LIKE "ram","RAM",
"CORE"), "_CREDIT_", regionMapping(region)) AS description) AS sku,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_start_time,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_end_time,
STRUCT ( project_id AS id,
project_name AS name,
ARRAY<STRUCT<key STRING,value STRING>> [] AS labels,
ancestry_numbers AS ancestry_numbers) AS project,
string_to_label_array(d.labels) as labels,
ARRAY<STRUCT<key STRING,value STRING>> [] AS system_labels,
STRUCT ( region AS location,
"" AS country,
region AS region,
"" AS zone ) AS location,
CURRENT_TIMESTAMP() AS export_time,
0.0 AS cost,
"USD" AS currency,
1.0 AS currency_conversion_rate,
STRUCT ( 0.0 AS amount,
IF(LOWER(unit_type) LIKE "ram", "byte-seconds", "seconds") AS unit,
0.0 AS amount_in_pricing_units,
IF(LOWER(unit_type) LIKE "ram", "byte-seconds", "seconds") AS pricing_unit
) AS usage,
ARRAY<STRUCT<name STRING,
amount FLOAT64>>[(IF(LOWER(unit_type) LIKE "ram",
"Committed Usage Discount: RAM",
"Committed Usage Discount: CPU"),
P_alloc_cud_credit_cost
)] AS credits,
STRUCT ( FORMAT_DATE("%Y%m", usage_date) AS month) AS invoice,
cost_type
FROM
`{project_id}.{corrected_dataset_id}.{distribute_commitments_table}` d
WHERE
P_alloc_cud_credit_cost <> 0

),
cancelled_credits AS (
SELECT
billing_account_id,
service AS service,
sku,
usage_start_time,
usage_end_time,
project AS project,
labels,
system_labels,
location AS location,
export_time,
0.0 AS cost,
currency,
currency_conversion_rate,
STRUCT( 0.0 AS amount,
usage.unit AS unit,
0.0 AS amount_in_pricing_units,
usage.pricing_unit AS pricing_unit) AS usage,
ARRAY<STRUCT<name STRING,amount FLOAT64>> [(cs.name,-1*cs.amount)] AS credits,
invoice,
cost_type
FROM
billing_export_table,
UNNEST(credits) AS cs
WHERE
service.description = "Compute Engine"
AND (
FALSE
OR (LOWER(sku.description) LIKE "%instance%"
OR LOWER(sku.description) LIKE "% intel %")
OR LOWER(sku.description) LIKE "%memory optimized core%"
OR LOWER(sku.description) LIKE "%memory optimized ram%"
OR LOWER(sku.description) LIKE "%commitment%")
-- Filter out Sole Tenancy skus that do not represent billable compute instance usage
AND NOT
( FALSE
-- the VMs that run on sole tenancy nodes are not actually billed. Just the sole tenant node is
OR LOWER(sku.description) LIKE "%hosted on sole tenancy%"
-- sole tenancy premium charge is not eligible instance usage
OR LOWER(sku.description) LIKE "sole tenancy premium%"
)
and (LOWER(cs.name) LIKE "%committed%" OR LOWER(cs.name) LIKE "%sustained%")
),

cancelled_cud_costs AS (
SELECT
billing_account_id,
service AS service,
STRUCT ( commitmentSKUToNegationSKU(sku.description ) AS id,
commitmentSKUToNegationSKU(sku.description) AS description
) AS sku,
usage_start_time,
usage_end_time,
project AS project,
labels,
system_labels,
location AS location,
export_time,
-1.0*cost AS cost,
currency,
currency_conversion_rate,
STRUCT( 0.0 AS amount,
usage.unit AS unit,
0.0 AS amount_in_pricing_units,
usage.pricing_unit AS pricing_unit) AS usage,
ARRAY<STRUCT<name STRING,amount FLOAT64>> [] AS credits,
invoice,
cost_type
FROM
billing_export_table
WHERE
{enable_cud_cost_attribution}
AND service.description = "Compute Engine"
AND LOWER(sku.description) LIKE "%commitment%"
AND cost <> 0
),

correct_sud_credits AS (
SELECT
billing_account_id AS billing_account_id,
STRUCT ( service_id AS id,
service_description AS description) AS service,
STRUCT ( "Reattribution_Addition_SUD_CREDIT" AS id,
"Reattribution_Addition_SUD_CREDIT" AS description) AS sku,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_start_time,
TIMESTAMP_ADD(TIMESTAMP(usage_date), INTERVAL ((3600*23)+3599) SECOND) AS usage_end_time,
STRUCT ( project_id AS id,
project_name AS name,
ARRAY<STRUCT<key STRING,
value STRING>> [] AS labels,
ancestry_numbers AS ancestry_numbers)
AS project,
string_to_label_array(d.labels) as labels,
ARRAY<STRUCT<key STRING,
value STRING>> [] AS system_labels,
STRUCT ( "" AS location,
"" AS country,
region AS region,
"" AS zone ) AS location,
CURRENT_TIMESTAMP() AS export_time,
0.0 AS cost,
"USD" AS currency,
1.0 AS currency_conversion_rate,
STRUCT ( 0.0 AS amount,
IF(LOWER(unit_type) LIKE "ram", "byte-seconds", "seconds") AS unit,
0.0 AS amount_in_pricing_units,
IF(LOWER(unit_type) LIKE "ram", "byte-seconds", "seconds") AS pricing_unit
) AS usage,
ARRAY<STRUCT<name STRING,
amount FLOAT64>> [("Sustained Usage Discount",
P_alloc_sud_credit_cost)] AS credits,
STRUCT ( FORMAT_DATE("%Y%m", usage_date) AS month) AS invoice,
cost_type
FROM
`{project_id}.{corrected_dataset_id}.{distribute_commitments_table}` d
WHERE
P_alloc_sud_credit_cost <> 0)

SELECT
*
FROM
correct_sud_credits

UNION ALL

SELECT
*
FROM
correct_cud_credits

UNION ALL

SELECT
*
FROM
cancelled_credits

UNION ALL

SELECT
*
FROM
correct_cud_costs

UNION ALL

SELECT
*
FROM
cancelled_cud_costs

UNION ALL

SELECT
*
FROM
`{billing_export_table_name}`
)
"""
Loading