Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: [AEA-4487] - Service Search Alerts #1318

Merged
merged 7 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ jobs:
CREATE_INT_RELEASE_NOTES: true
CREATE_PROD_RELEASE_NOTES: true
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.DEV_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down Expand Up @@ -173,6 +174,7 @@ jobs:
LOG_LEVEL: DEBUG
LOG_RETENTION_DAYS: 30
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.QA_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ jobs:
LOG_LEVEL: DEBUG
LOG_RETENTION_DAYS: 30
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: false
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.DEV_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ jobs:
CREATE_INT_RELEASE_NOTES: true
CREATE_PROD_RELEASE_NOTES: true
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.DEV_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down Expand Up @@ -187,6 +188,7 @@ jobs:
LOG_LEVEL: DEBUG
LOG_RETENTION_DAYS: 30
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.REF_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down Expand Up @@ -216,6 +218,7 @@ jobs:
LOG_LEVEL: DEBUG
LOG_RETENTION_DAYS: 30
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.QA_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand All @@ -240,6 +243,7 @@ jobs:
CREATE_INT_RELEASE_NOTES: true
CREATE_INT_RC_RELEASE_NOTES: true
TOGGLE_GET_STATUS_UPDATES: true
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.INT_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down Expand Up @@ -294,6 +298,7 @@ jobs:
CREATE_PROD_RELEASE_NOTES: true
TOGGLE_GET_STATUS_UPDATES: true
RUN_REGRESSION_TESTS: false
ENABLE_ALERTS: true
secrets:
REGRESSION_TESTS_PEM: ${{ secrets.REGRESSION_TESTS_PEM }}
CLOUD_FORMATION_DEPLOY_ROLE: ${{ secrets.PROD_CLOUD_FORMATION_DEPLOY_ROLE }}
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/sam_release_code.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,9 @@ on:
RUN_REGRESSION_TESTS:
type: boolean
default: true
ENABLE_ALERTS:
type: boolean
default: true
secrets:
CLOUD_FORMATION_DEPLOY_ROLE:
required: true
Expand Down Expand Up @@ -142,6 +145,7 @@ jobs:
DOMAIN_NAME_EXPORT: ${{ inputs.DOMAIN_NAME_EXPORT }}
ZONE_ID_EXPORT: ${{ inputs.ZONE_ID_EXPORT }}
TOGGLE_GET_STATUS_UPDATES: ${{ inputs.TOGGLE_GET_STATUS_UPDATES }}
ENABLE_ALERTS: ${{ inputs.ENABLE_ALERTS }}
run: ./release_code.sh

- name: create_int_release_notes
Expand Down
9 changes: 6 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ sam-sync: guard-AWS_DEFAULT_PROFILE guard-stack_name compile download-get-secret
--parameter-overrides \
EnableSplunk=false\
TargetSpineServer=$$TARGET_SPINE_SERVER \
TargetServiceSearchServer=$$TARGET_SERVICE_SEARCH_SERVER
TargetServiceSearchServer=$$TARGET_SERVICE_SEARCH_SERVER \
EnableAlerts=false

sam-sync-sandbox: guard-stack_name compile download-get-secrets-layer
sam sync \
Expand All @@ -50,7 +51,8 @@ sam-deploy: guard-AWS_DEFAULT_PROFILE guard-stack_name
--parameter-overrides \
EnableSplunk=false \
TargetSpineServer=$$TARGET_SPINE_SERVER \
TargetServiceSearchServer=$$TARGET_SERVICE_SEARCH_SERVER
TargetServiceSearchServer=$$TARGET_SERVICE_SEARCH_SERVER \
EnableAlerts=false

sam-delete: guard-AWS_DEFAULT_PROFILE guard-stack_name
sam delete --stack-name $$stack_name
Expand Down Expand Up @@ -105,7 +107,8 @@ sam-deploy-package: guard-artifact_bucket guard-artifact_bucket_prefix guard-sta
Env=$$TARGET_ENVIRONMENT \
DomainNameExport=$$DOMAIN_NAME_EXPORT \
ZoneIDExport=$$ZONE_ID_EXPORT \
ToggleGetStatusUpdates=$$TOGGLE_GET_STATUS_UPDATES
ToggleGetStatusUpdates=$$TOGGLE_GET_STATUS_UPDATES \
EnableAlerts=$$ENABLE_ALERTS

compile-node:
npx tsc --build tsconfig.build.json
Expand Down
84 changes: 84 additions & 0 deletions SAMtemplates/alarms/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: |
PfP Cloudwatch alarms and related resources

Parameters:
StackName:
Type: String
Default: none

GetMyPrescriptionsFunctionName:
Type: String
Default: none

EnableAlerts:
Type: String

Resources:
ServiceSearchErrorsLogsMetricFilter:
Type: AWS::Logs::MetricFilter
Properties:
FilterName: GetMyPrescriptionsErrors
# Match logs with $.message starting with "call to service search unsuccessful"
FilterPattern: !Sub '{ ($.level = "WARN") && ($.function_name = "${GetMyPrescriptionsFunctionName}") && $.message = %call to service search unsuccessful% }' # function_name included to allow it to be set as a dimension on the metric
LogGroupName:
Fn::ImportValue: !Sub ${StackName}:functions:${GetMyPrescriptionsFunctionName}:LambdaLogGroupName
MetricTransformations:
- MetricNamespace: LambdaLogFilterMetrics
MetricName: ErrorCount
MetricValue: 1
Unit: Count
Dimensions: # dimensions for a logs filter metric can only be a field/value from the filter pattern
- Key: FunctionName
Value: $.function_name

ServiceSearchErrorsAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Count of Service Search errors
AlarmName: !Sub ${StackName}_ServiceSearch_Errors
Namespace: LambdaLogFilterMetrics
MetricName: ErrorCount
Dimensions:
- Name: FunctionName
Value: !Ref GetMyPrescriptionsFunctionName
Period: 60 #seconds
EvaluationPeriods: 1
Statistic: Sum
ComparisonOperator: GreaterThanOrEqualToThreshold
Threshold: 1
Unit: Count
TreatMissingData: notBreaching
ActionsEnabled: !Ref EnableAlerts
AlarmActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn
InsufficientDataActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn
OKActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn

ServiceSearchUnhandledErrorsAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Count of Service Search unhandled errors
AlarmName: !Sub ${StackName}_ServiceSearch_UnhandledErrors
Namespace: Lambda
MetricName: Errors
Dimensions:
- Name: FunctionName
Value: !Ref GetMyPrescriptionsFunctionName
Period: 60 #seconds
EvaluationPeriods: 1
Statistic: Sum
ComparisonOperator: GreaterThanOrEqualToThreshold
Threshold: 1
Unit: Count
TreatMissingData: notBreaching
ActionsEnabled: !Ref EnableAlerts
AlarmActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn
InsufficientDataActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn
OKActions:
- !ImportValue lambda-resources:SlackAlertsSnsTopicArn
6 changes: 6 additions & 0 deletions SAMtemplates/functions/lambda_resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,9 @@ Outputs:
Value: !GetAtt ExecuteLambdaManagedPolicy.PolicyArn
Export:
Name: !Sub ${StackName}:functions:${LambdaName}:ExecuteLambdaPolicyArn

LogGroupName:
Description: Lambda log group name
Value: !Ref LambdaLogGroup
Export:
Name: !Sub ${StackName}:functions:${LambdaName}:LambdaLogGroupName
16 changes: 16 additions & 0 deletions SAMtemplates/main_template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,13 @@ Parameters:
Type: String
Default: false

EnableAlerts:
Type: String
Default: true
AllowedValues:
- true
- false

Resources:
Apis:
Type: AWS::Serverless::Application
Expand Down Expand Up @@ -121,3 +128,12 @@ Resources:
EnrichPrescriptionsFunctionArn: !GetAtt Functions.Outputs.EnrichPrescriptionsFunctionArn
LogRetentionInDays: !Ref LogRetentionInDays
EnableSplunk: !Ref EnableSplunk

Alarms:
Type: AWS::Serverless::Application
Properties:
Location: alarms/main.yaml
Parameters:
StackName: !Ref AWS::StackName
GetMyPrescriptionsFunctionName: !GetAtt Functions.Outputs.GetMyPrescriptionsFunctionName
EnableAlerts: !Ref EnableAlerts
Loading