Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java dev #52

Merged
merged 7 commits into from
Oct 14, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 239 additions & 0 deletions examples/existing-cluster-java/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# Existing Cluster with the AWS Observability accelerator base module and Java monitoring


This example demonstrates how to use the AWS Observability Accelerator Terraform
modules with Java monitoring enabled.
The current example deploys the [AWS Distro for OpenTelemetry Operator](https://docs.aws.amazon.com/eks/latest/userguide/opentelemetry.html) for Amazon EKS with its requirements and make use of existing
Amazon Managed Service for Prometheus and Amazon Managed Grafana workspaces.

It is based on the `java module`, one of our [workloads modules](../../modules/workloads/)
to provide an existing EKS cluster with an OpenTelemetry collector,
curated Grafana dashboards, Prometheus alerting and recording rules with multiple
configuration options on the cluster infrastructure.


## Prerequisites

Ensure that you have the following tools installed locally:

1. [aws cli v2](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
2. [kubectl](https://kubernetes.io/docs/tasks/tools/)
3. [terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli)


## Setup

This example uses a local terraform state. If you need states to be saved remotely,
on Amazon S3 for example, visit the [terraform remote states](https://www.terraform.io/language/state/remote) documentation

1. Clone the repo using the command below

```
git clone https://github.com/aws-observability/terraform-aws-observability-accelerator.git
```

2. Initialize terraform

```console
cd examples/existing-cluster-java
terraform init
```

3. AWS Region

Specify the AWS Region where the resources will be deployed. Edit the `terraform.tfvars` file and modify `aws_region="..."`. You can also use environement variables `export TF_VAR_aws_region=xxx`.

4. Amazon EKS Cluster

To run this example, you need to provide your EKS cluster name.
If you don't have a cluster ready, visit [this example](../eks-cluster-with-vpc)
first to create a new one.

Add your cluster name for `eks_cluster_id="..."` to the `terraform.tfvars` or use an environment variable `export TF_VAR_eks_cluster_id=xxx`.

5. Amazon Managed Service for Prometheus workspace (optional)

If you have an existing workspace, add `managed_prometheus_workspace_id=ws-xxx`
or use an environment variable `export TF_VAR_managed_prometheus_workspace_id=ws-xxx`.

If you don't specify anything a new workspace will be created for you.

6. Amazon Managed Grafana workspace

If you have an existing workspace, create an environment variable `export TF_VAR_managed_grafana_workspace_id=g-xxx`.

7. <a name="apikey"></a> Grafana API Key

Amazon Managed Service for Grafana provides a control plane API for generating Grafana API keys. We will provide to Terraform
a short lived API key to run the `apply` or `destroy` command.
Ensure you have necessary IAM permissions (`CreateWorkspaceApiKey, DeleteWorkspaceApiKey`)

```sh
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
```

## Deploy

```sh
terraform apply -var-file=terraform.tfvars
```

or if you had only setup environment variables, run

```sh
terraform apply
```

## Visualization

1. Prometheus datasource on Grafana

Open your Grafana workspace and under Configuration -> Data sources, you should see `aws-observability-accelerator`. Open and click `Save & test`. You should see a notification confirming that the Amazon Managed Service for Prometheus workspace is ready to be used on Grafana.

2. Grafana dashboards

Go to the Dashboards panel of your Grafana workspace. You should see a list of dashboards under the `Observability Accelerator Dashboards`

<img width="832" alt="image" src="https://user-images.githubusercontent.com/97046295/194903648-57c55d30-6f90-4b03-9eb6-577aaba7dc22.png">

Open a specific dashboard and you should be able to view its visualization
lewinkedrs marked this conversation as resolved.
Show resolved Hide resolved

<img width="874" alt="image" src="https://user-images.githubusercontent.com/97046295/194922672-d037c0e5-851d-4d8b-bd2e-066cd1e2d118.png">
lewinkedrs marked this conversation as resolved.
Show resolved Hide resolved


2. Amazon Managed Service for Prometheus rules and alerts

Open the Amazon Managed Service for Prometheus console and view the details of your workspace. Under the `Rules management` tab, you should find new rules deployed.

<img width="1314" alt="image" src="https://user-images.githubusercontent.com/97046295/194904104-09a28577-d149-478e-b0a1-dc21cb7effc1.png">


To setup your alert receiver, with Amazon SNS, follow [this documentation](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-alertmanager-receiver.html)


## Deploy an Example Java Application

In this section we will reuse an example from the AWS OpenTelemetry collector [repository](https://github.com/aws-observability/aws-otel-collector/blob/main/docs/developers/container-insights-eks-jmx.md). For convenience, the steps can be found below.

1. Clone [this repository](https://github.com/aws-observability/aws-otel-test-framework) and navigate to the `sample-apps/jmx/` directory.

2. Authenticate to Amazon ECR

```sh
export AWS_ACCOUNT_ID=`aws sts get-caller-identity --query Account --output text`
export AWS_REGION={region}
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
```

3. Create an Amazon ECR repository

```sh
aws ecr create-repository --repository-name prometheus-sample-tomcat-jmx \
--image-scanning-configuration scanOnPush=true \
--region $AWS_REGION
```

4. Build Docker image and push to ECR.

```sh
docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/prometheus-sample-tomcat-jmx:latest .
docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/prometheus-sample-tomcat-jmx:latest
```

5. Install sample application

```sh
export SAMPLE_TRAFFIC_NAMESPACE=javajmx-sample
curl https://raw.githubusercontent.com/aws-observability/aws-otel-test-framework/terraform/sample-apps/jmx/examples/prometheus-metrics-sample.yaml > metrics-sample.yaml
sed -i "s/{{aws_account_id}}/$AWS_ACCOUNT_ID/g" metrics-sample.yaml
sed -i "s/{{region}}/$AWS_REGION/g" metrics-sample.yaml
sed -i "s/{{namespace}}/$SAMPLE_TRAFFIC_NAMESPACE/g" metrics-sample.yaml
kubectl apply -f metrics-sample.yaml
```

Verify that the sample application is running:

```sh
kubectl get pods -n $SAMPLE_TRAFFIC_NAMESPACE

NAME READY STATUS RESTARTS AGE
tomcat-bad-traffic-generator 1/1 Running 0 11s
tomcat-example-7958666589-2q755 0/1 ContainerCreating 0 11s
tomcat-traffic-generator 1/1 Running 0 11s
```

## Advanced configuration

1. Cross-region Amazon Managed Prometheus workspace

If your existing Amazon Managed Prometheus workspace is in another AWS Region,
add this `managed_prometheus_region=xxx` and `managed_prometheus_workspace_id=ws-xxx`.

2. Cross-region Amazon Managed Grafana workspace

If your existing Amazon Managed Prometheus workspace is in another AWS Region,
add this `managed_prometheus_region=xxx` and `managed_prometheus_workspace_id=ws-xxx`.

## Destroy resources

If you leave this stack running, you will incur charges. To remove all resources
created by Terraform, [refresh your Grafana API key](#apikey) and run:

```sh
terraform destroy -var-file=terraform.tfvars
```


<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.1.0, < 1.3.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 4.0.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | >= 1.25.0 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | >= 2.4.1 |
| <a name="requirement_kubectl"></a> [kubectl](#requirement\_kubectl) | >= 1.14 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 2.10 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 4.0.0 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_eks_observability_accelerator"></a> [eks\_observability\_accelerator](#module\_eks\_observability\_accelerator) | ../../ | n/a |
| <a name="module_workloads_java"></a> [workloads\_java](#module\_workloads\_java) | ../../modules/workloads/java | n/a |

## Resources

| Name | Type |
|------|------|
| [aws_eks_cluster.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster) | data source |
| [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS Region | `string` | n/a | yes |
| <a name="input_eks_cluster_id"></a> [eks\_cluster\_id](#input\_eks\_cluster\_id) | Name of the EKS cluster | `string` | n/a | yes |
| <a name="input_grafana_api_key"></a> [grafana\_api\_key](#input\_grafana\_api\_key) | API key for authorizing the Grafana provider to make changes to Amazon Managed Grafana | `string` | `""` | no |
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | `""` | no |
| <a name="input_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#input\_managed\_prometheus\_workspace\_id) | Amazon Managed Service for Prometheus Workspace ID | `string` | `""` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_aws_region"></a> [aws\_region](#output\_aws\_region) | AWS Region |
| <a name="output_eks_cluster_id"></a> [eks\_cluster\_id](#output\_eks\_cluster\_id) | EKS Cluster Id |
| <a name="output_eks_cluster_version"></a> [eks\_cluster\_version](#output\_eks\_cluster\_version) | EKS Cluster version |
| <a name="output_grafana_dashboard_urls"></a> [grafana\_dashboard\_urls](#output\_grafana\_dashboard\_urls) | URLs for dashboards created |
| <a name="output_managed_prometheus_workspace_endpoint"></a> [managed\_prometheus\_workspace\_endpoint](#output\_managed\_prometheus\_workspace\_endpoint) | Amazon Managed Prometheus workspace endpoint |
| <a name="output_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#output\_managed\_prometheus\_workspace\_id) | Amazon Managed Prometheus workspace ID |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
100 changes: 100 additions & 0 deletions examples/existing-cluster-java/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
provider "aws" {
region = local.region
}

data "aws_eks_cluster_auth" "this" {
name = var.eks_cluster_id
}

data "aws_eks_cluster" "this" {
name = var.eks_cluster_id
}

provider "kubernetes" {
host = local.eks_cluster_endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
kubernetes {
host = local.eks_cluster_endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.this.token
}
}

locals {
region = var.aws_region
eks_cluster_endpoint = data.aws_eks_cluster.this.endpoint
create_new_workspace = var.managed_prometheus_workspace_id == "" ? true : false
tags = {
Source = "github.com/aws-observability/terraform-aws-observability-accelerator"
}
}

# deploys the base module
module "eks_observability_accelerator" {
# source = "aws-observability/terrarom-aws-observability-accelerator"
source = "../../"

aws_region = var.aws_region
eks_cluster_id = var.eks_cluster_id

# deploys AWS Distro for OpenTelemetry operator into the cluster
enable_amazon_eks_adot = true

# reusing existing certificate manager? defaults to true
enable_cert_manager = true

# creates a new Amazon Managed Prometheus workspace, defaults to true
enable_managed_prometheus = local.create_new_workspace

# reusing existing Amazon Managed Prometheus if specified
managed_prometheus_workspace_id = var.managed_prometheus_workspace_id
managed_prometheus_workspace_region = null # defaults to the current region, useful for cross region scenarios (same account)

# sets up the Amazon Managed Prometheus alert manager at the workspace level
enable_alertmanager = true

# reusing existing Amazon Managed Grafana workspace
enable_managed_grafana = false
managed_grafana_workspace_id = var.managed_grafana_workspace_id
grafana_api_key = var.grafana_api_key

tags = local.tags
}

# https://www.terraform.io/language/modules/develop/providers
# A module intended to be called by one or more other modules must not contain
# any provider blocks.
# This allows forcing dependency between base and workloads module
provider "grafana" {
url = module.eks_observability_accelerator.managed_grafana_workspace_endpoint
auth = var.grafana_api_key
}

module "workloads_java" {
source = "../../modules/workloads/java"

eks_cluster_id = module.eks_observability_accelerator.eks_cluster_id

dashboards_folder_id = module.eks_observability_accelerator.grafana_dashboards_folder_id
managed_prometheus_workspace_id = module.eks_observability_accelerator.managed_prometheus_workspace_id

managed_prometheus_workspace_endpoint = module.eks_observability_accelerator.managed_prometheus_workspace_endpoint
managed_prometheus_workspace_region = module.eks_observability_accelerator.managed_prometheus_workspace_region

# optional, defaults to 60s interval and 15s timeout
prometheus_config = {
global_scrape_interval = "60s"
global_scrape_timeout = "15s"
scrape_sample_limit = 2000
}

tags = local.tags

depends_on = [
module.eks_observability_accelerator
]
}
29 changes: 29 additions & 0 deletions examples/existing-cluster-java/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
output "eks_cluster_id" {
description = "EKS Cluster Id"
value = module.eks_observability_accelerator.eks_cluster_id
}

output "aws_region" {
description = "AWS Region"
value = module.eks_observability_accelerator.aws_region
}

output "eks_cluster_version" {
description = "EKS Cluster version"
value = module.eks_observability_accelerator.eks_cluster_version
}

output "managed_prometheus_workspace_endpoint" {
description = "Amazon Managed Prometheus workspace endpoint"
value = module.eks_observability_accelerator.managed_prometheus_workspace_endpoint
}

output "managed_prometheus_workspace_id" {
description = "Amazon Managed Prometheus workspace ID"
value = module.eks_observability_accelerator.managed_prometheus_workspace_id
}

output "grafana_dashboard_urls" {
description = "URLs for dashboards created"
value = module.workloads_java.grafana_dashboard_urls
}
24 changes: 24 additions & 0 deletions examples/existing-cluster-java/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
variable "eks_cluster_id" {
description = "Name of the EKS cluster"
type = string
}
variable "aws_region" {
description = "AWS Region"
type = string
}
variable "managed_prometheus_workspace_id" {
description = "Amazon Managed Service for Prometheus Workspace ID"
type = string
default = ""
}
variable "managed_grafana_workspace_id" {
description = "Amazon Managed Grafana Workspace ID"
type = string
default = ""
}
variable "grafana_api_key" {
description = "API key for authorizing the Grafana provider to make changes to Amazon Managed Grafana"
type = string
default = ""
sensitive = true
}
Loading