You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Ability to run Terraform with your AWS Account. You must use Terraform 0.11 or higher.
34
34
* A subnet within a VPC for the EMR cluster to run in.
35
-
*[S3 Bucket](https://github.com/terraform-aws-modules/terraform-aws-s3-bucket) to send data from Segment to and to store logs.
35
+
*An [S3 Bucket](https://github.com/terraform-aws-modules/terraform-aws-s3-bucket)for Segment to load data into. You can create a new one just for this, or re-use an existing one you already have.
36
36
37
37
## VPC
38
38
@@ -50,6 +50,33 @@ The repository is split into multiple modules, and each can be used independentl
50
50
51
51
# Usage
52
52
53
+
## Terraform Installation
54
+
*Note* - Skip this section if you already have a working Terraform setup
55
+
### OSX:
56
+
`brew` on OSX should install the latest version of Terraform.
57
+
```
58
+
brew install terraform
59
+
```
60
+
61
+
### Centos/Ubuntu:
62
+
* Follow instructions [here](https://phoenixnap.com/kb/how-to-install-terraform-centos-ubuntu) to install on Centos/Ubuntu OS.
63
+
* Ensure that the version installed in > 0.11.x
64
+
65
+
Verify installation works by running:
66
+
```
67
+
terraform help
68
+
```
69
+
70
+
## Set up Project
71
+
* Create project directory
72
+
```
73
+
mkdir segment-datalakes-tf
74
+
```
75
+
* Create `main.tf` file
76
+
* Update the `segment_sources` variable in the `locals` to the sources you want to sync
77
+
* Update the `name` in the `aws_s3_bucket` resource to the desired name of your S3 bucket
78
+
* Update the `subnet_id` in the `emr` module to the subnet in which to create the EMR cluster
79
+
53
80
```hcl
54
81
provider "aws" {
55
82
region = "us-west-2" # Replace this with the AWS region your infrastructure is set up in.
Copy file name to clipboardExpand all lines: modules/emr/README.md
+56Lines changed: 56 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,6 +48,62 @@ Type: `string`
48
48
49
49
Default: `""`
50
50
51
+
### master\_instance\_type
52
+
53
+
Description: EC2 Instance Type for Master
54
+
55
+
Type: `string`
56
+
57
+
Default: `"m5.xlarge"`
58
+
59
+
### core\_instance\_type
60
+
61
+
Description: EC2 Instance Type for Core Nodes
62
+
63
+
Type: `string`
64
+
65
+
Default: `"m5.xlarge"`
66
+
67
+
# task\_instance\_type
68
+
69
+
Description: EC2 Instance Type for Task Nodes
70
+
71
+
Type: `string`
72
+
73
+
Default: `"m5.xlarge"`
74
+
75
+
# core\_instance\_count
76
+
77
+
Description: Number of instances of Core Nodes
78
+
79
+
Type: `string`
80
+
81
+
Default: `"2"`
82
+
83
+
# core\_instance\_max\_count
84
+
85
+
Description: Max number of Core Nodes used on autoscale
86
+
87
+
Type: `string`
88
+
89
+
Default: `"4"`
90
+
91
+
# task\_instance\_count
92
+
93
+
Description: Number of instances of Task Nodes
94
+
95
+
Type: `string`
96
+
97
+
Default: `"2"`
98
+
99
+
# task\_instance\_max\_count
100
+
101
+
Description: Max number of Task Nodes used on autoscale
102
+
103
+
Type: `string`
104
+
105
+
Default: `"4"`
106
+
51
107
### tags
52
108
53
109
Description: A map of tags to add to all resources. A vendor=segment tag will be added automatically (which is also used by the IAM policy to provide Segment access to submit jobs).
0 commit comments