Skip to content

Commit

Permalink
Adding in the call to the EMR client
Browse files Browse the repository at this point in the history
  • Loading branch information
alexanderdean committed Jun 10, 2012
1 parent 3d16d6f commit a402cb0
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 8 deletions.
10 changes: 2 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,2 @@
.idea/
.manager/
workspace.xml
*.iml
*.sublime-*
.DS_Store
*~
*.swp
# Only project- and language-specific ignores in here. Use global .gitignore for editors etc
.*.yml
1 change: 1 addition & 0 deletions hive/etl/config.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
aws:
access_key_id: ADD HERE
secret_access_key: ADD HERE
emr_client_path: ADD HERE
buckets:
jar: ADD HERE
in: ADD HERE
Expand Down
33 changes: 33 additions & 0 deletions hive/etl/daily-etl.rb
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,39 @@
# Determine yesterday's date
yesterday = (Date.today - 1).strftime('%Y-%m-%d')

# Now load the Ruby EMR Client
$LOAD_PATH << config["aws"]["emr_client_path"]
require 'amazon/coral/elasticmapreduceclient'
require 'amazon/retry_delegator'

aws_config = {
:endpoint => "https://elasticmapreduce.amazonaws.com",
:ca_file => File.join(config["aws"]["emr_client_path"], "cacert.pem"),
:aws_access_key => config["aws"]["my_access_id"],
:aws_secret_key => config["aws"]["my_secret_key"],
:signature_algorithm => :V2
}
client = Amazon::Coral::ElasticMapReduceClient.new_aws_query(aws_config)

# Use the retry delegator to make your client retry if it gets connection failures.
is_retryable_error_response = Proc.new do |response|
if response == nil then
false
else
ret = false
if response['Error'] then
# don't retry on 'Timeout' because the call might have succeeded
ret ||= ['InternalFailure', 'Throttling', 'ServiceUnavailable'].include?(response['Error']['Code'])
end
ret
end
end

client = Amazon::RetryDelegator.new(client, :retry_if => is_retryable_error_response)

# Debug TODO: remove
puts client.DescribeJobFlows.inspect

# Runs a daily ETL job for the specific day.
# Uses the Elastic MapReduce Command Line Tool.
# Parameters:
Expand Down

0 comments on commit a402cb0

Please sign in to comment.