Skip to content

Latest commit

 

History

History
executable file
·
262 lines (199 loc) · 18.2 KB

README.md

File metadata and controls

executable file
·
262 lines (199 loc) · 18.2 KB

Active Elastic Job

Build Status Gem Version

You have your Rails application deployed on the Amazon Elastic Beanstalk platform and now your application needs to offload work—like sending emails—into asynchronous background jobs. Or you want to perform jobs periodically similar to cron jobs. Then Active Elastic Job is the right gem. It provides an adapter for Rails' Active Job framework that allows your application to queue jobs as messages in an Amazon SQS queue. Elastic Beanstalk provides worker environments that automatically pull messages from the queue and transforms them into HTTP requests. This gem knows how to handle these requests. It comes with a Rack middleware that intercepts these requests and transforms them back into jobs which are subsequently executed.

Architecture Diagram

Why use this gem?

  • It is easy to setup.
  • It makes your application ready for worker environments that are highly integrated in the Elastic Beanstalk landscape.
  • It is based on Amazon SQS, a fast, fully managed, scaleable, and reliable queue service. You do not need to operate and maintain your custom-messaging cluster.
  • It is easy to deploy. You simply push your application code to a worker environment, the same way that you push your application code to your web environment.
  • It scales. The worker environments come with auto-scale capability. Additional worker instances will spawn automatically and process jobs from the queue if the load increases above a preconfigured threshold.

Usage

  1. Add this line to your application's Gemfile:

     gem 'active_elastic_job'
    
  2. Create an SQS queue:

  • Log into your Amazon Web Service Console and select SQS from the services menu.

  • Create a new queue. Select a name of choice but do not forget to use the same name in your Active Job class definition.

    class YourJob < ActiveJob::Base
      queue_as :name_of_your_queue
    end

    Also use that same name in your Action Mailer configuration (if you send emails in background jobs):

    # config/application.rb
    module YourApp
      class Application < Rails::Application
        config.action_mailer.deliver_later_queue_name = :name_of_your_queue
      end
    end
  • Choose a visibility timeout that exceeds the maximum amount of time a single job will take.

  1. Give your EC2 instances permission to send messages to SQS queues:
  • Stay logged in and select the IAM service from the services menu.
  • Select the Roles submenu.
  • Find the role that you select as the instance profile when creating the Elastic Beanstalk web environment: Instance Profile
  • Attach the AmazonSQSFullAccess policy to this role.
  • Make yourself familiar with AWS Service Roles, Instance Profiles, and User Policies.
  1. Tell the gem the region of your SQS queue that you created in step 2:
  • Select the web environment that is currently hosting your application and open the Software Configuration settings.
  • Add AWS_REGION and set it to the region of the SQS queue, created in Step 2.
  1. Create a worker environment:
  • Stay logged in and select the Elastic Beanstalk option from the services menu.
  • Select your application, click the Actions button and select Launch New Environment.
  • Click the create worker button and select the identical platform that you had chosen for your web environment.
  • In the Worker Details form, select the queue, that you created in Step 2, as the worker queue, and leave the MIME type to application/json. The visibility timeout setting should exceed the maximum time that you expect a single background job will take. The HTTP path setting can be left as it is (it will be ignored).
  1. Configure the worker environment for processing jobs:
  • Select the worker environment that you just have created and open the Software Configuration settings.
  • Add PROCESS_ACTIVE_ELASTIC_JOBS and set it to true.
  1. Configure Active Elastic Job as the queue adapter.

    # config/application.rb
    module YourApp
      class Application < Rails::Application
        config.active_job.queue_adapter = :active_elastic_job
      end
    end
  2. Verify that both environments—web and worker—have the same secret base key:

  • In the Software Configuration settings of the web environment, copy the value of the SECRET_KEY_BASE variable.
  • Open the Software Configuration settings of the worker environment and add the SECRET_KEY_BASE variable. Paste the value from the web environment, so that both environments have the same secret key base.
  1. Deploy the application to both environments (web and worker).

Set up periodic tasks (cron jobs)

Elastic beanstalk worker environments support the execution of periodic tasks similar to cron jobs. We recommend you to make yourself familiar with Elastic Beanstalks' official doumentation first.

You don't need this gem to make use of Elastic Beanstalk's periodic tasks feature, however, this gem takes care of intercepting the POST requests from the SQS daemon (explained in the official documentation). If the gem detects a POST request from the daemon caused by a periodic task definition, then the gem will create a corresponding Active Job instance and trigger the execution. To make use of the gem, just follow these conventions when writing your definition of the perdiodic tasks in cron.yaml:

  • Set name to the class name the of the (ActiveJob) job that should be performed.
  • Set url to /periodic_tasks.

This is an example of a cron.yaml file which sets up a periodic task that is executed at 11pm UTC every day. The url setting leads to requests which will be intercepted by the gem. It then looks at the name setting, passed as a request header value by the SQS daemon, and instantiates a PeriodicTaskJob job object. Subsequently it triggers its execution by calling the #perform_now method.

version: 1
cron:
 - name: "PeriodicTaskJob"
   url: "/periodic_tasks"
   schedule: "0 23 * * *"

FIFO Queues

FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated. FIFO queues also provide exactly-once processing but have a limited number of transactions per second (TPS).

The message group id will be set to the job type, and the message deduplication id will be set to the job id.

Note: Periodic tasks don't work for worker environments that are configured with Amazon SQS FIFO queues.

Optional configuration

This gem is configurable in case your setup requires different settings than the defaults. The snippet below shows the various configurable settings and their defaults.

Rails.application.configure do
  config.active_elastic_job.process_jobs = ENV['PROCESS_ACTIVE_ELASTIC_JOBS'] == 'true'
  config.active_elastic_job.aws_credentials = lambda { Aws::InstanceProfileCredentials.new } # allows lambdas for lazy loading
  config.active_elastic_job.aws_region # no default
  config.active_elastic_job.secret_key_base = Rails.application.secrets[:secret_key_base]
  config.active_elastic_job.periodic_tasks_route = '/periodic_tasks'.freeze
end

If you don't want to provide AWS credentials by using EC2 instance profiles, but via environment variables, you can do so:

Rails.application.configure do
  config.active_elastic_job.aws_credentials = Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
end

Suggested Elastic Beanstalk configuration

Extended Nginx read timeout

By default, Nginx has a read timeout of 60 seconds. If a job takes more than 60 seconds to complete, Nginx will close the connection making AWS SQS think the job failed. However, the job will continue running until it completes (or errors out), and SQS will re-queue the job to be processed again, which typically is not desirable.

The most basic way to make this change is to simply add this to a document within nginx/conf.d:

fastcgi_read_timeout 1800; # 30 minutes
proxy_read_timeout 1800; # 30 minutes

However, one of the best parts about active-elastic-job is that you can use the same code base for your web environment and your worker environment. You probably don't want your web environment to have a read_timeout longer than 60 seconds. So here's an Elastic Beanstalk configuration file to only add this to your worker environments.

Amazon Linux 2

Create two files (for application / configuration deployment) with same content.

.platform/hooks/predeploy/nginx_read_timeout.sh .platform/confighooks/predeploy/nginx_read_timeout.sh

#!/usr/bin/env bash
set -xe

if [ $PROCESS_ACTIVE_ELASTIC_JOBS ]
then
  cat >/var/proxy/staging/nginx/conf.d/read_timeout.conf <<EOL
fastcgi_read_timeout 1800;
proxy_read_timeout 1800;
EOL
fi

Pre-Amazon Linux 2

Coming soon

Experimental

Multiple Queues with Single Worker

The default aws-sqsd daemon only support one queue at a time and is determined by the Elastic Beanstalk configuration. However, as of 3.1.0, we've introduced an experimental feature for also handling requests made by other sqsd daemons. One options is (sqsd)[https://github.com/mogadanez/sqsd], but any daemon that makes localhost requests with the user-agent sqsd should work.

Potential Setup

In .platform/hooks/postdeploy put a shell script like this:

#!/usr/bin/env bash
set -xe

if [ "$PROCESS_ACTIVE_ELASTIC_JOBS" ]
then
  npm install -g sqsd
  nohup "$(npm bin -g)/sqsd" --queue-url $SQS_URL --web-hook '/' --worker-health-url '/health' --ssl-enabled false --daemonized false >> /var/log/sqsd.log 2>&1 &
fi
  • worker-health-url is optional, but better to have than to not
  • ssl-enabled is set to false as the default Elastic Beanstalk setup has the SSL ending at the load balancer and not the application.
  • daemonized is set to false otherwise sqsd would stop once the queue was empty (this seems backwards from the sqsd README, but it works this way)
  • user-agent is technically optional as the default value is sqsd, but there's potential to expand features based on this field
  • Everything starting with the >> is optional unless you want output from the daemon logged

Potential Problems

aws-sqsd cannot coordinate resources with sqsd therefor it can't properly "load balance" tasks like normal. If this becomes a problem you can lower the number of concurrent workers and max messages to help keep resources in check.

FAQ

A summary of frequently asked questions:

What are the advantages in comparison to popular alternatives like Resque, Sidekiq or DelayedJob?

You decided to use Elastic Beanstalk because it facilitates deploying and operating your application. Active Elastic Job embraces this approach and keeps deployment and maintenance simple. To use Resque, Sidekiq or DelayedJob as a queuing backend, you would need to setup at least one extra EC2 instance that runs your queue application. This complicates deployment. Furthermore, you will need to monitor your queue and make sure that it is in a healthy state.

Can I run Resque or DelayedJob in my web environment which already exists?

It is possible but not recommended. Your jobs will be executed on the same instance that is hosting your web server, which handles your users' HTTP requests. Therefore, the web server and the worker processes will fight for the same resources. This leads to slower responses of your application. But a fast response time is actually one of the main reasons to offload tasks into background jobs.

Is there a possibility to prioritize certain jobs?

Amazon SQS does not support prioritization. In order to achieve faster processing of your jobs you can add more instances to the worker environment or create a separate queue with its own worker environment for your high-priority jobs.

Can jobs be delayed?

You can schedule jobs not more than 15 minutes into the future. See the Amazon SQS API reference. If you need to postpone the execution of a job further into the future, then consider the possibility of setting up a periodic task.

Can I monitor and inspect failed jobs?

Amazon SQS provides dead-letter queues. These queues can be used to isolate and sideline unsuccessful jobs.

Is my internet-facing web environment protected against being spoofed into processing jobs?

The Rails application will treat requests presenting a user agent value aws-sqsd/* as a request from the SQS daemo; therefore, it tries to un-marshal the request body back into a job object for further execution. This adds a potential attack vector since anyone can fabricate a request with this user agent and, therefore, might try to spoof the application into processing jobs or even malicious code. This gem takes several counter-measures to block the attack vector.

  • The middleware that processes the requests from the SQS daemon is disabled per default. It has to be enabled deliberately by setting the environment variable PROCESS_ACTIVE_ELASTIC_JOBS to true, as instructed in the Usage section.
  • Messages that represent the jobs are signed before they are enqueued. The signature is verified before the job is executed. This is the reason both environments-web and worker-need to have the same value for the environment variable SECRET_KEY_BASE (see the Usage section Step 7) since the secret key base will be used to generate and verify the signature.
  • Only requests that originate from the same host (localhost) are considered to be requests from the SQS daemon. SQS daemons are installed in all instances running in a worker environment and will only send requests to the application running in the same instance. Because of these safety measures it is possible to deploy the same codebase to both environments, which keeps the deployment simple and reduces complexity.

Can jobs get lost?

Active Elastic Job will raise an error if a job has not been sent successfully to the SQS queue. It expects the queue to return an MD5 digest of the message contents, which it verifies for correctness. Amazon advertises SQS to be reliable and messages are stored redundantly. If a job is not executed successfully, the corresponding message become visible in the queue again. Depending on the queue's setting, the worker environment will pull the message again and an attempt will be made to execute the jobs again.

What can be the reason if jobs are not executed?

Inspect the log files of your worker tier environment. It should contain entries for the requests that are performed by the AWS SQS daemon. Look out for POST requests from user agents starting with aws-sqsd/. If the log does not contain any, then make sure that there are messages enqueued in the SQS queue which is attached to your worker tier. You can do this from your AWS console.

When you have found the requests, check their response codes which give a clue on why a job is not executed:

  • status code 500: something went wrong. The job might have raised an error.
  • status code 403: the request seems to originate from another host than localhost or the message which represents the job has not been verified successfully. Make sure that both environment, web and worker, use the same SECRET_KEY_BASE.
  • status code 404 or 301: the gem is not included in the bundle, or the PROCESS_ACTIVE_ELASTIC_JOBS is not set to true (see step 6) in the worker environment or the worker environment uses an outdated platform which uses the AWS SQS daemon version 1. Check the user agent again, if it lookes like this aws-sqsd/1.* then it uses the old version. This gem works only for daemons version 2 or newer.

Bugs - Questions - Improvements

Whether you catch a bug, have a question or a suggestion for improvement, I sincerely appreciate any feedback. Please feel free to create an issue and I will follow up as soon as possible.

Contribute

Running the complete test suite requires to launch elastic beanstalk environments. Travis builds triggered by a pull request will launch the needed elastic beanstalk environments and subsequently run the complete test suite. You can run all specs that do not depend on running elasitic beanstalk environments by setting an environment variable:

EXCEPT_DEPLOYED=true bundle exec rspec spec

Feel free to issue a pull request, if this subset of specs passes.

Development environment with Docker

We recommend to run the test suite in a controlled and predictable envrionment. If your development machine has Docker installed, then you can make use of the Dockerfile that comes with this package. Build an image and run tests in container of that image.

docker build -t active-elastic-job-dev .
docker run -e EXCEPT_DEPLOYED=true -v $(pwd):/usr/src/app active-elastic-job-dev bundle exec rspec spec