Here's the thing about sending emails at scale: your API will always be faster than your email provider. You build a user registration endpoint that sends welcome emails, and suddenly you're making users wait 2-3 seconds just because SendGrid or AWS SES needs time to process. Even worse, if the email service hiccups, your entire registration flow fails.
I ran into this exact problem and realized I needed a better approach. That's where this project came from.
A decoupled email system using BullMQ and Redis that separates the concern of "queuing an email" from "actually sending it." The result? Instant API responses and bulletproof reliability.
Here's the architecture:
User Request → API Endpoint → Queue Job → Return Success
↓
Background Worker → Send Email → Done
The API doesn't wait for the email anymore. It queues the job and moves on. Meanwhile, dedicated worker processes handle the actual sending in the background, with built-in retry logic if something goes wrong.
Speed: API responses went from 2000ms to under 50ms. Users don't wait for emails to send anymore.
Reliability: If SendGrid is down, jobs stay in the queue and retry automatically. No lost emails, no error screens for users.
Scalability: Need to handle more emails? Spin up more workers. The queue distributes the load across multiple processes or even multiple servers.
Visibility: BullMQ tracks every job—pending, processing, completed, failed. You know exactly what's happening at any moment.
This isn't just about emails. The same pattern works for processing uploaded files, generating reports, sending notifications, or any task that's too slow or unreliable to handle inline.
The system has two main components:
Receives requests and adds jobs to the Redis queue:
await emailQueue.add('welcome-email', {
to: user.email,
subject: 'Welcome!',
body: 'Thanks for signing up'
});Processes jobs from the queue with automatic retries:
emailQueue.process(async (job) => {
await sendEmail(job.data);
});Between them sits Redis, acting as the message broker. It stores jobs, tracks their state, and ensures workers don't process the same job twice.
| Metric | Before Queue | With Queue |
|---|---|---|
| API Response Time | 2000ms | 45ms |
| Failed Email Impact | User sees error | Transparent retry |
| Concurrent Processing | 1 at a time | 10+ workers |
| System Coupling | Tight | Decoupled |
Prerequisites:
- Node.js 16+
- Redis running on
localhost:6379
Setup:
git clone https://github.com/dharamdan01/async-job-queue-system.git
cd async-job-queue-system
npm installStart the worker (processes jobs):
node worker.jsStart the producer (creates jobs):
node producer.jsThe producer will add sample jobs to the queue, and the worker will process them. Check the console logs to see jobs moving through the system.
- BullMQ: Job queue built on Redis with first-class TypeScript support
- Redis: In-memory data store acting as the message broker
- Node.js: Runtime for both producer and worker processes
- ioredis: Redis client with robust connection handling
Building this taught me about:
- Distributed systems and the producer-consumer pattern
- When to decouple services (spoiler: whenever I/O is slow or unreliable)
- Message brokers and why Redis works so well for this
- Job retry strategies and handling failures gracefully
- The difference between vertical scaling (bigger servers) and horizontal scaling (more workers)
Things I'm planning to add:
- Docker Compose setup so you can run Redis + workers with one command
- Dead letter queue for jobs that fail repeatedly
- Rate limiting to avoid overwhelming email providers
- Web dashboard using Bull Board to visualize the queue in real-time
- Job prioritization so critical emails (password resets) jump ahead of marketing emails
This isn't a toy project. The patterns here are used by companies processing millions of jobs per day:
- Stripe uses queues for webhooks and async payment processing
- Shopify queues background jobs for inventory updates
- Airbnb processes booking confirmations asynchronously
Understanding job queues means understanding how modern backends stay fast and reliable under load.
Dharam Dan
GitHub: @dharamdan01
LinkedIn: Dharam Dan
If this helped you understand distributed systems better, I'd appreciate a ⭐