Skip to main content

ECS Architecture

Overview

This document describes the production architecture for the Smart360 platform running on Amazon ECS. The system consists of nine microservices (auth ax ,bx ,cx ,mx ,onboarding,rx, notification and wx) deployed in the smart360-prod cluster, each following a three service pattern with Django backends, Celery task processing, and Celery-beat scheduled task management.

Architecture

ecs-arch-june-2025.drawio.png


ECS Cluster Details

  • Cluster Name: smart360-prod
  • Compute Type: EC2 workers
  • Environment: Production
  • Region: us-east-1

Service Architecture Pattern

Web Service (Django Backend)

  • Purpose: REST API endpoints and web application backend
  • Technology: Django
  • Service Auto-scaling: Enabled with 90% CPU and memory utilization thresholds
  • Target Groups: Connected to Application Load Balancer
  • Resource Configuration: Defined in bifrost GitHub repository, prod Branch (https://github.com/BynryGit/bifrost/tree/prod/task-definations/)
  • Environment Configuration: Defined in bifrost GitHub repository, prod branch (https://github.com/BynryGit/bifrost/tree/prod/environment-variables/prod)

Celery Service (Background Tasks)

  • Purpose: Executes asynchronous background tasks
  • Technology: Celery
  • Message Broker: ElastiCache Redis (same VPC)
  • Scaling: Service auto-scaling configured
  • Task Source: Tasks queued by web services and scheduled by celery-beat
  • Resource Configuration: Defined in bifrost GitHub repository, prod Branch (https://github.com/BynryGit/bifrost/tree/prod/task-definations/)
  • Environment Configuration: Defined in bifrost GitHub repository, prod branch (https://github.com/BynryGit/bifrost/tree/prod/environment-variables/prod)

Celery Beat Service (Task Scheduler)

  • Purpose: Schedules periodic and recurring tasks
  • Technology: Celery Beat
  • Message Broker: ElastiCache Redis (same VPC)
  • Function: Adds scheduled tasks to Redis queues for Celery workers to process
  • Resource Configuration: Defined in bifrost GitHub repository, prod Branch (https://github.com/BynryGit/bifrost/tree/prod/task-definations/)
  • Environment Configuration: Defined in bifrost GitHub repository, prod branch (https://github.com/BynryGit/bifrost/tree/prod/environment-variables/prod)


Auto Scaling Group (ASG) Configuration

  • ASG Name: Infra-ECS-Cluster-smart360-prod-ebd9d770-ECSAutoScalingGroup-i82Z93MZY6Xi
  • Capacity Settings:
    • Minimum: 1 instance
    • Maximum: 2 instances
    • Desired: 2 instances
  • Instance Warmup Period: 300 seconds
  • Availability Zones: Multi-AZ deployment across us-east-1 region
  • Instance Type: On-demand instances only

Scaling Policies

  • Scaling Trigger: CPU-based auto-scaling
  • Target Metric: Average CPU utilization at 90%
  • Scaling Behavior: Execute policy as required to maintain target utilization
  • Integration: ASG functions as the ECS Capacity Provider for automatic container placement

Capacity Provider Strategy

The Auto Scaling Group is configured as an ECS Capacity Provider, enabling:

  • Automatic EC2 instance scaling based on container resource requirements
  • Intelligent container placement across available instances
  • Seamless integration between ECS service scaling and underlying EC2 infrastructure

Service Auto-Scaling Configuration

  • Target Metrics: 90% CPU utilization and 90% memory utilization
  • Applied To: Web service backends
  • Scaling Type: Service-level auto-scaling within ECS

Network Architecture

VPC Configuration

  • VPC: Default
  • Subnet Type: Public subnets
  • Load Balancer: Application Load Balancer (ALB)
  • Internal Communication: Services communicate within the same VPC

Data Layer

  • Primary Database: Amazon RDS instance (same VPC)
  • Database Proxy: PgBouncer connection pooling (same VPC)
  • Cache/Message Broker: ElastiCache Redis cluster (same VPC)

Deployment Pipeline

CI/CD Process

  • Pipeline Tool: Jenkins
  • Container Registry: Amazon ECR
  • Deployment Strategy: ECS Rolling deployments
  • Source Code: Configuration managed in bifrost GitHub repository (prod branch)

Resource Configuration

  • CPU and memory allocations for each service type are explicitly defined in the task definitions located in the bifrost repository's production branch

Key Design Principles

Microservice Isolation

Each microservice operates independently with its own set of web, celery, and celery-beat services, enabling independent scaling and deployment.

Shared Infrastructure

All services share common infrastructure components (RDS, ElastiCache, VPC) while maintaining logical separation through service boundaries individually.