Page cover

Deploying on AWS (ECS Fargate)

Introduction

This guide covers deploying the Composable Agentic Platform (CAP) on Amazon Web Services using a fully managed container infrastructure. The CAP Agent runtime is packaged as a Docker container and deployed to AWS ECS Fargate, giving you a scalable, serverless hosting environment without managing EC2 instances.

Deployment is self-service via a provided AWS CloudFormation template that provisions the complete infrastructure stack in a single operation.

circle-info

AWS Marketplace customers: If you have not yet subscribed, visit the TomorrowX listing on AWS Marketplacearrow-up-right before proceeding. A valid subscription is required for the container image to start correctly.

Architecture Overview

The CloudFormation template provisions a production-ready full-stack environment in your AWS account:

┌──────────────────────────────────────────────────────────────────┐
│ Your AWS Account                                            │
│                                                             │
│  Internet → Application Load Balancer (ALB)                 │
│                 │                                           │
│                 └── Port 80/443  → CAP Agent (web/solution) │
│                                                             │
│  ECS Fargate (CAP Agent container — private IP per task)    │
│       │             ↑                                       │
│       │      CAP Console → Task Private IP:20001 (direct)   │
│       │                                                     │
│       ├── RDS PostgreSQL (application data)                 │
│       ├── EFS (shared file storage / uploads)               │
│       └── CloudWatch Logs (agent logs)                      │
└──────────────────────────────────────────────────────────────────┘

Infrastructure provisioned by the CloudFormation template:

Component
Service
Purpose

Container runtime

ECS Fargate

Runs the CAP Agent — no EC2 to manage

Load balancer

Application Load Balancer

Routes web traffic (port 80/443) to the agent solution endpoint

Database

RDS PostgreSQL

Persistent application data storage

Shared file storage

EFS

File uploads, shared across container restarts

Secrets

Secrets Manager

Database credentials (never in plaintext)

Logs

CloudWatch

Agent log streaming and retention

Container image

AWS Marketplace ECR

TomorrowX-managed, subscription-gated

Prerequisites

Before deploying, you'll need:

  • An active AWS Marketplace subscription to a TomorrowX CAP platform tier. Verify at AWS Marketplace → Manage subscriptions.

  • An AWS account with permissions to create: CloudFormation stacks, ECS clusters, RDS instances, EFS file systems, ALBs, IAM roles, VPCs, and Secrets Manager entries.

  • AWS CLI installed and configured, or access to the AWS Management Console.

  • A nominated AWS region — the stack can be deployed to any region that supports ECS Fargate, RDS, and EFS.

circle-exclamation

Deployment

Step 1: Obtain the CloudFormation Template

The CloudFormation template (cap-fullstack.yaml) is provided by your TomorrowX delivery partner or available from the TomorrowX support portal at tomorrowx.devarrow-up-right.

Step 2: Deploy the Stack

Via AWS Console:

  1. Open CloudFormation → Stacks → Create stack → With new resources

  2. Upload cap-fullstack.yaml

  3. Enter a Stack name (e.g. cap-production). The ECS service will be named <stack-name>-svc.

  4. Complete the parameter fields:

Parameter
Description
Example

VpcId

Your VPC ID

vpc-xxxxxxxxxxxxxxxxx

SubnetIds

At least 2 subnets (different AZs)

subnet-xxx,subnet-yyy

WebAccessCidr

CIDR range for web access (ports 80/8080). Set to x.x.x.x/32 for one IP, 0.0.0.0/0 for all, or another CIDR range

0.0.0.0/0

ManagementCidr

CIDR range for Console management access (port 20001). Restrict to your Console server IP for production

10.0.0.5/32

DatabaseAccessCidr

CIDR range for direct PostgreSQL access (port 5432). Use 127.0.0.1/32 to disable external access

127.0.0.1/32

DBPassword

RDS database password

Choose a strong password

DBName

PostgreSQL database name

aispark

AgentName

CAP Agent identifier (must match Console definition)

Agent

  1. Acknowledge IAM resource creation and deploy (see AWS Resources Created below).

  2. Wait for the stack status to reach CREATE_COMPLETE (typically 10–15 minutes).

Via AWS CLI:

Step 3: Retrieve the Stack Outputs

Once the stack is deployed, retrieve the output values — these are the URLs and connection details you'll need:

Output Key
Description
Example

AgentWebURL

Public-facing solution URL (ALB)

http://cap-production-alb-xxxxx.eu-west-1.elb.amazonaws.com

AgentManagementEndpoint

ALB DNS name — not used for Console registration (see below)

cap-production-alb-xxxxx.eu-west-1.elb.amazonaws.com

RDSEndpoint

Database host and port

cap-db.xxxxx.eu-west-1.rds.amazonaws.com:5432

RDSJdbcUrl

Full JDBC connection string

jdbc:postgresql://host:5432/aispark

ECSClusterName

ECS cluster name

cap-production-cluster

ECSServiceName

ECS service name

cap-production-svc

SnapshotBucketName

S3 bucket for configuration snapshots

cap-production-snapshots-xxxxxxxx

LogGroupName

CloudWatch log group

/ecs/cap-production

DeployUserCredentialsArn

Secrets Manager ARN for the Console deploy user credentials

arn:aws:secretsmanager:eu-west-1:123456789012:secret:cap-production/deploy-user-credentials-AbCdEf

DeployUserCredentialsCommand

CLI command to retrieve the deploy user access key and secret

(run in AWS CLI to get JSON with AccessKeyId and SecretAccessKey)

First-Time Setup

AWS Resources Created

The CloudFormation template creates the following IAM resources in your account:

Resource
Type
Purpose

TaskExecutionRole

IAM Role

Allows ECS Fargate to pull container images from ECR and read database credentials from AWS Secrets Manager

TaskRole

IAM Role

Grants running containers access to CloudWatch Logs, the stack's S3 snapshot bucket, ECS Exec for debugging, and AWS Marketplace metering

SnapshotScriptLambdaRole

IAM Role

Allows a one-time Lambda function to deploy a helper script to the S3 bucket during stack creation

DeployUser

IAM User

Scoped credentials for CAP Console's remote agent deployment feature (rolling updates). Policy is restricted to ECS task discovery (scoped to this stack's cluster) and ALB target group operations

DeployUserAccessKey

IAM Access Key

Long-term credentials for the deploy user, stored securely in AWS Secrets Manager (secret name: <stack-name>/deploy-user-credentials). Retrieve via the DeployUserCredentialsCommand stack output

circle-info

All IAM resources are scoped to the minimum permissions required for their function. The deploy user credentials are never exposed in task definition environment variables — they are stored exclusively in AWS Secrets Manager.

circle-info

This guide assumes you already have a CAP Console running (for example, from the TomorrowX AMI on AWS Marketplace). The CloudFormation stack deploys a CAP Agent (PDA) runtime only — your existing Console is used to manage and deploy configurations to it.

Register the Agent in Your CAP Console

The CAP Console connects directly to each ECS task on port 20001 using the task's private IP address. The ALB is for web (solution) traffic only — routing management commands through the ALB would distribute them randomly across tasks, so only one task would receive each configuration push while the others stay stale.

Find the running task's private IP:

Then in your CAP Console:

  1. Navigate to Administration → Agent Definitions → Add

  2. Enter the Agent ID matching the AgentName parameter (e.g. Agent)

  3. Set Host to the ECS task private IP retrieved above (e.g. 10.0.2.47)

  4. Set Port to 20001

  5. Save — the agent should immediately show as Online

circle-exclamation

Verify the Agent Endpoint

The AgentWebURL stack output is the public-facing URL of the solution deployed to the agent — not a CAP Console login page. Once a configuration is deployed from your Console, this is where end users or downstream systems will reach it.

Install Extensions

Extensions provide the rule libraries available in The Editor. TomorrowX provides a base extension package:

  1. Navigate to Administration → Extensions → Upload

  2. Upload the RulesBaseFactory-EXTENSION.zip (and any other extensions provided with your subscription)

  3. Extensions activate automatically — no container restart is required

Deploy Your First Agent Configuration

  1. In The Editor, create a Repository and build a Ruleset

  2. Navigate to Repositories → Deploy to Agent → Select your agent

  3. The ruleset activates on the running agent within seconds

  4. Test via the AgentWebURL in your browser

Configuration Persistence (Gold Master Pattern)

The CAP Agent container is stateless by design — configuration is not baked into the container image. Instead, a snapshot of the live agent configuration can be captured to S3 at any time. When the container starts (or restarts), it automatically restores from this snapshot.

How it works:

Container Start
Behaviour

S3_GOLD_MASTER env var set

Downloads configuration snapshot from S3, restores it, then starts Jetty

S3_GOLD_MASTER env var not set

Starts with a blank configuration (fresh deployment mode)

This means:

  • Container replacements (deployments, scaling, restarts) automatically restore your configuration

  • Multiple container tasks share the same configuration from S3

  • Rolling back is as simple as restoring a previous snapshot

Capturing a snapshot is handled by your delivery partner using the TomorrowX snapshot tool, or via the update-stack command to point the ECS service at a new S3_GOLD_MASTER path after capturing.

circle-info

For managed deployments, your TomorrowX delivery partner will handle Gold Master configuration and updates. Contact [email protected]envelope for guidance.

Viewing Agent Logs

Logs are streamed to CloudWatch automatically. To view them:

AWS Console:

  1. Open CloudWatch → Log groups

  2. Select the log group from the LogGroupName stack output

  3. Select the latest log stream

AWS CLI:

Logs are also accessible via the CAP Console UI — navigate to Agent → View Agent Logs to browse date-stamped log files.

Scaling

To run multiple agent tasks for high availability or load distribution, update the ECS service desired count:

All tasks share the same RDS database and EFS storage, and restore from the same S3 Gold Master snapshot on startup. The ALB distributes web traffic across healthy tasks automatically.

circle-exclamation
circle-info

For session-stateful configurations, enable ALB sticky sessions (duration-based, 1 hour recommended) so that a client is consistently routed to the same container task.

Updating the Container Image

When a new CAP version is released via AWS Marketplace, update the ECS service to pull the new image:

The ALB performs a rolling update — existing tasks continue serving traffic while new tasks start and pass health checks.

Troubleshooting

Agent Shows Offline in Console

  • Confirm the agent definition Host is set to the task's current private IP — not the ALB DNS name

  • Get the current private IP: aws ecs describe-tasks --cluster <cluster> --tasks <task-arn> --query 'tasks[0].attachments[0].details[?name==\privateIPv4Address`].value' --output text`

  • Verify the ECS task security group allows inbound traffic on port 20001 from your CAP Console's IP or security group

  • Confirm your CAP Console has network connectivity to the task's private IP (same VPC, VPC peering, or VPN)

  • View ECS task logs in CloudWatch for startup errors

Container Tasks Failing Health Checks

Common causes:

  • RDS security group not allowing inbound from the ECS task security group

  • Incorrect DB credentials (check Secrets Manager entry)

  • EFS mount point not accessible (check EFS security group)

S3 Snapshot Restore Fails on Startup

  • Confirm the S3_GOLD_MASTER environment variable in the task definition points to an existing S3 object

  • Verify the ECS task IAM role has s3:GetObject permission on the snapshot bucket

  • Check containers logs in CloudWatch for the specific error during restore

Cannot Access Console Web UI

  • Confirm the ALB listener is configured on port 80 (or 443 if HTTPS is configured)

  • Check ALB target group health — all targets should show healthy

  • Verify the ECS task security group allows outbound traffic

Last updated