Deploying on AWS (ECS Fargate)
Introduction
This guide covers deploying the Composable Agentic Platform (CAP) on Amazon Web Services using a fully managed container infrastructure. The CAP Agent runtime is packaged as a Docker container and deployed to AWS ECS Fargate, giving you a scalable, serverless hosting environment without managing EC2 instances.
Deployment is self-service via a provided AWS CloudFormation template that provisions the complete infrastructure stack in a single operation.
AWS Marketplace customers: If you have not yet subscribed, visit the TomorrowX listing on AWS Marketplace before proceeding. A valid subscription is required for the container image to start correctly.
Architecture Overview
The CloudFormation template provisions a production-ready full-stack environment in your AWS account:
┌──────────────────────────────────────────────────────────────────┐
│ Your AWS Account │
│ │
│ Internet → Application Load Balancer (ALB) │
│ │ │
│ └── Port 80/443 → CAP Agent (web/solution) │
│ │
│ ECS Fargate (CAP Agent container — private IP per task) │
│ │ ↑ │
│ │ CAP Console → Task Private IP:20001 (direct) │
│ │ │
│ ├── RDS PostgreSQL (application data) │
│ ├── EFS (shared file storage / uploads) │
│ └── CloudWatch Logs (agent logs) │
└──────────────────────────────────────────────────────────────────┘Infrastructure provisioned by the CloudFormation template:
Container runtime
ECS Fargate
Runs the CAP Agent — no EC2 to manage
Load balancer
Application Load Balancer
Routes web traffic (port 80/443) to the agent solution endpoint
Database
RDS PostgreSQL
Persistent application data storage
Shared file storage
EFS
File uploads, shared across container restarts
Secrets
Secrets Manager
Database credentials (never in plaintext)
Logs
CloudWatch
Agent log streaming and retention
Container image
AWS Marketplace ECR
TomorrowX-managed, subscription-gated
Prerequisites
Before deploying, you'll need:
An active AWS Marketplace subscription to a TomorrowX CAP platform tier. Verify at
AWS Marketplace → Manage subscriptions.An AWS account with permissions to create: CloudFormation stacks, ECS clusters, RDS instances, EFS file systems, ALBs, IAM roles, VPCs, and Secrets Manager entries.
AWS CLI installed and configured, or access to the AWS Management Console.
A nominated AWS region — the stack can be deployed to any region that supports ECS Fargate, RDS, and EFS.
Deploying this stack will create billable AWS resources. Review the CloudFormation template and AWS pricing for ECS Fargate, RDS, EFS, and ALB in your region before deploying.
Deployment
Step 1: Obtain the CloudFormation Template
The CloudFormation template (cap-fullstack.yaml) is provided by your TomorrowX delivery partner or available from the TomorrowX support portal at tomorrowx.dev.
Step 2: Deploy the Stack
Via AWS Console:
Open CloudFormation → Stacks → Create stack → With new resources
Upload
cap-fullstack.yamlEnter a Stack name (e.g.
cap-production). The ECS service will be named<stack-name>-svc.Complete the parameter fields:
VpcId
Your VPC ID
vpc-xxxxxxxxxxxxxxxxx
SubnetIds
At least 2 subnets (different AZs)
subnet-xxx,subnet-yyy
WebAccessCidr
CIDR range for web access (ports 80/8080). Set to x.x.x.x/32 for one IP, 0.0.0.0/0 for all, or another CIDR range
0.0.0.0/0
ManagementCidr
CIDR range for Console management access (port 20001). Restrict to your Console server IP for production
10.0.0.5/32
DatabaseAccessCidr
CIDR range for direct PostgreSQL access (port 5432). Use 127.0.0.1/32 to disable external access
127.0.0.1/32
DBPassword
RDS database password
Choose a strong password
DBName
PostgreSQL database name
aispark
AgentName
CAP Agent identifier (must match Console definition)
Agent
Acknowledge IAM resource creation and deploy (see AWS Resources Created below).
Wait for the stack status to reach CREATE_COMPLETE (typically 10–15 minutes).
Via AWS CLI:
Step 3: Retrieve the Stack Outputs
Once the stack is deployed, retrieve the output values — these are the URLs and connection details you'll need:
AgentWebURL
Public-facing solution URL (ALB)
http://cap-production-alb-xxxxx.eu-west-1.elb.amazonaws.com
AgentManagementEndpoint
ALB DNS name — not used for Console registration (see below)
cap-production-alb-xxxxx.eu-west-1.elb.amazonaws.com
RDSEndpoint
Database host and port
cap-db.xxxxx.eu-west-1.rds.amazonaws.com:5432
RDSJdbcUrl
Full JDBC connection string
jdbc:postgresql://host:5432/aispark
ECSClusterName
ECS cluster name
cap-production-cluster
ECSServiceName
ECS service name
cap-production-svc
SnapshotBucketName
S3 bucket for configuration snapshots
cap-production-snapshots-xxxxxxxx
LogGroupName
CloudWatch log group
/ecs/cap-production
DeployUserCredentialsArn
Secrets Manager ARN for the Console deploy user credentials
arn:aws:secretsmanager:eu-west-1:123456789012:secret:cap-production/deploy-user-credentials-AbCdEf
DeployUserCredentialsCommand
CLI command to retrieve the deploy user access key and secret
(run in AWS CLI to get JSON with AccessKeyId and SecretAccessKey)
First-Time Setup
AWS Resources Created
The CloudFormation template creates the following IAM resources in your account:
TaskExecutionRole
IAM Role
Allows ECS Fargate to pull container images from ECR and read database credentials from AWS Secrets Manager
TaskRole
IAM Role
Grants running containers access to CloudWatch Logs, the stack's S3 snapshot bucket, ECS Exec for debugging, and AWS Marketplace metering
SnapshotScriptLambdaRole
IAM Role
Allows a one-time Lambda function to deploy a helper script to the S3 bucket during stack creation
DeployUser
IAM User
Scoped credentials for CAP Console's remote agent deployment feature (rolling updates). Policy is restricted to ECS task discovery (scoped to this stack's cluster) and ALB target group operations
DeployUserAccessKey
IAM Access Key
Long-term credentials for the deploy user, stored securely in AWS Secrets Manager (secret name: <stack-name>/deploy-user-credentials). Retrieve via the DeployUserCredentialsCommand stack output
All IAM resources are scoped to the minimum permissions required for their function. The deploy user credentials are never exposed in task definition environment variables — they are stored exclusively in AWS Secrets Manager.
This guide assumes you already have a CAP Console running (for example, from the TomorrowX AMI on AWS Marketplace). The CloudFormation stack deploys a CAP Agent (PDA) runtime only — your existing Console is used to manage and deploy configurations to it.
Register the Agent in Your CAP Console
The CAP Console connects directly to each ECS task on port 20001 using the task's private IP address. The ALB is for web (solution) traffic only — routing management commands through the ALB would distribute them randomly across tasks, so only one task would receive each configuration push while the others stay stale.
Find the running task's private IP:
Then in your CAP Console:
Navigate to Administration → Agent Definitions → Add
Enter the Agent ID matching the
AgentNameparameter (e.g.Agent)Set Host to the ECS task private IP retrieved above (e.g.
10.0.2.47)Set Port to
20001Save — the agent should immediately show as Online
Network connectivity required: Your CAP Console must be able to reach the task's private IP on port 20001. Ensure the ECS task security group allows inbound traffic on port 20001 from your Console's IP or security group. If your Console runs outside the VPC, connectivity via VPN or VPC peering is required.
Task replacement: If ECS replaces a task (deployment, restart, or scaling event), the new task gets a new private IP. Update the Console agent definition accordingly, or use AWS Cloud Map service discovery to assign stable per-task DNS names.
Verify the Agent Endpoint
The AgentWebURL stack output is the public-facing URL of the solution deployed to the agent — not a CAP Console login page. Once a configuration is deployed from your Console, this is where end users or downstream systems will reach it.
Install Extensions
Extensions provide the rule libraries available in The Editor. TomorrowX provides a base extension package:
Navigate to Administration → Extensions → Upload
Upload the
RulesBaseFactory-EXTENSION.zip(and any other extensions provided with your subscription)Extensions activate automatically — no container restart is required
Deploy Your First Agent Configuration
In The Editor, create a Repository and build a Ruleset
Navigate to Repositories → Deploy to Agent → Select your agent
The ruleset activates on the running agent within seconds
Test via the
AgentWebURLin your browser
Configuration Persistence (Gold Master Pattern)
The CAP Agent container is stateless by design — configuration is not baked into the container image. Instead, a snapshot of the live agent configuration can be captured to S3 at any time. When the container starts (or restarts), it automatically restores from this snapshot.
How it works:
S3_GOLD_MASTER env var set
Downloads configuration snapshot from S3, restores it, then starts Jetty
S3_GOLD_MASTER env var not set
Starts with a blank configuration (fresh deployment mode)
This means:
Container replacements (deployments, scaling, restarts) automatically restore your configuration
Multiple container tasks share the same configuration from S3
Rolling back is as simple as restoring a previous snapshot
Capturing a snapshot is handled by your delivery partner using the TomorrowX snapshot tool, or via the update-stack command to point the ECS service at a new S3_GOLD_MASTER path after capturing.
For managed deployments, your TomorrowX delivery partner will handle Gold Master configuration and updates. Contact [email protected] for guidance.
Viewing Agent Logs
Logs are streamed to CloudWatch automatically. To view them:
AWS Console:
Open CloudWatch → Log groups
Select the log group from the
LogGroupNamestack outputSelect the latest log stream
AWS CLI:
Logs are also accessible via the CAP Console UI — navigate to Agent → View Agent Logs to browse date-stamped log files.
Scaling
To run multiple agent tasks for high availability or load distribution, update the ECS service desired count:
All tasks share the same RDS database and EFS storage, and restore from the same S3 Gold Master snapshot on startup. The ALB distributes web traffic across healthy tasks automatically.
Management and scaling: Each ECS task has its own private IP. When running multiple tasks, each task must be registered as a separate agent definition in your CAP Console (with its own private IP and a unique Agent ID, e.g. Agent-1, Agent-2). Alternatively, use AWS Cloud Map to assign stable per-task DNS names that update automatically when tasks are replaced.
For session-stateful configurations, enable ALB sticky sessions (duration-based, 1 hour recommended) so that a client is consistently routed to the same container task.
Updating the Container Image
When a new CAP version is released via AWS Marketplace, update the ECS service to pull the new image:
The ALB performs a rolling update — existing tasks continue serving traffic while new tasks start and pass health checks.
Troubleshooting
Agent Shows Offline in Console
Confirm the agent definition Host is set to the task's current private IP — not the ALB DNS name
Get the current private IP:
aws ecs describe-tasks --cluster <cluster> --tasks <task-arn> --query 'tasks[0].attachments[0].details[?name==\privateIPv4Address`].value' --output text`Verify the ECS task security group allows inbound traffic on port
20001from your CAP Console's IP or security groupConfirm your CAP Console has network connectivity to the task's private IP (same VPC, VPC peering, or VPN)
View ECS task logs in CloudWatch for startup errors
Container Tasks Failing Health Checks
Common causes:
RDS security group not allowing inbound from the ECS task security group
Incorrect DB credentials (check Secrets Manager entry)
EFS mount point not accessible (check EFS security group)
S3 Snapshot Restore Fails on Startup
Confirm the
S3_GOLD_MASTERenvironment variable in the task definition points to an existing S3 objectVerify the ECS task IAM role has
s3:GetObjectpermission on the snapshot bucketCheck containers logs in CloudWatch for the specific error during restore
Cannot Access Console Web UI
Confirm the ALB listener is configured on port
80(or443if HTTPS is configured)Check ALB target group health — all targets should show
healthyVerify the ECS task security group allows outbound traffic
Last updated


