Architecture

Financial ledger and double-entry accounting at scale

A double-entry financial ledger that survives 200,000 micro-transactions per second without losing a cent — idempotency gate, deferred SUM=0 journal, write-sharded hot accounts, CQRS balance cache, and a WORM audit trail, built entirely on AWS-managed services.

01

Command entry layer

flowchart LR Client([Merchant or client]) --> APIGW[API Gateway] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service]

Payment commands (charge, refund, transfer) enter through API Gateway and an ALB into ECS Fargate ledger services. The entry layer authenticates, validates the command shape, and does no money movement yet — it just gets the command to the idempotency gate.

02

Idempotency gate

flowchart LR Client([Merchant or client]) --> APIGW[API Gateway] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service] ECS -->|conditional PutItem\nattribute_not_exists| Idem[(DynamoDB\nidempotency keys\n30-day TTL)] Idem -->|key exists| Replay[Return stored result]

Before any money moves, the service does an atomic conditional PutItem into a DynamoDB idempotency table keyed on sha256 of client_id, request_id, amount, currency and a timestamp bucket. The first writer wins; concurrent retries lose the conditional check and replay the stored result — no double-charge.

03

Immutable double-entry journal

flowchart TD Client([Merchant or client]) --> APIGW[API Gateway] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service] ECS -->|conditional PutItem| Idem[(DynamoDB\nidempotency keys)] ECS --> Proxy[RDS Proxy\nsession-pinned] Proxy -->|debit and credit legs| Aurora[(Aurora PostgreSQL\nMulti-AZ I/O-Optimized\ndeferred SUM equals 0)] Idem -->|key exists| Replay[Return stored result]

The winning command writes journal entries into Aurora PostgreSQL Multi-AZ, I/O-Optimized, through RDS Proxy (session-pinned, so the deferred constraint's multi-statement transaction stays on one backend connection). A DEFERRABLE constraint checks SUM of signed amount_cents equals zero per transaction at COMMIT. Entries are append-only integer cents — money is never created or destroyed.

04

Write-sharded hot accounts

flowchart TD Client([Merchant or client]) --> APIGW[API Gateway] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service] ECS -->|conditional PutItem| Idem[(DynamoDB\nidempotency keys)] ECS --> Shard{Hot account} Shard -->|yes| Shards[(Aurora journal\n256 virtual shards\naccount_id NNN)] Shard -->|no| Aurora[(Aurora journal\nsingle account row)] Idem -->|key exists| Replay[Return stored result]

A viral account would funnel 200k writes/s onto one account row and throttle. At a ~1,000 writes/s per-key ceiling, 200k writes/s needs 200 shards, so hot accounts are split into 256 virtual shards keyed account_id#NNN — no overflow tier. Writes round-robin across shards. A control table flags which accounts are hot and pre-shards them, so sharding is surgical, not global.

05

CQRS balance read path

flowchart TD Client([Merchant or client]) --> APIGW[API Gateway] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service] ECS -->|conditional PutItem| Idem[(DynamoDB\nidempotency keys)] ECS --> Aurora[(Aurora journal\n256 shards for hot accounts)] Aurora -->|CDC via DMS| Kinesis[(Kinesis On-Demand\nby account_id)] Kinesis --> Projector[CQRS projector Lambda] Projector --> Cache[(ElastiCache Valkey\nmaterialised balances)] Projector --> Snap[(DynamoDB\nanalytics snapshots)] Cache -->|balance read| ECS Aurora -.->|cache miss fallback| ECS

Summing millions of entries on every balance read is O(entries) and unusable. A Kinesis On-Demand stream (partitioned by logical account_id for LSN ordering) feeds a CQRS projector Lambda that materialises a per-account balance in ElastiCache via atomic INCRBY (sum across all 256 shards) and analytics snapshots in DynamoDB. Reads hit the cache; a miss falls back to a bounded Aurora aggregate.

06

Full system - sweeper, reconciliation and WORM audit

flowchart TD Client([Merchant or client]) --> APIGW[API Gateway] subgraph Edge[Edge and compute] APIGW --> ALB[ALB] ALB --> ECS[ECS Fargate\nledger service] end subgraph Write[Write side - source of truth] ECS -->|conditional PutItem| Idem[(DynamoDB\nidempotency keys\nPENDING then COMPLETE)] ECS --> Proxy[RDS Proxy] Proxy --> Aurora[(Aurora journal\nMulti-AZ I/O-Optimized\n256 shards 90-day window)] Sweep[EventBridge sweeper\nLambda every 5 min] --> Idem Sweep --> Aurora end subgraph Read[Read side - CQRS] Aurora -->|CDC via DMS| Kinesis[(Kinesis On-Demand)] Kinesis --> Projector[CQRS projector Lambda] Projector --> Cache[(ElastiCache Valkey\nbalances)] Projector --> Snap[(DynamoDB\nanalytics snapshots)] Cache --> ECS end subgraph Saga[Cross-account] ECS --> SFN[Step Functions Express\nsaga compensating entries] SFN --> Proxy end subgraph Audit[Audit and reconcile] Aurora -->|DMS Parquet| S3[(S3 Object Lock\nWORM audit and archive)] S3 --> Athena[Athena compliance query] Recon[Step Functions plus Athena\nnightly re-sum] --> S3 Recon --> Cache Recon --> Drift[(S3 drift report\nalert over 0.01 dollar)] Macie[Macie scan] --> S3 end CloudTrail[CloudTrail control plane] --> Audit

The complete single-region Multi-AZ picture: an EventBridge sweeper Lambda resolves stranded PENDING idempotency claims against Aurora every 5 minutes. Nightly Step Functions plus Athena re-sums the Parquet journal of record against the balance cache, alerting on drift over $0.01 (with a 26-hour dead-man's switch). DMS exports entries to S3 as Parquet under Object Lock (WORM), queried via Athena, with CloudTrail, KMS and Macie covering SOC 2, ISO 27001, GDPR, NIST and PCI-DSS.