Scale-to-Zero Architecture in Azure for SaaS Products

Scale to Zero Azure SaaS Architecture

Open Table of contents

Introduction
- What you’ll learn
- Prerequisites
What we’re optimising for
- Defining “scale to zero” (pragmatically)
Core principles
Reference architecture
Workload mapping
Achieving scale-to-zero (practical configuration)
Data and storage patterns
Security baseline
Payment and wallet model
- Subscription lifecycle
- Wallet as immutable ledger
Deployment and environments
Ops and residency
Cost control checklist
Honest trade-offs
Architecture diagrams
Next steps
Summary

Introduction

If you’re building a SaaS product as an individual contributor then you’ll almost certainly have had the same thoughts that I had, how do I keep capital investment as low as possible. You will have thought that, as the capital investment is your own money, so every bit counts. At the same time, if you have experience of the software delivery life-cycle you know that its generally a good idea to have something that still acts production-grade; a solution that won’t fall over the moment traffic arrives.

This blog post, part of the “Playbooks” series, walks through a scale-to-zero architecture on Azure. One that can idle at near-zero cost while maintaining a clean path to “real SaaS” with authentication, payments, async jobs, observability, and sensible security defaults.

I’ll use a reference architecture I’ve called “SaaS-Light” throughout, but the design considerations apply to most early-stage SaaS products where cost control is critical.

What you’ll learn

How to structure compute, data, and background processing for genuine scale-to-zero
Practical configuration knobs in Azure Container Apps and Azure Functions
Data tier patterns that minimise idle chatter
Security defaults that don’t require a dedicated security team
Honest trade-offs you’ll need to accept

Prerequisites

Before diving in, you should be comfortable with:

Azure fundamentals (resource groups, subscriptions, managed identity)
Containerised applications (Docker basics)
At least one backend framework (ASP.NET Core, Node.js, etc.)
Basic understanding of queues and async processing

What we’re optimising for

Let’s be explicit about the goals; these shape every decision downstream.

Primary goal: Near-zero idle cost while keeping a clean path to production-grade SaaS (auth, payments, async jobs, observability, security).

Secondary goals:

Predictable operations
Minimal moving parts
Regional data residency (more on this in the Ops section)
Developer ergonomics

Defining “scale to zero” (pragmatically)

Let’s be honest about what “scale to zero” actually means in practice:

Layer	Can it hit zero?	Reality check
Compute	Yes	Container Apps and Functions can scale to 0 replicas
Background processing	Yes	Event-driven/queued, runs only when needed
Data tier	Mostly	Serverless SQL can pause, but constant pokes keep it awake
Edge/CDN	No	DNS, WAF, caching, always incurs baseline cost
Monitoring	No	Application Insights has minimum ingestion costs
Storage	No	Blob storage charges for existence, not just access

The goal isn’t literally zero; it’s near-zero with a clear understanding of what’s irreducible.

Core principles

These five principles form the “rules of the game” for this architecture:

1. Separate request/response from long work

Your web/API layer should handle fast work only. Anything that takes more than a couple of seconds becomes a job; I’d typically say queue it, process it asynchronously, update status.

❌ User clicks "Refresh Prices" → API blocks for 30 seconds → Returns result
✅ User clicks "Refresh Prices" → API enqueues job → Returns immediately → Worker processes → UI polls for status

2. Event-driven everything

Prefer HTTP triggers + queue triggers over always-on schedulers or polling loops. If nothing’s happening, nothing should be running.

3. Cold-start tolerant UX

Cold starts are real. Design for them:

Keep container images small
Cache aggressively at the edge (I like Cloudflare’s free plan for this, but more later)
Show loading states gracefully
Pre-warm critical paths if needed

4. Cost controls are architecture, not finance

Don’t treat cost management as a finance team problem. Bake it into the architecture:

minReplicas: 0 is a design decision
Budgets and alerts are infrastructure
“Kill switches” for runaway workloads are first-class features

5. Security is default, not a later patch

Managed identity, Key Vault, least privilege, defensive data boundaries, these aren’t “nice to haves” for later. They’re cheaper to build in from day one. So, make sure you think about them, if you’re utilising Agentic AI workflows, make sure to include security reviews in your process.

Reference architecture

Here’s the high-level view of what we’re building:

Edge layer

Cloudflare (or equivalent) handles:

DNS management
WAF and DDoS protection
Edge caching and cache rules
Rate limiting (optional but recommended)

Why Cloudflare? It’s cost-effective, has a generous free tier, and the Workers/Rules ecosystem is mature. Azure Front Door is an alternative if you prefer staying in-ecosystem.

App tier (compute that scales to zero)

Azure Container Apps (ACA) hosts your application containers:

App	Purpose	Scale rule
`web`	Customer UI + API (e.g., Blazor Server, ASP.NET Core)	HTTP concurrency, min replicas = 0
`admin`	Internal/admin portal	HTTP concurrency, min replicas = 0

Container Apps is the sweet spot here. it’s simpler than Kubernetes, supports scale-to-zero natively, and handles ingress/TLS automatically.

In the above table I’m saying scale to zero for both public and admin apps. You might choose to keep the public app warm if you want to provide a better experience for end users (less cold start impact). The admin app is a great candidate for scale-to-zero since it’s used infrequently (Or at least only by a small number of users).

Async tier (also scales to zero)

Pick one (or mix based on workload):

Azure Functions (Consumption or Flex plan):

Best for: queue triggers, timer triggers, webhook handlers
Cold start: typically 1-3 seconds
Cost: pay per execution

Azure Container Apps Jobs:

Best for: run-to-completion workloads (batch imports, data migrations)
Cold start: container pull time (keep images small!)
Cost: pay for execution duration

Integration and messaging

Azure Storage Queues for most scenarios:

Dead simple
Extremely cheap
Good enough for 90% of async patterns

Azure Service Bus when you need:

Message sessions (extracting ordered sets of messages)
Advanced dead-letter handling
Exactly-once processing guarantees
Complex routing/topics

Start with Storage Queues. Graduate to Service Bus when you hit its limitations. Please bear in mind that Azure Queue Storage is a lightweight queue and doesn’t provide a FIFO guarantee (ordering is best-effort). If you need guaranteed ordered processing and other broker features, Azure Service Bus can provide FIFO via Sessions (Standard/Premium), but it typically comes at a higher cost than Storage Queues.

Data layer

Azure SQL Database (Serverless tier):

Auto-pause after configurable idle period
Auto-resume on first connection
Pay for vCores only when active

Perfect for: relational domain data (users, collections, subscriptions, pricing metadata).

Azure Blob Storage:

Images and binary assets
Serve via CDN for performance
Lifecycle policies for cost control

What about Redis?

Be honest with yourself: Redis (Azure Cache for Redis) is rarely “scale-to-zero”. The cheapest tier still costs money when idle.

Instead:

Use app-level caching (in-memory, with short TTL)
Use edge caching (Cloudflare cache rules)
Add Redis only when you’ve proven you need distributed cache

Config and secrets

Azure Key Vault:

Store third-party secrets (payment provider keys, email API keys)
Access via managed identity (no connection strings in config)

Azure App Configuration:

Feature flags
Runtime configuration
Environment-specific settings

Observability

Application Insights:

Logs, traces, metrics in one place
Distributed tracing across services
Alerting on anomalies

Cost control tips:

Enable sampling (don’t log 100% of requests)
Set retention periods appropriately
Focus on high-signal metrics, not vanity dashboards

Workload mapping

These are some concrete examples that I ran into so Let’s map real SaaS flows to this architecture using the my first SaaS project reference:

Authentication flow

User → Cloudflare → ACA (web) → Azure AD B2C / Entra External ID
                                         ↓
                              Token validation at app edge

Options:

Azure AD B2C or Entra External ID for customer identity
Alternative providers (Auth0, Clerk) if you prefer managed auth

The key: token validation happens at your app edge, not deep in your business logic.

Email notifications

App event (sign-up, password reset)
         ↓
    Storage Queue
         ↓
    Azure Function
         ↓
    Email provider API (Resend, Postmark, SendGrid)

Why queue it? Email providers can be slow or rate-limited. Your API shouldn’t block on email delivery.

Read-heavy endpoints (search, collection views)

User → Cloudflare (cache hit?) → ACA (web) → SQL
              ↑                        ↓
         Cache response           Query + cache miss

Cache public/catalog-like data aggressively at the edge. Use Cache-Control headers and Cloudflare cache rules.

Price refresh (the async pattern in detail)

This is the canonical example of “separate fast from slow”:

1. User clicks "Refresh Prices"
2. API validates request, enqueues job, returns job ID immediately
3. Worker picks up job from queue
4. Worker calls external pricing API
5. Worker writes pricing rows to SQL
6. Worker updates job status to "complete"
7. UI polls job status, shows progress, displays results

Key design choice: Price refresh is never synchronous. The user gets immediate feedback, and the actual work happens in the background.

Achieving scale-to-zero (practical configuration)

Here’s where theory meets YAML. Let’s look at the actual knobs (don’t giggle at the phrasing). I should also point out that YAML is not my strong suit.

Azure Container Apps configuration

# Excerpt from ACA deployment
properties:
  configuration:
    ingress:
      external: true
      targetPort: 8080
  template:
    scale:
      minReplicas: 0      # ← This is the magic
      maxReplicas: 10
      rules:
        - name: http-scaling
          http:
            metadata:
              concurrentRequests: "50"

Additional optimisations:

Keep images small: Use multi-stage builds, distro-less bases
Health probes: Configure liveness/readiness so ACA knows when you’re ready
Fast startup: Lazy-load what you can, defer non-critical initialisation

Azure Functions (Consumption plan)

{
  "bindings": [
    {
      "name": "queueItem",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "price-refresh-jobs",
      "connection": "AzureWebJobsStorage"
    }
  ]
}

The function only runs when there’s work in the queue. No queue messages = no execution = no cost.

Azure SQL Serverless

-- Configure auto-pause (via Azure Portal or ARM/Bicep)
-- Typical settings:
-- Auto-pause delay: 60 minutes
-- Min vCores: 0.5
-- Max vCores: 4

Design your app to minimise SQL chatter:

Batch reads where possible
Use projections instead of loading full entities
Consider read replicas or materialised views for heavy read patterns

Edge caching strategy

Configure Cloudflare (or your edge provider) to cache appropriately:

Endpoint pattern	Cache strategy
`/api/catalog/*`	Cache 1 hour, stale-while-revalidate
`/api/user/*`	No cache (private data)
`/static/*`	Cache indefinitely (hashed filenames)
`/api/health`	No cache

Use stale-while-revalidate for data that’s okay to be slightly stale. Users get fast responses while the cache refreshes in the background.

Data and storage patterns

SQL schema approach

Even if v1 is effectively single-tenant, design for multi-tenancy from day one:

-- Every significant table includes TenantId
CREATE TABLE Collections (
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    TenantId UNIQUEIDENTIFIER NOT NULL,  -- Ready for multi-tenant
    Name NVARCHAR(200) NOT NULL,
    CreatedAt DATETIME2 NOT NULL,
    -- ...
    INDEX IX_Collections_TenantId (TenantId)
);

This saves painful migrations later.

Pricing history (append-only pattern)

For data that grows over time (price history, audit logs), use append-only tables:

-- Append-only history
CREATE TABLE PriceHistory (
    Id BIGINT IDENTITY PRIMARY KEY,
    CardId UNIQUEIDENTIFIER NOT NULL,
    Price DECIMAL(10,2) NOT NULL,
    Currency CHAR(3) NOT NULL,
    RecordedAt DATETIME2 NOT NULL,
    Source NVARCHAR(50) NOT NULL,
    INDEX IX_PriceHistory_CardId_RecordedAt (CardId, RecordedAt DESC)
);

-- Materialised current price (updated by worker)
CREATE TABLE CurrentPrices (
    CardId UNIQUEIDENTIFIER PRIMARY KEY,
    Price DECIMAL(10,2) NOT NULL,
    Currency CHAR(3) NOT NULL,
    LastUpdated DATETIME2 NOT NULL
);

Query current prices from CurrentPrices (fast, single row per card). Query history from PriceHistory only when needed.

Blob storage for images

Use deterministic, hierarchical paths; the below example is related to TCG card images (Why not):

cards/{setCode}/{cardNumber}/{variant}.jpg

Examples:
cards/ONE/042/normal.jpg
cards/ONE/042/foil.jpg
cards/ONE/042/extended-art.jpg

For public assets, use a public container with CDN. For private assets, generate SAS tokens with short expiry.

Security baseline

This is the minimum viable security posture; so, enough to be taken seriously without a dedicated security team.

Managed identity everywhere

ACA (web) ──managed identity──→ Azure SQL
          ──managed identity──→ Key Vault
          ──managed identity──→ Blob Storage
          ──managed identity──→ Storage Queue

Azure Functions ──managed identity──→ Same resources

No connection strings in environment variables. No secrets in config files. Managed identity handles authentication automatically.

Key Vault for third-party secrets only

Store only what you can’t avoid:

Payment provider API keys
Email service API keys
Third-party integration credentials

Access via managed identity:

// In startup
builder.Configuration.AddAzureKeyVault(
    new Uri("https://your-vault.vault.azure.net/"),
    new DefaultAzureCredential()
);

Network security (pragmatic approach)

Start simple, add complexity when justified:

Phase 1 (MVP): Public endpoints + Cloudflare WAF + rate limiting
Phase 2 (when needed): Private endpoints for SQL, VNet integration
Phase 3 (enterprise): Full network isolation, Azure Firewall

Don’t over-engineer network security before you have traffic worth protecting; this is typically a waste of time and money in early stages. You may not have even validated that your SaaS product has a market yet.

Admin portal protection

I like building admin portals as separate apps. I almost always want to automate some admin tasks depending on the particular flavour of SaaS product and as such I like to keep them isolated from public-facing apps.

The admin portal (admin app) needs extra protection:

Separate ACA app with separate ingress
Stronger auth requirements (MFA, specific group membership)
IP allowlisting via Cloudflare Access or Azure networking
Consider Cloudflare Zero Trust for remote admin access

Payment and wallet model

For SaaS with subscriptions and usage-based billing consider the below patterns.

Subscription lifecycle

Payment provider webhook
         ↓
    Webhook endpoint (validate signature)
         ↓
    Storage Queue
         ↓
    Worker (idempotent processing)
         ↓
    Update subscription state in SQL

Idempotency is critical. Payment webhooks can be delivered multiple times. Use the event ID to deduplicate. For those not familiar with the term, idempotent processing means that processing the same event multiple times has the same effect as processing it once.

Wallet as immutable ledger

Don’t store just a balance field—store a ledger:

CREATE TABLE WalletTransactions (
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    WalletId UNIQUEIDENTIFIER NOT NULL,
    Amount DECIMAL(10,2) NOT NULL,      -- Positive = credit, negative = debit
    TransactionType NVARCHAR(50) NOT NULL,
    ReferenceId NVARCHAR(200) NULL,     -- External reference (payment ID, etc.)
    CreatedAt DATETIME2 NOT NULL,
    Description NVARCHAR(500) NULL
);

-- Balance is calculated: SUM(Amount) WHERE WalletId = @WalletId
-- Or maintain a materialised balance updated transactionally

This gives you auditability and makes reconciliation possible.

Deployment and environments

Subscription strategy

For early-stage SaaS, keep it simple:

Azure Tenant (your org)
├── Subscription: saas-dev
│   ├── rg-saas-app-dev-uks
│   ├── rg-saas-data-dev-uks
│   └── rg-saas-ops-dev-uks
│
└── Subscription: saas-prod
    ├── rg-saas-app-prod-uks
    ├── rg-saas-data-prod-uks
    └── rg-saas-ops-prod-uks

Separate subscriptions per environment:

Clear billing separation
Blast radius containment
Different RBAC policies per environment

Resource group organisation

Group by workload, not by “dev vs prod”:

Resource Group	Contains
`rg-{product}-app-{env}-{region}`	ACA apps, Functions
`rg-{product}-data-{env}-{region}`	SQL, Storage accounts
`rg-{product}-ops-{env}-{region}`	Key Vault, App Config, monitoring

CI/CD with GitHub Actions

# Simplified deployment flow
name: Deploy to Production

on:
  push:
    branches: [main]

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Build container image
        run: docker build -t ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }} .
      
      - name: Push to ACR
        run: |
          az acr login --name ${{ secrets.ACR_NAME }}
          docker push ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }}
      
      - name: Deploy to ACA
        run: |
          az containerapp update \
            --name web \
            --resource-group rg-deckfolio-app-prod-uks \
            --image ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }}

Use ACA revisions for safer rollouts—deploy a new revision, validate, then shift traffic.

Infrastructure as Code

Use Bicep or Terraform to make cost controls repeatable:

// Example: ACA with scale-to-zero
resource webApp 'Microsoft.App/containerApps@2023-05-01' = {
  name: 'web'
  location: location
  properties: {
    configuration: {
      ingress: {
        external: true
        targetPort: 8080
      }
    }
    template: {
      scale: {
        minReplicas: 0  // Scale to zero!
        maxReplicas: 10
        rules: [
          {
            name: 'http-scaling'
            http: {
              metadata: {
                concurrentRequests: '50'
              }
            }
          }
        ]
      }
    }
  }
}

Ops and residency

Regional considerations

The reference architecture assumes deployment to a single region initially. For UK-based SaaS with UK data residency requirements:

Primary region: UK South (London)
Paired region (DR): UK West (Cardiff), if needed later.

Azure regions to consider for other markets:

Market	Primary region	Notes
UK	UK South	Good service availability
EU	West Europe (Netherlands) or France Central	GDPR-aligned
US	East US or West US 2	Broadest service availability
APAC	Australia East or Southeast Asia	Depends on customer location

Data residency checklist

If data residency matters to your customers:

SQL Database in target region
Blob Storage in target region (with geo-redundancy disabled or paired within region)
Backups configured to stay in-region
Logs and telemetry (App Insights workspace) in-region
Document your residency posture for customer due diligence

Operational runbook basics

Even at MVP stage, document:

How to check if things are healthy (App Insights dashboard, key metrics)
How to restart a stuck service (ACA revision restart)
How to replay failed queue messages (dead-letter queue handling)
How to restore from backup (SQL point-in-time restore, blob versioning)
Who to contact for each external dependency (payment provider support, email service)

Cost control checklist

Here’s the concrete checklist to keep costs minimal:

Compute

ACA apps: minReplicas: 0
ACA apps: right-sized CPU/memory (start with 0.25 CPU, 0.5 Gi)
Container images: multi-stage builds, small base images
Functions: Consumption or Flex plan (not Premium unless needed)

Background processing

Queue-triggered, not polling
Timer triggers only for truly periodic work
Batch operations where possible

Data

SQL: Serverless tier with auto-pause
SQL: Query optimisation (fewer roundtrips, projections)
Blob: Lifecycle policies for old data
Blob: Appropriate redundancy (LRS is cheapest)

Observability

App Insights: Sampling enabled (start with 20%)
App Insights: Retention set appropriately (30-90 days)
Alerts: High-signal only, avoid alert fatigue

Governance

Budget alerts configured per subscription
Resources tagged for cost attribution
Monthly cost review in calendar
“Kill switch” process documented for runaway costs

Honest trade-offs

Let’s be upfront about what you’re accepting with this architecture:

Cold starts are real

When your app has been idle, the first request will be slower:

ACA: 2-10 seconds depending on image size and startup time
Functions (Consumption): 1-5 seconds
SQL (Serverless): 5-60 seconds for first query after pause

Mitigations:

Aggressive edge caching for common requests
Loading states in UI
Keep-alive pings for critical paths (but this defeats scale-to-zero)
Premium plans if cold start is truly unacceptable (costs more)

SQL isn’t truly zero if you’re constantly poking it

Serverless SQL pauses after a configurable idle period (default: 1 hour). If your app makes a health check query every minute, it never pauses.

Design around this:

Health checks shouldn’t hit SQL
Batch background jobs to run together
Accept that active development means SQL stays warm

Some costs are irreducible

No matter how clever you are:

DNS: ~£0.50/month per zone
Cloudflare: Free tier exists, but WAF features cost
Storage: Existence costs money (pennies, but not zero)
App Insights: Ingestion has a cost floor
SSL certificates: Free via Let’s Encrypt/Cloudflare, but management overhead

For a minimal SaaS in idle state, expect £10-25/month baseline, not literally zero.

Complexity tax rises fast

Every additional service adds:

Another thing to monitor
Another thing to secure
Another thing to understand when debugging
Another thing to pay for

Start simple:

Storage Queues before Service Bus
Single region before multi-region
App-level caching before Redis
Manual processes before automation

Add sophistication only when you’ve proven you need it.

Architecture diagrams

To visualise the key flows in this architecture:

1. Request path (synchronous)

Request path (synchronous)

2. Async job path (background processing)

Async job path (background processing)

3. Webhook path (external events)

Webhook path (external events))

Next steps

If you’re ready to implement this architecture:

Start with the app tier: Get a basic ACA deployment with minReplicas: 0 working
Add the data layer: Provision SQL Serverless and Blob Storage
Implement one async flow: Pick something simple (email notification) to prove the queue pattern
Add observability: Configure App Insights with sensible sampling
Iterate: Add complexity only as you prove you need it

The goal isn’t perfection; and this is where a lot of people stumble when they first start thinking about building a SaaS product either on the side or as a full time solo endeavour. The goal here is really a production-grade foundation that doesn’t cost money while you’re finding product-market fit.

Summary

Scale-to-zero on Azure is achievable for SaaS products, but it requires intentional architecture:

Separate fast from slow—queue long operations
Event-driven by default—no idle polling
Edge caching—reduce cold start impact
Serverless data—SQL that can pause
Security from day one—managed identity everywhere
Honest about trade-offs—cold starts, baseline costs, complexity tax

The patterns in this playbook have been battle-tested on real SaaS products. They won’t make your cloud bill literally zero, but they’ll keep it minimal while you focus on what matters: building something customers want to pay for.

Scale-to-Zero Architecture in Azure for SaaS Products

Table of contents

Introduction

What you’ll learn

Prerequisites

What we’re optimising for

Defining “scale to zero” (pragmatically)

Core principles

1. Separate request/response from long work

2. Event-driven everything

3. Cold-start tolerant UX

4. Cost controls are architecture, not finance

5. Security is default, not a later patch

Reference architecture

Edge layer

App tier (compute that scales to zero)

Async tier (also scales to zero)

Integration and messaging

Data layer

Config and secrets

Observability

Workload mapping

Authentication flow

Email notifications

Read-heavy endpoints (search, collection views)

Price refresh (the async pattern in detail)

Achieving scale-to-zero (practical configuration)

Azure Container Apps configuration

Azure Functions (Consumption plan)

Azure SQL Serverless

Edge caching strategy

Data and storage patterns

SQL schema approach

Pricing history (append-only pattern)

Blob storage for images

Security baseline

Managed identity everywhere

Key Vault for third-party secrets only

Network security (pragmatic approach)

Admin portal protection

Payment and wallet model

Subscription lifecycle

Wallet as immutable ledger

Deployment and environments

Subscription strategy

Resource group organisation

CI/CD with GitHub Actions

Infrastructure as Code

Ops and residency

Regional considerations

Data residency checklist

Operational runbook basics

Cost control checklist

Compute

Background processing

Data

Observability

Governance

Honest trade-offs

Cold starts are real

SQL isn’t truly zero if you’re constantly poking it

Some costs are irreducible

Complexity tax rises fast

Architecture diagrams

1. Request path (synchronous)

2. Async job path (background processing)

3. Webhook path (external events)

Next steps

Summary

You Might Also Like

WinUI 3: A Practical Deep Dive

Clawdbot, ahem, Moltbot (Sorry, now OpenClaw) Went Viral. Here's my take.

Building and Running Local Language Models in C# – Quickstart Edition

Comments