Skip to content
CTCO
Go back

Scale-to-Zero Architecture in Azure for SaaS Products

Published:  at  09:00 AM
·
19 min read
· By Joseph Tomkinson
Deep Dives
Human + AI

Scale to Zero Azure SaaS Architecture

Table of contents

Open Table of contents

Introduction

If you’re building a SaaS product as an individual contributor then you’ll almost certainly have had the same thoughts that I had, how do I keep capital investment as low as possible. You will have thought that, as the capital investment is your own money, so every bit counts. At the same time, if you have experience of the software delivery life-cycle you know that its generally a good idea to have something that still acts production-grade; a solution that won’t fall over the moment traffic arrives.

This blog post, part of the “Playbooks” series, walks through a scale-to-zero architecture on Azure. One that can idle at near-zero cost while maintaining a clean path to “real SaaS” with authentication, payments, async jobs, observability, and sensible security defaults.

I’ll use a reference architecture I’ve called “SaaS-Light” throughout, but the design considerations apply to most early-stage SaaS products where cost control is critical.

What you’ll learn

Prerequisites

Before diving in, you should be comfortable with:


What we’re optimising for

Let’s be explicit about the goals; these shape every decision downstream.

Primary goal: Near-zero idle cost while keeping a clean path to production-grade SaaS (auth, payments, async jobs, observability, security).

Secondary goals:

Defining “scale to zero” (pragmatically)

Let’s be honest about what “scale to zero” actually means in practice:

LayerCan it hit zero?Reality check
ComputeYesContainer Apps and Functions can scale to 0 replicas
Background processingYesEvent-driven/queued, runs only when needed
Data tierMostlyServerless SQL can pause, but constant pokes keep it awake
Edge/CDNNoDNS, WAF, caching, always incurs baseline cost
MonitoringNoApplication Insights has minimum ingestion costs
StorageNoBlob storage charges for existence, not just access

The goal isn’t literally zero; it’s near-zero with a clear understanding of what’s irreducible.


Core principles

These five principles form the “rules of the game” for this architecture:

1. Separate request/response from long work

Your web/API layer should handle fast work only. Anything that takes more than a couple of seconds becomes a job; I’d typically say queue it, process it asynchronously, update status.

❌ User clicks "Refresh Prices" → API blocks for 30 seconds → Returns result
✅ User clicks "Refresh Prices" → API enqueues job → Returns immediately → Worker processes → UI polls for status

2. Event-driven everything

Prefer HTTP triggers + queue triggers over always-on schedulers or polling loops. If nothing’s happening, nothing should be running.

3. Cold-start tolerant UX

Cold starts are real. Design for them:

4. Cost controls are architecture, not finance

Don’t treat cost management as a finance team problem. Bake it into the architecture:

5. Security is default, not a later patch

Managed identity, Key Vault, least privilege, defensive data boundaries, these aren’t “nice to haves” for later. They’re cheaper to build in from day one. So, make sure you think about them, if you’re utilising Agentic AI workflows, make sure to include security reviews in your process.


Reference architecture

Here’s the high-level view of what we’re building:

Edge layer

Cloudflare (or equivalent) handles:

Why Cloudflare? It’s cost-effective, has a generous free tier, and the Workers/Rules ecosystem is mature. Azure Front Door is an alternative if you prefer staying in-ecosystem.

App tier (compute that scales to zero)

Azure Container Apps (ACA) hosts your application containers:

AppPurposeScale rule
webCustomer UI + API (e.g., Blazor Server, ASP.NET Core)HTTP concurrency, min replicas = 0
adminInternal/admin portalHTTP concurrency, min replicas = 0

Container Apps is the sweet spot here. it’s simpler than Kubernetes, supports scale-to-zero natively, and handles ingress/TLS automatically.

In the above table I’m saying scale to zero for both public and admin apps. You might choose to keep the public app warm if you want to provide a better experience for end users (less cold start impact). The admin app is a great candidate for scale-to-zero since it’s used infrequently (Or at least only by a small number of users).

Async tier (also scales to zero)

Pick one (or mix based on workload):

Azure Functions (Consumption or Flex plan):

Azure Container Apps Jobs:

Integration and messaging

Azure Storage Queues for most scenarios:

Azure Service Bus when you need:

Start with Storage Queues. Graduate to Service Bus when you hit its limitations. Please bear in mind that Azure Queue Storage is a lightweight queue and doesn’t provide a FIFO guarantee (ordering is best-effort). If you need guaranteed ordered processing and other broker features, Azure Service Bus can provide FIFO via Sessions (Standard/Premium), but it typically comes at a higher cost than Storage Queues.

Data layer

Azure SQL Database (Serverless tier):

Perfect for: relational domain data (users, collections, subscriptions, pricing metadata).

Azure Blob Storage:

What about Redis?

Be honest with yourself: Redis (Azure Cache for Redis) is rarely “scale-to-zero”. The cheapest tier still costs money when idle.

Instead:

  1. Use app-level caching (in-memory, with short TTL)
  2. Use edge caching (Cloudflare cache rules)
  3. Add Redis only when you’ve proven you need distributed cache

Config and secrets

Azure Key Vault:

Azure App Configuration:

Observability

Application Insights:

Cost control tips:


Workload mapping

These are some concrete examples that I ran into so Let’s map real SaaS flows to this architecture using the my first SaaS project reference:

Authentication flow

User → Cloudflare → ACA (web) → Azure AD B2C / Entra External ID

                              Token validation at app edge

Options:

The key: token validation happens at your app edge, not deep in your business logic.

Email notifications

App event (sign-up, password reset)

    Storage Queue

    Azure Function

    Email provider API (Resend, Postmark, SendGrid)

Why queue it? Email providers can be slow or rate-limited. Your API shouldn’t block on email delivery.

Read-heavy endpoints (search, collection views)

User → Cloudflare (cache hit?) → ACA (web) → SQL
              ↑                        ↓
         Cache response           Query + cache miss

Cache public/catalog-like data aggressively at the edge. Use Cache-Control headers and Cloudflare cache rules.

Price refresh (the async pattern in detail)

This is the canonical example of “separate fast from slow”:

1. User clicks "Refresh Prices"
2. API validates request, enqueues job, returns job ID immediately
3. Worker picks up job from queue
4. Worker calls external pricing API
5. Worker writes pricing rows to SQL
6. Worker updates job status to "complete"
7. UI polls job status, shows progress, displays results

Key design choice: Price refresh is never synchronous. The user gets immediate feedback, and the actual work happens in the background.


Achieving scale-to-zero (practical configuration)

Here’s where theory meets YAML. Let’s look at the actual knobs (don’t giggle at the phrasing). I should also point out that YAML is not my strong suit.

Azure Container Apps configuration

# Excerpt from ACA deployment
properties:
  configuration:
    ingress:
      external: true
      targetPort: 8080
  template:
    scale:
      minReplicas: 0      # ← This is the magic
      maxReplicas: 10
      rules:
        - name: http-scaling
          http:
            metadata:
              concurrentRequests: "50"

Additional optimisations:

Azure Functions (Consumption plan)

{
  "bindings": [
    {
      "name": "queueItem",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "price-refresh-jobs",
      "connection": "AzureWebJobsStorage"
    }
  ]
}

The function only runs when there’s work in the queue. No queue messages = no execution = no cost.

Azure SQL Serverless

-- Configure auto-pause (via Azure Portal or ARM/Bicep)
-- Typical settings:
-- Auto-pause delay: 60 minutes
-- Min vCores: 0.5
-- Max vCores: 4

Design your app to minimise SQL chatter:

Edge caching strategy

Configure Cloudflare (or your edge provider) to cache appropriately:

Endpoint patternCache strategy
/api/catalog/*Cache 1 hour, stale-while-revalidate
/api/user/*No cache (private data)
/static/*Cache indefinitely (hashed filenames)
/api/healthNo cache

Use stale-while-revalidate for data that’s okay to be slightly stale. Users get fast responses while the cache refreshes in the background.


Data and storage patterns

SQL schema approach

Even if v1 is effectively single-tenant, design for multi-tenancy from day one:

-- Every significant table includes TenantId
CREATE TABLE Collections (
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    TenantId UNIQUEIDENTIFIER NOT NULL,  -- Ready for multi-tenant
    Name NVARCHAR(200) NOT NULL,
    CreatedAt DATETIME2 NOT NULL,
    -- ...
    INDEX IX_Collections_TenantId (TenantId)
);

This saves painful migrations later.

Pricing history (append-only pattern)

For data that grows over time (price history, audit logs), use append-only tables:

-- Append-only history
CREATE TABLE PriceHistory (
    Id BIGINT IDENTITY PRIMARY KEY,
    CardId UNIQUEIDENTIFIER NOT NULL,
    Price DECIMAL(10,2) NOT NULL,
    Currency CHAR(3) NOT NULL,
    RecordedAt DATETIME2 NOT NULL,
    Source NVARCHAR(50) NOT NULL,
    INDEX IX_PriceHistory_CardId_RecordedAt (CardId, RecordedAt DESC)
);

-- Materialised current price (updated by worker)
CREATE TABLE CurrentPrices (
    CardId UNIQUEIDENTIFIER PRIMARY KEY,
    Price DECIMAL(10,2) NOT NULL,
    Currency CHAR(3) NOT NULL,
    LastUpdated DATETIME2 NOT NULL
);

Query current prices from CurrentPrices (fast, single row per card). Query history from PriceHistory only when needed.

Blob storage for images

Use deterministic, hierarchical paths; the below example is related to TCG card images (Why not):

cards/{setCode}/{cardNumber}/{variant}.jpg

Examples:
cards/ONE/042/normal.jpg
cards/ONE/042/foil.jpg
cards/ONE/042/extended-art.jpg

For public assets, use a public container with CDN. For private assets, generate SAS tokens with short expiry.


Security baseline

This is the minimum viable security posture; so, enough to be taken seriously without a dedicated security team.

Managed identity everywhere

ACA (web) ──managed identity──→ Azure SQL
          ──managed identity──→ Key Vault
          ──managed identity──→ Blob Storage
          ──managed identity──→ Storage Queue

Azure Functions ──managed identity──→ Same resources

No connection strings in environment variables. No secrets in config files. Managed identity handles authentication automatically.

Key Vault for third-party secrets only

Store only what you can’t avoid:

Access via managed identity:

// In startup
builder.Configuration.AddAzureKeyVault(
    new Uri("https://your-vault.vault.azure.net/"),
    new DefaultAzureCredential()
);

Network security (pragmatic approach)

Start simple, add complexity when justified:

  1. Phase 1 (MVP): Public endpoints + Cloudflare WAF + rate limiting
  2. Phase 2 (when needed): Private endpoints for SQL, VNet integration
  3. Phase 3 (enterprise): Full network isolation, Azure Firewall

Don’t over-engineer network security before you have traffic worth protecting; this is typically a waste of time and money in early stages. You may not have even validated that your SaaS product has a market yet.

Admin portal protection

I like building admin portals as separate apps. I almost always want to automate some admin tasks depending on the particular flavour of SaaS product and as such I like to keep them isolated from public-facing apps.

The admin portal (admin app) needs extra protection:


Payment and wallet model

For SaaS with subscriptions and usage-based billing consider the below patterns.

Subscription lifecycle

Payment provider webhook

    Webhook endpoint (validate signature)

    Storage Queue

    Worker (idempotent processing)

    Update subscription state in SQL

Idempotency is critical. Payment webhooks can be delivered multiple times. Use the event ID to deduplicate. For those not familiar with the term, idempotent processing means that processing the same event multiple times has the same effect as processing it once.

Wallet as immutable ledger

Don’t store just a balance field—store a ledger:

CREATE TABLE WalletTransactions (
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    WalletId UNIQUEIDENTIFIER NOT NULL,
    Amount DECIMAL(10,2) NOT NULL,      -- Positive = credit, negative = debit
    TransactionType NVARCHAR(50) NOT NULL,
    ReferenceId NVARCHAR(200) NULL,     -- External reference (payment ID, etc.)
    CreatedAt DATETIME2 NOT NULL,
    Description NVARCHAR(500) NULL
);

-- Balance is calculated: SUM(Amount) WHERE WalletId = @WalletId
-- Or maintain a materialised balance updated transactionally

This gives you auditability and makes reconciliation possible.


Deployment and environments

Subscription strategy

For early-stage SaaS, keep it simple:

Azure Tenant (your org)
├── Subscription: saas-dev
│   ├── rg-saas-app-dev-uks
│   ├── rg-saas-data-dev-uks
│   └── rg-saas-ops-dev-uks

└── Subscription: saas-prod
    ├── rg-saas-app-prod-uks
    ├── rg-saas-data-prod-uks
    └── rg-saas-ops-prod-uks

Separate subscriptions per environment:

Resource group organisation

Group by workload, not by “dev vs prod”:

Resource GroupContains
rg-{product}-app-{env}-{region}ACA apps, Functions
rg-{product}-data-{env}-{region}SQL, Storage accounts
rg-{product}-ops-{env}-{region}Key Vault, App Config, monitoring

CI/CD with GitHub Actions

# Simplified deployment flow
name: Deploy to Production

on:
  push:
    branches: [main]

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Build container image
        run: docker build -t ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }} .
      
      - name: Push to ACR
        run: |
          az acr login --name ${{ secrets.ACR_NAME }}
          docker push ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }}
      
      - name: Deploy to ACA
        run: |
          az containerapp update \
            --name web \
            --resource-group rg-deckfolio-app-prod-uks \
            --image ${{ secrets.ACR_LOGIN_SERVER }}/web:${{ github.sha }}

Use ACA revisions for safer rollouts—deploy a new revision, validate, then shift traffic.

Infrastructure as Code

Use Bicep or Terraform to make cost controls repeatable:

// Example: ACA with scale-to-zero
resource webApp 'Microsoft.App/containerApps@2023-05-01' = {
  name: 'web'
  location: location
  properties: {
    configuration: {
      ingress: {
        external: true
        targetPort: 8080
      }
    }
    template: {
      scale: {
        minReplicas: 0  // Scale to zero!
        maxReplicas: 10
        rules: [
          {
            name: 'http-scaling'
            http: {
              metadata: {
                concurrentRequests: '50'
              }
            }
          }
        ]
      }
    }
  }
}

Ops and residency

Regional considerations

The reference architecture assumes deployment to a single region initially. For UK-based SaaS with UK data residency requirements:

Azure regions to consider for other markets:

MarketPrimary regionNotes
UKUK SouthGood service availability
EUWest Europe (Netherlands) or France CentralGDPR-aligned
USEast US or West US 2Broadest service availability
APACAustralia East or Southeast AsiaDepends on customer location

Data residency checklist

If data residency matters to your customers:

Operational runbook basics

Even at MVP stage, document:

  1. How to check if things are healthy (App Insights dashboard, key metrics)
  2. How to restart a stuck service (ACA revision restart)
  3. How to replay failed queue messages (dead-letter queue handling)
  4. How to restore from backup (SQL point-in-time restore, blob versioning)
  5. Who to contact for each external dependency (payment provider support, email service)

Cost control checklist

Here’s the concrete checklist to keep costs minimal:

Compute

Background processing

Data

Observability

Governance


Honest trade-offs

Let’s be upfront about what you’re accepting with this architecture:

Cold starts are real

When your app has been idle, the first request will be slower:

Mitigations:

SQL isn’t truly zero if you’re constantly poking it

Serverless SQL pauses after a configurable idle period (default: 1 hour). If your app makes a health check query every minute, it never pauses.

Design around this:

Some costs are irreducible

No matter how clever you are:

For a minimal SaaS in idle state, expect £10-25/month baseline, not literally zero.

Complexity tax rises fast

Every additional service adds:

Start simple:

Add sophistication only when you’ve proven you need it.


Architecture diagrams

To visualise the key flows in this architecture:

1. Request path (synchronous)

Request path (synchronous)

2. Async job path (background processing)

Async job path (background processing)

3. Webhook path (external events)

Webhook path (external events))


Next steps

If you’re ready to implement this architecture:

  1. Start with the app tier: Get a basic ACA deployment with minReplicas: 0 working
  2. Add the data layer: Provision SQL Serverless and Blob Storage
  3. Implement one async flow: Pick something simple (email notification) to prove the queue pattern
  4. Add observability: Configure App Insights with sensible sampling
  5. Iterate: Add complexity only as you prove you need it

The goal isn’t perfection; and this is where a lot of people stumble when they first start thinking about building a SaaS product either on the side or as a full time solo endeavour. The goal here is really a production-grade foundation that doesn’t cost money while you’re finding product-market fit.


Summary

Scale-to-zero on Azure is achievable for SaaS products, but it requires intentional architecture:

The patterns in this playbook have been battle-tested on real SaaS products. They won’t make your cloud bill literally zero, but they’ll keep it minimal while you focus on what matters: building something customers want to pay for.


You Might Also Like



Comments