Hosting a Single Page Application in AWS

Last year I started using Single SPA as framework for a microfrontend based single page application.

Part of this was figuring out how to even host a single page application in AWS. I wanted to do this without having to run servers and instead rely on AWS services only. This is the solution I arrived at.

I’ll write some follow ups on the rest of the deploy process for a Single SPA application as well, but this is a good chunk of work so it made sense to have stand alone. Also this bit will work for any single page application.

Example infrastructure here as Terraform code.

Deploying any Single Page Application to AWS

The core problem here is that we need a place to host files and a way to serve them.

S3 is the move for hosting files in AWS. While S3 can do static website hosting itself, it cannot do custom HTTPS certificates which is a deal breaker.

So to actually host files the choice is to put CloudFront in front of S3 as a content delivery network (CDN). CloudFront will provide some lower latency to users and it has some cheaper data transfer costs.

S3 Bucket Configuration

The way cloudfront talks with S3 is via Origin Access Control or an Origin Access Identity. A single page app doesn’t really need origin access control features, so we’ll stick with an access identity here. The Access Identity lets cloudfront identify itself and then an S3 bucket policy allows the cloudfront distribution to access files. Confusing? That’s IAM! This article on service policys vs IAM entity permissions might help.

Here’s an example bucket configuration

resource "aws_s3_bucket" "spa" {
  bucket = "example-spa-app"
  acl    = "private"
}

resource "aws_cloudfront_origin_access_identity" "spa" {
  comment = "example spa identity"
}

data "aws_iam_policy_document" "spa" {
  # allow the origin access identity to get objects
  statement {
    actions = ["s3:GetObject"]
    resources = [
      "${aws_s3_bucket.spa.arn}/*",
    ]

    principals {
      type        = "AWS"
      identifiers = [aws_cloudfront_origin_access_identity.spa.iam_arn]
    }
  }

  statement {
    actions = ["s3:ListBucket"]
    resources = [
      aws_s3_bucket.spa.arn,
    ]

    principals {
      type        = "AWS"
      identifiers = [aws_cloudfront_origin_access_identity.spa.iam_arn]
    }
  }
}

resource "aws_s3_bucket_policy" "spa" {
  bucket = aws_s3_bucket.spa.bucket
  policy = data.aws_iam_policy_document.spa.json
}

HTTPS Certificates in AWS

AWS offers certificate manager (ACM) to provision and maintain TLS certificates that can be attached to AWS resources — like a cloudfront distribution!

The ideal way to do this is have your single SPA app’s DNS pointed to AWS Route53 as well. This could be a top level domain or a subdomain could be delegated to Route53 DNS as well. If this is done ACM can provide you with DNS records to validate the certificate. Otherwise the validation will happen via email and require manual intervention yearly to keep it valid. DNS is a much nicer solution.

This example uses app.example.com as the domain for the certificate and Route53

resource "aws_route53_zone" "spa" {
  name = "app.example.com" # CHANGEME
}

resource "aws_acm_certificate" "spa" {
  domain_name       = aws_route53_zone.spa.name # app.example.com or whatever
  validation_method = "DNS"
  subject_alternative_names = [
    "*.${aws_route53_zone.spa.name}",
  ]

  # in case we need to recreate, make a new one first
  # then remove the old
  lifecycle {
    create_before_destroy = true
  }
}

# dynamically set up the validation records so acm
# can verify we own the domain
resource "aws_route53_record" "spa-certificate" {
  for_each = {
    for dvo in aws_acm_certificate.spa.domain_validation_options : dvo.domain_name => {
      domain = dvo.domain_name
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  type            = each.value.type
  ttl             = 600
  zone_id         = aws_route53_zone.spa.zone_id
  records         = [each.value.record]

  lifecycle {
    create_before_destroy = true
  }
}

# this will "wait" for the cert to be come valid on creation
# so we don't spin up resources with pending certs
resource "aws_acm_certificate_validation" "spa" {
  certificate_arn         = aws_acm_certificate.spa.arn
  validation_record_fqdns = [for record in aws_route53_record.spa-certificate : record.fqdn]

  lifecycle {
    create_before_destroy = true
  }
}

CloudFront Configuration

With the S3 bucket and TLS certificate ready, we’re ready to provision a cloudfront distribution to serve our files.

With a single page application, we want to support navigating around the application via history.pushState and the like as well as landing on any URL and serving the application. By defaul the second piece is a problem. If a file is not found on S3, then cloudfront will 404. This is avoidable by having some custom error responses that rewrite the response.

Cloudfront also lets you define cache behaviors with min and max time to live as well as allow headers, etc.

locals {
  spa_min_ttl     = 300
  spa_default_ttl = 3600
  spa_max_ttl     = 86400
}

resource "aws_cloudfront_cache_policy" "spa" {
  name        = "spa-example"
  default_ttl = local.spa_default_ttl
  min_ttl     = local.spa_min_ttl
  max_ttl     = local.spa_max_ttl
  parameters_in_cache_key_and_forwarded_to_origin {
    query_strings_config {
      query_string_behavior = "none"
    }
    cookies_config {
      cookie_behavior = "none"
    }
  }
}

The actual distribution itself can be configure to use the custom domain name and to redirect to HTTPS.


resource "aws_cloudfront_distribution" "spa" {
  enabled             = true
  is_ipv6_enabled     = true
  comment             = "${aws_route53_zone.spa.name} distribution"
  default_root_object = "index.html" # point to your renders index.html!

  origin {
    domain_name = aws_s3_bucket.spa.bucket_domain_name
    origin_id   = "main"

    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.spa.cloudfront_access_identity_path
    }
  }

  aliases = [
    aws_route53_zone.spa.name,
  ]

  default_cache_behavior {
    target_origin_id       = "main"
    allowed_methods        = ["GET", "HEAD", "OPTIONS"]
    cached_methods         = ["GET", "HEAD"]
    compress               = true
    viewer_protocol_policy = "redirect-to-https" # redirect to HTTP :tada:
    cache_policy_id        = aws_cloudfront_cache_policy.spa.id
  }

  viewer_certificate {
    # here's that certificate validation resource :tada:
    acm_certificate_arn      = aws_acm_certificate_validation.spa.certificate_arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1"
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  # these two custom_error_response make the spa work, if we hit a URL that
  # does not exist inthe bucket, we _must_ serve the index file so routing can
  # kick in and such.
  custom_error_response {
    error_code         = 403
    response_code      = 200
    response_page_path = "/index.html"
  }

  custom_error_response {
    error_code         = 404
    response_code      = 200
    response_page_path = "/index.html"
  }
}

The most improtant bits above are the custom_error_response paths. When a user lands on a non-asset URL, those previously 404 response will be re-writen to 200 OK responses and land on the index.html which will load the application and kick off routing. Same story for 403 Forbidden resposnes that cloudfront might get trying to access an S3 object it does not have permission to access.

The very last thing here is to connect the cloudfront distribution with DNS. This is done via an alias record.

resource "aws_route53_record" "spa" {
  zone_id = aws_route53_zone.spa.zone_id
  name    = aws_route53_zone.spa.name
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.spa.domain_name
    zone_id                = aws_cloudfront_distribution.spa.hosted_zone_id
    evaluate_target_health = false
  }
}

Controlling Cache Behavior

Cloudfront has default caching behavior based on its cache policy describe above, but CloudFront will also respect cache headers from S3. These can be controlled when you PutObject into S3.

For example, if you want to set the cache control behavior of an object, that can be done via the Cache Control header in the PutObject request or via the CLI:

aws s3 cp \
    --cache-control 'max-age=1800,must-revalidate' \
    dist/index.hml \
    s3://example-spa-app/index.html

Deployment Considerations

CloudFront is a CDN: it caches copies of object at the edge (close to the user) according to the default cache policy or the cache headers from the upstream origin.

That means deploying the same filename time after time is gonna cause issues: users will see old versions of the application until the object becomse stale. Cloudfront has them cached.

The best way to work around this is to deploy new files each time. Use the version in the file path and point to those or include a file hash in the built file and reference that.

In cases where that filename cannot be changed, like the index.hml file for example, that would require an invalidation during deployment. That combined with cache control described above for file such as that works fairly well.

Here’s an example deploy script:

npm run build

# sync versioned asset files to s3
aws s3 sync dist/assets/ s3://example-spa-app/assets/

# copy the index.html with cache control, this would point to the versionsed assets
aws s3 cp \
    --cache-control 'max-age=1800,must-revalidate' \
    dist/index.hml \
    s3://example-spa-app/index.html

# create a cloudfront invalidation for index.html
CLOUDFRONT_DISTRIBUTION="changeme: distribution ID from the infra above"
aws cloudfront create-invalidation \
    --distribution-id "$CLOUDFRONT_DISTRIBUTION" \
    --paths "/index.html"

CORS

Should a single page app need cross origin resource sharing configuration, that needs to be done via the S3 bucket. S3 can handle the actual CORS configuration and then the cloudfront cache policy can forward those headers to the origin as part its fetching of resources from the upstream origin.

resource "aws_s3_bucket_cors_configuration" "spa" {
  bucket = aws_s3_bucket.spa.bucket

  cors_rule {
    allowed_headers = ["*"]
    allowed_methods = ["GET", "HEAD"]
    allowed_origins = [
      "https://changeme.example.com"
    ]
    expose_headers = [
      "Content-Length",
      "Content-Type",
      "Connection",
      "Date",
      "ETag",
      "x-amz-request-id",
      "x-amz-version-id",
      "x-amz-id-2",
    ]
    max_age_seconds = 3600
  }
}

resource "aws_cloudfront_cache_policy" "spa-with-cors" {
  default_ttl = local.spa_default_ttl
  min_ttl     = local.spa_min_ttl
  max_ttl     = local.spa_max_ttl
  parameters_in_cache_key_and_forwarded_to_origin {
    headers_config {
      header_behavior = "whitelist"
      headers {
        items = [
          "origin",
          "access-control-request-headers",
          "access-control-request-method"
        ]
      }
    }
    query_strings_config {
      query_string_behavior = "none"
    }
    cookies_config {
      cookie_behavior = "none"
    }
  }
}