What exactly is “CloudFront”?

CloudFront is a “Content Delivery Network”

AWS has their own dedicated ‘Content Delivery Network’ (CDN) which assists in delivering data/videos/apps/content globally, with high speeds and low latency. This allows for AWS infrastructure that is already there to accelerate loading of content. Both static and dynamic content can be delivered.

CDNs work by having a globally-replicated cache, so content can be delivered to its destination quickly; in Amazon’s case, they built a high speed CDN, CloudFront, as opposed to content being delivered over the open Internet without using a CDN.

Content is cached at different CloudFront locations (edge locations) and delivered directly to users requesting it. If data that is being requested is not cached, CloudFront fetches the data from its original location instead, and delivers it to the user directly.

***CloudFront does not automatically cache all objects. It will cache the first configuration, and then from then on, if an object is called that is not currently cached it will retrieve it from the origin servers/source.***

CloudFront Concepts

  • CloudFront Origin — this is the original location of the content in question being delivered; this can be many things, e.g. an S3 bucket, an ELB, anything available over the Internet
  • CloudFront Distribution — this is the ‘configuration’ set that is deployed out to the CDN for content delivery
  • Edge Location — locations along AWS’s CDN that can cache and deliver content
  • Regional Edge Location — similar to Edge Locations, but are larger and can hold more cache

When using CloudFront, usually a CloudFront Distribution is created, with a CloudFront Origin assigned. The CloudFront Distribution needs a domain name configured for it, but a custom one can be defined. Once this is done, the CloudFront Distribution is then deployed out to all chosen Edge Locations.

Once a user request is sent out for the data in the Origin, the request first checks for local Edge Location, and returns the content if it’s cached. If not, then the request is sent over to a Regional Edge Location and returns the content to the user if it’s cached; if not, the content needs to be fetched from the Origin (Origin Fetch).

Once an Origin Fetch is conducted, the content is pushed to the Regional Edge Location and cached, then the content is pushed to the Edge Location, as well as being cached. Finally, the content is then pushed to the user; next time the client requests the same content, it can fetch from the Edge Locations instead of the Origin.

Sometimes content isn’t cached at an Edge Location local to the user, so they would fetch from the Regional Edge Location, and the Regional Edge Location pushes the cache to the local Edge Location in question.

CloudFront Architecture Example:

  1. The viewer requests the website at www.example.com.
  2. If the requested object is cached, CloudFront returns the object from its cache to the viewer.
  3. If the object is not in CloudFront’s cache, CloudFront requests the object from the origin (an S3 bucket).
  4. S3 returns the object to CloudFront, which triggers the Lambda@Edge origin response event.
  5. The object, including the security headers added by the Lambda@Edge function, is added to CloudFront’s cache.
  6. (Not shown) The objects is returned to the viewer. Subsequent requests for the object that come to the same CloudFront edge location are served from the CloudFront cache.

Integration

CloudFront is able to integrate with ACM (AWS Certificate Manager).

When sites use HTTPS, they are encrypted with SSL/TLS and needs an ‘Authoritative Security Certificate’ to be used to encrypt traffic and prove client identify and enable HTTPS for content access.

ACM is used to create and mange these certifications. This can either be the default CloudFront certificate, which uses ‘xxx.cloudfront.net’ as the domain name. If a custom domain name is needed, a custom ‘Authoritative Security Certificate’ needs to be created.

Either public or private certificates can be created for external/internally facing sites with ACM. ACM can be set up and used to ‘add HTTPs’ to certain types of content like an S3 bucket.

CloudFront is used for accelerating primarily download operations; however, uploading content is supported using the ‘HTTP request methods:

  • POST — submit an entity to a specified resource
  • PUT — replaces the target resource with the request payload
  • DELETE — deletes the specified resource
  • OPTIONS — defines the communication option for the target resource
  • PATCH — do a partial modification of a target resource

When uploading to CloudFront, content is written to the Origin; Uploading to CloudFront does not write to Edge Locations cache, its simply passed along to the Origin location.

Origin Access Identity (OAIs)

CloudFront can be configured to be private, which will block unauthorized CloudFront requests; this can be worked around, the resource can be publicly accessible via normal DNS instead of the CloudFront CDN.

This can be prevented by using Origin Access Identity (OAI) to ensure that any requests to Origin content must use CloudFront and be pre-authorized,

An OAI is created, and associated with a CloudFront Distribution; when the Distribution is shared with an Edge Location, the OAI is also inherited. Once this is done, all relevant Edge Locations will have the OAI stored. An S3 bucket policy will need to Inplicit Deny so that only OAI is used when the content is accessed via CloudFront.

OAIs — secure S3 bucket from direct access via DNS, restricts to CloudFront access only

DevOps Engineer writing about breaking into the industry