Solving Edge Delivery and AEM performance in China [VIDEO]

Anna Szemiot

by Anna Szemiot
10 min read

adaptto talk youtube thumbnail adaptto talk youtube thumbnail

At this year’s adaptTo() conference in Berlin, StreamX’s Kamil Chociej and Arbory Digital’s Tad Reeves broke down why even the most modern sites - built on Edge Delivery Services - still struggle in China, and what it takes to fix that.

Their session traced the issue from the network layer up: how China’s Great Firewall (GFW) actually behaves, why global CDNs and observability tools don’t work there, and how StreamX’s event-driven mesh can make global AEM sites load fast behing the Great Firewall - without building a separate setup just for China.

“Solving Edge Delivery and AEM Performance in China: Comparing Approaches & Results” was presented October 2 at adaptTo() Berlin. Watch the recording below or scroll down to read the solution walkthrough.

Video

Solving Edge Delivery and AEM Performance in China: Comparing Approaches & Results

Watch ↗

Why global sites fail in China

There are 3  core blockers explain why global delivery breaks down in Mainland China:

The Great Firewall of China

GFW is a highly sophisticated filter applied to all incoming traffic. It uses multiple techniques: DNS poisoning - IP blocking, traffic throttling, TCP resets, and packet forging - to delay or alter content. Its behavior is nearly impossible to predict; connectivity can change day-to-day, or even depending on which building you are accessing the internet from.

No global CDN

Virtually all the CDNs that Edge Delivery and modern AEM rely on are absent in China. Fastly, Cloudflare, CloudFront, Azure Front Door, and Akamai (which recently exited the China market) do not have local Points of Presence. Even hosting in Hong Kong doesn't count, as it's not network China. 

No visibility

Most global monitoring or analytics tools - DataDog, New Relic, Dynatrace, Google Tag Manager, Adobe Analytics - either don’t run at all or need local rewrites. Teams have no way to know what their Chinese users actually see.

The experiment

To actually quantify the problem, Tad and Kamil deployed an in-China RUM solution (Alicloud ARMS) on an existing Edge Delivery Services site that was not optimized for the in-China user population besides being translated.

The results were pretty dramatic:

  • The average Largest Contentful Paint (LCP) for the site in the US was around 1.5 seconds.

  • In China, during peak business hours, the average LCP spiked to 1.8 minutes (152 seconds), with the page often loading broken or failing entirely

This confirmed the initial hypotheses: any data traversal over the Great Firewall is going to be unreliable, no matter what. Therefore, in-China clients should ideally never have to access data from outside the firewall. 

For Adobe teams, that’s especially painful. AEM as a Cloud Service isn’t available in China. AEM Managed Services technically is, but only through a fully separate site, codebase, and license.

Common (partial) fixes

There are a few strategies used in the AEM world especially - each solve the problem partially but also introduce new, fundamental problems related to the very nature of the Great Firewall and content reliability.

Duplicated infrastructure - the "6.5 way"

While possible, provisioning infrastructure locally is expensive and not an option for those using AEM as a Cloud Service.

Static Site Generator (SSG)

This involves taking a "picture" of the site and sticking it in China. While fast, it eliminates any dynamic features.

Removing features

Some companies choose to "hamstring the entire site" by removing features (like recommended products) that rely on services hosted outside of China. This is effective, but the site ends up being very basic and way outside of brand's global standards.

In-China CDN

Some companies choose to "hamstring the entire site" by removing features (like recommended products) that rely on services hosted outside of China. This is effective, but the site ends up being very basic and way outside of brand's global standards.

New approach: invert the cache model

Architecture Tad and Kamil proposed at adaptTo() conference flips the traditional cache model, pushing content and code directly to a flexible, lightweight system hosted inside China.

This can work with Edge Delivery Services, but also AEM 6.5, AEM as a Cloud Service, Adobe Commerce/Magento as well as Product Information Management, headless CMS and e-commerce systems.

The solution uses a globally distributed Kubernetes clusters mesh that consists of:

  1. Pilot cluster (Control Plane):  Allows to manage the project and deploy services.

  2. Processing cluster (Data Plane): Located in the EU (near the source content for reliable connection). It receives and processes data ingestion from the content sources.

  3. Edge clusters (Data Plane): Responsible for serving content near the end users.

Content Flow and Processing

Content ingestion (for an Edge Delivery site) is triggered when a content author publishes a page. A GitHub action fetches the content (HTML, JS, CSS) from the Helix endpoint and ingests it into the Processing cluster.

The Processing cluster performs a critical function: processing the data. This makes sure  that when a page is published, all referenced files, including images and static assets, are identified, fetched and pushed to edge China cluster. This prevents reliance on external sources in the US or EU - like Fastly.

Since Fastly image transform services are inaccessible in China, all required image renditions (even those with transform parameters) are fetched from the source system and pushed individually to the Chinese edge cluster.

While the GFW still filters the connection between the Processing cluster (EU) and the Edge cluster (China), a robust messaging system ensures reliable delivery, although content updates might see a minimal delay. 

By moving the data source inside China, the firewall’s impact is shifted from crippling user performance to merely delaying content updates, so that the end-user experience remains fast.

Results

Accessing the content via the China edge cluster loaded the page in less than one second, which is a monumental improvement over the 1-to-2-minute load times seen using the standard Edge Delivery Services endpoint.

TL;DR

By moving the source of truth closer to the user and turning the cache model upside down, StreamX neutralizes the Great Firewall’s impact where it hurts most: end user data load path.

The Firewall is still there; but now latency it creates can only affect content updates, not the user experience.

Click the image to download full presentation