Build complete sitemap for multi-source AEM website

Marta Cukierman

by Marta Cukierman
10 min read

StreamX use case - AEM sitemaps StreamX use case - AEM sitemaps

Adobe Experience Manager (AEM) provides built-in support for generating sitemaps, making the task straightforward as long as the platform has full control over the website's structure. But, when you start dealing with environments that include multiple sources, markets, and projects, things get complicated fast. 

This is where StreamX steps in: the event-streaming service mesh has been designed to redefine the way we solve site integration complexities. With sitemaps, StreamX zeroes in on streamlining integration and boosting search engine indexing accuracy. The goal is to make handling complex digital landscapes as straightforward as managing a single-source setup in AEM.

Below, we’ll walk you through the  approach StreamX offers to simplify sitemap management in diverse and complex digital ecosystems.

The importance of organic traffic

Organic traffic is crucial for a website's consistent visibility, credibility and success. It’s more reliable and less expensive than paid ads, offering a steady visitor inflow - especially if the site ranks well in search results. Higher organic rankings often translate to increased user trust and higher conversion rates, since they reach people actively searching for related content.

Similarly, sitemaps are crucial for effective Search Engine Optimization (SEO):

  • They make it easier for search engines like Google and Bing to discover and index a site's content, improving its search visibility.

  • They serve as essential tools, particularly for new or updated pages, ensuring search engines can efficiently locate and index content, thus speeding up its visibility online.

  • And what’s most important: in sprawling or intricate sites, sitemaps are a guarantee that all pages, especially the less obvious ones, will be visible to search engines. This leads to comprehensive indexing and better site coverage.

Now that we’ve established why anyone would want a complete and relevant sitemap at all - let’s look at the possible challenges with it.

Sitemap management in multi-source AEM environments

In a multi-source setup, Adobe Experience Manager (AEM) needs to work with a variety of external systems, such as CRM, ERP, PIM, and different content repositories. 

This integration creates challenges in maintaining uniform sitemaps, since each connected system can have its unique structure, metadata, and update schedules. As the number of integrated sources grows, the complexity of the website's structure grows with it. Building a sitemap that is logically structured and free of broken links (to help search engines navigate your site) gets increasingly complicated. 

Beyond that, as the website expands and more sources are added, the sitemap can grow in size and complexity, potentially causing performance issues during its generation.

Diagram describing how sitemap is generated on DXP

Issues during sitemap generation

Though many opt for Adobe Experience Manager (AEM) as the central point for pulling all this data together to reflect the site structure, it's not the best fit for handling scattered data efficiently. This architectural choice often leads to issues with sitemap generation, updates, and maintenance. This is especially true with large web businesses — like stores, airlines, and booking services— that are using data generated from external services, internal applications, and other sources.

Example: Sitemaps for booking and airline sites

In businesses involved in the online travel ticket and hotel stay sales, sitemaps need to include not only the standard content pages (like Terms of Use, Privacy Policy, and Destination Information) but also pages unique to the industry, such as promotions, extra services, and travel and accommodation details.

More often than not, all this information must be available in multiple languages to cater to every market the provider serves. This results in thousands of pages, with the potential for even more.

In the operational and technical context, managing thousands of pages involves a large team and several projects, all developing code independently and contributing their own pages that need to be integrated into the sitemap, sometimes on a different CMS than the main site. This becomes problematic when the goal is to create a universal sitemap for pages under the brand's domain.

Diagram with simplified example of sitemap responsibilities with AEM

Simplified example of sitemap responsibilities with AEM

None of the systems in this setup is designed to collect and manage all the data. They are created for specific tasks. The CMS is there to create content and compose digital experiences and often ends up being the go-to solution for integration as the system that “holds the glass”.

CMSs are not equipped to automatically manage such a dynamic and dispersed data source. They lack mechanisms for automatically detecting and adding newly generated pages to the sitemap; moreover, embedding comprehensive sitemap generation logic within AEM can introduce multiple problems.

The primary issue is that this approach tends to complicate the system, making it hard to manage. This complexity can result in increased development time, higher maintenance costs, and a greater potential for errors. 

Moreover, as the project grows and evolves, this method could lead to scalability issues and make it challenging to adapt to changes in site structure or to integrate new content sources. 

The StreamX way

So, how do we solve the sitemap challenge without inviting the hard-to-predict problems that come with complexity? 

In a StreamX world, once you’ve connected your Adobe Experience Manager, your PIM , your e-commerce and all the other backend systems with StreamX sitemap generation is handled by a separate service. This service regenerates your sitemap when a change affecting the site structure or content is detected in any of the connected data sources. 

StreamX sitemap generation process

StreamX sitemap generation process

This way you can keep your AEM implementation cleaner and more focused on content management. Sitemaps are consistently complete and relevant. 

StreamX comes with data pipelines that are triggered by a publication or update from connected data sources. Everything from content management systems (CMS), or e-commerce platforms, product information management (PIM) systems, to other backend systems can easily become a StreamX data source 

When the source system communicates a change in advance, it eliminates the need for requesting a current status from all of the source systems, reducing infrastructure load and speeding up the sitemap generation process. If you need to introduce a more sophisticated logic into your sitemap generation process you do it in one destination service.

There is yet another bonus to this approach, adding a new data source or system does not require significant changes in existing code of your sitemap.


StreamX sitemap generation process

Simplified example of sitemap responsibilities with StreamX

  • Improved quality and completeness of sitemaps: Continuous listening to changes ensures the sitemap is complete and does not include broken links, improving visibility in search engines.

  • Rapid changes and scalability: StreamX enables easy addition of new content sources to the sitemap,

  • Clean AEM implementation that is easier to develop and maintain.

Optimize your AEM platform with StreamX

Manage complex sitemaps and improve your site's search visibility.
Schedule your StreamX to see how it's achieved.