This version is still in development and is not considered stable yet. For the latest stable version, please use StreamX Guides 1.1.0!

Build data aggregation with StreamX

Data aggregation involves collecting and combining data from various sources to create new entities, offering a unified view or summary. When composite data parts are spread across multiple systems, managing the data flows can become complex. StreamX addresses this issue by offering a central assembly point and delivering pre-computed data to the web server.

In this tutorial, we will use a simple implementation of data aggregation with StreamX.

This tutorial covers the following topics:

  • Data aggregation from multiple sources

  • Page generation by using the StreamX Rendering Engine, including:

    • Managing page templates

    • Managing template data

Prerequisites

To complete this guide, you will need:

Verify that no other StreamX instance or any other application that uses port 8081 is running.

Step 1: Get the sources

Clone the Git repository containing source files for the example:

git clone https://github.com/streamx-dev/streamx-docs-resources.git

Step 2: Run the StreamX Mesh

Our example StreamX Mesh is configured to aggregate data from multiple independent sources and merge them into a unified entity. The system combines:

  • Product information from PIM

  • Pricing data from the internal system

  • Customer reviews from FMS

The computed data is then fed into the StreamX Rendering Engine, which generates the final target pages.

  1. Open the terminal and go to build-data-aggregation-tutorial inside the cloned project directory.

  2. Run the StreamX Mesh by using the following command:

    streamx run
  3. Wait for the following output:

    -------------------------------------------------------------------
    STREAMX IS READY!
    -------------------------------------------------------------------
    ...
    -------------------------------------------------------------------
    Network ID:
    ...
    Mesh configuration file: ./mesh.yaml
    -------------------------------------------------------------------

Step 3: Publish template and data

Publish template

  1. Publish the site/template.html to the renderers channel with the following command:

    streamx publish -s 'template.bytes=file://site/template.html' renderers template.html

    Where:

    • -s indicates that an external plain text file is the source for the published content.

    • renderers is the channel you are publishing the template to.

    • template.html is the publish key.

Publish rendering context

The StreamX Rendering Engine requires additional context:

  • Data that triggers page generation

  • The type of generated output

  • Names of generated results.

Once this context is defined, you can proceed with publishing the required data.

  1. Run the following command to provide the necessary page generation details:

streamx publish rendering-contexts pages-rendering-context rendering-contexts/pages-rendering-context.json

Publish product data

  1. Publish the data/product.json to the data channel with the following command:

    streamx publish -s 'content.bytes=file://data/product.json' data product:1

    Where the number 1 following the colon represents the id, serving to consolidate entities from several channels.

  2. Open your web browser and go to http://localhost:8081/generated/1.html.

  3. Verify that the page is accessible, but has no price and no reviews.

Step 4: Update optional data

  1. Publish the data/price.json to the data channel with the following command:

    streamx publish -s 'content.bytes=file://data/price.json' data price:1
  2. Open http://localhost:8081/generated/1.html.

  3. Verify that the page contains the price.

  4. Now unpublish the data by using the price:1 key with the following command:

    streamx unpublish data price:1
  5. Visit http://localhost:8081/generated/1.html.

  6. Confirm that the page generated from product:1 data is published, but its price is not available.

Step 5: Update multivalued data

Publish reviews

  1. Publish the data/review_1.json and data/review_2.json to the data channel with the following commands:

    streamx publish -s 'content.bytes=file://data/review_1.json' data review:1:firstReviewHash
    streamx publish -s 'content.bytes=file://data/review_2.json' data review:1:secondReviewHash
  2. Refresh http://localhost:8081/generated/1.html.

  3. Verify that the page now contains two reviews.

Unpublish part of the data

  1. Unpublish a review with the review:1:firstReviewHash key with the following command:

    streamx unpublish data review:1:firstReviewHash
  2. Visit http://localhost:8081/generated/1.html.

  3. Confirm that the review generated from review:1:firstReviewHash has disappeared, but the second review is still visible.

Summary

Congratulations! You have learned how to create pages from multiple external sources by using the StreamX Rendering Engine.