SCpbMD Catalog Sync Explained

I’ve been meaning to post this diagram for a while. I’ve used this to explain the Sitecore Commerce catalog data sync operation to at least three different clients since I drafted it this summer. And although I created it for the Microsoft Dynamics 365 version of the connector, it’s similar for Dynamics AX and other PIM systems as well.

SCpbMD_DataSync_2017-11-27.png

In the D365 box, you have the UI application, which admins can use to publish catalog data once it has been validated. There are a lot of elements that must be configured an working correctly to have a valid catalog, but a few of the key pieces are:

  • An online navigation hierarchy for your online channel
  • An assortment for your online channel containing released products
  • A catalog associated with the online channel
  • Products assigned to nodes on the online navigation hierarchy
  • Product attributes defined and attached to nodes on your online navigation hierarchy

Once the catalog is published, it’s ready to go as far as the “headquarters” database is concerned. But in Dynamics AX / D365, it also has to be distributed — sent to the online channel database using distribution jobs.

Once the catalog is in the online channel database, the D365 Retail Server can read from it. External applications can read catalog data from the online channel database using the Retail Server APIs, a curious mix of web services that aren’t quite WCF and aren’t quite REST.

This is where the Sitecore part of the picture comes into play. Sitecore provides a sample console application that uses Sitecore’s Data Exchange Framework to fetch data from the Dynamics Retail server. It transforms it into an XML file that can then be imported into Sitecore’s Commerce Server. (Sitecore 9 also uses this catalog.xml file format, though the old commerce server components are no longer used.)

This places the product and category definitions and data into the product catalog database. This product catalog database acts as an “edge cache” that keeps just the products the site will use close to the infrastructure of the website itself. It provides some redundancy in case that communications problems occur between Sitecore and D365.

The last step in the process is the catalog data provider. Sitecore XP uses a data provider to access the product catalog database, creating virtual Sitecore items that appear within the Sitecore UI. Product and category data are not stored as “real Sitecore items” in the sense that they live in the standard Sitecore master or web databases.

Watch the arrows!

Note the color and direction of the arrows in the diagram above. The orange arrows are the ones controlled by the Sitecore console app and Data Exchange Framework. The arrows in blue are either part of standard D365 functionality or belong to extensions to the Sitecore platform. The orange arrows could have been labeled “Extract, Transform, and Load” because that’s exactly the operations performed by the catalog sync. (If I ever redraw the diagram, I might update the labels to say just that!)

The direction of the arrows are important, too. Catalog information must be sent from AX HQ to the channel database by those batch distribution jobs. If those jobs aren’t running, then no updates occur in the channel DB, and no updates will be returned in by the Retail Server API.

Once the data is available at the Retail Server, it’s up to the Sitecore catalog sync process to fetch the latest data from the Retail Server. This can be run manually or as a scheduled job, but note that there are no notifications here — D365 doesn’t push the data to Sitecore, Sitecore pulls the data when it needs to.

A sequence of batches

This is definitely not a real-time process. As you might imaging from the number of batches, pushes, and pulls shown in the diagram, it can take a significant amount of time to move an update — like a adding a new product to the catalog or setting up a new product attribute for a category — from D365 HQ into Sitecore XP. If every batch job involved executed on a 15 minute timer, it could take 45-60 minutes for that product to appear on the site. The interval could be longer depending on the size of the catalog and the number of changes made.

There’s more than one way to do it

Although Sitecore provides the catalog sync code as part of its commerce connectors, it’s really just an example or starter kit for us to use. In practice, you’ll need to modify the logic used to generate the catalog.xml file to import into Sitecore. You may also need to move the data sync process to other servers for scalability or performance reasons. Or you could replace Sitecore’s Data Exchange Framework with another ETL framework or a business process orchestration suite like BizTalk.

The connector is just a starting point for implementation, and hopefully the diagram and my explanation of it makes a good starting point for discussion with your team or client about how the process of syncing catalogs might work.

Sitecore Commerce Catalogs at Scale

Last week, I gave a presentation to the DC Sitecore User Group on Sitecore Commerce Catalogs. It was a small crowd due to some thunderstorms in the area, and I had a tough act to follow. Phil Wickland, Sitecore MVP and author of several books on Sitecore, gave a talk on Personalization for Impact, which is worth seeing.

My talk was about how and why Sitecore imports catalog data from a PIM, using the Sitecore Commerce and Microsoft D365 integration as an example.

Here’s a link to the video:

The audio is a bit hard to hear at times, but I’ve posted my slides to slideshare here: