Serializing a #Sitecore Computed Index Field (When You Need to Pass an Immense Amount of Data)

Recently I worked on a Data Exchange Framework process that involved extracting several records. Each record would eventually be inserted into an external table as they were basically rows. The DEF process was slow creating the records so I created a computed index field to handle creating the records and the DEF process would just need to read and insert the records accordingly. Problem was most computed index fields usually return a small value. In this case it could be several rows of data. The solution was simple though. Serialize a list of objects then deserialize it and let the DEF processor take it from there.

In my example the computed index field is created using an Image field class which contains a list of ImageInfo objects. You can fill in and create the index however you want. See below.

public class
 Image
 {
  [IndexField("imageinformation")]
  publicList<ImageInfo> ImageInformation { get; set; }
  [IndexField("images")]
  public string Imagelist { get; set; }
  [IndexField("_templatename")]
   public string TemplateName { get; set; }
 }

public class ImageInfo
 {
  public string EntityId { get; set; }
  public string caption1 { get; set; }
  public string caption2 { get; set; }
  public string freeformcaptions { get; set; }
  public string url { get; set; }
  }

As stated above, Image contains a list of ImageInfo objects. I used Newtonsoft to Serialize it.

return JsonConvert.SerializeObject(image.ImageInformation); 

This is an example of what the serialized data looks like:

“[{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape3.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape1.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Exterior.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape3.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape4.ashx\”},{\”EntityId\”:\”4aa575d5-af18-466f-8e56-a06494ca6d16\”,\”caption1\”:\”NoCaption\”,\”caption2\”:null,\”freeformcaptions\”:null,\”url\”:\”https://acme.com/-/media/Images/Acme/Products/location/area51/subspecies/images/Streetscape5.ashx\”}]”

This is the computed index in the index configuration.

<fields hint="raw:AddComputedIndexField">
<!-- Custom Computed Fields -->
<!-- Default SC Fields -->
<field fieldName="images">Acme.Feature.Website.ComputedFields.CustomReporting.ImagesComputedField, Acme.Your.Website</field>
 </fields> 

In the process method of the DEF you can then deserialize the field and do your processing.

var imageinfo = JsonConvert.DeserializeObject<List<ImagesComputedField.ImageInfo>>(image.Imagelist);
foreach (var imagedet in imageinfo)
{
 UpdateDataTable(new Guid(imagedet.EntityId), imagedet.caption1, 
 imagedet.caption2,imagedet.freeformcaptions, imagedet.url); 
}

So that is it. Not too bad and will solve a lot of issues where you need to move a lot of data. The DEF process went from 15 plus minutes to less than a minute once this index was created.

#Sitecore Data Exchange Framework Scheduling Tasks Options

One of the little known things that comes with the DEF is new commands that can be used to schedule DEF pipeline batches.

The new command options under System/Tasks/Commands/Data Exchange are the following:

  • Run All Pipeline Batches Command (Used to run multiple batches.)
  • Run Selected Pipeline Batches Command (Used to run one batch process.)

Run All Pipeline Batches Command

This command is used for running multiple batch processes. You will notice in the Pipeline Batches Root I just selected the Pipeline Batches parent folder. This should run all the batch processes underneath it.

Run Selected Pipeline Batches Command

With this command you can select one Batch Process to run.

Scheduling

Once you have your commands setup scheduling is easy. Just create a scheduling task and select the command and fill in the required fields like you would do for other commands.

That’s it. Easy to schedule.

#Sitecore’s Data Exchange Framework Reddit Style Part 5 the Finale

In part 1 you saw what Sitecore items were created to for the data exchange process. In part 2 you saw how the backend was connected and the classes defined in the Converters section. In part 3 you saw how the item models are used. In part 4 you saw how the pipeline step processor is the method that brings everything together. In this part I want to talk about how the Pipeline Batch process is called that runs each step. This is the one of the coolest parts as you will be able to see the final result. You can find the source code here for reference. Still waiting for my module to get approved, but hopefully that will be soon.

Below you will find a Pipeline Batch that was created to run all the necessary steps to process and import the data into Sitecore.

When you first click on the Pipeline Batch you will notice a group of buttons on the ribbon. If something is not correctly setup these buttons will be disabled. To run the Pipeline Batch just click on the Run Pipeline Batch button.

While it is running the button will grey out and you will see the Stop Pipeline Batch button enabled. After it is finished running the Run Pipeline Batch will be enabled again.

Below are the steps that get run under Pipelines. You can also see above that the Reddit Pipeline is selected.

As you saw before the Reddit Pipeline Step hooks into the Converter and Processor. That will then call the next step below it (Iterate feed from Reddit Feeds and Run Pipeline).

The Iterate feed from Reddit Feeds and Run Pipeline as you see below will call a Pipeline (Reddit Feed to Sitecore Item Sync Pipeline) and store the data that the processor outputted into the Pipeline Context Source.

On the Resolve Sitecore Item Pipeline Step I defaulted all the fields except for the following. You will see the Template for New Item is the Reddit Feed Data item. The Item Name Value Accessor uses one of the fields from the Value Accessor Set to name the item. Endpoint From I just selected the Sitecore one. For the Identifier section you will see that the Identifier Value Accessor is a unique value that you can select from the Value Accessor Set. Then the Identifier Object Location is simply the Pipeline Context Source.

For the Apply Mapping from Reddit feed to Sitecore Item the fields are defaulted except for the Mapping Set. This is set to the Reddit Feed to Sitecore Item. You can see what it looks like below. This will map the data to the Sitecore fields.

For the Update Sitecore Item Pipeline Step I just chose the Sitecore Database Endpoint. This is the final step that updates the new item.

After everything runs you get the final result below:

So that is the Sitecore’s Data Exchange Framework in a nutshell. I have some more stuff I will be showing in different blogs, but this concludes the Reddit feed import example. I hope you have found this useful and I will be updating these blogs and cleaning them up as time permits. Please let me know if you have any questions.

#Sitecore’s Data Exchange Framework Reddit Style Part 4

In part 1 you saw what Sitecore items were created to for the data exchange process. In part 2 you saw how the backend was connected and the classes defined in the Converters section. In part 3 you saw how the item models are used. In this part I wanted to talk about the pipeline step processor. The method that brings it all together. You can find the source code here for reference.

This is a snapshot of each folder that was created and the corresponding class file. You can see there is a Processors\PipelineSteps folder that contains the process class.

RedditItemsProcessor.cs

In Sitecore under Pipelines\Reddit Pipeline\Reddit Pipeline Step you will see under Processor Type the RedditItemsProcessor defined.

Below is the code for the processor. You will see that the Required EndpointPlugin is defined for RedditSettings. Endpoint, PipelineStep and PipelineContext should all have values at this point. As you will notice the pipelineContext.PipelineBatchContext.Logger.Info is used to set up any custom logging. This will be outputted in the batch window when the processor runs. See example below.

If you look at the method RedditFeed. What that does is create a new Reddit class using RedditSharp. It then uses the blogpath setting from the endpoint to retrieve the blog feed. The redditfeedresults gets set to a collection of Reddit blogs. From there that collection is passed to the DEF which will use everything we setup to map and process each of the records. It will use fieldnames we defined earlier to match the Reddit feed fields to the fields defined for the new Reddit Sitecore item.

Outputted to the batch window:

Hopefully now you are getting the big picture on how all this stuff pieces together. In part 5 I will get into the part of running the DEF process and different options you can do. Let me know if you have any questions.

#Sitecore’s Data Exchange Framework Reddit Style Part 3

In part 1 you saw what Sitecore items were created to for the data exchange process. In part 2 you saw how the backend was connected and the classes defined in the Converters section. In this part I wanted to talk briefly about the item models that are defined for data exchange framework. You can find the source code here for reference.

This is a snapshot of each folder that was created and the corresponding class file. You can see there is a Models folder.

RedditFeedFieldValueValueAccessorItemModel.cs

As mentioned in the previous blog a ItemModel is used to defined the field name for the Value Accessor fields. Using this method is a good practice if you need to rename the field and makes the accessor code cleaner.

The name corresponds to the filed name of the template used for Value Accessor fields.

RedditReadStepItemModel.cs

As shown in the last blog in the RedditEndPointConverter class the RedditReadStepItemModel is used to correspond to the field name that holds the selection to the required EndPoint. As the previous item model mentioned this a good practice if you need to rename the field and makes the code cleaner.

Sitecore ItemModel class

The ItemModel Sitecore method that this item models inherit from is below.

So this blog was brief, but hopefully it gives a little glimpse of the value of having the item models defined. In part 4 I will get into the pipeline step processor. The most important piece that brings the functionality together. Let me know if you have any questions.

#Sitecore’s Data Exchange Framework Reddit Style Part 2

In part 1 you saw what items were created to for the data exchange process. Now I want to show you some of the backend so you can get a better picture on how they are connected. In this blog I will go over the Converters section. You can find the source code here for reference.

This is a snapshot of each folder that was created and the corresponding class file.

RedditFeedFieldValueAccessorConverter.cs

Let’s start with Value Access Converter. Below is the code.


The TemplateId ({68BD9AAD-635F-40F3-9ACD-711662C59EEC}) being set refers to the following template. This template inherits from the Value Accessor template installed by the DEF package.

So how this corresponds to Sitecore is simple. A Value Accessor Set item is created and the items underneath it use this template. In the RedditFeedFieldValueAccessorConverter class it is setup to get the value of that field using the combination of the template and the field name defined in the RedditFeedFieldValueValueAccesorItemModel. You will see how this is tied in to other parts of the DEF process as we go along.


In the code you will notice a ValueReader and a ValueWriter are defined. The ValueReader is used to read in the feed and match the fields for each Reddit blog item. That uses the RedditFeedValueReader class as you will see in another part of the blog. The ValueWriter defines a new PropertyValueWriter class. That class is part of the DEF and used to convert the field values.

RedditFeedValueReader.cs

This class will process the Reddit Post item and match field name with value. Title, AuthorName, SelfText etc… are field names in the Reddit feed. I believe you can name these differently, but I prefer to keep the same name as it is in the Reddit Post item.

RedditEndPoint.cs

The Reddit End Point will have the information for the Blog Path in Reddit. The TemplateId defined points to the RedditEndPoint template. Settings are returned and added to the plugin. The plugin is required to be implemented by the pipeline converter and processor.


This is the model plug in code:


This is values of the Reddit Endpoint. You can see that the Converter Type points to the RedditEndPoint class.

More about this later, but as you can see in the Reddit Pipeline Step the Reddit Endpoint is selected. Now you can see how this all ties together.

RedditEndPointConverter.cs

The converter is one of the most crucial parts of the DEF. This is what ties together the pipeline step of the DEF. The TemplateId is the template id of the Reddit Pipeline step. As you can see the endpoint settings are passed in to the pipeline step.


Looking in Sitecore you can see where they Converter Type points to the Endpoint Converter and also the Processor Type which will get called and receive the end point information and do the Reddit Feed processing. More about that in another blog though.

Hopefully you are seeing how some of this stuff is tied together. In part 3 of the blog I will talk a little about the models which will be a short blog. Part 4 I will get into the processor. Let me know if you have any questions.

#Sitecore’s Data Exchange Framework Reddit Style Part 1

Now that I have been using #Sitecore’s Data Exchange Framework it was time share my knowledge with the rest of the world. For this blog post I will break it up in various parts. Hopefully it will give you a good overview of the DEF and how powerful it can be. I have used for a few different things, but for this example I decided to create something somewhat unique. I will try to keep each blog as short as possible, but there is a lot to cover. I will eventually update each blog with a link to the finished Sitecore module and source code.

First thing you want to do is download the Data Exchange Framework. You can find it here. I installed both the Date Exchange Framework and the Sitecore Provider for Data Exchange Framework.

After that it is a simple right click on the Data Exchange item under System and simple add a new tenant. Below are some of the settings I created and used. This should give you a rough outline on what you should create when doing your DEF feed.

I will go into more detail, but after I was done adding all the stuff for my Reddit feed this is what it looked like:

Value Accessor Sets

  • Used to map feed values from the source
  • Used to map source values to Sitecore

Each Accessor set has a field. I made these fields the same names I was getting from the feed. It made things a little easier to remember. For the list of fields check out the RedditSharp API here.

For the Sitecore field you will use the fields of a template you create. In this case it will be a Reddit item template.

The field value is going to point to the item template of the Reddit feed item. I kept the field naming consistent with the feed.

Value Mapping Sets

To bring those together we will use a value mapping set. This will make the connection from the feed field to the Sitecore field.

Example setting:

Endpoints

In this example created a template that contains settings needed for the field. Template inherits from the endpoint base template. For the Sitecore endpoint I created and used the default values.

Feed Endpoint example:

The converter type points to a created method in the code behind. The method is used to retrieve the blog path.

Sitecore Endpoint example (defaulted values):

After the batch process runs all of the pipeline steps the following will be created in Sitecore. I will go into how this is created in other parts of the blog, but this will give you a picture of what happens. Keep in mind the DEF can be used to do more than just create Sitecore items. Lots of things like external SQL tables, CRMs, MongoDB etc… could be updated.

RedditList

In the next blog post I will talk about how things get tied together. It should make things more clear on why things are setup this way in this blog post. You can find it at Sitecore’s Data Exchange Framework Reddit Style Part 2.