Can You Handle This (DataProvider Gotcha)

Just a quick aside from my main pet project 🙂

There’s a load of stuff about DataProviders in Sitecore. I love them but it can be hard to make them efficient and it’s a minefield – you can really break Sitecore when you’re that close to the metal.

So here’s one I found on a project recently. If you want to know more about DataProviders generally – this blog post is a good start:

http://www.techphoria414.com/blog/2011/january/black-art-of-sitecore-data-providers

I’d also recommend digging out some of the open source DataProviders and having a poke about.

I’ve written 2 sets of DataProviders so far (both in to DAM systems) and they were all working fine until one particular client had an issue. Some investigation by their talented developers identified this bit of code as the culprit:

private bool CanProcessItem(ID id)
{
   if (IDTable.GetKeys(_idTablePrefix, id).Length == 0)
   {
       return false;
   }

   return true;
}

This particular approach is taken from the YouTubeDataProvider – the reason it was causing an issue on this particular Sitecore solution was partly down to the volume of fields on some of the items. Here’s how Sitecore treats the DataProvider process: what’s that? You want an Item? Cool, I’ll get it from this DataProvider, and this DataProvider and so on…

It’s something to remember. Whenever Sitecore gets an Item EVERY data provider method is run. It’s also worth remembering that Sitecore has been built in a modular fashion from the ground so every field is an item… That’s why we need the CanProcessItem in the first place – so we can filter out unnecessary calls.

The other issue is the IDTable.GetKeys method calls the SQL Server directly to get the values out of the IDTable table – so there’s no caching and every single call to CanProcessItem hits the database.

The first thing we’ve tried (which is currently being tested) is to check for a templateId first so we can reduce the number of hits to the IDTable (only certain templates will need to use the DataProvider):

private bool CanProcessItem(ID id, ID templateId)
{
    if (templateId != ID.Null && templateId != _templateId)
    {
        return false;
    }

    if (IDTable.GetKeys(_idTablePrefix, id).Length == 0)
    {
        return false;
    }

    return true;
}

Now, that’s all well and good for every method in the DataProviser except:

public override ItemDefinition GetItemDefinition(ID itemId, CallContext context)

We don’t have an ItemDefinition at this point (because that’s the point of the method *duh*) so the only way to get the templateId is to hit the database in some way (we can’t get the item because that will in turn launch the DataProvider code and you’ll be stuck in an infinite loop).

That’s why I say we’re currently “testing” this solution. If that doesn’t speed the processing up enough for this scenario – we’re going to have to look at other options:

  1. Caching / pre-caching IDs.
  2. Storing the IDs somewhere else?
  3. Some other idea I’ve not thought of yet.

Anyway – wanted to get that out while it was on my mind. If there’s an appetite for it – I might write more about DataProviders at some point. Let me know!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s