Monday, February 6, 2012

Sitecore Fetch Squad

Automated crawler fetching websites and blogs from Sitecore content

Archive for the ‘Sitecore’ Category

Previously I published an article explaining how to fix a glitch in Sitecore index that keeps old data after updating an item. In this article I updated the code since some issues were fixed in Sitecore 6.2 (Update-4) and 6.3.

Starting from 6.2 (Update-4) and 6.3 the _template field in Lucene index stores an item ID in ShortID format. Now it’s even easier to customize the DatabaseCrawler as the following methods were made virtual: AddItem, IndexVersion, DeleteItem and DeleteVersion. Also fields _hasIncludes, _hasExludes and _templateFilder were made protected which helps to shrink the fix even more.

Here is the code that fixes the outlined problem for these releases:

Code Snippet
  1. namespace Lucene.Search.Crawlers
  2. {
  3.     public class DatabaseCrawler : Sitecore.Search.Crawlers.DatabaseCrawler
  4.     {
  5.         protected override void AddMatchCriteria(Net.Search.BooleanQuery query)
  6.         {
  7.             query.Add(new TermQuery(new Term(BuiltinFields.Database, RootItem.Database.Name)), BooleanClause.Occur.MUST);
  8.             query.Add(new TermQuery(new Term(BuiltinFields.Path, ShortID.Encode(RootItem.ID).ToLowerInvariant())), BooleanClause.Occur.MUST);
  9.             if (this._hasIncludes || this._hasExcludes)
  10.             {
  11.                 foreach (KeyValuePair<string, bool> pair in this._templateFilter)
  12.                 {
  13.                     query.Add(new TermQuery(new Term(BuiltinFields.Template, ShortID.Encode(pair.Key).ToLowerInvariant())), pair.Value ? BooleanClause.Occur.SHOULD : BooleanClause.Occur.MUST_NOT);
  14.                 }
  15.             }
  16.         }
  17.  
  18.         protected Item RootItem
  19.         {
  20.             get
  21.             {
  22.                 return Sitecore.Data.Managers.ItemManager.GetItem(Root, Sitecore.Globalization.Language.Invariant,
  23.                                                                   Sitecore.Data.Version.Latest,
  24.                                                                   Sitecore.Data.Database.GetDatabase(Database),
  25.                                                                   Sitecore.SecurityModel.SecurityCheck.Disable);
  26.             }
  27.         }
  28.  
  29.         protected override Query GetVersionQuery(ID id, string language, string version)
  30.         {
  31.             Assert.ArgumentNotNull(id, "id");
  32.             Assert.ArgumentNotNullOrEmpty(language, "language");
  33.             Assert.ArgumentNotNullOrEmpty(version, "version");
  34.             BooleanQuery query = new BooleanQuery();
  35.             query.Add(new TermQuery(new Term(BuiltinFields.ID, GetItemID(id, language, version).ToLowerInvariant())), BooleanClause.Occur.MUST);
  36.             this.AddMatchCriteria(query);
  37.             return query;
  38.         }
  39.     }
  40. }

Basically you need to make sure that the result of GetItemID method is in lower case as well as a term for _path field, in AddMatchCriteria method, is constructed with lower case query.

Sitecore Gadgets

Sitecore Lucene index does not remove old data

Posted by admin On October - 30 - 2011

Looks like interest to Sitecore implementation of Lucene index has raised since Dream Core event and developers have run into an issue with old data being kept in the index repository. In this article I want to show you how to go around this issue.
First of all let’s see why it’s happening. I ran into this issue when I started playing with new implementation of Lucene index in Sitecore 6. When I created an output of the results I saw duplicates of my data in there. I stated debugging my code and found that Lucene somehow recognizes raw GUID’s which breaks search criteria that Sitecore uses to find items during update/delete procedure.
To solve this issue I had to create additional field for Lucene index (_shorttemplateid) and store there short GUID for an item (item.ID.ToShortID()). Then override AddMatchCriteria method and dependent properties to use short template GUID for matching criteria. Below is the code example.

Code Snippet
  1. namespace LuceneExamples
  2. {
  3.    public class DatabaseCrawler : Sitecore.Search.Crawlers.DatabaseCrawler
  4.    {
  5.       #region Fields
  6.  
  7.       private bool _hasIncludes;
  8.       private bool _hasExcludes;
  9.       private Dictionary<string, bool> _templateFilter;
  10.       private ArrayList _customFields;
  11.  
  12.       #endregion Fields
  13.  
  14.       #region ctor
  15.  
  16.       public DatabaseCrawler()
  17.       {
  18.          _templateFilter = new Dictionary<string, bool>();
  19.          _customFields = new ArrayList();
  20.       }
  21.  
  22.       #endregion ctor
  23.  
  24.       #region Base class methods
  25.  
  26.       // Should be overriden to add date fields in "yyyyMMddHHmmss" format. Otherwise it's not possible to create range queries for date values.
  27.       // Also adds _shorttemplateid field which has a template id in ShortID format.
  28.       protected override void AddAllFields(Document document, Item item, bool versionSpecific)
  29.       {
  30.          Assert.ArgumentNotNull(document, "document");
  31.          Assert.ArgumentNotNull(item, "item");
  32.          Sitecore.Collections.FieldCollection fields = item.Fields;
  33.          fields.ReadAll();
  34.          foreach (Sitecore.Data.Fields.Field field in fields)
  35.          {
  36.             if (!string.IsNullOrEmpty(field.Key) && (field.Shared != versionSpecific))
  37.             {
  38.                bool tokenize = base.IsTextField(field);
  39.                if (IndexAllFields)
  40.                {
  41.                   if (field.TypeKey == "date" || field.TypeKey == "datetime")
  42.                   {
  43.                      IndexDateFields(document, field.Key, field.Value);
  44.                   }
  45.                   else
  46.                   {
  47.                      document.Add(CreateField(field.Key, field.Value, tokenize, 1f));
  48.                   }
  49.                }
  50.                if (tokenize)
  51.                {
  52.                   document.Add(CreateField(BuiltinFields.Content, field.Value, true, 1f));
  53.                }
  54.             }
  55.          }
  56.          AddShortTemplateId(document, item);
  57.          AddCustomFields(document, item);
  58.       }
  59.  
  60.       /// <summary>
  61.       /// Loops through the collection of custom fields and adds them to fields collection of each indexed item.
  62.       /// </summary>
  63.       /// <param name="document">Lucene document</param>
  64.       /// <param name="item">Sitecore data item</param>
  65.       private void AddCustomFields(Document document, Item item)
  66.       {
  67.          foreach(CustomField field in _customFields)
  68.          {
  69.             document.Add(CreateField(field.LuceneFieldName, field.GetFieldValue(item), field.StorageType, field.IndexType, Boost));
  70.          }
  71.       }
  72.  
  73.       /// <summary>
  74.       /// Creates a Lucene field.
  75.       /// </summary>
  76.       /// <param name="fieldKey">Field name</param>
  77.       /// <param name="fieldValue">Field value</param>
  78.       /// <param name="storeType">Storage option</param>
  79.       /// <param name="indexType">Index type</param>
  80.       /// <param name="boost">Boosting parameter</param>
  81.       /// <returns></returns>
  82.       private Fieldable CreateField(string fieldKey, string fieldValue, Field.Store storeType, Field.Index indexType, float boost)
  83.       {
  84.          Field field = new Field(fieldKey, fieldValue, storeType, indexType);
  85.          field.SetBoost(boost);
  86.          return field;
  87.       }
  88.  
  89.       /// <summary>
  90.       /// Parses a configuration entry for a custom field and adds it to a collection of custom fields.
  91.       /// </summary>
  92.       /// <param name="node">Configuration entry</param>
  93.       public void AddCustomField(XmlNode node)
  94.       {
  95.          CustomField field = CustomField.ParseConfigNode(node);
  96.          if (field == null)
  97.          {
  98.             throw new InvalidOperationException("Could not parse custom field entry: " + node.OuterXml);
  99.          }
  100.          _customFields.Add(field);
  101.       }
  102.  
  103.       // Method should use _shorttemplateid to allow one create combined/boolean search queries with template id reference.
  104.       // Also used to create a matching criteria for update/delete actions.
  105.       protected override void AddMatchCriteria(BooleanQuery query)
  106.       {
  107.          query.Add(new TermQuery(new Term(BuiltinFields.Database, Database)), BooleanClause.Occur.MUST);
  108.          query.Add(new TermQuery(new Term(BuiltinFields.Path, Sitecore.Data.ShortID.Encode(Root).ToLowerInvariant())), BooleanClause.Occur.MUST);
  109.          if (HasIncludes || HasExcludes)
  110.          {
  111.             foreach (KeyValuePair<string, bool> pair in TemplateFilter)
  112.             {
  113.                query.Add(new TermQuery(new Term(Constants.ShortTemplate, Sitecore.Data.ShortID.Encode(pair.Key).ToLowerInvariant())), pair.Value ? BooleanClause.Occur.SHOULD : BooleanClause.Occur.MUST_NOT);
  114.             }
  115.          }
  116.       }
  117.  
  118.       // Method should be overriden because _hasIncludes and _hasExcludes variables were introduced.
  119.       protected override bool IsMatch(Item item)
  120.       {
  121.           bool flag;
  122.           Assert.ArgumentNotNull(item, "item");
  123.           if (!RootItem.Axes.IsAncestorOf(item))
  124.           {
  125.               return false;
  126.           }
  127.           if (!HasIncludes && !HasExcludes)
  128.           {
  129.               return true;
  130.           }
  131.           if (!TemplateFilter.TryGetValue(item.TemplateID.ToString(), out flag))
  132.           {
  133.               return !HasIncludes;
  134.           }
  135.           return flag;
  136.       }
  137.  
  138.       // Method required to override AddMatchCriteria one.
  139.       new public void IncludeTemplate(string templateId)
  140.       {
  141.          Assert.ArgumentNotNullOrEmpty(templateId, "templateId");
  142.          _hasIncludes = true;
  143.          _templateFilter[templateId] = true;
  144.       }
  145.  
  146.       // Method required to override AddMatchCriteria one.
  147.       new public void ExcludeTemplate(string templateId)
  148.       {
  149.          Assert.ArgumentNotNullOrEmpty(templateId, "templateId");
  150.          _hasExcludes = true;
  151.          _templateFilter[templateId] = false;
  152.       }
  153.  
  154.       #endregion Base class methods
  155.  
  156.       /// <summary>
  157.       /// Converts Sitecore date and datetime fields to the recognizable format for Lucene API.
  158.       /// </summary>
  159.       /// <param name="doc">Lucene document object</param>
  160.       /// <param name="fieldKey">Field name</param>
  161.       /// <param name="fieldValue">Field value</param>
  162.       private void IndexDateFields(Document doc, string fieldKey, string fieldValue)
  163.       {
  164.          DateTime dateTime = Sitecore.DateUtil.IsoDateToDateTime(fieldValue);
  165.          string luceneDate = "";
  166.          if (dateTime != DateTime.MinValue)
  167.          {
  168.             luceneDate = dateTime.ToString(Constants.DateTimeFormat);
  169.          }
  170.          doc.Add(CreateField(fieldKey, luceneDate, false, 1f));
  171.       }
  172.  
  173.       /// <summary>
  174.       /// Adds template id in ShortID format
  175.       /// </summary>
  176.       /// <param name="doc">Lucene document object</param>
  177.       /// <param name="item">Sitecore item</param>
  178.       private void AddShortTemplateId(Document doc, Item item)
  179.       {
  180.          doc.Add(CreateField(Constants.ShortTemplate, Sitecore.Data.ShortID.Encode(item.TemplateID).ToLowerInvariant(), false, 1f));
  181.       }
  182.  
  183.       #region Properties
  184.  
  185.       protected bool HasIncludes
  186.       {
  187.          get
  188.          {
  189.             return _hasIncludes;
  190.          }
  191.          set
  192.          {
  193.             _hasIncludes = value;
  194.          }
  195.       }
  196.  
  197.       protected bool HasExcludes
  198.       {
  199.          get
  200.          {
  201.             return _hasExcludes;
  202.          }
  203.          set
  204.          {
  205.             _hasExcludes = value;
  206.          }
  207.       }
  208.  
  209.       protected Dictionary<string, bool> TemplateFilter
  210.       {
  211.          get
  212.          {
  213.             return _templateFilter;
  214.          }
  215.       }
  216.  
  217.       protected Item RootItem
  218.       {
  219.          get
  220.          {
  221.             return Sitecore.Data.Managers.ItemManager.GetItem(Root, Sitecore.Globalization.Language.Invariant,
  222.                                                               Sitecore.Data.Version.Latest,
  223.                                                               Sitecore.Data.Database.GetDatabase(Database),
  224.                                                               Sitecore.SecurityModel.SecurityCheck.Disable);
  225.          }
  226.       }
  227.  
  228.       #endregion Properties
  229.  
  230.    }
  231. }

This should solve this issue as well as add Lucene recognizable format for Sitecore date and datetime field types. Also it will allow to build Combined and Boolean search queries.

Update. Code for the Constants class:

   1: namespace LuceneExamples

   2: {

   3:    public class Constants

   4:    {

   5:       // special field for template id in ShortID format

   6:       public const string ShortTemplate = "_shorttemplateid";

   7:  

   8:       // searchable date-time format. All datetime field

   9:       public const string DateTimeFormat = "yyyyMMddHHmmss";

  10:  

  11:       // Path to lucene setting items: /sitecore/system/Settings/Lucene

  12:       public const string LuceneSettingsPath = "{89783047-026C-45B5-AB5B-338E4A22446C}";

  13:    }

  14: }

Hope it saves someone a minute or two.

Sitecore Gadgets

If you read my previous posts about Lucene search index, then you already know how to configure it and how it works in Sitecore application.
In this part we will take a look at API to see what can be achieved using the search index.
To search existing index we need to get an index instance somehow. I’m not going to show the code examples that you would write with old search index. If you’re intrested in it, check out this article Lecene Search Engine.
Additional layer of API for new search index resides under Sitecore.Search namespace. In order to get a search index object, you would need to use members of SearchManager class.
To get an index by name use this line of code:
Index indx = Search.Manager.GetIndex(index_name);
If you want to use default system index, you can simply call SystemIndex property of SearchManager class. In order to use Sitecore API to look up for some info in the index, you need to created a search context.
It’s easy to do by calling CreateSearchContext method of the index object we got previously. It’s also possible to create a search context by using one for the constructors of IndexSearchContext class. In this case it will be easy to run search queries by passing a search index instance as a parameter to the search context.
To search information, we need to create a query and run it in the search context. Sitecore API has a few classes that you can use to build search queries. Let’s take a look at each of them:
FullTextQuery – this type of query searches the index by “_content” field. All information from text fields (such as “Single-Line Text”, “Rich Text”, “Multi-Line Text”, “text”, “rich text”, “html”, “memo”) is stored there. Data in this field are indexed and tokenized. Which means that the search operations running on these data are very efficient.
FieldQuery – this type of query allows you to search any field that was added to the index. By default database crawler adds all item fields to the index.
CombinedQuery – this type of query was designed to allow you create complex queries with additional conditions. For instance to find items which have specific work in title and belong to some category. When you add search queries to this type of query, you need to supply QueryOccurance parameter. It’s Enum type that has the following members:
– Must – it’s a logical AND operator in Lucene boolean query.
– MustNot – it’s a logical NOT operator in Lucene boolean query.
– Should – it’s a logical OR operation in Lucene boolean query.
You can read more about this operators in Query Parser Syntax article.
All of these query types are derived from QueryBase class.
There is one thing left until we jump from the theory to some code samples. To run defined queries, you need to use one of Search methods of IndexSearchContext object.
Now let’s create a couple of samples to see a real code that goes behind the theory.
Sample 1: Searching text fields.
// Next samples will skip lines with getting the index instance and creating the search context.
// Get an index object
Index indx = SearchManager.GetIndex(“my_index”);
// Create a search context
using (IndexSearchContext searchContext = indx.CreateSearchContext())
{
// In following examples I will be using QueryBase class to create search queries.
FullTextQuery ftQuery = new FullTextQuery(“welcome”);
SearchHits results = searchContext.Search(ftQuery);
}
Sample 2: Searching item fields
Let’s say we want to find items classified by some category. There is a trick searching by GUIDs so let’s say our category is just a string name.
…..
// FieldQuery ctor accepts two parambers. First is a field name. The other one is a value we’re looking for.
QueryBase query = FieldQuery(“category”, “slr”);
SearchHits results = searchContext.Search(query);
…..
Sample 3: Searching by multiple conditions
Turned out that your category parameter is not enough to get required results. You client is screaming that there are too many items and business users cannot find ones they are looking for (is there anything they can find?).
Obviously you have some additional fields that can help to find more strict results. Let’s say there is a rating field with values from 1 to 5.
That’s where CombinedQuery gets into the game.
…..
// CombinedQuery object has Add method that should be used to add search queries to it. That’s why we cannot use base class variable here.
CombinedQuery query = new CombinedQuery();
QueryBase catQuery = new FieldQuery(“category”, “slr”);
QueryBase ratQuery = new FieldQuery(“rating”, “5″);
query.Add(catQuery, QueryOccurance.Must);
query.Add(ratQuery, QueryOccurance.Must);
var hits = searchContext.Search(query);
…..
All results are presented as SearchHits object. Now you should use of following methods of SearchHits object to get the results as Sitecore items:
– FetchResults(int, int) – returns search results as SearchResultCollection. First parameter is a start position of an item you want to start fetching results from. Second one is count of items you want to fetch. By calling this function as mentioned below, you can get all results at once:
var results = hits.FetchResults(0, hits.Length);
- Slice(int) – returns all results as IEnumerable collection.
- Slice(int, int) – this method has similar signature to FetchResults but returns results as IEnumerable collection.
Here are a couple of examples the way you can transform SearchHits object into Sitecore items.
Sample 4: using FetchResults
…..
SearchResultCollection results = hits.FetchResults(0, hits.Length);
IEnumerable searchItems = from hit in results
select hit.GetObject();
}
…..
Sample 5: using Slice
…..
IList searchItems = List();
foreach(var hit in hits.Slice(0))
{
ItemUri itemUri = new ItemUri(hit.Url);
if (itemUri != null)
{
Item item = ItemManager.GetItem(itemUri.ItemID, itemUri.Language, itemUri.Version, Factory.GetDatabase(itemUri.DatabaseName));
if (item != null)
{
searchItems.Add(item);
}
}
}
…..
It’s worth to mention that some variations of Search method of IndexSearchContext class can accept Lucene.Net.Search.Query as a search query parameter. It becomes very useful when you need to create a complex query which cannot be built with Sitecore query types.
Searching GUIDs.
New search index has lots of useful built-in fields that help to build strict queries.
Besides standard fields it has the following fields that contain GUIDs in ShortID format:
– _links – contains all references to current item
– _path – contains ShortIDs for every parent item in the path relative to current item
– _template – contains GUID of item’s tempalte.
NOTE: this field is supposed to have ShortID value instead of GUID one. This field should not be used in combined queries prior to Sitecore 6.2 releases.
If you decide to add custom fields to your search index and they should have GUID values, you need to store them as ShortID in lower case format. Otherwise search will not be able to find any results. The reason why it happens is because Lucene recognizes GUIDs and applies special parsing for them. It works fine if search query has only one field to look into. If it’s combined/complex query then it fails to find anything even if it’s correct.
So, remember if you need to filter search results by template, you will have to customize DatabaseCrawler to add another field (e.g. _shorttemplateid) to store item template id in ShortID format.
Sample 1: Find all item references
…..
QueryBase query = FieldQuery(“_links”, ShortID.Encode(item.ID).ToLowerInvariant());
SearchHits results = searchContext.Search(query);
…..
Sample 6: Find all items based on specified template
…..
// Prior to Sitecore 6.2 release, you will need to add and use _shorttemplateid field
QueryBase query = FieldQuery(“_template”, ShortID.Encode(item.ID).ToLowerInvariant());
SearchHits results = searchContext.Search(query);
…..
Sample 7: Find items that are descendants of a specified one
…..
QueryBase query = FieldQuery(“_path”, ShortID.Encode(item.ID).ToLowerInvariant());
SearchHits results = searchContext.Search(query);
…..
Sample 8: Find items of a parent and belong to a specific template
…..
CombinedQuery query = new CombinedQuery();
query.Add(new FieldQuery(“_shorttemplateid”, ShortID.Encode(templateId).ToLowerInvariant()), QueryOccurance.Must);
query.Add(new FieldQuery(“_path”, ShortID.Encode(parent.ID).ToLowerInvariant())), QueryOccurance.Must);

…..
That’s all I wanted to tell about Lucene search index in Sitecore 6. I hope it will help Lucene beginners to better understand the concept and get up to speed with Lucene search index abilities.
Enjoy!

Sitecore Gadgets

Teach User Manager how to search by email

Posted by admin On October - 30 - 2011
A couple of days ago one interesting issue popped out in my Inbox. Our customer wanted to search users by email address in User Manager application. Quite legitimate request. Especially if you have an integration with AD or CRM or whatever external security system with large number of users. I tried a couple of configuration changes with no luck. Then checked our support portal knowledge base and saw several request like that. I was surprised by the fact that you cannot search users by email.
After spending several hours investigating how search functionality in User Manager works and doing some “shaman dancing” around it, I ended up with a solution that requires to override a standard Sitecore.Security.Accounts.UserProvider and change its method called “GetUsersByName”. As it turned out, search functionality of User Manager calls the GetUserByName method when one searches for a user.
This is how it looks:
namespace Sitecore.SharedSource.Security.Accounts
{
public class UserProvider : Sitecore.Security.Accounts.UserProvider
{
protected override IEnumerable GetUsersByName(int pageIndex, int pageSize, string userNameToMatch)
{
return FindUsers(pageIndex, pageSize, userNameToMatch);
}

protected IEnumerable FindUsers(int pageIndex, int pageSize, string userNameToMatch)
{
Assert.ArgumentNotNull(userNameToMatch, “userNameToMatch”);
IEnumerable users = null;
string userWithNoWildcard = StringUtil.RemovePostfix(Settings.Authentication.VirtualMembershipWildcard, StringUtil.RemovePrefix(Settings.Authentication.VirtualMembershipWildcard, userNameToMatch));

if (Regex.IsMatch(userWithNoWildcard, @”^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$”))
{
users = findByEmail(pageIndex, pageSize, userWithNoWildcard);
}
else
{
users = findUsername(pageIndex, pageSize, userNameToMatch);
}
return users;
}

protected IEnumerable findUsername(int pageIndex, int pageSize, string username)
{
int total;
return new Enumerable(delegate
{
return Membership.FindUsersByName(username, pageIndex, pageSize, out total).GetEnumerator();
},
delegate(object user)
{
return User.FromName(((MembershipUser)user).UserName, false);
});
}

protected IEnumerable findByEmail(int pageIndex, int pageSize, string userToMatch)
{
int total;
return new Enumerable(delegate
{
return Membership.FindUsersByEmail(userToMatch, pageIndex, pageSize, out total).GetEnumerator();
},
delegate(object user)
{
return User.FromName(((MembershipUser)user).UserName, false);
});
}
}
}
That’s it! Simple enough, isn’t it.
Thanks to Alex Shyba who helped me with some useful code!
After compiling the code and moving a dll to the /bin folder, don’t forget to change a reference in web.config file to point it to a new UserProvider class.

    <userManager defaultProvider=”default” enabled=”true”>
      <providers>
        <clear />
        <add name=”default”>
          <x:attribute name=”type”>Sitecore.SharedSource.Security.Accounts.UserProvider, YOUR_DLL_HERE</x:attribute>
        </add>
      </providers>
    </userManager>

For those who consider this so easy so that it’s now worth to create a VS project I’ve built a ready-to-go Sitecore package with a dll and include file inside. Just install it and give it a try ;) .
This is a Sitecore package.
Enjoy!

Sitecore Gadgets

Language filtered Multilist field

Posted by admin On October - 30 - 2011

Recently I happened to help one of our clients to create a custom Multilist field that gives you selecting options only if item has  at least one translated version in a current content language. From content author’s point of view it makes a lot of sense. Why to give them an option that is not useful?

It’s being a while since I created a custom field that has to take into account content language selection. The main challenge for me was to retrieve that content language. I did remember that there is a property that the Content Editor (CE) sets at run-time for every field but could not recall its name. So after several minutes of browsing my code storage of all samples for all Sitecore versions, I finally found it. So, here are the properties that you need to define in your custom field if you are planning to use them afterwards:

- ItemLanauage – represents a content language selected in the CE.

- ItemID – contains the item ID the field belongs to.

- ItemVersion – contains selected item version.

- ReadOnly – indicates if field is a readonly. If it is, then it will be grayed out and not editable.

- Source – represents the source value from field definition on a data template.

- FieldID – contains the field ID.

In my example I needed only ItemLanguage property. When I put things together the code looked like this:

using Sitecore.Shell.Applications.ContentEditor;using Sitecore.Data.Items;

namespace Sitecore.Shell.Applications.ContentEditor.CustomExtensions{    public class LanguageFilteredMultilist : Sitecore.Shell.Applications.ContentEditor.MultilistEx    {        #region Overrides

        protected override Item[] GetItems(Item current)        {                          Item[] items = base.GetItems(current);

            var filteredItems = items.Where(item =>                     {                        string lang = ItemLanguage;                        if (!string.IsNullOrEmpty(lang))                        {                            var versions = Sitecore.Data.Managers.ItemManager.GetVersions(item, Sitecore.Data.Managers.LanguageManager.GetLanguage(lang, item.Database));                            return versions.Count > 0;                        }                        return false;                    }                );   

            return filteredItems.ToArray<Item>();        }

        #endregion Overrides

        // Content Editor sets this property. It has a content language from the Content Editor.        public string ItemLanguage        {            get;            set;        }    }}

Simple enough isn’t it. Don’t forget to add your custom field to /App_Config/FieldTypes.config file if it’s supposed to contain references to other items. In my case it should be added since it’s a multilist. If you forget to do it, the LinkDatabase won’t update references for your field.

That’s all for now.

Sitecore Gadgets

Adding custom fields to the index

Posted by admin On October - 30 - 2011

In this post I want to show how to address a missing feature that was a part of “old” lucene index implementation. This article will provide an example how one can customize Lucene search configuration so that it’s possible to add custom fields to the index.

First off, let’s create a configuration that would allow us to add additional fields to the indexed data.

<index id="News" type="Sitecore.Search.Index, Sitecore.Kernel">  <param desc="name">$(id)</param>  <param desc="folder">_news</param>  <Analyzer ref="search/analyzer" />  <locations hint="list:AddCrawler">    <examples-news type="LuceneExamples.DatabaseCrawler,LuceneExamples">      <Database>web</Database>      <Root>/sitecore/content</Root>      <IndexAllFields>true</IndexAllFields>      <include hint="list:IncludeTemplate">        <news>{788EF1BE-B71E-4D59-9276-50519BD4F641}</news>        <tag>{4DD970FB-2695-4E50-96F3-A766F7D6CAF1}</tag>      </include>      <fields hint="raw:AddCustomField">        <field luceneName="author" storageType="no" indexType="tokenized">__updated by</field>        <field luceneName="changed" storageType="yes" indexType="untokenized">__updated</field>      </fields>    </examples-news>  </locations></index>

There is a new configuration section in this example. It’s <fields> section that introduces two fields “author” and “changed”. These fields will be added to a fields collection of each indexed item. Basically, there is AddCustomField method that gets called for every <field> configuration entry to identify a custom field that is going to be added to the fields collection.

Description of configuration attributes:

  • luceneName  is a field name that appears in lucene index.
  • storageType  is a storage type for lucene field. It can have the following values:
    • no
    • yes
    • compress
  • indexType  is an index type for lucene field. It can have the following values:
    • no
    • tokenized
    • untokenized
    • nonorms

Refere to Lucene documentation to find out what each of these options mean: store and index.

Now all you need to do is to loop through the collection of custom fields in the overridden AddAllFields method and add them to the indexed data.

I created a custom class called CustomField that helps to manage custom field entries. Below is the example of this class as well as additional methods for extended DatabaseCrawler. Since code for the DatabaseCrawler was already published in this blog post, I’m not going to duplicate it here.

Here is a code for CustomField class.

using System.Xml;using Sitecore.Data;using Sitecore.Data.Items;using Sitecore.Xml;using Lucene.Net.Documents;

namespace LuceneExamples{   public class CustomField   {      public CustomField()      {         FieldID = ID.Null;         FieldName = "";         LuceneFieldName = "";      }

      public ID FieldID      {         get;         private set;      }

      public string FieldName { get; private set; }

      public Field.Store StorageType { get; set; }

      public Field.Index IndexType { get; set; }

      public string LuceneFieldName { get; private set; }

      public static CustomField ParseConfigNode(XmlNode node)      {         CustomField field = new CustomField();         string fieldName = XmlUtil.GetValue(node);         if (ID.IsID(fieldName))         {            field.FieldID = ID.Parse(fieldName);         }         else         {            field.FieldName = fieldName;         }         field.LuceneFieldName = XmlUtil.GetAttribute("luceneName", node);         field.StorageType = GetStorageType(node);         field.IndexType = GetIndexType(node);

         if (!IsValidField(field))         {            return null;         }

         return field;      }

      public string GetFieldValue(Item item)      {         if (!ID.IsNullOrEmpty(FieldID))         {            return item[ID.Parse(FieldID)];         }         if(!string.IsNullOrEmpty(FieldName))         {            return item[FieldName];         }         return string.Empty;      }

      private static bool IsValidField(CustomField field)      {         if ((!string.IsNullOrEmpty(field.FieldName) || !ID.IsNullOrEmpty(field.FieldID)) && !string.IsNullOrEmpty(field.LuceneFieldName))         {            return true;         }         return false;      }

      private static Field.Index GetIndexType(XmlNode node)      {         string indexType = XmlUtil.GetAttribute("indexType", node);         if (!string.IsNullOrEmpty(indexType))         {            switch (indexType.ToLowerInvariant())            {               case "no":                  return Field.Index.NO;               case "tokenized":                  return Field.Index.TOKENIZED;               case "untokenized":                  return Field.Index.UN_TOKENIZED;               case "nonorms":                  return Field.Index.NO_NORMS;            }         }         return Field.Index.TOKENIZED;      }

      private static Field.Store GetStorageType(XmlNode node)      {         string storage = XmlUtil.GetAttribute("storageType", node);         if (!string.IsNullOrEmpty(storage))         {            switch (storage.ToLowerInvariant())            {               case "no":                  return Field.Store.NO;               case "yes":                  return Field.Store.YES;               case "compress":                  return Field.Store.COMPRESS;            }         }         return Field.Store.NO;      }   }}

And the code for additional methods for DatabaseCrawler.

/// <summary>/// Loops through the collection of custom fields and adds them to fields collection of each indexed item./// </summary>/// <param name="document">Lucene document</param>/// <param name="item">Sitecore data item</param>private void AddCustomFields(Document document, Item item){   foreach(CustomField field in _customFields)   {      document.Add(CreateField(field.LuceneFieldName, field.GetFieldValue(item), field.StorageType, field.IndexType, Boost));   }}

/// <summary>/// Creates a Lucene field./// </summary>/// <param name="fieldKey">Field name</param>/// <param name="fieldValue">Field value</param>/// <param name="storeType">Storage option</param>/// <param name="indexType">Index type</param>/// <param name="boost">Boosting parameter</param>/// <returns></returns>private Fieldable CreateField(string fieldKey, string fieldValue, Field.Store storeType, Field.Index indexType, float boost){   Field field = new Field(fieldKey, fieldValue, storeType, indexType);   field.SetBoost(boost);   return field;}

/// <summary>/// Parses a configuration entry for a custom field and adds it to a collection of custom fields./// </summary>/// <param name="node">Configuration entry</param>public void AddCustomField(XmlNode node){   CustomField field = CustomField.ParseConfigNode(node);   if (field == null)   {      throw new InvalidOperationException("Could not parse custom field entry: " + node.OuterXml);   }   _customFields.Add(field);}

Last thing that is left to do is to call AddCustomFields method from AddAllFields one.

protected override void AddAllFields (Documentdocument, Itemitem, bool versionSpecific)
{
    ………………………………………
    AddCustomFields(document, item);
}

You can take it even further and add support for some field interpreter for each field configuration entry.

Hope you’ll find it useful.

Sitecore Gadgets

Restrict access to advanced media upload options

Posted by admin On October - 30 - 2011
Sometimes you want to give an access to Advanced Media Upload for editors. When you do it, they get access to all the advanced options that you might not want to share with them.
This package allows you to configure access to advanced media upload options by tighten security on Sitecore items.
After installing the package, you can configure advanced options access for presets items at /sitecore/System/Settings/Media/Presets path.

Download the package.

Sitecore Gadgets

How often did you stress yourself thinking on the question “What’s the most efficient way to retrieve Sitecore items?”?
Here are possible ways to do it:
  • Sitecore API
  • Sitecore Query
  • SQL custom stored procedures
  • Lucene index
Let’s take a look at each option briefly.
Sitecore API – the most popular way to get an item. All you need is an item ID and simple call of GetItem() method of database instance will get you the latest item version in current language. If you want an item in specific language, not a problem: GetItem(, ). Want specify version, not a problem either: GetItem(, , ). There are many ways to get an item through Sitecore API. My point is that it’s the most easiest way to do it.
Sitecore Query – easy way to get a bunch of items filtered by some criteria. Build a string query, which kinda look like XPath query, and run it on a database instance. Read more about Sitecore Query.
SQL custom stored procedures – we all know how to create a SQL stored procedure. Connections strings already exist in Sitecore solution. All you need is to use some SQL management studio to create a stored procedure for the database. The question is why would you do it if API already exists? It makes sense only if you run complex query with lots of filtering conditions so that SQL will return you only items that you’re searching for.
Lucene index – why would I use separate data storage/data index to get an item from Sitecore database? It does not make sense. Agree, for an item it does not but for a collection of items it has a perfect sense to do it. Why? Maybe because performance is much better when you are working with big amount of data. Moreover that’s a search index. Which means that data are perfectly organized to be searched.
Let’s talk about performance characteristics for each of these options and compare them.
Sitecore API – uses Sitecore data provider to pull out information from the database. If item that we’re looking for is cached the call to SQL database is avoided and item gets retrieved from one of Sitecore caches. Using Sitecore caches gives you huge performance benefit. That’s why we always worried about caches configuration in Sitecore solution. What if you need to get a number of items by some criteria, let’s say template ID and belonging to some category (from here on I’m gonna use this condition for all item retrieval options). You run foreach or for or any other logic that goes through the content tree and checks for a TemplateID property and category field of every single item. If item resides in cache, it will be quick enough but still the code will have to request every single item from the content tree.
Sitecore Query – the picture with Sitecore Query is very similar to Sitecore API but it get’s more slower when number of items is growing. Kim has a good article that explains Sitecore Query performance.
NOTE: I’m not talking about fast Sitecore Query introduced in Sitecore 6.
SQL stored procedures – it seems to be a good approach to go with if you’re searching for items through the content tree. The thing is there will be only one stored procedure that executes query by specified criteria. Also you will have to consider caching option for retrieved items if you have very deep content tree and items of some specific time are not gathered under one branch. My point is that it can cost you more then the benefits that you will get. I would go with this approach only if I cannot do it with Lucene resources.
Lucene index – again it’s search index. It perfectly fits searching options. When you search for items with specified criteria it neither requests an item instance nor runs query over database tables. It goes through the fields that you have in your index and compares their values to your search conditions. The only thing if search query uses custom fields to filter data by, the fields must be added to your index. Otherwise you will get zero results. It’s very easy to do though. I will describe it in the next part of the article.
Conclusion: Let’s answer those questions in the topic of this part.
Why to use lucene index?
The approach is one of the most (maybe event the best) efficient ones. Search is quick enough that in most cases you don’t even need to implement custom caching for the retrieved data.
When to use lucene index?
When you need to run search queries with specific criteria on huge number of data.
Don’t forget the rule – the more complex search query gets the slower it works ;) .
In next part we will look into Lucene search index introduced in Sitecore 6.
Information about “old” Lucene index (it was introduced in Sitecore 5) you can find here.

Sitecore Gadgets

Here is second part of the Lucene search index overview for Sitecore 6. In this part we’ll take a look at configuration settings and talk about how it works.
Sitecore 5 has Lucene engine as well. Let’s step one Sitecore version back and see how Lucene works there. In web.config file there is a section /sitecore/indexes that contains Lucene index configuration. When index is configured, it should be added to /sitecore/databases/database/indexes section.
The web database does not have a search index by default. Even if you add it to aforementioned section, it won’t work. Why? Because index configuration relies on HistoryEngine functionality. By default the web database does not have it. It’s easy to add it though. Just add the HistoryEngine configuration section to the database.
You can find more configuration details from this article on SDN.
This index has the same configuration in Sitecore 6.
In addition to it, Sitecore 6 has a new Lucene index functionality. Which is more reliable and has Sitecore level API on top of Lucene one. In some cases you will still have to use Lucene API. For instance to create range queries.
Configuration settings for new search index located under /sitecore/search section.
The analyzer section defines a Lucene analyzer that is used to analyze and index content.
The categories section is used to categories search results. It’s used for content tree search introduced in Sitecore 6. The search box is located right above the content tree in content editor.
The configuration section has indexes definitions with their configurations. An index definition should be created under /sitecore/search/configuration/indexes node.
First two parameters describe the index name and folder name where it should be stored:
<param desc=”name”>$(id)</param>
<param desc=”folder”>my_index_folderName</param>
Next setting is the analyzer that should be used for the index:

<Analyzer ref=”search/analyzer” />

Lucene StandardAnalyzer covers most of the case scenarios. But it’s possible to use any other analyzer if it’s needed.
Following setting defines locations for the index:

<locations hint=”list:AddCrawler”>

It’s possible to have multiple locations for one index. Moreover it’s even possible to have content from different databases in the same index. Every child of the locations node has its own configuration for a particular part of the content. A name of location node is not predefined. You’re welcome to name it the way you want. For example:
<locations hint=”list:AddCrawler”>

<sdn-site type=”Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel”>

</sdn-site>

</locations>
Every location has a database section. It defines indexing database for the location.
Then root section. The database crawler will index content beneath this path.
Next sibling node is the include section. Here it’s possible to add templates items of which should be included to the index or excluded from it.
Example:
<include hint=”list:IncludeTemplate”>

<sampleItem>{76036F5E-CBCE-46D1-AF0A-4143F9B557AA}</sampleItem>

</include>
<include hint=”list:ExcludeTemplate”>

<layout>{3A45A723-64EE-4919-9D41-02FD40FD1466}</layout>

</include>
It does not make sense to use both of these settings for the one location. Use only one of them.
Next location setting is tags section. Here you can tag indexed content and use it during the search procedure.
Last setting is boost section. Here you have an ability to boost indexed content among other content that belongs to other locations.
And last but not the least, this search index uses the same HistoryEngine mechanism as old one. So, don’t forget to copy configuration section from master database to a database where you want to add search index facilities to.
How it all works?
When an action performed on the item, database crawler updates entries in search index for the item. So that information in index is in sync with the one in database. How does it happen if “item:saved“, “item:deleted“, “item:renamed“, “item:copied“, “item:moved” do not have event handlers that trigger search index update? Thank to HistoryEngine that was mentioned several times already.
It is HistoryEngine that tracks any changes made to the item and fires appropriate event handler to process it.
IndexingManager is responsible for all operations to the search index. It subscribes to AddEntry event of HistoryEngine and as soon as an entry added to the History table, it triggers a job that updates the search index(es).
In web.config file there are a few settings that belong to indexing functionality.
  • Indexing.UpdateInterval – sets the interval between the IndexingManager checking its queue for pending actions. Default value is 5 minutes.
    What does it mean? If for whatever reason pending job was not executed, the IndexingManager will re-run it if it finds it in pending state after 5 minutes pass.
  • Indexing.UpdateJobThrottle – sets the minimum time to wait between individual index update jobs. Default value 1 second.
    When some operation is performed on the item, you can see this entry in Sitecore log file:
    INFO Starting update of index for the database ‘databaseName‘ ( pending).
    This setting sets the interval between jobs like this. So that it does not overwhelm all CPU time if you’re doing massive change to the items.
  • Indexing.ServerSpecificProperties – Indicates if server specific keys should be used for property values (such as ‘last updated’). It’s off by default.
    This setting is designed for content delivery environments in web farms. As web database is shared, there could be a situation when one server has updated its search indexes and changed History table in the database. Other servers won’t update their indexes because HistoryEngine wouldn’t indicate there was a change. This setting prevents situations like this.
Well… this is it for now. In next part we will take a look at Sitecore Lucene API and create some search queries with it.
Enjoy!

Sitecore Gadgets

Upcoming Sitecore Users’ Virtual Group

Posted by admin On October - 30 - 2011

Exciting news! I was invited to present at the upcoming Sitecore Users’ Virtual Group which will be held next week on Wednesday (Oct 19th) at Noon Pacific, 3:00 PM Eastern, 8:00 PM UK. Thanks guys!
So the topic would be quite random, “Latest cool prototypes from Sitecore US lab”. Since I am not in product development, rather “on-the-field” kind of Sitecorian, don’t expect any of the new MVC, Sitecore 7, but expect the demos of the following components produced for the customer needs I witness during my consulting engagements with partners and customers:
- Updates on the following two hottest modules produced. Both contain some major enhancements based on your feedback (thank you)!

Sitecore.Search extension aka AdvancedDatabaseCrawler
Partial Language Fallback

- Other prototypes and super experimental stuff like ContentSilo, NameValueEx field, WorkflowBundle and more (if we have time)!
If this sounds interesting, see this event page for more details and click here to register. Space is limited.
Hope it will be informative and fun! See you soon.

Consulting and Supporting Sitecore Developer Community