Friday, July 30, 2010

Sitecore Fetch Squad

Automated crawler fetching websites and blogs from Sitecore content

(Version 1.1.1 - October 19th, 2008)

In this series, I’m going to compare the functionality of SharePoint and Sitecore from a developer perspective in each area and both candidates can collect "points" from 0 (has nothing to contribute) to 10 (great integration and functionality).

This series will cover the following areas:
- Introduction (Part 1)
- Installation and General Architecture (Part 2)
- Configuration (Part 3)
- Customization and Designer Experience (Part 4)
- Integration with Developer Tools (Part 5)
- Feature Deployment (Part 6)
- API und Data Model (Part 7)
- Integration with other systems (Part 8)
- Conclusion (Part 9)

 

Customization and Designer Experience

The term "customization" is used in SharePoint to describe the work of creating solutions without actually writing code or using the API. The work can either be done using features of the SharePoint Browser Interface or the Microsoft Office SharePoint Designer application. The result of this work is stored inside the content database of a SharePoint site collection, it is not deployed in form of compiled DLLs and additional resources.
In Sitecore, a similar kind of work is performed by a Sitecore XML Developer (sometimes referred to as Sitecore HTML Developer or SCD1, in contrast to the Sitecore API Developer, sometimes also called Sitecore .NET Developer or SCD2). Sitecore XML Developers create conceptual artifacts (Layouts, Templates, Renderings, Items) using the Sitecore Client (a superior Desktop-like web-based UI) and external XML Development Tools of their choice (e.g. Notepad, UltraEdit, Dreamweaver, Altova, VisualStudio, Expression Web, etc.).
This article refers to this content-centric development process in both products as "customization" - real "development" using the API by writing code is covered in a later part of this series.

SharePoint provides a lot of customization functionality out of the box. As stated earlier, both WSS 3.0 and MOSS 2007 come with a couple of ready-to-run features for endusers. Most of these features are based on Lists, one of the main building blocks of SharePoint. Nearly everything related to content in SharePoint comes in form of some kind of lists or as an extended version of this concept, be it calendars, document libraries, task lists or even webpages. Users can create lists with the SharePoint Browser Interface, define fields, attach basic workflows, versioning and use other options. Developers can go a bit further and create so called "site columns" and "content types" which are basically global definitions of fields and (partial) list types. To display contents of lists, there are several options. I will not cover all of them here but I’m going to mention some of the most common approaches.
But before I do that, there is another functionality I want to mention briefly. It’s called the Web Part Framework and should sound familiar to every ASP.NET 2.0 developer. In fact, Web Parts are essentially the same in SharePoint as they are in ASP.NET (actually, the Web Parts concept has been introduced with SharePoint 2.0 and was adopted by ASP.NET 2.0). The main difference is that Web Parts in SharePoint are placed in Web Part Zones on Site Pages, which are located in the SharePoint content database, not in the file system. To learn more about Web Parts click here.
The first and simplest option to store and display structured data in SharePoint is to create a list, add some content and create additional views using the SharePoint Browser Interface. Views can be created using pre-defined templates and by specifying filters and other options. They can afterwards be placed on customized Site Pages and page layouts using the Web Parts Framework. SharePoint provides a couple of customizable Web Parts to display data from lists. These include the Content Query Web Part, the Data View Web Part and the XML Web Part.
A second and more advanced option is to use the customization settings the Content Query Web Part. In this approach developers use the existing XSLT capabilities of the CQWP to render the content of a list based on a XSLT Stylesheet. This can be done by creating customized XSLT Stylesheet and upload it to the Style library of a SharePoint site, which in turn makes it available for the CQWP settings dialog.
In addition to this structured approach (data is stored separately in lists, presentation is configured using web parts), the Web Part Framework does provide some Web Parts creating unstructured, discrete content, for example to place a styled HTML text and an images on a web page. From my perspective as Software and Information Architect, I have to strongly advice everyone against adopting this approach. It does not only intermixes content and presentation, but it also creates volatile content which in turn will lead to an unmanageable mud of information islands, detached from the overall information architecture.
Unfortunately, these content Web Parts are not the only way to create "sloppy" pages in SharePoint. Using SharePoint Designer, you can also create simple Content Pages (at least they are based on existing Master Pages, which define the overall layout of several pages in a Site) that can contain Text, Images and other HTML elements. Sounds familiar? Of course, it’s only "unmanaged" HTML, hopefully at least based on central styles and encapsulated by a MasterPage. This approach violates the same principle as the way mentioned before and I strongly discourage everyone from using it.
A much better way is the possibility to use the Web Content Management capabilities of SharePoint by using the Publishing Site functionality of MOSS 2007. To utilize its features, a developer has to create site columns and content types, which define a basic content structure for information that are going to be processed within a SharePoint Site. Having done that, he can create list items based on those content types and display them using SharePoint Page Layouts. Page Layouts are basically ASPX pages with controls. "Pages" (which are essentially list items based on Content Types in this case) use them to display their field values by "filling" the controls in those Page Layouts. SharePoint automatically combines them with one of a few pre-defined MasterPages, managed as part of the Publishing Site to ensure a common look and feel across pages.
This is by far the most structured approach to create web pages in SharePoint, but unfortunately there a couple of drawbacks.
Every list in SharePoint can also be accessed using RSS and other defined protocols, for example using Microsoft Access or web services.

SharePoint comes with a couple of out-of-the-box Features. Most of the configuration necessary to use those Features is performed automatically when a user creates a new Site based on an appropriate Site Definition (not to confuse with a Site Template… *urks*). The Site Definition activates certain Features right after it has been provisioned in the SharePoint Content Database. These Features do not only create additional functionality, they also change the behavior of the whole SharePoint Site, add pre-defined content and structures, extend menus and perform several other tasks.
This behavior is essentially a good idea, except for the fact that in some cases, there are no "clean" Site Definitions available for certain kinds of Sites. That means, you often get a couple of sample content, pre-defined lists and a couple of unnecessary customizations with every Site.
Unfortunately, one bad example for this situation is the "Publishing Portal" Site Definition (often just referred to as "Publishing Site"). It automatically activates all WCM (Web Content Management) Features in a SharePoint Site, but it also creates a couple of useless and even annoying things inside your newly created project workspace. Well, there are two options how to deal with that. The first option is to delete all that stuff by hand, hoping that you do not destroy any of the necessary functionality of a Publishing Portal (e.g. the TasksList and the WorkflowHistoryList which are necessary for certain automatic workflows activated by Publishing Site). The other option is to create a minimal Publishing Site Definition yourself - and it’s not exactly a pretty easy task.
This problem is so common that some SharePoint authors devote complete chapters to the creation of Site Definitions, just to have a "clean" starting point for their sample projects (e.g. "Professional SharePoint 2007 Web Content Management Developement" by Andrew Connell, Chapter 5: "Minimal Publishing Site Definition", WROX Press, 2008).
Another main drawback in customizing SharePoint is the integration with existing SCCM infrastructure and release management procedures - basically because it is not possible. Using either the SharePoint Browser Interface or the SharePoint Designer application, changes are generally performed directly in the SharePoint content database, which makes it difficult (if not impossible) to use it with source code management tools and in staged testing environments. Of course it is possible to create deployable SharePoint Features as part of "Solution Packages" (more about that later) which can contain all those things (lists, list templates, site columns, content types, page layouts, XSLT Stylesheets, etc.) in VisualStudio, but this is really a pretty heavy development task just to create some structured content that can be managed using Subversion or TFS. Nevertheless, that approach makes it possible to use source code management tools, although it is much more complicated and it lacks the nice usability that is provided by the SharePoint Browser Interface and the SharePoint Designer.

Advantages

  1. SharePoint delivers a lot of relatively easy-to-use out-of-the-box features. Some of them can be customized using the SharePoint Browser Interface, others using the SharePoint Designer.
  2. Using pre-defined Site Definitions, SharePoint users can easily create new SharePoint Sites based on a list of pre-defined Features and containing several customization.
  3. Structured pages in SharePoint can be customized using either the Web Part Framework or by defining Page Layouts, Content Types and creating Site Pages based on list items.
  4. Some Web Parts provide additional styling and customization options using XSLT and settings.
  5. Developers can use most of their knowledge about MasterPages and WebForms to create a common look and feel, although they need to be aware of certain limitations.

Drawbacks

  1. From a developers perspective, customizing SharePoint is not a consistent experience. There are several levels of customization with certain limitations, as well as different approaches and tools.
  2. Using a combination of the SharePoint Browser Interface, SharePoint Designer and some external tools, customizing SharePoint basically means re-arranging pre-packed snippets of functionality. Sometimes this pre-defined features are even getting in the way of the developer, for example if he wants to start with a fresh and clean project.
  3. The development task itself is a mix of using features of a web-based browser interface with some editing controls (SharePoint Browser Interface), a desktop application (SharePoint Designer), writing some XSLT (maybe using Expression Web), defining some CSS classes and creating HTML pages with placeholders. There is little guidance about where to start and no continuous workflow for creating content structures and presentation elements.
  4. SharePoint has some features that support the creation of unstructured content, without separation of content and presentation, bypassing the overall information architecture and decreasing manageability and maintainability. Unfortunately by not disabling these features (which is not that easy), users tend to use them a lot, primarily because they are easier to find and to understand than most structured approaches in SharePoint.
  5. From the perspective of this article (which is focusing on "customization" and not on "development"), SharePoint is limited to create page-driven content, which means that most data can not easily be re-used in the same Site or across Sites. This often leads to inconsistencies caused by duplicate information. The introduction of Site Columns and Content Types is an important step forward from SharePoint 2.0 and demonstrates Microsoft’s efforts to integrate functionality of MCMS 2002 into MOSS 2007. But the fact that site pages can only (or at least without additional development effort) be created based on the fields of a single list item is still a huge drawback. Although lists can have relationships to other lists, using those relationship during customization (again, we’re not talking about development here) is not trivial.
  6. SharePoint Designer and the SharePoint Browser Interface perform their changes always directly in the SharePoint Content Database. Besides some other drawbacks there is currently no support for SCCM tools. At the moment it is also not possible to use SharePoint Designer with files in the file system without creating some strange side effects that can even kill your complete SharePoint installation (hint: you should never open and save an application-level MasterPage in the "12 hive" with SharePoint Designer).

In contrast to SharePoint, Sitecore does not aim to provide a handful of pre-defined "content snippets" and a relatively fixed site design for immediate user activities right after the installation has finished. From an architectural point of view, there are several good reasons for that.
First of all, building an information architecture and a decent taxonomy is not anything you should do lightheaded. Even if you have a pretty small site, you should definitely give some thought to what you want to present to your users and how do you want to do that - at least, if you want to create something, that is built to last.
Secondly, how can anyone else know how your users and customers are going to work? Of course there are several "common features" on a website or in an intranet project, but the question is: Do you want pre-defined functionality, limited to a specific solution based on a pre-defined perspective of a common problem -or- do you want a flexible eco-system with everything you need to rapidly build everything you want?
Thirdly, I don’t know any developer who really likes it to re-arrange stuff someone outside of his own team created. Human brains do not work the same way and to follow the line of thoughts of someone you don’t even know is not only complicated, error-prone and inefficient - it is also not much fun.
What Sitecore gives you as a developer or as information architect is intentionally not a bunch of tools and features to create a bunch of inflexible web pages based on "pre-packed" pieces of code. Instead it provides you with a full-featured framework for building a web based structure of your information, your processes and your organizational reality, enriched by an API that makes it a fully-fledged web application framework.
Heaving said that, there are some folks out there that might say: But that’s exactly what I want, I need some things right out of the box. Does that mean, Sitecore is not for you? Absolutely not… In fact, you CAN use existing functionality, modules, information structures and even content, built by experienced Sitecore partners and Sitecore itself. Go to SDN, download a predefined site package (packages are deployment units in Sitecore, like an MSI package for Windows) and install it into your fresh Sitecore system: Tadaa. If you want to, you can get a complete website with design, structure, additional functionality and sample content. And it’s your choice to select which demo site, starter kit, intranet package or whatever else to use as basis for your site.
The question is: If you are more and more becoming and experienced Sitecore developer, to you really want to start your projects by deleting and re-structuring demo content? I wouldn’t…
What you can do instead is to build your own "starter kits" and "toolboxes" right in your own Sitecore and use the package mechanism to move it to your next project. And that is really powerful. So powerful that we will talk about this in-depth in a later part of the series.

Sitecore’s architecture is one of the best-designed software architectures in terms of Content Management Systems I’ve ever seen. The main concept of the architecture is an universal piece of structured information, named Item. This concept is simple, but very powerful. In fact, everything in Sitecore is based on Items - even Sitecore itself. Sitecore Items live inside a hierarchical structure named the Content Tree. In contrast to SharePoint where you can take a look at your Site Collection in a hierarchical view called the "SiteManager" (you can find it at different locations under the name "Site Content and Structure"), Sitecore’s Content Tree is not only an aggregated view of database content, re-arranged to get a comprehensive view, it is really the way the whole system is structured. Browsing Sitecore’s Content Tree using the Content Editor (Sitecore’s "Windows Explorer") as administrator gives you the ability to see and access every piece of information within the application.
Sitecore’s XML development features (which we call "customization" in this article series) are based on a so called "content-driven" approach to content management. "Content-driven" means, you are goind to structure your information first and then you decide, how and where to display it to your users.
To define the structure of the data used in your site, a developer defines a couple of blueprints for the data, similar, but more powerful then Site Columns and Content Types in SharePoint. To be honest, these blueprints have a very misleading name. In Sitecore terms they are called Templates. A Template is the definition of a certain type of structured data within Sitecore, like a class in OOP or a table definition in SQL. It contains definitions for various fields based on certain types, similar to properties in object-oriented languages like C# or columns in a SQL Server table. This concept has something in common with content types, lists and list items in SharePoint, although there are several differences. I will not go into detail here, but here are some important distinctions:

  1. Sitecore Templates can be chained and they allow "multiple inheritance".
  2. Sitecore Templates allow additional options per field.
  3. Sitecore Items are hierarchically structured and can have complex relationships to other Items or complete structures of Items.

After you have designed the structure of your data and some content, the next thing to do is to present that content to your users. In Sitecore, this procedure is called "rendering". There are several options how to render content in Sitecore as it comes out-of-the-box with a bunch of technologies for that purpose. The one we are going to talk about is named "XSL Rendering" and it is by far the most common way. The rendering framework of Sitecore can be extend and at netzkern we are currently discussing a couple of ideas how to do that (e.g. providing support for Ruby on Rails-style templates, for XQuery, HAML and MVC Views).
I’m not going to explain exactly how to develop with XSLT in Sitecore, because that topic is covered in-depth in several tutorials on SDN and (of course) in every Sitecore training. What I’m going to do is to give you an impression of how customization works with XSLT in Sitecore (remember, it is not the only option for customization) and why it is so powerful.
In an earlier article I described the way Sitecore handles requests and we are going to extend the explanation here a bit further. After Sitecore has dispatched the request to ASP.NET, the .NET framework opens the requested ASPX and (if necessary) compiles it together with its code-behind file (actually, in Web Application Projects, there is also a second partial code-behind file that is merged with the ASPX markup file). The result is a HTTP Handler component, which receives and handles the request by executing a combination of framework code and custom code hooked into the processing pipeline using events (PageLoad, PreRender, OnClick, and so on).
This ASPX file usually defines the overall layout of a Sitecore page and of course you can use MasterPages and all other ASP.NET WebForms features. In Sitecore terms, this file is called a "Layout" and it is also managed by Sitecore. The Layout can contain a couple of Sitecore controls. Again, in contrast to SharePoint that handles controls on customized site pages a little different than pure ASP.NET, Sitecore controls are just powerful .NET components, completely in harmony with the ASP.NET WebForms framework.
Depending on the architecture of your site, you can choose whether a Sitecore Rendering is placed directly on the Layout or if you want to use the Sitecore Presentation framework to load it dynamically into a placeholder. I will not go further into detail here, but please don’t hesitate to consult your local Sitecore partner, the forum at SDN or to drop me a line.
What I want to mention is a feature this article will refer to as "layout composition". In Sitecore, you can dynamically combine different Layouts, based on the page or type of pages being delivered. In Sitecore, you can create independent parts of Layouts, called Sublayouts. Sublayouts are basically ASCX controls, derived from a special base class. The rendering engine of Sitecore uses placeholders in to dynamically arrange Sublayouts, Renderings and Controls on a Sitecore Layout and merges them together, even with MasterPages. That way, you can dynamically assemble your pages which dramatically reduces the number of Layouts necessary and by that means the amount of work to create and maintain a consistent look and feel. For example, to build 5 slightly different designs, accessible not only using a Desktop browser but also a mobile device, you need to build 10 Page Layouts in SharePoint. You achieve the same result in Sitecore with two Layouts and two Sublayouts.
This feature alone increases your productivity significantly and it is one of the main reasons, why Sitecore was appointed "cool vendor of the year" by Gartner.
At some point, Sitecore calls your XSL Rendering with the structure of the Sitecore Content Tree as parameter. You can now use the whole power of XSLT and additional Sitecore Extension Functions to transform your content into HTML or - if you want to - into some other kind of XML or non-XML structure. XSLT is in its capacity as functional, transformational language a perfect fit for the Sitecore Content Tree, the Items Concept, the way humans think and (most importantly) nearly all information architectures and websites are structured.
If you have not worked with functional programming languages or transformation DSLs before, it might at first feel a little strange, but XSLT is known for a very steep learning-curve. If you are not comfortable working with XSLT at all (although you should at least try to), no problem! As I mentioned earlier, there are several other rendering technologies already available and more to come (especially using the ASP.NET MVC View template language is in my opinion a very promising candidate for the future).

As you can already see from the way this part of the article has been written, creating Sitecore-based sites and applications is pretty straightforward and your course of action is based on a defined process with additional guidelines. Pages in Sitecore are essentially the combination of layouts as dynamic containers for information coming from the Content Tree. Page Layouts and MasterPages provide a common look and feel as well as advanced functionality, if necessary (we will cover that in a later part of this series). But let’s summarize what we have learned.

Advantages

  1. Sitecore provides a simple, well-integrated and very powerful Software Architecture with a universal concept named Sitecore Items.
  2. Developers can use the Sitecore Installer to install pre-defined Sitecore packages to use as starting points for your development -or- to create their own toolbox.
  3. Sitecore provides strong separation of content and presentation. In fact, creating presentation artifacts is also separated into designer responsibilities (MasterPages and ASPX Pages) and developer tasks (Templates and Renderings), so each group can work usually work without influencing the other groups work.
  4. The Sitecore Item concept is much more flexible and advanced than the SharePoint list concept. It is additionally easier to understand and to use, if explained properly (e.g. by reading the documentation or attending a SCD1 or a SHD training).
  5. Developers are presented with a clear guidance on how to create Sitecore sites.
  6. As Sitecore Renderings are living in the file system, a developer can select his tool of choice and orientate oneself by the nature of the task at hand. It also depends on the type of rendering being created.
  7. Sitecore’s Content Tree usually reflects your individual information architecture.
  8. Sitecore is completely "content-driven" which makes it possible to re-use content across pages and sites without any additional effort.

Drawbacks

  1. Sitecore does not provide the same amount of out-of-the-box functionality as SharePoint, therefor It is not a plug’n'play system. It is up to the developer to install a complete pre-designed site, partial site templates or additional features using the Sitecore Installer within the Sitecore Client web interface.
  2. Sitecore is more developer-centric than SharePoint. You have to learn about the Items concept and you have to learn at least one rendering engine, preferably XSLT.
  3. To provide all the table-centric features of SharePoint lists with all filters and options, a Sitecore developer has to create a certain amount of customizations, especially templates and renderings. This might be a good place for future improvement, especially for intranet scenarios (and yes, we’re currently working on a concept).

This article covers a lot of features in both products and should provide a pretty complete overview of the customization a developer usually performs in both platforms. Unfortunately for Microsoft, SharePoint lacks the transparency and clarity of Sitecore and makes it difficult for developers to use all the pre-defined SharePoint features productively. Sitecore also surpasses SharePoint in terms of flexibility and it is better integrated with the ASP.NET framework. Because of that, working with Sitecore feels much more natural for a developer. SharePoint provides more out-of-the-box features, but developers often stumble upon the complexity of these pre-packed components. Sitecore, on the other hand, can be easily extended with additional functionality and even complete sites.

These facts lead to the following verdict from a developer perspective:

Sitecore: +8 points = 24 points total
SharePoint: +5 points = 17 points total

Looking forward to see you soon in part 5.

Julius Ganns . netzkern

Comments are closed.