Adding Solr Search to Mura CMS
Posted on Jun 15, 2010
The best thing about Mura CMS is that much of what you need to build a typical content managed site is already built-in. However, the second best thing about Mura CMS is that anything that isn't built in can generally be added in using its extensibility without breaking your upgrade path. In the case of my application, one of the requirements was that we use ColdFusion 9's new Solr-based search engine to index all of the site content. By default, Mura uses a standard query based searching, in part, I suspect, because its designed to run on the other CFML engines should you choose to. I am running ColdFusion 9, so I want Solr! Here's how I did it.
Tapping Into the Mura Event
As
a Mura request processes, it "announces" a number of events that you
can easily write event handlers to respond to (see this post on the Mura site for details).
In order to build the Solr plugin, I needed to tap into several of
these. To better understand this, let's break down the different aspects of adding Mura support:
- When the application loads, we need to check if the collection exists. If not, we need to create it;
- When you submit a search, the application should return results from our Solr plugin rather than the default search query;
- When you add a new content item or update an existing one, we should refresh that item in the Solr index;
- If you remove a content item, we should remove that item from the index.
Each of these requirements maps to a separate event - 1) onApplicationLoad; 2) onSiteSearchRender; 3) onContentSave; and 4) onContentDelete. Let's examine the various methods of my event handler component.
Building the Collection
To
begin with, inside the cfcomponent tag of my event handler CFC, I have
defined a simple string variable, variables.collectionName, containing
the name of the Solr collection name that will contain my searchable
content. After that, the first method of my component is triggered when
my Mura application loads (onApplicationLoad) and it simply checked to
see if a collection with that name exists.
<cffunction name="onApplicationLoad" access="public" output="false" returntype="void">
<cfif not collectionExists(variables.collectionName)>
<cfset indexAllContent() />
</cfif>
</cffunction>
In order to do this, I created a simple method to see if the collection exists.
<cffunction name="collectionExists" access="private" output="false" returntype="boolean">
<cfargument name="collectionName" type="string" required="true" />
<cfset var getCollections = "" />
<cfset var returnVal = false />
<cfcollection action="list" name="getCollections" />
<cfif listContainsNoCase(valueList(getCollections.Name),arguments.collectionName)>
<cfset returnVal = true />
</cfif>
<cfreturn returnVal />
</cffunction>
If the collection doesn't exist, I call a method that I have called indexAllContent that will handle adding all of the content already in my Mura content database (and, if necessary, any file-based content) to a new collection. I am leaving out some of the specifics of this method as they are uniquely relevant to the application I am building (for instance we don't want all content indexed, just content within specific categories). I leave it up to you to determine how you might want to filter the content you wish to index. Note that in my case I created the collection with the categories attribute set to true and assigned the different sections to categories as required.
<cffunction name="indexAllContent" access="private" output="false" returntype="void">
<cfset var getContent = "" />
<cfset var dsn = application.serviceFactory.getBean("configBean").getDatasource() />
<cfcollection action="create" collection="#variables.collectionName#" path="#expandPath('/../collections')#" />
<!--- now populate the collection with content --->
<cfquery name="getContent" datasource="#dsn#">
SELECT tcontent_ID
,ContentID
,Title
,Body
FROM tcontent
WHERE active = 1
</cfquery>
<cfindex action="update" collection="#variables.collectionName#" key="contentID" type="custom" query="getContent" title="title" body="body" />
</cffunction>
Displaying the Results
Now that we've indexed the content we need to be
able to display those results rather than the default search results in
Mura. Mura provides an event, onSiteSearchRender, expressly for this
purpose. Still, this is the part of the event handler that caused me the
most trouble, in part because it behaved in some manners differently
than I expect it should and in part because there isn't always a ton of
documentation about the Mura component API.
The easy part of the function below is the cfsearch which should be self-explanatory. After that you'll see some method calls on the contentIterator object in Mura. The buildQueryFromList() method actually populates the iterator with content objects based upon a list of content IDs, which is why I simply provide a valueList of the key field from my index, which just contains the contentID. The setNextN() method just tells the iterator how many records will be displayed per page. Lastly, the setStartRow() method does exactly what you think it would.
You will notice that I actually set the result into a request variable. I would have expected that there was a more direct way, say via a return value, to replace this iterator with the one that would typically display on the page, however that didn't seem to be the case. Perhaps that will be changed in the future.
<cffunction name="onSiteSearchRender" access="public" output="false" returntype="void">
<cfargument name="event" type="any" required="true" />
<cfset var category = event.getValue('category') />
<cfsearch collection="#variables.collectionName#" name="getResults" criteria="#event.getValue('keywords')#" type="internet" category="#category#" />
<cfset iterator=application.serviceFactory.getBean("contentIterator") />
<cfset iterator.buildQueryFromList(valueList(getResults.key),event.getValue("siteID")) />
<cfset iterator.setNextN(10) />
<cfset iterator.setStartRow(event.getValue('startRow')) />
<cfset request.iterator = iterator />
</cffunction>
Updating and Deleting Content in the Index
Now that we have an index in
Solr and can return results to the user, we need to make sure that as we
add or update content within the Mura admin, that these changes are
reflected in our index. To do this, we simply need to tie into the
onContentSave and onContentDelete methods provided in a Mura event
handler. Both of these methods are simple, though in my application
onContentSave actually looped through the currentBean's "crumbdata"
(which is an array of the hierarchy of categories within the site tree
that this content belongs to) to determine if it belonged to a category I
planned on indexing (and added the appropriate category to the cfindex
call). However, since that is specific to my requirements, I left that
out.
<cffunction name="onContentSave" access="public" output="false" returntype="void">
<cfargument name="event" type="any" required="true" />
<cfset var currentBean = event.getValue('contentBean') />
<cfindex action="update" collection="#variables.collectionName#" type="custom" key="#currentBean.getValue('contentID')#" title="#currentBean.getValue('title')#" body="#currentBean.getValue('body')#" />
</cffunction>
Deleting the content from the index is even easier. This is true even if you are only concerned with specific content categories as I am, since if the content doesn't actually exist in the index its not deleted obviously.
<cffunction name="onContentDelete" access="public" output="false" returntype="void">
<cfargument name="event" type="any" required="true" />
<cfindex action="delete" collection="#variables.collectionName#" key="#arguments.event.getValue('contentID')#" type="custom" />
</cffunction>
What's Next?
As you would with any event handler, you now simply need
to add this handler component (I named mine solrSearch.cfc) into your
site's specific eventHandler component. Open up
/[siteid]/includes/eventHandler.cfc and add this code into the
onApplicationLoad() method (note, you'll notice I manually call the
onApplicationLoad() method of my solrSearch component as that is not
automatically called by Mura):
<cfset var solrSearch = createObject("component","solrSearch") />
<cfset application.pluginManager.addEventHandler(solrSearch,event.getValue("siteID")) />
<cfset solrSearch.onApplicationLoad() />
The final step is simply to customize the search to meet your needs and to use the results you are providing. For the most part, you can copy the code that's in the /[siteID]/includes/display_objects/dsp_search_results.cfm and place it in /[siteID]/includes/display_objects/custom/dsp_search_results.cfm. However, much of the code that comes before the display needs to be slightly modified. Here's the code that replaced lines 44-82 from the default display object. You can always do as I did and cfdump the iterator and go through trial and error to get this right.
<cfset iterator = request.iterator />
<cfset totalRecords = iterator.recordcount() />
<cfset currentRow = iterator.currentRow() />
<cfset recordsPerPage=10 />
<cfset numberOfPages= iterator.pageCount() />
<cfset currentPageNumber=iterator.getPageIndex() />
<cfset next=evaluate((request.startrow+recordsperpage)) />
<cfset previous=evaluate((request.startrow-recordsperpage)) />
<cfset through= iif(totalRecords lt next,totalrecords,next-1) />
I know I don't give all the answers here, in part because how your search functionality is built in part depends on your specific requirements. However, I hope that the code I provided can get you started. Feel free to send me any questions.
Comments
Cool stuff Brian! As an aside, one recent feature that I've really enjoyed using lately is the Mura scope, which gives you handy shortcuts into things like Mura's bean factory (powered by ColdSpring), the content bean and configbean. So instead of application.serviceFactory.getBean(), you can simply use $.getBean(); $.content() instead of event.getValue("contentbean"); and $.siteConfig("datasource") instead of application.serviceFactory.getBean("configbean").getDataSource().
Posted By Tony Garcia / Posted on 06/15/2010 at 10:35 PM
Actually, for the datasource, you would use $.globalConfig("datasource")
Posted By Tony Garcia / Posted on 06/15/2010 at 10:37 PM
As an update to this, a former coworker noted some errors in the code above. Here are his notes:
When creating a solr collection with the cfcollection tag you must specify engine="solr" or else it creates a verity collection
Solr doesn't use type="" in the cfsearch tag. Using type="" with a solr collection it will create an error: Error executing query : unknown_handler_internet_basic
in solr search colons are used to search on a category, adding a colon to the end of a word that is not a category will produce an error:
Error executing query : undefined_field_ (word that had colon in front of it)
http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSCCDC2C74-DE46-4ea1-B42C-DDF4F623B704.html
Posted By Brian Rinaldi / Posted on 09/19/2010 at 7:39 PM
Great post and reference. We at BMChosting are moving to Mura CMS since we are a Coldfusion centric shop.
Posted By Greg / Posted on 10/25/2010 at 8:00 AM
Another free open source based option for search is SearchBlox. It is built on top of Lucene the same API that Solr uses.
Posted By tss / Posted on 12/31/2010 at 3:48 AM
Good stuff Brian...
Just one thing: your indexAllContent() method should specify the siteid of your current site, or else if you have a multi-site installation of Mura, you'll get the content from all sites indexed.
With a bit of extra effort, it would also be possible to index any attached files (such as PDFs and Word documents). I might try to put all this together into a Mura plugin...
Posted By Seb Duggan / Posted on 04/04/2011 at 10:56 AM