Integrating Zend_Lucene with Yii

December 5, 2009
The Yii Book If you like my writing on the Yii framework, you'll love "The Yii Book"!

I’m just not a big fan of using the Zend Framework as my Web development tool, but one of the framework’s nicest features is that you can use only the parts of it you need. I am, however, a big fan of the Yii framework and one of its many plusses is that you can easily integrate other frameworks and tools into it. Like, for example, the Zend Framework. Yii does not have its own search engine functionality, and Apache’s Lucene is arguably the gold standard (although clearly not the only choice), so tapping into Zend’s Lucene module for a Yii-driven site makes a lot of sense. In this post, I’ll walk you through the steps for integrating  Zend_Lucene into Yii. This post does assume familiarity with PHP, MVC, and Yii.To start, let’s create a spot in the Yii application for the Zend Framework. Create a new directory called vendors within the Yii protected folder. This isn’t required, but as the Zend Framework is a different beast than all the Yii code, I think it’s best to separate it out. Within vendors, create a directory called Zend (or ZendFramework, if you’d rather).

Next, download the Zend Framework. You’ll want to download the latest full package, even though you’ll only use a bit of it. After the download has completed, expand the ZIP or TAR.GZ file (whichever format you choose to download the framework in). The result will be a folder named ZendFramework-x.y.z. (where x.y.z represent the full version number). Within that folder, go into library/Zend and copy Exception.php to protected/vendors/Zend. This is the file that the Zend Framework uses to report problems, so you’ll want to include it while developing and debugging Zend_Lucene with Yii. Also copy the Search folder to protected/vendors/Zend. You’ll end up with a structure like this:

In terms of the MVC architecture, the Zend Framework provides the Model to be used by this search process, but the Controller and View will still be done using Yii. First, let’s write a new Controller for searching:

class SearchController extends CController
{
    private $_indexFiles = '../runtime/search';
    public function actionIndex() {}
    public function actionCreate() {}
    public function actionSearch() {}
    public function actionUpdate() {}
}

As with all Yii Controllers, this one extends the base CController class. Within this Controller the various methods are defined, corresponding to the actions that’ll be taken in the search process. The index action is the default and is for accessing the search page without performing an actual search (e.g., clicking on a link to go to the search page). The create action will be used to generate the search database: the series of files that Lucene needs to perform its searches. The search action is for handling submission of the search form (i.e., it does the actual searching). Finally, the update action is for updating the Lucene database files when necessary (like when the site content changes). The class also has one private variable that stores the location on the server of Lucene database files. I chose to put them in a search folder found within runtime (protected/runtime/search). This class member is good to have as multiple methods will need this information but I create it as a private variable as it’s not necessary (nor should it be accessed) outside of the class. As a naming convention, some like to use underscores at the front of private class variables.

Within three of the methods (not actionIndex()), the Controller will use Zend_Lucene. In order to do so, this script needs access to the Zend files, so import the contents of the vendors directory at the top of this script, just before the class definition begins:

Yii::import('application.vendors.*');

Then, include the Lucene.php page, found within the Zend Framework Search folder:

require_once('Zend/Search/Lucene.php');

Now this Yii Controller can create objects of type Zend_Search_Lucene, which is defined in that file. The actions will use that object type to perform the searches. To start, the index action just renders the index View:

public function actionIndex()
{
    $this->render('index');
}

Presumably the index View file just shows the search form. The search form, by the way, should have an action attribute of www.example.com/index.php/search/search, so that it calls the search action of the search Controller. The form should contain a text input with the name terms.

The update action would be used by an administrator to update the search database. Perhaps it’d be called automatically after some content is generated or once per hour or day. It would destroy the existing search database and then invoke the actionCreate() method. The Lucene database can’t just be updated for whatever content changed; you need to destroy and recreate it instead. It really wouldn’t matter what View this action renders, depending upon what you want the admin to see. Maybe the View would just show a message indicating that the database has been updated.

The create action is an important one, and is where real knowledge of Lucene comes into play. The shell of it would look like so:

public function actionCreate() {
    $index = new Zend_Search_Lucene($this->_indexFile, true);
    // Add documents to the database.
    $index->commit();
    $this->render('create');
}

First, a Zend_Search_Lucene object is created (again, this is where Yii is making use of a class defined outside of Yii; it’s a sweet thing). The first argument provided when creating the object is the location of the database files. This is represented by the Controller variable, accessible in $this->_indexFile. The second argument indicates that a fresh database should be created. Next up, you add content to the database. This is complicated and well beyond the scope of what I’m writing here. I’ll try to discuss this, in brief, in a separate post, but I’d recommend you read as much as you can online first. In a very minimalistic way, you could add a single HTML page to the search database by doing this:

$url = 'http://www.example.com/index.php/page/show/id/1';
$doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($url);
$index->addDocument($doc);

Finally the database has to be saved, by invoking the commit() method. And then some View is rendered. As this action would also only be likely called by an administrator or cron, it doesn’t matter much what the View contains.

Lastly, there’s the search action. This action would check for search terms, run the search against the database, then send the results on to a View:

public function actionSearch() {
    if (isset($_GET['terms'])) {
        $index = new Zend_Search_Lucene($this->_indexFile);
        $results = $index->find($_GET['terms']);
        $this->render('search', array('results' => $results));
    } else {
        $this->render('index');
    }
}

First the method checks for the presence of search terms in the URL. Then it creates a Zend_Search_Lucene object, which is necessary for both creating and using the search database. This time only the location of the search database is passed when creating the object. The object’s find() method is invoked for performing the search (it can be that simple!). Then the search View is rendered, passing it the results. If no search terms were passed to this page, the index View is rendered instead. As for the search results View, a basic version to get you started might look like this:

<h2>Search Results for "<?php echo CHtml::encode($_GET['terms']); ?>"</h2>
<?php if ($results): ?>
    <?php foreach($results as $result): ?>
        <p><?php echo CHtml::encode($result->title); ?></p>
    <?php endeach; ?>
<?php else: ?>
    <p class="error">No results matched your search terms.</p>
<?php endif; ?>

That’s largely the logic and structure of a search results View. It displays the provided search terms and checks for results. If there were some, each result title is printed. In a real application, you’d likely link the title to a URL or whatever but I don’t want to get too messy here. If you do print_r($result), you’ll see a bunch of information there that you can use.

So that’s the steps you need to take to get started using Zend_Lucene within your Yii application. These steps provide functionality; mastering Lucene is how you make this more professional. I’ll try to write more about defining a Lucene search database in subsequent posts towards that end. If you have any comments, questions, or requests, let me know.

Thanks,

Larry