Zend Search Lucene searching whole product description

What is below is my code of search engine on the website. Right now is only searching what is refered as ProductName and ProductNumber. I didn’t know what need to be changed to searching whole ProductDescription
Here is Search.php file

 protected $_index;
protected $_indexed = array();
/**
 * 
 * @var Zend_Http_Client
 */
protected $_httpClient;

public function __construct()
{
    try {
        $indexDir = realpath($_SERVER['DOCUMENT_ROOT'] . '/../tmp/search');
        $this->_index = Zend_Search_Lucene::open($indexDir);
    } catch (Zend_Search_Lucene_Exception $e) {
        $this->_index = Zend_Search_Lucene::create($indexDir);
    }

    $this->_httpClient = new Zend_Http_Client();
    $this->_httpClient->setConfig(array('timeout' => 10));

    Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive());
}

public function indexUrl($url)
{
    if (is_array($url)) {
        foreach ($url as $uri) {
            $this->_indexUrl($uri);
        }
    } else if (is_string($url)) {
        $this->_indexUrl($url);
    }
}

public function indexWholePage()
{
    $pageUrl = $this->_getHostName();

    $this->_indexUrl($pageUrl . '/');
}

protected function _indexUrl($url)
{
    if (in_array($url, $this->_indexed))
        return;

    $log = Zend_Registry::get('Zend_Log');
    $log->log($url, Zend_Log::NOTICE);

    $this->_httpClient->setUri($url);
    $response = $this->_httpClient->request();

    $this->_indexed[] = $url;

    if ($response->isSuccessful()) {
        $body = $response->getBody();

        $doc = Zend_Search_Lucene_Document_Html::loadHTML($body, true);

        foreach ($doc->getLinks() as $link) {
            if ($this->_isValidPageLink($link) && !in_array($this->_getHostName() . $link, $this->_indexed)) {
                $this->_indexUrl($this->_getHostName() . $link);
            }
        }

        $t = new Zend_Search_Lucene_Index_Term($url, 'url');
        $q = new Zend_Search_Lucene_Search_Query_Term($t);
        $hits = $this->_index->find($q);

        foreach ($hits as $hit) {
            if ($hit->md5 == md5($body)) {
                return;
            } else {
                $this->_index->delete($hit->id);
            }
        }

        $doc->addField(Zend_Search_Lucene_Field::Keyword('url', $url));
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('md5', md5($body)));

        $this->_index->addDocument($doc);

        $log = Zend_Registry::get('Zend_Log');
        $log->log('done', Zend_Log::NOTICE);
    }
}

public function search($query)
{
    return $this->_index->find($query);
}

public function deleteIndex()
{
    
}

protected function _getHostName()
{
    $host = isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '';
    $proto = (isset($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== "off") ? 'https' : 'http';
    $port = isset($_SERVER['SERVER_PORT']) ? $_SERVER['SERVER_PORT'] : 80;
    $uri = $proto . '://' . $host;

    if ((('http' == $proto) && (80 != $port)) || (('https' == $proto) && (443 != $port))) {
        $uri .= ':' . $port;
    }

    return $uri;
}

protected function _isValidPageLink($url)
{
    $hostName = $this->_getHostName();

    if (substr($url, 0, strlen($hostName)) == $hostName ||
            substr($url, 0, 1) == '/' || substr($url, 0, 1) == '?') {
        if (@preg_match('#^(.+)\.(jpg|gif|png|pdf|doc|xls)$#i', $url)) {
            return false;
        }
        return true;
    }

    return false;
}

And here is php form to generate search results. Lucene implementations that I found after searching where completly different than what here is. This is my first time with ZendFramework.

 <form method="get" action="/search.html" class="searchForm" enctype="application/x-www-form-urlencoded" id="searchForm">
  <fieldset>
    <input type="text" id="search_text" name="q" value="<?php echo $this->escape($this->query) ?>"><br>
     <input type="submit" value="search" id="search" name="search"> 
  </fieldset>
</form>

<h1>Search results</h1>

<?php if(empty($this->searchString)): ?>
          <p><strong>Please write text of minimal lenght of<?php echo $this->minimumLength ?></strong></p>
<?php else: ?>

<?php if(count($this->products)){ ?>

<?php foreach ($this->products as $product): ?>
<?php $link = '/'.$this->permalink($product->product_name).','.$product->product_id.','.$product->category_id.',p.html'; ?>
<div class="productlist clearfix">
  <a href="<?= $link; ?>" class="clearfix">
<div class="txt">
  <h2><?= $product->product_name ?><?php if(strlen($product->product_number) > 2){ echo '<small> [ '.$product->product_number.' ]</small>'; } ?></h2>
  <p><?= stripslashes($product->product_intro2) ?></p>
</div>
<div class="pic">
   <?php if($product->has_media): ?>
     <?php echo $this->thumb($product->media_src, 110, 110) ?>
   <?php endif; ?>
   <p style="text-align: center;">More</p>
</div>
</a>
</div>
<hr/>

<?php endforeach; ?>

<?php }else{ ?>
<p>0 product was found</p>
<?php } ?>

<div style="clear: both;">
<?php echo $this->products; ?>
</div>



<?php endif ?>

Just a quick note: our Lucene component has been deprecated since we released ZF 2.0, and has not been maintained in many (> 5) years. As a result, it’s very likely you may not get much help on the topic.

So its like I suspected.
My company website was developed in 2013 and has ZF 1.11.11.
Also I’m a bit affraid to upgrade it to newest ZF (it may broke this website)

Ooof… 1.11.11 is very old, and the latest ZF1 release, 1.12.20, issued 8 Sep 2016, marked the end-of-life for version 1. It no longer receives security fixes, nor any additional patches. You will need to upgrade or migrate at some point.

One thing you may want to consider is moving your Zend_Search_Lucene functionality to something like Solr or ElasticSearch, both of which are based on Lucene, but work on the latest Lucene versions, and offer a ton of convenience and additional features. This would be a good first step in a migration, allowing you to tackle other features later.

Yea, Its pretty old and I’m more into upgrade it.
But after I made a copy of website and database and put it in other subdomain I got error: Application Errorcache_dir must be a directory.
I set chmod to 777 for entire folder of tmp and his subfolders. Ofcourse I set up new database connection.
After some research I put in File.php (under backend) echo $result and get something like that: /tmp…/tmp/cache/dbMetadata/
I didn’t know what else I should change.

Looks like a weird file path, as if the configuration of the path is wrong.

In File.php I have something like that:
> protected $_options = array(

    'cache_dir' => '/../tmp/',
    'file_locking' => true,
    'read_control' => true,
    'read_control_type' => 'crc32',
    'hashed_directory_level' => 0,
    'hashed_directory_umask' => 0700,
    'file_name_prefix' => 'zend_cache',
    'cache_file_umask' => 0600,
    'metadatas_array_max_size' => 100
);

When I change cache_dir to NULL its same.
This error is showing every path that is wrong.
And as I said above this file structure just were copied from previous location to modify it.

What if you just try to use the application without the cache and postpone the fix? Since you will want to upgrade anyway, you might not need the cache to get there. I remember using the file cache quite a lot, even in ZF 1.11.11 and it always worked, but I also remember that getting the file path right took me some time.

After I comment this line i Bootstrap.php

protected function _initCache()
{
$frontendName = ‘Core’;
$backendName = ‘File’;

    $frontend = array('automatic_serialization' => true);
    $backend  = array('cache_dir' => APPLICATION_PATH . '/../tmp/cache/system/');
    $cache = Zend_Cache::factory($frontendName, $backendName,
    $frontend, $backend);
     
    Zend_Registry::set('Zend_Cache', $cache);
    return $cache;
}

I got error:
Application ErrorResource matching “cache” not found

Ok, so that does not help. But your initial error message comes from the cache of the database metadata, which can be disabled, depending on how it is setup. If it is configurable, it can probably be disabled or simply not enabled. What is done in _initCache() should not trigger the error with path “/tmp…/tmp/cache/dbMetadata/”, because a different path is set in _initCache().