function index() { $this->load->library('zend', 'Zend/Feed'); $this->load->library('zend', 'Zend/Search/Lucene'); $this->load->library('zend'); $this->zend->load('Zend/Feed'); $this->zend->load('Zend/Search/Lucene'); //Create index. $index = new Zend_Search_Lucene('C:\\xampp\\xampp\\htdocs\\controle_frota\\lucene\\feeds_index', true); $feeds = array('http://oglobo.globo.com/rss.xml?limite=50'); //grab each feed. foreach ($feeds as $feed) { $channel = Zend_Feed::import($feed); echo $channel->title() . '<br />'; //index each item. foreach ($channel->items as $item) { if ($item->link() && $item->title() && $item->description()) { //create an index doc. $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('link', $this->sanitize($item->link()))); $doc->addField(Zend_Search_Lucene_Field::Text('title', $this->sanitize($item->title()))); $doc->addField(Zend_Search_Lucene_Field::Unstored('contents', $this->sanitize($item->description()))); echo "\tAdding: " . $item->title() . '<br />'; $index->addDocument($doc); } } } $index->commit(); echo $index->count() . ' Documents indexed.<br />'; }
public function add($content, $section, $mtime) { foreach ($this->split_headings($content) as $headers) { $doc = new Zend_Search_Lucene_Document(); $link = "index.php?page=" . preg_replace('/\\/|\\\\/', '.', $section); $link = str_replace('.page', '', $link) . '#' . $headers['section']; //unsearchable text $doc->addField(Zend_Search_Lucene_Field::UnIndexed('link', $link)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('mtime', $mtime)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('title', $headers['title'])); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('text', $headers['content'])); //searchable text $doc->addField(Zend_Search_Lucene_Field::Keyword('page', strtolower($headers['title']))); $body = strtolower($this->sanitize($headers['content'])) . ' ' . strtolower($headers['title']); $doc->addField(Zend_Search_Lucene_Field::Unstored('contents', $body)); $this->_index->addDocument($doc); } }
function create_index() { echo "Building search index...\n"; $files = $this->get_files($this->_api); $count = 0; foreach ($files as $file) { $content = $this->get_details($file, $this->_api); $doc = new Zend_Search_Lucene_Document(); $title = $content['class']; echo " Adding " . $title . "\n"; //unsearchable text $doc->addField(Zend_Search_Lucene_Field::UnIndexed('link', $content['link'])); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('title', $title)); //$doc->addField(Zend_Search_Lucene_Field::UnIndexed('text', $content['content'])); //searchable $body = strtolower($this->sanitize($content['content'])) . ' ' . strtolower($title); $doc->addField(Zend_Search_Lucene_Field::Keyword('page', strtolower(str_replace('.', ' ', $title)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('contents', $body)); $this->_index->addDocument($doc); $count++; } $this->_index->commit(); echo "\n {$count} files indexed.\n"; }
public function indexAction() { // action body $this->_helper->viewRenderer->setNoRender(true); $jobs = new Application_Model_DbTable_JobPortal(); $index = Zend_Search_Lucene::create('C:\\indexed'); $maxBufferedDocs = 100; $index->setMaxBufferedDocs($maxBufferedDocs); $users = $jobs->fetchAll(); Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num()); foreach ($users as $user) { $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('pri', $user->id)); $doc->addField(Zend_Search_Lucene_Field::Text('title', $user->title)); //here field = ur database column $doc->addField(Zend_Search_Lucene_Field::Text('shortd', $user->shortd)); $doc->addField(Zend_Search_Lucene_Field::Unstored('longd', $user->longd)); $doc->addField(Zend_Search_Lucene_Field::Text('exp', $user->exp)); $doc->addField(Zend_Search_Lucene_Field::Text('location', $user->location)); $index->addDocument($doc); } $index->commit(); $this->_helper->redirector('search'); }
public function actionFulltext() { // get dirty $this->template->num_docs = fulltext::index()->numDocs(); $this->template->dirty = fulltext::dirty(); $this->template->num_dirty = count($this->template->dirty); // index $index = fulltext::index(); $this->template->update_now = array_slice($this->template->dirty, 0, 50); if (!empty($this->template->update_now)) { adminlog::log(__('Attempt to update fulltext')); } foreach (mapper::products()->findByIds($this->template->update_now) as $product) { // delete old foreach ($index->termDocs(new Zend_Search_Lucene_Index_Term($product->getId(), 'id')) as $id) { $index->delete($id); } // add $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('id', $product->getId())); $doc->addField(Zend_Search_Lucene_Field::UnStored('name', $product->getName())); $doc->addField(Zend_Search_Lucene_Field::UnStored('nice_name', $product->getNiceName())); $doc->addField(Zend_Search_Lucene_Field::Unstored('code', $product->getCode())); $doc->addField(Zend_Search_Lucene_Field::UnStored('meta_keywords', $product->getMetaKeywords())); $doc->addField(Zend_Search_Lucene_Field::UnStored('meta_description', $product->getMetaDescription())); $description = ''; if (strlen($product->getDescription()) < 1) { if (strlen($product->getMetaDescription()) < 1) { $description = $product->getName(); } else { $description = $product->getMetaDescription(); } } else { $description = $product->getDescription(); } if ($manufacturer = mapper::products()->findManufacturerOf($product->getId())) { $doc->addField(Zend_Search_Lucene_Field::UnStored('manufacturer', $manufacturer->getName())); $description .= ' ' . $manufacturer->getName(); $description .= ' ' . $manufacturer->getDescription(); } if ($category = mapper::products()->findCategoryOf($product->getId())) { $doc->addField(Zend_Search_Lucene_Field::UnStored('category', $category->getName())); $description .= ' ' . $category->getName(); $description .= ' ' . $category->getDescription(); } $description .= ' ' . $product->getName(); $doc->addField(Zend_Search_Lucene_Field::UnStored('description', $description)); $index->addDocument($doc); } // undirty updated foreach ($this->template->update_now as $id) { fulltext::dirty($id, FALSE); } // log adminlog::log(__('Successfully updated %d fulltext items, %d remains'), count($this->template->update_now), $this->template->num_dirty - count($this->template->update_now)); // refresh $s = 5; Environment::getHttpResponse()->setHeader('Refresh', $s . '; ' . (string) Environment::getHttpRequest()->getOriginalUri()); $this->template->next_update = $s; }
$cindex->addDocument($doc); $changed = true; } mysql_free_result($thread_result); if($changed) $cindex->commit(); //Do the upload: $term = new Zend_Search_Lucene_Index_Term($fileid, 'fileid'); $docIds = $uindex->termDocs($term); if(count($docIds) == 0) { $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('fileid', $fileid)); $doc->addField(Zend_Search_Lucene_Field::Unstored('filename', $filename)); $doc->addField(Zend_Search_Lucene_Field::Unstored('uploader', $uploader)); $uindex->addDocument($doc); $uindex->commit(); } echo "\n"; flush(); } echo "Done.\noptimizing indices... "; flush(); $cindex->optimize(); $uindex->optimize(); echo "Done (".number_format($timer->reset(), 5)."s)\n\nUpload index size is ".$uindex->count().", ".$uindex->numDocs()." documents.\n\nComment index size is ".$cindex->count().", ".$cindex->numDocs()." documents.\n\n";
public static function saveSearchIndex() { $index = new Zend_Search_Lucene(sfConfig::get('sf_lib_dir') . '/modules/search/tmp/lucene.user.index', true); $doc = new Zend_Search_Lucene_Document(); $doc->addField(Zend_Search_Lucene_Field::Keyword('id', $this->pageDocument->getId())); $doc->addField(Zend_Search_Lucene_Field::Keyword('pageid', $this->pageDocument->getId())); $doc->addField(Zend_Search_Lucene_Field::Keyword('title', $this->pageDocument->getNavigationTitle())); $blobData = $this->pageDocument->getContent(); // $blockContents = $blobData->__toString(); $blockContents = $blobData; $doc->addField(Zend_Search_Lucene_Field::Unstored('contents', $blockContents . ' ' . $this->pageDocument->getNavigationTitle())); $index->addDocument($doc); $index->commit(); $hits = $index->find(strtolower('maquette')); foreach ($hits as $hit) { // echo $hit->score.'<br/>'; // echo $hit->id; // echo $hit->contents.'<br/>'; echo $hit->pageid; } }
public static function addInformationObjectIndex(QubitInformationObject $informationObject, $language, $options = array()) { // Only ROOT node should have no parent, don't index if (null === $informationObject->parent) { return; } $doc = new Zend_Search_Lucene_Document(); // Reference elements $doc->addField(Zend_Search_Lucene_Field::Keyword('id', $informationObject->id)); $doc->addField(Zend_Search_Lucene_Field::Keyword('slug', $informationObject->slug)); $doc->addField(Zend_Search_Lucene_Field::Keyword('culture', $language)); $doc->addField(Zend_Search_Lucene_Field::Keyword('className', 'QubitInformationObject')); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('parentId', $informationObject->parentId)); $doc->addField(Zend_Search_Lucene_Field::Keyword('parent', $informationObject->parent->slug)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('collectionRootId', $informationObject->getCollectionRoot()->id)); $doc->addField(Zend_Search_Lucene_Field::Keyword('collectionRootSlug', $informationObject->getCollectionRoot()->slug)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('collectionRootTitle', $informationObject->getCollectionRoot()->getTitle())); // Digital object information if ($digitalObject = $informationObject->getDigitalObject()) { $doc->addField(Zend_Search_Lucene_Field::Keyword('hasDigitalObject', 'true')); $doc->addField(Zend_Search_Lucene_Field::Keyword('do_mediaTypeId', $digitalObject->mediaTypeId)); if (null !== $digitalObject->thumbnail) { $doc->addField(Zend_Search_Lucene_Field::UnIndexed('do_thumbnail_FullPath', $digitalObject->thumbnail->getFullPath())); } // $doc->addField(Zend_Search_Lucene_Field::Unstored('mediatype', $digitalObject->getMediaType()->getName(array('culture' => $language)))); // $doc->addField(Zend_Search_Lucene_Field::Unstored('filename', $digitalObject->getName())); // $doc->addField(Zend_Search_Lucene_Field::Unstored('mimetype', $digitalObject->mimeType)); } else { $doc->addField(Zend_Search_Lucene_Field::Keyword('hasDigitalObject', 'false')); } // Title // include an i18n fallback for proper search result display in case the title field was not translated if (0 < strlen($informationObject->getTitle(array('culture' => $language)))) { $titleField = Zend_Search_Lucene_Field::Text('title', $informationObject->getTitle(array('culture' => $language))); } else { $titleField = Zend_Search_Lucene_Field::Text('title', $informationObject->getTitle(array('sourceCulture' => true))); } // Boost the hit relevance for the title field $titleField->boost = 10; $doc->addField($titleField); // Publication status $doc->addField(Zend_Search_Lucene_Field::Text('publicationStatusId', $informationObject->getPublicationStatus()->status->id)); $doc->addField(Zend_Search_Lucene_Field::Text('scopeAndContent', $informationObject->getScopeAndContent(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Text('referenceCode', $informationObject->referenceCode)); // Store dates as serialized array $dates = array(); foreach ($informationObject->getDates() as $date) { $save_date['id'] = $date->id; $save_date['rendered'] = Qubit::renderDateStartEnd($date->getDate(array('cultureFallback' => true)), $date->startDate, $date->endDate); $save_date['type'] = $date->getType(array('cultureFallback' => true))->__toString(); if (isset($date->actor)) { $save_date['actor'] = $date->actor->__toString(); } $dates[] = $save_date; } $doc->addField(Zend_Search_Lucene_Field::UnIndexed('dates', serialize($dates))); // CREATOR $creatorField = Zend_Search_Lucene_Field::Unstored('creator', $informationObject->getCreatorsNameString(array('culture' => $language))); // Boost the hit relevance for the creator field $creatorField->boost = 8; $doc->addField($creatorField); $doc->addField(Zend_Search_Lucene_Field::Unstored('creatorhistory', $informationObject->getCreatorsHistoryString(array('culture' => $language)))); // Level of Description if (null !== $informationObject->getLevelOfDescription()) { $doc->addField(Zend_Search_Lucene_Field::Text('levelOfDescription', $informationObject->getLevelOfDescription()->getName(array('culture' => $language)))); } else { $doc->addField(Zend_Search_Lucene_Field::UnIndexed('levelOfDescription', null)); } // Repository $repository = $informationObject->getRepository(array('inherit' => true)); if (null !== $repository) { $doc->addField(Zend_Search_Lucene_Field::Keyword('repositoryId', $repository->id)); $doc->addField(Zend_Search_Lucene_Field::Keyword('repositorySlug', $repository->slug)); $doc->addField(Zend_Search_Lucene_Field::Text('repositoryName', $repository->getAuthorizedFormOfName(array('culture' => $language)))); } else { $doc->addField(Zend_Search_Lucene_Field::UnIndexed('repositoryId', null)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('repositorySlug', null)); $doc->addField(Zend_Search_Lucene_Field::UnIndexed('repositoryName', null)); } // Identifier $identifierField = Zend_Search_Lucene_Field::Text('identifier', $informationObject->getIdentifier()); $identifierField->boost = 5; $doc->addField($identifierField); // I18n fields $doc->addField(Zend_Search_Lucene_Field::Unstored('alternatetitle', $informationObject->getAlternateTitle(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('edition', $informationObject->getEdition(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('extentandmedium', $informationObject->getExtentAndMedium(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('archivalhistory', $informationObject->getArchivalHistory(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('acquisition', $informationObject->getAcquisition(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('appraisal', $informationObject->getAppraisal(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('accruals', $informationObject->getAccruals(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('arrangement', $informationObject->getArrangement(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('accessconditions', $informationObject->getAccessConditions(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('reproductionconditions', $informationObject->getReproductionConditions(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('physicalcharacteristics', $informationObject->getPhysicalCharacteristics(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('findingaids', $informationObject->getFindingAids(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('locationoforiginals', $informationObject->getLocationOfOriginals(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('locationofcopies', $informationObject->getLocationOfCopies(array('culture' => $language)))); $doc->addField(Zend_Search_Lucene_Field::Unstored('relatedunitsofdescription', $informationObject->getRelatedUnitsOfDescription(array('culture' => $language)))); // Subjects $subjectField = Zend_Search_Lucene_Field::Unstored('subject', $informationObject->getAccessPointsString(QubitTaxonomy::SUBJECT_ID, array('culture' => $language))); // Boost the hit relevance for the subject field $subjectField->boost = 5; $doc->addField($subjectField); // Place $placeField = Zend_Search_Lucene_Field::Unstored('place', $informationObject->getAccessPointsString(QubitTaxonomy::PLACE_ID, array('culture' => $language))); // Boost the hit relevance for the place field $placeField->boost = 3; $doc->addField($placeField); // Names $nameField = Zend_Search_Lucene_Field::Unstored('name', $informationObject->getNameAccessPointsString(array('culture' => $language))); // Boost the hit relevance for the place field $nameField->boost = 3; $doc->addField($nameField); $cultureInfo = sfCultureInfo::getInstance($language); $languages = $cultureInfo->getLanguages(); $scripts = $cultureInfo->getScripts(); // Languages if (0 < count($properties = $informationObject->getProperties($name = 'language'))) { $languageCodes = unserialize($properties->offsetGet(0)->getValue(array('sourceCulture' => true))); if (0 < count($languageCodes)) { $languageString = ''; foreach ($languageCodes as $languageCode) { $languageString .= $languages[$languageCode] . ' '; } $doc->addField(Zend_Search_Lucene_Field::Unstored('language', rtrim($languageString))); } } // Scripts if (0 < count($properties = $informationObject->getProperties($name = 'script'))) { $scriptCodes = unserialize($properties->offsetGet(0)->getValue(array('sourceCulture' => true))); if (0 < count($scriptCodes)) { $scriptString = ''; foreach ($scriptCodes as $scriptCode) { $scriptString .= $scripts[$scriptCode] . ' '; } $doc->addField(Zend_Search_Lucene_Field::Unstored('script', rtrim($scriptString))); } } // Notes if (0 < count($notes = $informationObject->getNotes())) { $noteString = ''; foreach ($notes as $note) { $noteString .= $note->getContent(array('culture' => $language)) . ' '; } $doc->addField(Zend_Search_Lucene_Field::Unstored('notes', $noteString)); } // Exclude control area fields for now, maybe add a seperate index for administrative data? // (institution_responsible_identifier, rules, sources, revision_history) // To come: // Add all dynamic metadata fields to index self::getInstance()->getEngine()->getIndex()->addDocument($doc); }