Ejemplos de MyCrawler::resume en PHP

Lenguaje de programación: PHP

Clase / Tipo: MyCrawler

Método / Función: resume

Ejemplos en hotexamples.com: 1

PHP MyCrawler::resume - 1 ejemplos encontrados. Estos son los ejemplos en PHP del mundo real mejor valorados de MyCrawler::resume extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

setURL(23)

addURLFilterRule(5)

setTrafficLimit(3)

obeyRobotsTxt(3)

setPageLimit(3)

addContentTypeReceiveRule(2)

goMultiProcessed(2)

go(2)

obeyNoFollowTags(2)

enableAggressiveLinkSearch(2)

addURLFollowRule(2)

setFollowMode(2)

setCrawlingDepthLimit(1)

setUrlCacheType(1)

setLinkExtractionTags(1)

setUserAgentString(1)

setWorkingDirectory(1)

addBasicAuthentication(1)

resume(1)

processLinks(1)

getProcessReport(1)

getCrawlerId(1)

excludeLinkSearchDocumentSections(1)

enableResumption(1)

enableCookieHandling(1)

addReceiveContentType(1)

addLinkSearchContentType(1)

set_url_test_auth(1)

Ejemplo n.º 1

Mostrar archivo

Archivo: resumable_example.php Proyecto: hasandz/phpcrawl

$crawler = new MyCrawler();
$crawler->setURL("www.php.net");
$crawler->addContentTypeReceiveRule("#text/html#");
$crawler->addURLFilterRule("#\\.(jpg|jpeg|gif|png)\$# i");
$crawler->setPageLimit(50);
// Set the page-limit to 50 for testing
// Important for resumable scripts/processes!
$crawler->enableResumption();
// At the firts start of the script retreive the crawler-ID and store it
// (in a temporary file in this example)
if (!file_exists("/tmp/mycrawlerid_for_php.net.tmp")) {
    $crawler_ID = $crawler->getCrawlerId();
    file_put_contents("/tmp/mycrawlerid_for_php.net.tmp", $crawler_ID);
} else {
    $crawler_ID = file_get_contents("/tmp/mycrawlerid_for_php.net.tmp");
    $crawler->resume($crawler_ID);
}
// Start crawling
$crawler->goMultiProcessed(5);
// Delete the stored crawler-ID after the process is finished completely and successfully.
unlink("/tmp/mycrawlerid_for_php.net.tmp");
$report = $crawler->getProcessReport();
if (PHP_SAPI == "cli") {
    $lb = "\n";
} else {
    $lb = "<br />";
}
echo "Summary:" . $lb;
echo "Links followed: " . $report->links_followed . $lb;
echo "Documents received: " . $report->files_received . $lb;
echo "Bytes received: " . $report->bytes_received . " bytes" . $lb;