PHP Zrashwani\NewsScrapper Selector::isXPath Exemples

Langage de programmation: PHP

Espace de nommage/Pack: Zrashwani\NewsScrapper

Class/Type: Selector

Méthode/Fonction: isXPath

Exemples au hotexamples.com: 2

PHP Zrashwani\NewsScrapper Selector::isXPath - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de Zrashwani\NewsScrapper\Selector::isXPath extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

isXPath(2)

isCSS(1)

Méthodes fréquemment utilisées

isXPath (2)

isCSS (1)

Exemple #1

0

Afficher le fichier

Fichier : Client.php Projet : zrashwani/news-scrapper

/** * scrap one source of news * @param string $baseUrl url to scrap list of news from * @param string $linkSelector css selector for news links in page * @param int|NULL $limit limit of news article to scrap, * if not set it will scrap all matching the selector * @return array array of article items scrapped */ public function scrapLinkGroup($baseUrl, $linkSelector, $limit = null) { $crawler = $this->scrapClient->request('GET', $baseUrl); $scrap_result = array(); $theAdapter = new Adapters\DefaultAdapter(); $theAdapter->currentUrl = $baseUrl; $isXpath = Selector::isXPath($linkSelector); $method = $isXpath === false ? 'filter' : 'filterXPath'; $crawler->{$method}($linkSelector)->each(function (Crawler $link_node) use(&$scrap_result, $theAdapter, &$limit) { if (!is_null($limit) && count($scrap_result) >= $limit) { return; } $link = $theAdapter->normalizeLink($link_node->attr('href'), true); //remove hash before scrapping $article_info = $this->getLinkData($link); $this->setAdapter(''); //reset default adapter after scrapping one link $scrap_result[$link] = $article_info; }); return $scrap_result; }

Exemple #2

0

Afficher le fichier

Fichier : AbstractAdapter.php Projet : zrashwani/news-scrapper

/** * extract image source by selector * @param Crawler $crawler * @param string $selector * @return string|NULL */ protected function getSrcByImgSelector(Crawler $crawler, $selector) { $ret = null; $imgExtractClosure = function (Crawler $node) use(&$ret) { $ret = $node->attr('src'); }; if (Selector::isXPath($selector)) { $crawler->filterXPath($selector)->each($imgExtractClosure); } else { $crawler->filter($selector)->each($imgExtractClosure); } if (empty($ret) === false) { return $this->normalizeLink($ret); } else { return null; } }