Skip to content

buse974/SimplePageCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ZF2 SimplePageCrawler module

Version 0.3.0 Created by Vincent Blanchon

Introduction

SimplePageCrawler is a web page crawler. You can get informations :

  • Title
  • Meta (decsription, open graph, etc.)
  • H1, H2, etc.
  • List of the images
  • List of the links

Usage

Get page informations :

$crawler = $this->getServiceLocator('SimplePageCrawler');
$page = $crawler->get('http://www.nytimes.com');

echo sprintf('The title is "%s"', $page->getTitle());
echo sprintf('The description is "%s"', $page->getMeta('description'));

You can use th action helper :

$page = $this->simplePageCrawler('http://www.nytimes.com');

echo sprintf('The title is "%s"', $page->getTitle());
echo sprintf('The description is "%s"', $page->getMeta('description'));

Advanced usage

You can get Open graph metadatas :

$page = $this->simplePageCrawler('http://www.nytimes.com');
$metas = $page->getMeta()->getOpenGraph();

About

ZF2 module v0.3.0 - Get a page informations : title, meta, heading tags, images & links.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 99.6%
  • Shell 0.4%