Skip to content

A simple PHP DOM scraper based on DOMDocument class

License

Notifications You must be signed in to change notification settings

rafasashi/php-dom-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

php-dom-scraper

A simple PHP DOM scraper based on the DOMDocument class and preg_match() functions

Supported

  • html
  • css

USAGE

$html_contents = file_get_contents($url);

$dom_contents = parse_dom_contents($html_contents,'html');

OUTPUT

HTML

$dom_contents['html:head'] = '';
$dom_contents['html:links'] = '';
$dom_contents['html:scripts'] = '';
$dom_contents['html:styles'] = '';
$dom_contents['html:body'] = '';

CSS

$dom_contents['css'][$selector] = $value;

TODO

  • html -> (a,meta)
  • css -> @(media|import|local)
  • xml
  • rss
  • atom

About

A simple PHP DOM scraper based on DOMDocument class

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages