PageScraper

This project aims to be able to robustly grab the main article contents from any content heavy page.

It generates a very minimal reading page with only; one highly compressed repeating background image, no fonts, no js, one shared css file. It also has mobile view activated automatically based on screen width (using a css media query). All this in less then 30KB uncompressed.

Basic usage:

pagescrape.php?targetUrl=http://www.somenewssitehere.com/somearticle

Common Issues and Fixes

NGINX Error: "upstream sent too big header while reading response header from upstream"

This signifies that the page that was requested for processing was larger than allowed by your fastCGI buffer size.

To fix this increase fastcgi_buffer_size in your nginx config

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
.github/workflows		.github/workflows
lib		lib
styles		styles
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
README.md		README.md
composer.json		composer.json
index.html		index.html
json.php		json.php
pagescrape.php		pagescrape.php
phpunit.xml		phpunit.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

lib

lib

styles

styles

tests

tests

.editorconfig

.editorconfig

.gitattributes

.gitattributes

.gitignore

.gitignore

.travis.yml

.travis.yml

README.md

README.md

composer.json

composer.json

index.html

index.html

json.php

json.php

pagescrape.php

pagescrape.php

phpunit.xml

phpunit.xml

Repository files navigation

PageScraper

Common Issues and Fixes

NGINX Error: "upstream sent too big header while reading response header from upstream"

About

Releases

Packages

Languages

Nixes/PageScraper

Folders and files

Latest commit

History

Repository files navigation

PageScraper

Common Issues and Fixes

NGINX Error: "upstream sent too big header while reading response header from upstream"

About

Topics

Resources

Stars

Watchers

Forks

Languages