Skip to content

msankhala/parsehub-php

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParsehubPhp

Parsehub REST api wrapper class. Use this class to communicate with parsehub. This class uses phphttpclient to communicate with parsehub and monoglog to make log entries about operation performed. See Uses section for log path and api url option.

installation

You can either download, clone this repo or install via composer:

composer require msankhala/parsehub-php

Features

  • Uses phphttpclient class for making http requests.
  • This class also support basic logging using monolog.
  • This class use PSR-0 autoload.

Uses

Create Parsehub class Object to communicate with Parsehub, pass the api_key to parsehub class constructor. You can optionally pass api_url and log_path log file path as second and third arguments.

api_url default value https://www.parsehub.com/api/v2 log_path default value <repo-root>/log/parsehub.log

Autoload Parsehub class:

require_once __DIR__ . '/vendor/autoload.php';

use Parsehub\Parsehub;

In your controller you can use Parsehub class to get list of all the parsehub projects and run object for a parsehub project and save them in your db. When you get a parsehub project information you also get the run_list of that project which you can store in your db.

Get Parsehub projects list:

$api_key = <your-api-key>;
$parsehub = new Parsehub($api_key);
$projectList = $parsehub->getProjectList();
echo $projectList;

or

$api_key = <your-api-key>;
$api_url = 'https://www.parsehub.com/api/v2';
$log_path = 'path/to/parsehub.log';
$parsehub = new Parsehub($api_key, $api_url, $log_path);
$projectList = $parsehub->getProjectList();
echo $projectList;
// Get project_token and run_token from DB.
$project_token = <get project token from db>
$run_token = <get project token from db>

Get particular Parsehub project, Pass the project_token:

$parsehub = new Parsehub($api_key);
$project = $parsehub->getProject($project_token);
echo $project;

Get Last ready run Data for a project:

$parsehub = new Parsehub($api_key);
$data =  $parsehub->getLastReadyRunData($project_token);
print $data;

Get data for a particular run, Pass the run token:

$parsehub = new Parsehub($api_key);
$data = $parsehub->getRunData($run_token);
print $data;

Get a particular run, Pass the run token:

$parsehub = new Parsehub($api_key);
$run = $parsehub->getRun($run_token);
print $run;

Run a parsehub project:

$parsehub = new Parsehub($api_key);
$options = array(
    // Skip start_url option if don't want to override starting url configured
    // on parsehub.
    'start_url' => '<starting url at which crawling starts>'
    // Enter comma separated list of keywords to pass into `start_value_override`
    'keywords' => 'iphone case, iphone copy'
    // Set send_email options. Skip to remain this value default.
    'send_email' => 1,
);
$run_obj = $parsehub->runProject($project_token, $options);
echo $run_obj;

Cancel a parsehub project run:

$parsehub = new Parsehub($api_key);
$cancel = $parsehub->cancelProjectRun($run_token);
print $cancel;

Delete a parsehub project run, This will delete the project run and data of that run so be careful when using this method, once data deleted for a run, are not recoverable:

$parsehub = new Parsehub($api_key);
$cancel = $parsehub->deleteProjectRun($run_token);
print $cancel;

You can check the log in your log file.

About

Wrapper classes to parsehub REST api. Uses HTTPful for REST.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages