Skip to content

68publishers/crawler-client-php

Repository files navigation

Crawler Client Logo

Crawler Client PHP

PHP Client for 68publishers/crawler

Checks Coverage Status Total Downloads Latest Version PHP Version

Installation

$ composer require 68publishers/crawler-client-php

Client initialization

The client instance is simply created by calling the static method create().

use SixtyEightPublishers\CrawlerClient\CrawlerClient;

$client = CrawlerClient::create('<full url to your crawler instance>');

The Guzzle library is used to communicate with the Crawler API. If you want to pass some custom options to the configuration for Guzzle, use the second optional parameter.

use SixtyEightPublishers\CrawlerClient\CrawlerClient;

$client = CrawlerClient::create('<full url to your crawler instance>', [
    'timeout' => 0,
]);

Requests to the Crawler API must always be authenticated, so we must provide credentials.

use SixtyEightPublishers\CrawlerClient\CrawlerClient;
use SixtyEightPublishers\CrawlerClient\Authentication\Credentials;

$client = CrawlerClient::create('<full url to your crawler instance>');

$client = $client->withAuthentication(new Credentials('<username>', '<password>'));

It should be pointed out that the client is immutable - calling the with* methods always returns a new instance. This is all that is needed for the client to work properly. You can read about other options on the Advanced options page.

Nette Framework integration

For integration with the Nette Framework please follow this link.

Working with scenarios

Scenarios are handled by ScenarioController.

use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenariosController;

$controller = $client->getController(ScenariosController::class);

List scenarios

/**
 * @param int $page
 * @param int $limit
 * @param array<string, string|array<string>> $filter
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioListingResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 */
$response = $controller->listScenarios(1, 10);

$filteredResponse = $controller->listScenarios(1, 10, [
    'name' => 'Test',
    'status' => 'failed',
])

Get scenario

/**
 * @param string $scenarioId
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException
 */
$response = $controller->getScenario('<id>');

Run scenario

/**
 * @param \SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody $requestBody
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 */

As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.

 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody;

$requestBody = new ScenarioRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    config: [
        'scenes' => [ /* ... */ ],
        'options' => [ /* ... */ ],
        'entrypoint' => [ /* ... */ ],
    ],
)

$response = $controller->runScenario($requestBody);
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action;

$requestBody = new ScenarioRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    config: (new ScenarioConfig(new Entrypoint('<url>', 'default')))
        ->withOptions(/* ... */)
        ->withScene('default', [
            new Action('...', [ /* ... */ ])
            new Action('...', [ /* ... */ ])
        ]),
)

$response = $controller->runScenario($requestBody);

Validate scenario

/**
 * @param \SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody $requestBody
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValidateScenarioResponse
 */

As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.

 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody;

$requestBody = new ScenarioRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    config: [
        'scenes' => [ /* ... */ ],
        'options' => [ /* ... */ ],
        'entrypoint' => [ /* ... */ ],
    ],
)

$response = $controller->validateScenario($requestBody);
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action;

$requestBody = new ScenarioRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    config: (new ScenarioConfig(new Entrypoint('<url>', 'default')))
        ->withOptions(/* ... */)
        ->withScene('default', [
            new Action('...', [ /* ... */ ])
            new Action('...', [ /* ... */ ])
        ]),
)

$response = $controller->validateScenario($requestBody);

Abort scenario

/**
 * @param string $scenarioId
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Common\NoContentResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException
 */
$response = $controller->abortScenario('<id>');

Working with scenario schedulers

Scenario schedulers are handled by ScenarioSchedulersController.

use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulersController;

$controller = $client->getController(ScenarioSchedulersController::class);

List scenario schedulers

/**
 * @param int $page
 * @param int $limit
 * @param array<string, string|array<string>> $filter
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerListingResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 */
$response = $controller->listScenarioSchedulers(1, 10);

$filteredResponse = $controller->listScenarioSchedulers(1, 10, [
    'name' => 'Test',
    'userId' => '<id>',
])

Get scenario scheduler

/**
 * @param string $scenarioSchedulerId
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException
 */
$response = $controller->getScenarioScheduler('<id>');
$etag = $response->getEtag(); # you need Etag for update

Create scenario scheduler

/**
 * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 */

As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.

 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: [
        'scenes' => [ /* ... */ ],
        'options' => [ /* ... */ ],
        'entrypoint' => [ /* ... */ ],
    ],
)

$response = $controller->createScenarioScheduler($requestBody);
$etag = $response->getEtag(); # you need Etag for update
 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: (new ScenarioConfig(new Entrypoint('<url>', 'default')))
        ->withOptions(/* ... */)
        ->withScene('default', [
            new Action('...', [ /* ... */ ])
            new Action('...', [ /* ... */ ])
        ]),
)

$response = $controller->runScenario($requestBody);
$etag = $response->getEtag(); # you need Etag for update

Update scenario scheduler

/**
 * @param string $scenarioSchedulerId
 * @param string $etag
 * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\PreconditionFailedException
 */

As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.

 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: [
        'scenes' => [ /* ... */ ],
        'options' => [ /* ... */ ],
        'entrypoint' => [ /* ... */ ],
    ],
)

$response = $controller->updateScenarioScheduler('<id>', '<etag>', $requestBody);
$etag = $response->getEtag(); # you need Etag for next update
 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: (new ScenarioConfig(new Entrypoint('<url>', 'default')))
        ->withOptions(/* ... */)
        ->withScene('default', [
            new Action('...', [ /* ... */ ])
            new Action('...', [ /* ... */ ])
        ]),
)

$response = $controller->updateScenarioScheduler('<id>', '<etag>', $requestBody);
$etag = $response->getEtag(); # you need Etag for next update

Validate scenario scheduler

/**
 * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ValidateScenarioSchedulerResponse
 */

As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.

 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: [
        'scenes' => [ /* ... */ ],
        'options' => [ /* ... */ ],
        'entrypoint' => [ /* ... */ ],
    ],
)

$response = $controller->validateScenarioScheduler($requestBody);
 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint;
 use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action;

$requestBody = new ScenarioSchedulerRequestBody(
    name: 'My scenario',
    flags: ['my_flag' => 'my_flag_value'],
    active: true,
    expression: '0 2 * * *',
    config: (new ScenarioConfig(new Entrypoint('<url>', 'default')))
        ->withOptions(/* ... */)
        ->withScene('default', [
            new Action('...', [ /* ... */ ])
            new Action('...', [ /* ... */ ])
        ]),
)

$response = $controller->validateScenarioScheduler($requestBody);

Activate/deactivate scenario scheduler

/**
 * @param string $scenarioSchedulerId
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException
 */
 use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody;

# to activate the scenario scheduler:
$response = $controller->activateScenarioScheduler('<id>');

# to deactivate the scenario scheduler:
$response = $controller->deactivateScenarioScheduler('<id>');

Delete scenario scheduler

/**
 * @param string $scenarioSchedulerId
 * 
 * @returns \SixtyEightPublishers\CrawlerClient\Controller\Common\NoContentResponse
 * 
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException
 * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException
 */
$response = $controller->deleteScenarioScheduler('<id>');

License

The package is distributed under the MIT License. See LICENSE for more information.