Jobs API

Jobs API is used to start new jobs by given agent_id, to get the job result, download result in csv format etc.

Start a job

This API will start a new asynchronous job for given agent_id in the request body.

Endpoint:

Method: POST
URL: https://api.agenty.com/v1/jobs/{{AGENT_TYPE}}/async

Headers:

Key Value Description
Content-Type application/json

Query params:

Key Value Description
apikey {{API_KEY}}

Body:

{"agent_id":"{{AGENT_ID}}"}

Responses:

Status: OK | Code: 200

{
    "status_code": 200,
    "message": "A new scraping job 12999 submitted successfully",
    "job_id": 12999
}

Stop a running job

This API will send a stop request to Agenty workers running that particular job id in background.

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs/{{JOB_ID}}/stop

Headers:

Key Value Description
Content-Type application/json

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 400

{
    "status_code": 200,
    "message": "Stop requesst submitted successfully for job id {{JOB_ID}}"
}

Get job status by job id

Get the job status and other property associated with the job. E.g pages_credit, pages_processed etc.

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs/{{JOB_ID}}

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
    "job_id": 12996,
    "agent_id": "dg4ely7e9r",
    "type": "scraping",
    "status": "completed",
    "pages_total": 1,
    "pages_processed": 1,
    "pages_successed": 1,
    "pages_failed": 0,
    "pages_credit": 1,
    "created_at": "2019-03-08T02:24:07",
    "started_at": "2019-03-08T02:24:12",
    "completed_at": null,
    "stopped_at": null,
    "is_scheduled": false,
    "error": null
}

Get job result by job id

This API will fetch the job result by given job id.

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs/{{JOB_ID}}/result

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
offset 0 A number of lines to skip, for showing the next page. Must be number (int), use this to paginate when there are more then 2500 rows
limit 2500 A number between 1 and 2500 to display maximum number of rows per page. Must be number (int)
collection 1 The collection number you wants to fetch. Default is 1
modified 1 To fetch the modified result if post-processing script is used. By default is 1, to fetch the modified version when available or default otherwise. Use 0 if you want to force Agenty to fetch the default result only

Responses:

Status: OK | Code: 200

{
    "total": 20,
    "limit": 2500,
    "offset": 0,
    "returned": 20,
    "result": [
        {
            "name": "A Light in the ...",
            "price": "£51.77",
            "image": "http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg",
            "details_page_url": "http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"
        },
        {
            "name": "Tipping the Velvet",
            "price": "£53.74",
            "image": "http://books.toscrape.com/media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg",
            "details_page_url": "http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"
        },
		....
    ]
}

Get job logs by job id

This API will fetch the job logs

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs/{{JOB_ID}}/logs

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
offset 0 A number of lines to skip, for showing the next page. Must be number (int), use this to paginate when there are more then 2500 rows
limit 2500 A number between 1 and 2500 to display maximum number of rows per page. Must be number (int)

Responses:

Status: OK | Code: 200

{
    "total": 23,
    "limit": 2500,
    "offset": 0,
    "returned": 23,
    "result": [
        {
            "date": "2019-03-08 07:47:16.8152",
            "batch": "0",
            "level": "INFO",
            "message": "Starting job id: 12995"
        },
        {
            "date": "2019-03-08 07:47:17.5437",
            "batch": "1",
            "level": "TRACE",
            "message": "Running 1 of 1"
        }
        {
            "date": "2019-03-08 07:47:18.4069",
            "batch": "1",
            "level": "TRACE",
            "message": "http://books.toscrape.com/"
        },
        {
            "date": "2019-03-08 07:47:19.3961",
            "batch": "1",
            "level": "TRACE",
            "message": "StatusCode: 200"
        },
        {
            "date": "2019-03-08 07:47:20.2936",
            "batch": "1",
            "level": "TRACE",
            "message": "h3 a extracted 20 match(s) for field NAME of type CSS"
        },
		.......
    ]
}

Export job result by job id

This API will create a download link to download the job result or logs in CSV format.

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs/{{JOB_ID}}/export

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
type result The type of file to export. Must be result or logs
collection 1 The collection number you wants to export. Must be 1 or greater. Default is 1
modified 1 To export the modified result if post-processing script is used. By default is 1 to export modified version when available, Use 0 if you wants to download the default result
filename output Use this to give custom name to your download file. Default is export.csv

Responses:

Status: Download job result by job id | Code: 200

{
    "downloadlink": "https://server1.agenty.com/Job_12995/output1.csv?signature=sdlfjasoywerxvjsaldfkjpwqeroiiu9123e7"
}

Get all jobs

Get all the jobs for all agents under an account

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
    "total": 5,
    "limit": 1000,
    "offset": 0,
    "returned": 5,
    "result": [
        {
            "job_id": 12997,
            "agent_id": "rzpger0e2e",
            "type": "ocr",
            "status": "completed",
            "pages_total": 2,
            "pages_processed": 2,
            "pages_successed": 2,
            "pages_failed": 0,
            "pages_credit": 2,
            "created_at": "2019-03-08T02:28:56",
            "started_at": "2019-03-08T02:28:58",
            "completed_at": "2019-03-08T02:28:56",
            "stopped_at": null,
            "is_scheduled": false,
            "error": null
        },
        {
            "job_id": 12996,
            "agent_id": "dg4ely7e9r",
            "type": "scraping",
            "status": "completed",
            "pages_total": 1,
            "pages_processed": 1,
            "pages_successed": 1,
            "pages_failed": 0,
            "pages_credit": 1,
            "created_at": "2019-03-08T02:24:07",
            "started_at": "2019-03-08T02:24:12",
            "completed_at": "2019-03-08T02:24:00",
            "stopped_at": null,
            "is_scheduled": false,
            "error": null
        },
		.....
    ]
}

Get jobs by agent id

Get all the historical jobs for given agent id

Endpoint:

Method: GET
URL: https://api.agenty.com/v1/jobs

Query params:

Key Value Description
agent_id {{AGENT_ID}} Your agent id
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
    "total": 3,
    "limit": 1000,
    "offset": 0,
    "returned": 3,
    "result": [
        {
            "job_id": 12995,
            "agent_id": "7229mv10op",
            "type": "scraping",
            "status": "completed",
            "pages_total": 1,
            "pages_processed": 1,
            "pages_successed": 1,
            "pages_failed": 0,
            "pages_credit": 1,
            "created_at": "2019-03-08T02:17:29",
            "started_at": "2019-03-08T02:17:32",
            "completed_at": "2019-03-08T02:17:22",
            "stopped_at": null,
            "is_scheduled": false,
            "error": null
        },
        {
            "job_id": 12994,
            "agent_id": "7229mv10op",
            "type": "scraping",
            "status": "completed",
            "pages_total": 1,
            "pages_processed": 1,
            "pages_successed": 1,
            "pages_failed": 0,
            "pages_credit": 1,
            "created_at": "2019-03-08T02:16:37",
            "started_at": "2019-03-08T02:16:40",
            "completed_at": "2019-03-08T02:16:27",
            "stopped_at": null,
            "is_scheduled": false,
            "error": null
        },
        .....
    ]
}