Amazon S3 Integration

Agenty’s S3 integration allows you to upload your agent result CSV file on your S3 bucket on AWS selected region.

Amazon S3 also known as Amazon Simple Storage Service or S3 is a scalable, high-speed, web-based cloud storage service by AWS designed for online backup and archiving of small to large set of data on cloud with very low cost and high durability.

This tutorial will explain how you can use Agenty S3 plugin to transfer your agent result CSV file to your S3 bucket on AWS for backup, or to move the Agenty data on your cloud infrastructure where your other projects or server are running.

Create a S3 Bucket

  1. Sign in to your AWS console account and find the S3 service.

AWS Dashboard

  1. Clicking on S3 will open the S3 dashboard

S3 Dashboard

  1. Click on the Create bucket.

  2. Give a name to your bucket and select one of the region where you want to store your data physically. Then click on next-next to complete the wizard to finally confirm and your bucket will be ready in few minutes.

Create S3 Bucket

IAM Access

Once the bucket has been created, we’ll use IAM management feature by AWS to create credentials and give limited access to Agenty to connect, upload data to this bucket only.

IAM feature on AWS

Note: Make sure programmatic access has been selected in access type, as Agenty will connect using the access id and secret key programmatically.

Create a new policy with below permission or use the S3FullAccess

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AgentyAccessInBucket",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketCORS",
                "s3:AbortMultipartUpload",
                "s3:ListBucket",
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:DeleteObject",
                "s3:GetObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::agenty-s3-data/*",
                "arn:aws:s3:::agenty-s3-data"                
            ]
        }
    ]
}

Finish the add user wizard and you’ll see the access key id and secret access key on the finish. This id and secret key will be used by Agenty plugin for authentication and transfer your agent result to your S3 bucket automatically.

Add user in IAM

Configure S3 Plugin

Configure S3 plugin in Agenty

  • The S3 plugin form will open, where we can to select our agent to attach this plugin with in step 1,

Select Agenty agent in plugin

  • Then, enter all the required details to configure S3 plugin in step 2 : application id, secret key, region, bucket name etc.,

Enter s3 credentials

  • Click on the Save button to attach this plugin.

Dynamic File Names

Agenty can automatically give a dynamic name to your result file while uploading to S3 bucket. You may use these 8 dynamic variables in S3 Path parameter to generate a dynamic file name run-time:

  • {{agent_id}}
  • {{job_id}}
  • {{MMddyyyy}}
  • {{yyyyMMdd}}
  • {{yyyy-MM-dd}}
  • {{yyyy}}
  • {{MM}}
  • {{dd}}

You can use either one or combination of multiple dynamic variables to make a file name of your choice. The dynamic variable will help to differentiate each uploaded file on different run by using the agent_id, job_id, date etc variables. For example, if a data scraping job was started on 2nd of June 2019, the following dynamic names will result in :

  • Agenty/{{MMddyyyy}}/result.csv will be converted into Agenty/06022019/result.csv
  • Agenty/job_{{job_id}}_output.csv will be converted into Agenty/job_40942_output.csv
  • {{yyyy}}/{MM}/{{dd}}.csv will be converted into 2019/06/02.csv

Note : The dynamic variable names must be used in double-curly braces. For example {{name_of_variable}}

Try it

Now, we are done with configuring our S3 plugin with a web scraping agent which will scrape data from a website and upload the CSV result file to our S3 bucket on US East (N. Virginia) region. So just start the scraper by click on the Start button or using the API.

Scrape data from website

Once the job has been completed, check your S3 bucket and you’ll find that Agenty has uploaded your agent job result file on you S3 bucket with all the details you’ve selected on S3 plugin configuration:

CSV file uploaded to S3