Agenty scraping agents are easy and powerful suite of website scraping. Using the Agenty scraping agent, you can :
- Create and host scraping agents online
- Enter URL manually or with advanced list option, to create a list of thousands of bulk URLs to mange easily for batch crawling
- Start, schedule and paginate your scraping agent to extract data automatically
- Scrape data anonymously using our managed distributed servers with thousands of proxies in cloud
- Crawl password protected websites easily
- Get email alerts when your extraction jobs completes or configure a web hook, to post the scraped data on your server automatically
- Use the REST API to start agent, schedule agents, get the data, change urls and more..
- And many more...
To create an agent is quite easy. Just add Agenty Chrome extension to set up your agent, and go to the web page you are looking to scrape, then launch the extension and it will display a panel in right side as in the below screenshot.
Once the extension panel is up and visible, Click on the
New button to add a field and give a name to your field as I did - and give ProductName to my first field. Then click on the
(asterisk) button to enable the point-and-click feature to generate automatic CSS selectors when you click on the HTML element you want to scrape. For example, I want to scrape the name of products in this field. So, I clicked on the product name, and the extension generated the selector and highlighted the other matching products with same selector in the list.
Sometime you may see other matching items might be selected due to same CSS class or selector - So you can click on the yellow highlighted items to deselect them or can also write your selector manually by learning from here.
The extension will highlight the matching result, and will also show you the result preview. Once you are satisfied with the result and the number of records looks per your expectation, click on the
Accept button to save that field in your agent.
Now, follow the same process, to add as many fields as you want for text, attribute or html items to scrape anything from html pages. If you want to extract the link, image or any other attribute from the HTML tag, then you can use the
ATTR option from the Extract drop down, which will display a new text box where, you can enter the name of the attribute to extract instead simple TEXT or HTML.
For example -
- Image Scraping :- In case of Images I want to extract the "src" value, so after generating my selector I selected the ATTR option and entered "src" in the corresponding text box, to tell the extractor that I need the value of src in output instead the entire HTML for images scraping.
- Link Scraping:- To scrape URL links - Write your selector and then select the ATTR option, and entered href in the corresponding text box, to tell the extractor that you need the value of href in output instead the entire HTML or text.
- The ATTR (attribute) option is powerful extractor feature, and can be used to extract any attribute from a HTML tag.
Once you are done with all the fields setup, click on
Done button and the below dialog box will appear. Now enter the API Id and Admin API Key in text boxes under Send to Cloud Hosted App and click on the
Save button, the agent will be created in your online account. (If you don't have the API id and key, you can get one by logging in and then go to your account page in hosted app online)
The API Id and Key is stored in your chrome local storage when you enter it first time to remember in future. If you want to change any time later, just paste again, or the same will be used forever.
Once the agent is created, you can click on the link in success message which will take you on the agent page, for start, schedule and further configuration to manage and automate your data collection using the hosted scraping app online.