Agenty scraping agents are easy and powerful suite of website scraping. Using the Agenty scraping agent, you can :
- Create and host scraping agents online
- Enter URL manually or with advanced list option, to create a list of thousands of bulk URLs to mange easily for batch crawling
- Start, schedule, paginate your scraping agent to extract data automatically
- Scrape data anonymously using our managed distributed servers with thousands of proxies in cloud
- Crawl password protected websites easily.
- Get email alerts when your extraction jobs completes or configure a web hook, to post the scraped data on your server automatically
- Use the REST API to start agent, schedule agents, get the data, change urls and more..
- And many more...
Install Chrome App
Creating scraping agent is easy to use the chrome extension to setup your selectors and fields you want to scrape. Just install the Chrome extension, and go to the web page you are looking to scrape, then launch the extension, and it will display a panel in right side as in the below screenshot.
Once the extension panel is up and visible, Click on the "New" button to add a field and give a name to your field as I did - and given "ProductName" to my first field. Then click on the "(asterisk)" button to enable the "point-and-click" feature to generate automatic CSS selectors when you click on the HTML element you want to scrape. For example, I want to scrape the name of products in this field. So, I clicked on the product name, and the extension generated the selector and highlighted the other matching products with same selector in the list.
Sometime you may see other matching items might be selected due to same CSS class or selector - So you can click on the yellow highlighted items to deselect them or can also write your selector manually by learning from here.
The extension will highlight the matching result, and will also show you the result preview. Once you are satisfied with the result and the number of records looks per your expectation, click on the "Accept" button to save that field in your agent.
Now, follow the same process, to add as many fields as you want for text, attribute or html items to scrape anything from html pages. If you want to extract the link, image or any other attribute from the HTML tag, then you can use the "
ATTR" option from the "Extract" drop down, which will display a new text box where, you can enter the name of the attribute to extract instead simple TEXT or HTML.
For example -
- Image Scraping : In case of Images I want to extract the "src" value, so after generating my selector I selected the ATTR option and entered "src" in the corresponding text box, to tell the extractor that I need the value of src in output instead the entire HTML for images scraping.
- Link Scraping : To scrape URL links - Write your selector and then select the ATTR option, and entered href in the corresponding text box, to tell the extractor that you need the value of href in output instead the entire HTML or text.
- The ATTR (attribute) option is powerful extractor feature, and can be used to extract any attribute from a HTML tag.
Once you are done with all the fields setup, click on "Done" button and the below dialog box will appear. Now enter the API Id and Admin API Key in text boxes under "Send to Cloud Hosted App" and click on the "Save" button, the agent will be created in your online account. (If you don't have the API id and key, you can get one by logging in and then go to your account page in hosted app online)
The API Id and Key is stored in your chrome local storage when you enter it first time to remember in future. If you want to change any time later, just paste again, or the same will be used forever.
Once the agent is created, you can click on the link in success message which will take you on the agent page, for start, schedule and further configuration to manage and automate your data collection using the hosted scraping app online.
All scraping agents(*.scraping) file works on both desktop app and the hosted app. If you are using the desktop app as well, you can upload/download the agent file from your cloud app account.