MongoDB Integration

Agenty’s MongoDB workflow allows you to import your agent result to the MongoDB NoSQL database collection on a cloud (or on-premise) by providing the IP address, credentials etc.

MongoDB is a document oriented NoSQL database offered by MongoDB Inc. It allows you to import a high volume of data without any table schema, and comes with great flexibility and high scalability to deploy your databases. MongoDB uses JSON-like documents called BSON to store each object and also for querying and indexing of documents.

Because it’s schema-less, we don’t need to build a table to predefine any columns and their datatype, as opposed to SQL. So, we can use any (Web scraping, OCR or Text extractor etc.) agent in our Agenty account with any number of columns to import the output result to our MongoDB database collection automatically using this plugin.

MongoDB Connection String

The first step will be to grab our MongoDB connection string to use in Agenty workflow which will be used to make a connection and data transfer.

If you have installed MongoDB on your server/on-premise, you can simply enter the IP address and credentials in below format to make a connection string:

mongodb://\[username:password@\]hostname\[:port\]\[/\[database\]\[?options\]\]

See detailed connection reference here - https://docs.mongodb.com/v3.2/reference/connection-string/

In this article, I will use the MongoDB Atlas. Basically, the same product available as a service by MongoDB Inc. So, we do not require you to install anything on our server. We can just signup and choose any of these available providers:

  1. Amazon Web Services (AWS)
  2. Google Cloud
  3. Microsoft Azure

And, then launch a cluster to get started in a few minutes without worrying about the setup, installation, backup, security and more.

Create a Cluster

  • Select the configuration option, region, provider and other settings. Because we are just testing the integration, so I am going to ignore all these options and will continue with a Free option
  • So, I just given the name of cluster as agenty-test to easily recognize it and left everything else as default
  • Confirm the setting and build your cluster
  • MongoDB will deploy your changes and the cluster will be ready to use in few minutes.

Create a Database

  • Click on Collections

  • Then, click on Add my own data

  • Give a name to your database, as I gave agenty in database name and books in collection name
  • Click on the Create button to create the database and an empty collection for books
  • Once the Database and Collection has been created, it will look like this:

Get connection string

  • Go to the Clusters page
  • Click on Connect button

  • The connection dialog box will appear, where you need to click on the Connect your application to get a connection string

  • Copy the connection string, this will be used in Agenty MongoDB workflow in next step.

Whitelist Agenty IP

It’s important that the Agenty IP has been added to allow the inbound traffic from Agenty server to make a connection. So, add the Agenty IP address in network access (or in firewall if running MongoDB on premise)

  • Go to Network Access page under Security
  • Click on the Add IP Address
  • Enter Agenty IP (US region) : 34.238.118.11 (Or your dedicated server IP, if you are running on a dedicated server or other regions). You may contact support, if you are not sure what IP address should be used.

Configure MongoDB Workflow

  • Click on the + Create a workflow or on the top right corner

  • The MongoDB workflow configuration page will open

  • Select the agent you want to automatically transfer the data, every-time a job has been completed. Here as an example, I am using this MongoDB example scraping agent which is created to scrape books and their price from an eCommerce website.

  • Enter the connection string, database name and collection name we created in step #1

Note: If you are using the MongoDB Atlas, your password is the same as your MongoDB website password. Also, there are no need to URL encode the password as mentioned on MongoDB documentation (if any special character in your password) – Agenty will encode it automatically

  • Save this workflow configuration to attach it with the agent selected.

Try it out

Since, we associated this plugin with a web scraping agent named : MongoDB example

  • So, go to this scraping agent page in your Agenty account
  • Start the scraping agent job

Start web scraping agent to extract data

  • Check the agent logs. The plugin execute on job completion, so you’ll find the plugin logs on last :
2019-06-04 11:52:04.6813 INFO Starting plugin agent for batch number : 2  
2019-06-04 11:52:05.6655 TRACE Mongodb plugin started with timeout: 15 minutes  
2019-06-04 11:52:11.3073 TRACE 20 documents sent to MongoDB successfully  
2019-06-04 11:52:11.3073 TRACE Plugin task completed successfully. Duration: 00:00:05.6330886  
2019-06-04 11:52:11.3073 TRACE Pages credit: 5

Remember : Workflow consume pages credit based on total seconds of execution. For example, this plugin task was completed in 5 seconds. So, it took 5 pages credit.

  • Check your MongoDB database collection.

Signup now to get 100 pages credit free

14 days free trial, no credit card required!