What is OCR Agent?

Agenty’s OCR agent is SaaS based Optical Character Recognition (OCR) software to extract text from image based documents like scanned images and PDFs in batches.

It’s high-quality recognition engine recognize the light and dark patterns in images to identify each alphabetical letter, numeric digit or symbols. When a character is recognized, it is converted into an ASCII code. Special circuit boards and computer chips designed expressly for OCR are used to speed up the recognition process.

Thereafter, it can be used in an electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, superimposed text on an image etc.

But what is OCR agent? and what it is used for? How one can extract text out of thousands of images in bulk using OCR engines integrated with OCR Agent. This article explains and covers everything about the server-based OCR technology with real use-cases. Well, stick around and lets take a insightful dive into the real life use of OCR solutions.

There may have been many occasions when you had wished that somehow your important business document (such as the document in hard paper form and/or even in digital format like PDFs, images, JPEG, ping etc) and paper could turn into a SQL/NoSQL databases so that you could easily search, filter and query to find the relevant documents when needed. The OCR agent is an automation technology which enables conversion of a images files such as pdf, tiff, jpeg, png etc into structured and searchable text to easily import in databases like SQL, MySQL, MongoDB, Oracle, DynamoDB etc.

Create OCR Agent

  1. Login to cloud.agenty.com and click on the Agents menu.
  2. Now click on the button as New Agent located on top right corner of the current page, And you will see Agenty product page like as -

  1. Now press Get it button bellow to copy the OCR Agent in your account

OCR with Image URLs

  1. Go to Input tab
  2. Select input type as Manual URLs
  3. Enter the images URLs you want to extract the text from

  1. Click on Save button to save the input settings
  2. Now, click on Start button to run the agent on server

  1. Finally, go to the Result tab to see and downloaded the extracted text for each images and pdfs.

OCR with File Uploads

If you have images or pdfs files in your local computer hard drive, you can use the bucket feature to upload them in a bucket and run the OCR agent on that bucket for batch text extraction.

  1. Go to Buckets tab on left panel
  2. Click on New Bucket button to create a new bucket

  1. Now, give a name to your new bucket
  2. Select Region as North Virginia, USA (us-east-1)
  3. Click on Add button

  1. Click on the Add Document button

  1. Now, click on the Add Files to create list of files
  2. Then, click on the Start Upload to start uploading all the files

  1. After file upload, go back to Agents tab on the left menu
  2. Go to Input tab
  3. Select Input Type as Select A Bucket
  4. Now, select your newly created bucket from Select A Bucket drop-down options and start the OCR job.

Use Cases

The most use-scenarios for OCR agents are digitizing scanned paper documents into machine-readable text documents.

Many retail/eCommerce businesses which, as part of their daily routine, involve an uphill process of detailing their physical products into digital format with the help of OCR technology by scanning and decoding those scanned details further into a searchable text documents, such process on a larger scale could prove to be massively time consuming but with the OCR agent’s automation process has actually saved a lot of time and efforts.

Other real use cases wherein OCR agent is widely used as a source of information for multiple other fields such as medicine, e-discovery, education, management, retail/eCommerce, legal, banking & finance etc. for processing an array of documents like invoices, bank statements, pictures, business cards, print-outs etc. and delivering high value to the businesses by digitizing and preserving their documents holdings into searchable text and help them make most use of it.