Extract API

The extract API allows you to auto-extract the structured data from web. The structure data including schema.org, RDFa, Microdata, JSON-LD can be extracted easily by providing the page URL to /extract API.

GET

Send a GET request to https://chrome.agenty.com/extract endpoint with the url query parameter and your apiKey to extract structured data.

For example, I have this sample web-page with restaurant microdata in the HTML. -

<!DOCTYPE html>
<html>
<head>
<title>Restaurant Schema test</title>		
</head>
<body>
<div itemscope itemtype="http://schema.org/Restaurant">
  <span itemprop="name">GreatFood</span>
  <div itemprop="aggregateRating" itemscope itemtype="http://schema.org/AggregateRating">
    <span itemprop="ratingValue">4</span> stars -
    based on <span itemprop="reviewCount">250</span> reviews
  </div>
  <div itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
    <span itemprop="streetAddress">1901 Lemur Ave</span>
    <span itemprop="addressLocality">Sunnyvale</span>,
    <span itemprop="addressRegion">CA</span> <span itemprop="postalCode">94086</span>
  </div>
  <span itemprop="telephone">(408) 714-1489</span>
  <a itemprop="url" href="http://www.greatfood.com">www.greatfood.com</a>
  Hours:
  <meta itemprop="openingHours" content="Mo-Sa 11:00-14:30">Mon-Sat 11am - 2:30pm
  <meta itemprop="openingHours" content="Mo-Th 17:00-21:30">Mon-Thu 5pm - 9:30pm
  <meta itemprop="openingHours" content="Fr-Sa 17:00-22:00">Fri-Sat 5pm - 10:00pm
  Categories:
  <span itemprop="servesCuisine">
    Middle Eastern
  </span>,
  <span itemprop="servesCuisine">
    Mediterranean
  </span>
  Price Range: <span itemprop="priceRange">$$</span>
  Takes Reservations: Yes
</div>
</body>
</html>

So running the /extract API with this URL in postman or any programming language to send a HTTP GET request will result in structured data extracted -

const fetch = require('node-fetch');

fetch("https://chrome.agenty.com/extract?apiKey={{API_KEY}}&url=https://agenty.github.io/Agenty.TestData/scraping/schema/Restaurant-schema.html", {
  method: 'GET'
})
   .then(res => {
     console.log(res.json())
})

Sample response

{
    "metatags": {
        "openingHours": [
            "Mo-Sa 11:00-14:30",
            "Mo-Th 17:00-21:30",
            "Fr-Sa 17:00-22:00"
        ]
    },
    "microdata": {
        "Restaurant": [
            {
                "@context": "http://schema.org/",
                "@type": "Restaurant",
                "name": "GreatFood",
                "aggregateRating": {
                    "@context": "http://schema.org/",
                    "@type": "AggregateRating",
                    "ratingValue": "4",
                    "reviewCount": "250"
                },
                "address": {
                    "@context": "http://schema.org/",
                    "@type": "PostalAddress",
                    "streetAddress": "1901 Lemur Ave",
                    "addressLocality": "Sunnyvale",
                    "addressRegion": "CA",
                    "postalCode": "94086"
                },
                "telephone": "(408) 714-1489",
                "url": "http://www.greatfood.com",
                "openingHours": [
                    "Mo-Sa 11:00-14:30",
                    "Mo-Th 17:00-21:30",
                    "Fr-Sa 17:00-22:00"
                ],
                "servesCuisine": [
                    "Middle Eastern",
                    "Mediterranean"
                ],
                "priceRange": "$$"
            }
        ]
    },
    "rdfa": {},
    "jsonld": {}
}