Introduction
AirOps Batches provides automated page metadata extraction with LLM enrichment. Submit URLs and receive structured data including page classification, author information, publication dates, and brand mentions. Key Features:- Automatic page type classification
- Author and date extraction
- Brand mention detection from your provided list
- Smart gap analysis to minimize processing time
Workflow Phases
The batch progresses through three distinct phases:Phase 1: Scraping
URLs are scraped and parsed to extract structured data.Phase 2: Analyzing
Gap analysis determines which fields need additional extraction. Items with complete data skip enrichment.Phase 3: Enriching
Items with missing fields are processed via LLM for additional extraction.Target Schema
The system extracts these fields for each URL:| Field | Type | Description |
|---|---|---|
page_type | string | Classification of the page content |
author | string | Author of the content (when available) |
date_published | string | Publication date (when available) |
date_modified | string | Last modification date (when available) |
brand_mentions | array | Brands from your list found on the page |
Page Types
Thepage_type field classifies pages into one of these categories:
View all page types
View all page types
homepage- Main landing page of a websiteproduct_page- Individual product with features/pricingcollection_page- Multiple products grouped togetherpricing_page- Dedicated pricing tiers pageinformational_article- Standard blog/informational contentdocumentation- Technical reference, API docslisticle_article- “Best of,” “Top X” ranked listscomparison_page- Side-by-side comparisonssupport_article- FAQ, troubleshooting, help contentreview_page- Product/service review with ratingforum_thread- Community discussion or Q&Asocial_media_post- Individual social postsocial_media_profile- LinkedIn/Twitter/Instagram profile pagevideo_page- YouTube, Vimeo, video contentnews_article- Timely news or press coveragecase_study- Customer success storymarketplace_listing- E-commerce product listinglanding_page- Campaign/conversion page (not homepage)deal_page- Discount, promo, affiliate dealjob_posting- Job listings and career pagesother- Uncategorized
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/batches-airops | Create new batch |
| GET | /v1/batches-airops/:batch_id | Get batch status |
| GET | /v1/batches-airops/:batch_id/items | Get all items with results |