Overview
Olostep’s Batches endpoint allows you to start a batch of up to 10,000 URLs and get back the content in 5–7 minutes. You can start up to 10 batches at a time to extract content from 100,000 URLs in one go. If you need more scale, please reach out to us This is useful if you already have the URLs you want to process —for example, to aggregate data for analysis, build a specialized search tool, or monitor multiple websites for changes. In this guide, we’ll walk through how to start a batch with a list of URLs and retrieve the content in markdown format.Gist with Full Code
Here’s all the code in one gist that you can copy and paste to try out batch scraping with Olostep: https://gist.github.com/olostep/e903f2e4fc28f8093b834b4df68b8031 In this gist we have shown how to start a batch with 5 google search queries, check the status, and retrieve the content for each item.Prerequisites
Before getting started, ensure you have the following:- A valid Olostep API key. You can get one by signing up at Olostep.
- Python installed on your system.
- The
requests
andhashlib
libraries (installrequests
withpip install requests
if needed).
Step 1: Create a Batch from Local URLs
If you already have a list of URLs you want to process, you can define them directly in your script. Otherwise, you can read them from a file or database.Step 2: Monitor Batch Status
Once the batch is started, you can monitor its status using thebatch_id
that is returned when you start the batch
Step 3: Retrieve Completed Items
Once the batch is marked complete, fetch the processed items.retrieve_id
which you can use to fetch the scraped content.
Step 4: Retrieve the Content
Use theretrieve_id
to get the extracted content in markdown, html or json. Here is an example to retrieve the content in markdown format:
Hosted Content
We also host the content for 7 days, so you can retrieve it multiple times without re-scraping. Example of a hosted url for markdown contentExample Use Cases
1. Build Search Engines
Use Olostep to extract content from industry-specific websites (legal, medical, AI) and build a searchable database.2. Website Monitoring
Monitor product availability, price changes, or news updates on multiple websites by scheduling daily batch scrapes.3. Social Media Monitoring
Scrape mentions of your brand or keywords across forums or content sources and extract structured data.4. Aggregators
Build a job board, news aggregator, or real estate listing platform by pulling data from dozens of sources.Conclusion
With batch scraping, you can extract content from up to 100k URLs quickly and efficiently. Whether you’re building search tools, aggregators, or monitoring systems, Olostep Batches simplify the job. Want to extract only structured data? Use Parsers to get just the fields you need. Need help? Reach out toinfo@olostep.com
for support or have us write custom scripts for your use case.