Overview
Olostep’s scrape endpoint allows to extract content from any website. Content in markdown is useful if you want to feed it to an LLM without all the HTML. In this guide we will see how to extract markdown from a website likehttps://www.nea.com/team
.
Prerequisites
Before getting started, ensure you have the following:- A valid Olostep API key. You can get one by signing up at Olostep.
- Python installed on your system
- The
requests
andjson
libraries (these come pre-installed with Python, but you can install them usingpip install requests
if needed)
Extracting Text from a Website
The following Python script demonstrates how to extract text and markdown content from a website using Olostep’s API.Example Response
A successful response will look something like this:Explanation
url_to_scrape
: specifies the website URL to extract content from.formats
: defines the output formats (text in this case).Authorization
: contains your API key to authenticate the request.- The response is formatted as JSON and printed for readability.