Use this file to discover all available pages before exploring further.
The Olostep /v1/searches endpoint lets you search the web with a natural language query and get back a deduplicated list of relevant links with titles and descriptions.
Send a query in plain English
Get back structured links from across the web
Optionally scrape every returned URL in one round-trip and embed markdown_content / html_content directly into the response
Filter by domain, control the result count, and bound the scraping wallclock
It will search for the query semantically across the web and return results.For API details, see the Search Endpoint API Reference.
Pass scrape_options to scrape every returned URL in parallel and embed the rendered content directly on each link. This saves a round-trip per result vs. calling /v1/searches and /v1/scrapes separately.
{ "query": "What's going on with OpenAI's Sora shutting down?", "limit": 10, "scrape_options": { "formats": ["markdown"], "remove_css_selectors": "default", "timeout": 25 }}
Field
Type
Default
Description
formats
string[]
["markdown"]
Output formats to attach to each link. For /v1/searches, only "html" and "markdown" are supported. Pass ["html", "markdown"] to receive both.
remove_css_selectors
string
"default"
Forwarded to /v1/scrapes. "default" strips nav/footer/script/style/svg/dialog noise. Use "none" to disable, or pass a JSON-stringified array of selectors to remove.
timeout
integer
25
Wallclock budget in seconds for the entire scrape phase. Must be between 1 and 60. After this elapses, the search returns immediately — content fields will be null for any links that hadn’t finished.
All links are scraped in parallel. The timeout bounds the whole batch, not each individual link.
Per-link scrape failures (network errors, individual page timeouts) leave that link’s markdown_content / html_content as null while other links return normally.
If the global timeout elapses before all scrapes finish, the search responds immediately with the links it has — already-completed scrapes keep their content; in-flight ones come back with null content.
For reddit.com/.../comments/... URLs, the request is automatically routed through the @olostep/reddit-post parser and the structured JSON is rendered into clean markdown + basic HTML
If the combined inline content exceeds 9MB, content fields are nulled, result.size_exceeded is set to true, and you can fetch the full payload from result.json_hosted_url.
from olostep import Olostepclient = Olostep(api_key="YOUR_REAL_KEY")search = client.searches.create( query="What's going on with OpenAI's Sora shutting down?", limit=5, scrape_options={"formats": ["markdown"], "timeout": 25},)for link in search.links: print(link["url"], "—", len(link.get("markdown_content") or ""), "chars")
You will receive a search object in response. The search object contains an id, your original query, credits_consumed, and a result with a list of links.
{ "id": "search_9bi0sbj9xa", "object": "search", "created": 1760327323, "metadata": {}, "query": "What's going on with OpenAI's Sora shutting down?", "credits_consumed": 10, "result": { "json_content": "...", "json_hosted_url": "https://olostep-storage.s3.us-east-1.amazonaws.com/search_9bi0sbj9xa.json", "size_exceeded": false, "credits_consumed": 10, "links": [ { "url": "https://www.bbc.com/news/articles/c3w3e467ewqo", "title": "OpenAI to shut down Sora video platform", "description": "OpenAI says it will discontinue its Sora app...", "markdown_content": "# OpenAI to shut down Sora video platform\n\nOpenAI says it will discontinue..." }, { "url": "https://www.reddit.com/r/OutOfTheLoop/comments/1s2u847/whats_going_on_with_openais_sora_shutting_down/", "title": "What's going on with OpenAI's Sora shutting down?", "description": "Reddit thread discussing the shutdown.", "markdown_content": "# What's going on with OpenAI's Sora shutting down?\n\n*r/OutOfTheLoop · u/rm-minus-r · 1mo ago*\n\n..." } ] }}
Each link in result.links contains:
Field
Type
Description
url
string
The URL of the search result.
title
string
The title of the result page.
description
string
A short snippet describing the result.
markdown_content
string
Markdown content of the page. Only present when scrape_options.formats includes "markdown". null if the scrape failed, was empty, or hit the global timeout.
html_content
string
HTML content of the page. Only present when scrape_options.formats includes "html". null on failure/timeout.
The full result is also available as a hosted JSON file at result.json_hosted_url — useful when result.size_exceeded is true.
GET /v1/searches/{search_id} returns whatever was persisted at search time, including any scraped content. It’s a pure idempotent read — no re-scraping, no re-billing. Older searches without scrape_options simply have no per-link content fields.
from olostep import Olostepclient = Olostep(api_key="YOUR_REAL_KEY")search = client.searches.get(search_id="search_9bi0sbj9xa")print(search.id, len(search.links))
Each search costs 5 credits for the search itself.When scrape_options is provided, each scraped page is billed at the standard /v1/scrapes rate (typically 1 credit per page; some parsers cost more). The total is returned in credits_consumed.Examples: