PyPI Package: olostep | Requirements: Python 3.11+
Installation
Authentication
Get your API key from the Olostep Dashboard.Quick Start
The SDK provides two client options depending on your use case:Sync Client (`Olostep`)
Best for: Scripts and simple use cases where you prefer blocking operations.
The sync client provides a simpler, blocking interface that’s easier to get started with if you’re new to async/await.
The sync client provides a simpler, blocking interface that’s easier to get started with if you’re new to async/await.
Async Client (`AsyncOlostep`)
Best for: Production applications, and handling many concurrent requests.
The async client provides non-blocking operations and is the recommended choice for production applications that need high throughput.
The async client provides non-blocking operations and is the recommended choice for production applications that need high throughput.
Sync Client (Olostep)
The sync client (Olostep) provides a blocking interface that’s perfect for scripts and simple use cases.
Basic Web Scraping
Batch Processing
Smart Web Crawling
Site Mapping
AI-Powered Answers
Async Client (AsyncOlostep)
The async client (AsyncOlostep) is the recommended client for high-performance applications, backend services, and when you need to handle many concurrent requests.
Basic Web Scraping
Batch Processing
Smart Web Crawling
Site Mapping
AI-Powered Answers
SDK Reference
Method Structure
Both SDK clients provide the same clean, pythonic interface organized into logical namespaces:| Namespace | Purpose | Key Methods |
|---|---|---|
scrapes | Single URL extraction | create(), get() |
batches | Multi-URL processing | create(), info(), items() |
crawls | Website traversal | create(), info(), pages() |
maps | Link extraction | create(), urls() |
answers | AI-powered extraction | create(), get() |
retrieve | Content retrieval | get() |
Error Handling
Catch all SDK errors using the base exception class:Automatic Retries
The SDK automatically retries on transient errors (network issues, temporary server problems) based on theRetryStrategy configuration. You can customize the retry behavior by passing a RetryStrategy instance when creating the client:
Advanced Features
Smart Input Coercion
The SDK intelligently handles various input formats for maximum convenience:Advanced Scraping Options
Batch Processing with Custom IDs
Intelligent Crawling
Site Mapping with Filters
Answers Retrieval
Content Retrieval
Logging
Enable logging to debug issues:INFO (recommended), DEBUG (verbose), WARNING, ERROR
Retry Strategy Configuration
TheRetryStrategy class controls how the Olostep SDK handles transient API errors through automatic retries with exponential backoff and jitter. This helps ensure reliable operation in production environments where temporary network issues, rate limits, and server overload can cause intermittent failures.
Default Behavior
By default, the SDK uses the following retry configuration:- Max retries: 5 attempts
- Initial delay: 2 seconds
- Backoff: Exponential (2^attempt)
- Jitter: 10-90% of delay (randomized)
- Attempt 1: Immediate
- Attempt 2: ~2-3.6s delay
- Attempt 3: ~4-7.2s delay
- Attempt 4: ~8-14.4s delay
- Attempt 5: ~16-28.8s delay
Custom Configuration
When Retries Happen
The SDK automatically retries on:- Temporary server issues (
OlostepServerError_TemporaryIssue) - Timeout responses (
OlostepServerError_NoResultInResponse)
Transport vs Caller Retries
The SDK has two retry layers:- Transport layer: Handles network-level connection failures (DNS, timeouts, etc.)
- Caller layer: Handles API-level transient errors (controlled by
RetryStrategy)
Calculating Max Duration
Configuration Examples
Here are some examples of how to configure the retry strategy for different use cases.Conservative Strategy
Aggressive Strategy
No Retries (Fail Fast)
High-Throughput Strategy
Understanding Jitter
Jitter adds randomization to prevent “thundering herd” problems when many clients retry simultaneously. The jitter is calculated as:initial_delay=2.0, jitter_min=0.1, jitter_max=0.9:
- Attempt 0: base=2.0s, jitter=0.2-1.8s, final=2.2-3.8s
- Attempt 1: base=4.0s, jitter=0.4-3.6s, final=4.4-7.6s
- Attempt 2: base=8.0s, jitter=0.8-7.2s, final=8.8-15.2s
Best Practices
For Production Applications
For Development/Testing
For Batch Operations
Monitoring and Debugging
The SDK logs retry information at the DEBUG level:Error Handling
When all retries are exhausted, the original error is raised:Performance Considerations
- Memory: Each retry attempt uses additional memory for request/response objects
- Time: Total operation time can be significantly longer with retries enabled
- API Limits: Retries count against your API usage limits
- Network: More network traffic due to retry attempts
Detailed Error Handling
Exception Hierarchy
The Olostep SDK provides a comprehensive exception hierarchy for different failure scenarios. All exceptions inherit fromOlostep_BaseError.
There are three main error types that directly inherit from Olostep_BaseError:
Olostep_APIConnectionError- Network-level connection failuresOlostepServerError_BaseError- Errors raised (sort of) by the API serverOlostepClientError_BaseError- Errors raised by the client SDK
Why Connection Errors Are Separate
Olostep_APIConnectionError is separate from server errors because it represents network-level failures that occur before the API can process the request. These are transport layer issues (DNS or HTTP failures, timeouts, connection refused, etc.) rather than API-level errors. HTTP status codes (4xx, 5xx) are considered API responses and are categorized as server errors, even though they indicate problems.
Recommended Error Handling
For most use cases, catch the base error and print the error name:OlostepServerError_AuthFailed) is descriptive enough to understand the issue.
Granular Error Handling
If you need more specific error handling, catch the specific error types directly. Avoid usingOlostepServerError_BaseError or OlostepClientError_BaseError - these base classes only indicate who raised the error (server vs client), not who’s responsible for fixing it. This is an implementation detail that doesn’t help with error handling logic.
Instead, catch specific error types that indicate the actual problem:
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
OLOSTEP_API_KEY | Your API key | Required |
OLOSTEP_BASE_API_URL | API base URL | https://api.olostep.com/v1 |
OLOSTEP_API_TIMEOUT | Request timeout (seconds) | 150 |