Mastra Integration - Olostep Docs

The Olostep Mastra integration brings powerful web data extraction capabilities to Mastra.ai agents. Olostep is a Web search, scraping and crawling API — an API to search, extract and structure web data. Build intelligent AI agents that can autonomously search, scrape, analyze, and structure data from any website. Install from npm →

Features

The integration provides 4 powerful APIs for automated web data extraction:

Scrape Website

Extract content from any single URL in multiple formats (Markdown, HTML, JSON, text)

Batch Scrape URLs

Process up to 100,000 URLs in parallel. Perfect for large-scale data extraction

Create Crawl

Autonomously discover and scrape entire websites by following links

Create Map

Extract all URLs from a website for site structure analysis and content discovery

Installation

npm install @olostep/mastra-tools

Setup

1. Install the Package

npm install @olostep/mastra-tools @mastra/core

2. Import and Register Integration

In your Mastra configuration file:

import { Mastra } from '@mastra/core';
import { createOlostepIntegration } from '@olostep/mastra-tools';

// Create the Olostep integration
const olostep = createOlostepIntegration();

// Register APIs (this makes them available to agents)
olostep.registerApis();

// Add to your Mastra config
export const mastra = new Mastra({
  config: {
    integrations: [olostep],
    // ... other config
  },
});

3. Configure API Key

Set your Olostep API key as an environment variable:

export OLOSTEP_API_KEY=your-api-key-here

Or in your .env file:

OLOSTEP_API_KEY=your-api-key-here

Get your API key from the Olostep Dashboard.

Available APIs

The integration exposes 4 APIs that your Mastra agents can use:

scrapeWebsite

Extract content from a single URL. Supports multiple formats and JavaScript rendering. Use Cases:

Monitor specific pages for changes
Extract product information from e-commerce sites
Gather data from news articles or blog posts
Pull content for content aggregation

Schema Parameters:

apiKey

string

required

Your Olostep API key

url_to_scrape

string

required

Website URL to scrape (must include http:// or https://)

formats

array

default:"['markdown']"

Output formats: [‘html’, ‘markdown’, ‘json’, ‘text’]

country

string

Country code for location-specific content (e.g., “US”, “GB”, “CA”)

wait_before_scraping

number

Wait time in milliseconds for JavaScript rendering (0-10000)

parser

string

Optional parser ID for specialized extraction (e.g., “@olostep/amazon-product”)

Response:

id - Scrape ID
url_to_scrape - Scraped URL
result.markdown_content - Markdown content
result.html_content - HTML content
result.json_content - JSON content
result.text_content - Text content
result.screenshot_hosted_url - Screenshot URL (if available)
result.markdown_hosted_url - Hosted markdown URL
object - Object type (“scrape”)
created - Unix timestamp

Example Usage:

// In your agent or workflow
const result = await mastra.callApi({
  integrationName: 'olostep',
  api: 'scrapeWebsite',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      url_to_scrape: 'https://example.com',
      formats: ['markdown'],
      country: 'US',
    }
  }
});

batchScrape

Process multiple URLs in parallel (up to 100,000 at once). Perfect for large-scale data extraction. Use Cases:

Scrape entire product catalogs
Extract data from multiple search results
Process lists of URLs from spreadsheets
Bulk content extraction

Schema Parameters:

apiKey

string

required

Your Olostep API key

batch_array

array

required

Array of objects with url and optional custom_id fieldsExample: [{"url":"https://example.com","custom_id":"site1"}]

formats

array

default:"['markdown']"

Output formats for all URLs

country

string

Country code for location-specific scraping

wait_before_scraping

number

Wait time in milliseconds for JavaScript rendering

parser

string

Optional parser ID for specialized extraction

Response:

batch_id - Batch ID (use this to retrieve results later)
status - Processing status
object - Object type (“batch”)

Example Usage:

const result = await mastra.callApi({
  integrationName: 'olostep',
  api: 'batchScrape',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      batch_array: [
        { url: 'https://example.com', custom_id: 'site1' },
        { url: 'https://test.com', custom_id: 'site2' },
      ],
      formats: ['markdown'],
    }
  }
});

createCrawl

Autonomously discover and scrape entire websites by following links. Perfect for documentation sites, blogs, and content repositories. Use Cases:

Crawl and archive entire documentation sites
Extract all blog posts from a website
Build knowledge bases from web content
Monitor website structure changes

Schema Parameters:

apiKey

string

required

Your Olostep API key

start_url

string

required

Starting URL for the crawl (must include http:// or https://)

max_pages

number

default:"10"

Maximum number of pages to crawl

follow_links

boolean

default:"true"

Whether to follow links found on pages

formats

array

default:"['markdown']"

Format for scraped content

country

string

Optional country code for location-specific crawling

parser

string

Optional parser ID for specialized content extraction

Response:

id - Crawl ID (use this to retrieve results later)
object - Object type (“crawl”)
status - Crawl status
created - Unix timestamp

Example Usage:

const result = await mastra.callApi({
  integrationName: 'olostep',
  api: 'createCrawl',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      start_url: 'https://docs.example.com',
      max_pages: 50,
      follow_links: true,
      formats: ['markdown'],
    }
  }
});

createMap

Extract all URLs from a website for content discovery and site structure analysis. Use Cases:

Build sitemaps and site structure diagrams
Discover all pages before batch scraping
Find broken or missing pages
SEO audits and analysis

Schema Parameters:

apiKey

string

required

Your Olostep API key

url

string

required

Website URL to extract links from (must include http:// or https://)

search_query

string

Optional search query to filter URLs (e.g., “blog”)

top_n

number

Limit the number of URLs returned

include_urls

array

Glob patterns to include specific paths (e.g., [“/blog/**”])

exclude_urls

array

Glob patterns to exclude specific paths (e.g., [“/admin/**”])

Response:

id - Map ID
object - Object type (“map”)
url - Website URL
total_urls - Total URLs found
urls - Array of discovered URLs

Example Usage:

const result = await mastra.callApi({
  integrationName: 'olostep',
  api: 'createMap',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      url: 'https://example.com',
      search_query: 'blog',
      top_n: 100,
      include_urls: ['/blog/**'],
    }
  }
});

Using with Agents

Basic Agent Example

Create an agent that can scrape websites:

import { Agent } from '@mastra/core';
import { createOlostepIntegration } from '@olostep/mastra-tools';

const olostep = createOlostepIntegration();
olostep.registerApis();

const agent = new Agent({
  name: 'web-researcher',
  instructions: `
    You are a web research assistant. When users ask you to get information from a website,
    use the Olostep scrapeWebsite API to extract the content, then summarize it for them.
  `,
  model: 'openai/gpt-4',
});

// The agent can now use Olostep APIs through Mastra's API system

Agent Workflow Example

Build a research workflow that discovers and scrapes content:

// 1. Map a website to discover URLs
const mapResult = await mastra.callApi({
  integrationName: 'olostep',
  api: 'createMap',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      url: 'https://example.com',
      include_urls: ['/blog/**'],
    }
  }
});

// 2. Batch scrape discovered URLs
const batchResult = await mastra.callApi({
  integrationName: 'olostep',
  api: 'batchScrape',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      batch_array: mapResult.urls.slice(0, 10).map(url => ({ url })),
      formats: ['markdown'],
    }
  }
});

// 3. Process results with your agent
const summary = await agent.generate({
  messages: [{
    role: 'user',
    content: `Summarize this content: ${batchResult.result.markdown_content}`
  }]
});

Popular Use Cases

Research Agent

Build an agent that autonomously researches topics:

Multi-Source Research

Workflow:

User asks: “Research AI trends”
Agent uses createMap to discover relevant pages
Agent uses batchScrape to extract content
Agent analyzes and summarizes findings
Returns structured research report

Competitor Monitoring

Workflow:

Schedule daily monitoring
Use scrapeWebsite to check competitor pages
Compare with previous data
Alert on significant changes
Generate weekly reports

Content Aggregation

Workflow:

Use createCrawl to discover all blog posts
Use batchScrape to extract content
Process with AI to extract key topics
Store in knowledge base
Generate content calendar

E-commerce Intelligence

Monitor products and prices:

Agent Workflow:
Scrape product pages (scrapeWebsite)
Extract structured data (with parser)
Track price changes
Generate alerts
Update database

SEO Analysis

Analyze website structure and content:

Agent Workflow:
Map website structure (createMap)
Crawl important sections (createCrawl)
Analyze content quality
Identify SEO opportunities
Generate recommendations

Specialized Parsers

Olostep provides pre-built parsers for popular websites. Use them with the parser parameter:

Amazon Product

@olostep/amazon-productExtract: title, price, rating, reviews, images, variants

LinkedIn Profile

@olostep/linkedin-profileExtract: name, title, company, location, experience

LinkedIn Company

@olostep/linkedin-companyExtract: company info, employee count, industry, description

Google Search

@olostep/google-searchExtract: search results, titles, snippets, URLs

Google Maps

@olostep/google-mapsExtract: business info, reviews, ratings, location

Instagram Profile

@olostep/instagram-profileExtract: profile info, followers, posts, bio

Using Parsers

Add the parser ID to the parser parameter:

const result = await mastra.callApi({
  integrationName: 'olostep',
  api: 'scrapeWebsite',
  payload: {
    data: {
      apiKey: process.env.OLOSTEP_API_KEY,
      url_to_scrape: 'https://www.amazon.com/dp/PRODUCT_ID',
      formats: ['json'],
      parser: '@olostep/amazon-product',
    }
  }
});

The parser automatically extracts structured data specific to that website type.

Best Practices

Use Batch Processing for Multiple URLs

When scraping more than 3-5 URLs, use batchScrape instead of multiple scrapeWebsite calls. Batch processing is:

Much faster (parallel processing)
More cost-effective
Easier to manage
Better for rate limits

Set Appropriate Wait Times

For JavaScript-heavy sites, use the wait_before_scraping parameter:

Simple sites: 0-1000ms
Dynamic sites: 2000-3000ms
Heavy JavaScript: 5000-8000ms

Test with different values to find the optimal wait time.

Use Specialized Parsers

For popular websites (Amazon, LinkedIn, Google), use pre-built parsers:

Get structured data automatically
More reliable extraction
No need for custom parsing
Maintained by Olostep

Handle Async Operations

Batch, Crawl, and Map operations are asynchronous:

Store the returned ID (batch_id, crawl_id, map_id)
Poll for completion or use webhooks
Set up separate workflows for retrieval

Error Handling

Always wrap API calls in try-catch blocks:

try {
  const result = await mastra.callApi({
    integrationName: 'olostep',
    api: 'scrapeWebsite',
    payload: { data: {...} }
  });
} catch (error) {
  // Handle authentication, rate limit, or network errors
  console.error('Scraping failed:', error.message);
}

Rate Limiting

Be mindful of rate limits:

Space out requests with delays
Use batch processing when possible
Monitor usage in Olostep dashboard
Upgrade plan if needed

Complete Example

Here’s a complete example of building a research agent:

import { Mastra } from '@mastra/core';
import { Agent } from '@mastra/core';
import { createOlostepIntegration } from '@olostep/mastra-tools';

// Create and register Olostep integration
const olostep = createOlostepIntegration();
olostep.registerApis();

// Initialize Mastra
export const mastra = new Mastra({
  config: {
    integrations: [olostep],
    // ... other config
  },
});

// Create research agent
const researchAgent = new Agent({
  name: 'research-assistant',
  instructions: `
    You are a research assistant that can search, extract, and structure web data.
    When users ask you to research a topic:
    1. Use Olostep's createMap to discover relevant pages
    2. Use batchScrape to extract content from multiple sources
    3. Analyze and summarize the findings
    4. Present structured research reports
  `,
  model: 'openai/gpt-4',
});

// Use the agent
async function researchTopic(topic: string) {
  // Step 1: Discover relevant pages
  const mapResult = await mastra.callApi({
    integrationName: 'olostep',
    api: 'createMap',
    payload: {
      data: {
        apiKey: process.env.OLOSTEP_API_KEY!,
        url: `https://example.com/search?q=${topic}`,
        top_n: 20,
      }
    }
  });

  // Step 2: Scrape discovered pages
  const batchResult = await mastra.callApi({
    integrationName: 'olostep',
    api: 'batchScrape',
    payload: {
      data: {
        apiKey: process.env.OLOSTEP_API_KEY!,
        batch_array: mapResult.urls.slice(0, 10).map(url => ({ url })),
        formats: ['markdown'],
      }
    }
  });

  // Step 3: Analyze with agent
  const summary = await researchAgent.generate({
    messages: [{
      role: 'user',
      content: `Based on this research data, provide a comprehensive summary of ${topic}`
    }]
  });

  return summary;
}

Troubleshooting

Authentication Failed

Error: “Invalid API key”Solutions:

Check API key from dashboard
Ensure API key is set in environment variable
Verify API key is active
Check for extra spaces in API key

API Not Found

Error: “API not found” or “Integration not registered”Solutions:

Ensure registerApis() is called after creating integration
Verify integration is added to Mastra config
Check integration name is ‘olostep’
Restart Mastra server after changes

Scrape Returns Empty Content

Error: Content fields are emptySolutions:

Increase wait_before_scraping time
Check if website requires login
Try different format (HTML vs Markdown)
Verify URL is accessible
Check if site blocks automated access

Rate Limit Exceeded

Error: “Rate limit exceeded”Solutions:

Space out requests with delays
Use batch processing instead of individual scrapes
Upgrade your Olostep plan
Check rate limit in dashboard

TypeScript Errors

Error: Module not found or type errorsSolutions:

Ensure @mastra/core is installed
Check TypeScript version compatibility
Verify all dependencies are installed
Rebuild: npm run build

Pricing

Olostep charges based on API usage, independent of Mastra:

Scrapes: Pay per scrape
Batches: Pay per URL in batch
Crawls: Pay per page crawled
Maps: Pay per map operation

Check current pricing at olostep.com/pricing.

Support

Need help with the Mastra integration?

Documentation

Browse complete API docs

Support Email

Email: info@olostep.com

Mastra Docs

Learn about Mastra framework

Status Page

Check API status

Scrapes API

Learn about the Scrapes endpoint

Batches API

Learn about the Batches endpoint

Crawls API

Learn about the Crawls endpoint

Maps API

Learn about the Maps endpoint

Zapier Integration

Automate with Zapier workflows

LangChain Integration

Build AI agents with LangChain

Get Started

Ready to build AI agents with web scraping capabilities?

Install Package

Install @olostep/mastra-tools from npm

Build intelligent AI agents that can search, extract, and structure web data with Olostep and Mastra!

Get Started

Features

Integrations

​Features

Scrape Website

Batch Scrape URLs

Create Crawl

Create Map

​Installation

​Setup

​1. Install the Package

​2. Import and Register Integration

​3. Configure API Key

​Available APIs

​scrapeWebsite

​batchScrape

​createCrawl

​createMap

​Using with Agents

​Basic Agent Example

​Agent Workflow Example

​Popular Use Cases

​Research Agent

​E-commerce Intelligence

​SEO Analysis

​Specialized Parsers

Amazon Product

LinkedIn Profile

LinkedIn Company

Google Search

Google Maps

Instagram Profile

​Using Parsers

​Best Practices

​Complete Example

​Troubleshooting

​Pricing

​Support

Documentation

Support Email

Mastra Docs

Status Page

​Related Resources

Scrapes API

Batches API

Crawls API

Maps API

Zapier Integration

LangChain Integration

​Get Started

Install Package

Features

Installation

Setup

1. Install the Package

2. Import and Register Integration

3. Configure API Key

Available APIs

scrapeWebsite

batchScrape

createCrawl

createMap

Using with Agents

Basic Agent Example

Agent Workflow Example

Popular Use Cases

Research Agent

E-commerce Intelligence

SEO Analysis

Specialized Parsers

Using Parsers

Best Practices

Complete Example

Troubleshooting

Pricing

Support

Related Resources

Get Started