Olostepを使用したGoogle検索スクレイパー
このガイドでは、Olostep APIを使用してGoogle検索結果をスクレイピングし、構造化されたJSONデータに解析する方法を示します。これは、研究タスクの自動化、競合情報の収集、検索データを必要とするアプリケーションの構築に特に役立ちます。仕組み
以下のJavascriptの例では、Google検索とOlostepのgoogle検索パーサー@olostep/google-searchを使用して、特定の人物(Patrick Collison)のLinkedInプロファイルURLを検索する方法を示します。
コピー
async function scrapeGoogleSearch(apiKey, query = "site%3Alinkedin.com+Patrick+Collison") {
const endpoint = "https://api.olostep.com/v1/scrapes";
const payload = {
"formats": ["json"],
"parser": {"id": "@olostep/google-search"},
"url_to_scrape": `https://www.google.com/search?q=${encodeURIComponent(query)}&gl=us&hl=en`,
"wait_before_scraping": 0,
};
const headers = {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json"
};
try {
const response = await fetch(endpoint, {
method: "POST",
headers: headers,
body: JSON.stringify(payload)
});
const data = await response.json();
console.log(JSON.stringify(data, null, 4));
return data;
} catch (error) {
console.error("Error:", error);
throw error;
}
}
// <API_KEY> を実際のOlostep APIキーに置き換えてください
scrapeGoogleSearch("<API_KEY>");
レスポンス形式
Google検索パーサーを使用してOlostep APIにリクエストを送信すると、以下のようなJSONレスポンスが返されます。コピー
{
"id": "scrape_f2xghz17kt",
"object": "scrape",
"created": 1742679301,
"metadata": {},
"retrieve_id": "f2xghz17kt",
"url_to_scrape": "https://www.google.com/search?q=site%253Alinkedin.com%2BPatrick%2BCollison&gl=us&hl=en",
"result": {
"html_content": null,
"markdown_content": null,
"text_content": null,
"json_content": "{\"searchParameters\":{\"type\":\"search\",\"engine\":\"google\",\"q\":\"site:linkedin.com Patrick Collison\"},\"knowledgeGraph\":{\"description\":\"Experience. Stripe Graphic · Stripe. -. Education. Massachusetts Institute of Technology Graphic · Massachusetts Institute of Technology. 2006 - 2010 ...\"},\"organic\":[{\"title\":\"Patrick Collison - Stripe\",\"link\":\"https://www.linkedin.com/in/patrickcollison\",\"position\":1,\"snippet\":\"Experience. Stripe Graphic · Stripe. -. Education. Massachusetts Institute of Technology Graphic · Massachusetts Institute of Technology. 2006 - 2010 ...\",\"meta\":\"10.8K+ followers\"},{\"title\":\"The Stripe Story: How Patrick Collison Revolutionized ...\",\"link\":\"https://www.linkedin.com/pulse/stripe-story-how-patrick-collison-revolutionized-online-anshuman-jha-jzzic\",\"position\":2,\"snippet\":\"The Early Years: A Genius in the Making. Patrick Collison wasn't just bright—he was a supernova. By age 10, he'd devoured university-level math ...\"},{\"title\":\"In 2005, Patrick Collison was a 16-year-old winning ...\",\"link\":\"https://www.linkedin.com/posts/itselanagold_in-2005-patrick-collison-was-a-16-year-old-activity-7308533537576497154-w5vC\",\"position\":3,\"snippet\":\"In 2005, Patrick Collison was a 16-year-old winning Ireland's Young Scientist of the Year competition. By 2008, he and his younger brother ...\"},{\"title\":\"Patrick Collison on the importance of waiting a really long ...\",\"link\":\"https://www.linkedin.com/posts/the-startup-archive_patrick-collison-on-the-importance-of-waiting-activity-7286001819145707520-1mdI\",\"position\":4,\"snippet\":\"Patrick argues you should also view every person you hire as bringing along another 50 people just like them if your company is successful.\"},{\"title\":\"Tim Ferriss' Post - Patrick Collison — CEO of Stripe (#353)\",\"link\":\"https://www.linkedin.com/posts/timferriss_patrick-collison-ceo-of-stripe-353-activity-7271892372358148096--dsK\",\"position\":5,\"snippet\":\"Author of 5 #1 NYT/WSJ bestsellers, early-stage investor, host of The Tim Ferriss Show podcast (1B+ downloads), and collector of the strange.\"},{\"title\":\"Patrick Collison wanted a guide to Stripe's culture that ...\",\"link\":\"https://www.linkedin.com/posts/first-round-capital_patrick-collison-wanted-a-guide-to-stripes-activity-7304833456948097024-Tt6h\",\"position\":6,\"snippet\":\"Patrick Collison wanted a guide to Stripe's culture that convinced 50% of candidates not to join. And Eeke de Milliano was tasked with ...\"},{\"title\":\"The Collison brothers (John & Patrick) explain why Stripe is ...\",\"link\":\"https://www.linkedin.com/posts/marcelvanoost_the-collison-brothers-john-patrick-explain-activity-7301586346349850624-L-4U\",\"position\":7,\"snippet\":\"The Collison brothers (John & Patrick) explain why Stripe is better off staying Private: \\\" This is our life's work. We're not going anywhere ...\"},{\"title\":\"Stripe CEO Patrick Collison on Crafting a Culture ...\",\"link\":\"https://www.linkedin.com/posts/jennifer-chatman-8086a918_stripe-ceo-patrick-collison-on-crafting-a-activity-7231753022849085440-0RE5\",\"position\":8,\"snippet\":\"When Patrick Collison and his brother John Collison founded digital payment company Stripe in 2010, he didn't come in with “any kind of ...\"},{\"title\":\"Patrick Collison on the importance of beauty and ...\",\"link\":\"https://www.linkedin.com/posts/the-startup-archive_patrick-collison-on-the-importance-of-beauty-activity-7247935993817751552-Qt6h\",\"position\":9,\"snippet\":\"Patrick Collison on the importance of beauty and craftsmanship when building products “If Stripe is a monstrously successful business, ...\"}]}",
"llm_extract": null,
"screenshot_hosted_url": null,
"html_hosted_url": null,
"markdown_hosted_url": null,
"json_hosted_url": "https://olostep-storage.s3.us-east-1.amazonaws.com/json_f2xghz17kt.json",
"text_hosted_url": null,
"links_on_page": [],
"page_metadata": {
"status_code": 200,
"title": ""
}
}
}
- 基本的なリクエスト情報:
id,object,createdタイムスタンプ,url_to_scrape - 結果オブジェクト: データの異なる形式にアクセスするためのURL
- json_content: 構造化された検索結果を含む
searchParameters: 検索クエリに関する情報knowledgeGraph: 検索対象に関する詳細情報(利用可能な場合)organic: タイトル、リンク、位置、スニペットを含む検索結果のリストpeopleAlsoAsk: ユーザーがよく検索する関連質問relatedSearches: 関連する検索クエリの提案
json_contentは、構造化された検索結果を含むレスポンスの主要部分です。レスポンスから直接JSONコンテンツにアクセスするか、レスポンスで提供されるホストされたURLを使用できます。
構造化されたレスポンス: json_content
コピー
{
"searchParameters": {
"type": "search",
"engine": "google",
"q": "site:linkedin.com Patrick Collison"
},
"knowledgeGraph": {
"description": "Experience. Stripe Graphic · Stripe. -. Education. Massachusetts Institute of Technology Graphic · Massachusetts Institute of Technology. 2006 - 2010 ..."
},
"organic": [
{
"title": "Patrick Collison - Stripe",
"link": "https://www.linkedin.com/in/patrickcollison",
"position": 1,
"snippet": "Experience. Stripe Graphic · Stripe. -. Education. Massachusetts Institute of Technology Graphic · Massachusetts Institute of Technology. 2006 - 2010 ...",
"meta": "10.8K+ followers"
},
{
"title": "The Stripe Story: How Patrick Collison Revolutionized ...",
"link": "https://www.linkedin.com/pulse/stripe-story-how-patrick-collison-revolutionized-online-anshuman-jha-jzzic",
"position": 2,
"snippet": "The Early Years: A Genius in the Making. Patrick Collison wasn't just bright—he was a supernova. By age 10, he'd devoured university-level math ..."
},
{
"title": "In 2005, Patrick Collison was a 16-year-old winning ...",
"link": "https://www.linkedin.com/posts/itselanagold_in-2005-patrick-collison-was-a-16-year-old-activity-7308533537576497154-w5vC",
"position": 3,
"snippet": "In 2005, Patrick Collison was a 16-year-old winning Ireland's Young Scientist of the Year competition. By 2008, he and his younger brother ..."
},
{
"title": "The Collison brothers (John & Patrick) explain why Stripe is ...",
"link": "https://www.linkedin.com/posts/marcelvanoost_the-collison-brothers-john-patrick-explain-activity-7301586346349850624-L-4U",
"position": 4,
"snippet": "The Collison brothers (John & Patrick) explain why Stripe is better off staying Private: \" This is our life's work. We're not going anywhere ..."
},
{
"title": "Patrick Collison on the importance of waiting a really long ...",
"link": "https://www.linkedin.com/posts/the-startup-archive_patrick-collison-on-the-importance-of-waiting-activity-7286001819145707520-1mdI",
"position": 5,
"snippet": "Patrick argues you should also view every person you hire as bringing along another 50 people just like them if your company is successful."
},
{
"title": "Tim Ferriss' Post - Patrick Collison — CEO of Stripe (#353)",
"link": "https://www.linkedin.com/posts/timferriss_patrick-collison-ceo-of-stripe-353-activity-7271892372358148096--dsK",
"position": 6,
"snippet": "Author of 5 #1 NYT/WSJ bestsellers, early-stage investor, host of The Tim Ferriss Show podcast (1B+ downloads), and collector of the strange."
},
{
"title": "Patrick Collison on the importance of beauty and ...",
"link": "https://www.linkedin.com/posts/the-startup-archive_patrick-collison-on-the-importance-of-beauty-activity-7247935993817751552-Qt6h",
"position": 7,
"snippet": "Patrick Collison on the importance of beauty and craftsmanship when building products "If Stripe is a monstrously successful business, ..."
},
{
"title": "Stripe founder Patrick Collison tells the story of almost ...",
"link": "https://www.linkedin.com/posts/the-startup-archive_stripe-founder-patrick-collison-tells-the-activity-7235977194211000321-V-Cd",
"position": 8,
"snippet": "Stripe founder Patrick Collison tells the story of almost naming the company PayDemon Patrick and John Collison founded Stripe in 2010 when ..."
},
{
"title": "Patrick Collison created $50 billion of value as a co- ...",
"link": "https://www.linkedin.com/posts/tom-alder_patrick-collison-created-50-billion-of-value-activity-7239241304780513281-isxK",
"position": 9,
"snippet": "Patrick Collison created $50 billion of value as a co-founder of Stripe. He has also built the largest carbon removal program in the world."
}
]
}
json_hosted_urlフィールドを使用してJSONファイルにアクセスできます。
- 構造化されたJSON: 例のJSONを表示
formatsパラメータにこれらの形式を含めることができ、Olostepはレスポンスでそれらを返し、各形式のホストされたURLを提供します。
使用例シナリオ
1. LinkedInプロファイルの検索
上記の例では、検索クエリsite:linkedin.com Patrick Collisonを使用してPatrick CollisonのLinkedInプロファイルを検索しています。この手法は、任意の人物のプロフェッショナルプロファイルを見つけるために使用できます。
2. 企業の調査
クエリを変更して企業情報を検索できます。コピー
// 企業情報を検索
scrapeGoogleSearch(apiKey, "Stripe company information revenue");
3. ニュース記事の追跡
特定のトピックに関する最近のニュースを見つける:コピー
// AIに関する最近のニュースを検索
scrapeGoogleSearch(apiKey, "artificial intelligence news after:2023-01-01");
4. 競合分析
競合他社のオンラインプレゼンスを監視:コピー
// 競合他社の言及を検索
scrapeGoogleSearch(apiKey, "\"Company X\" vs \"Company Y\" comparison");
重要な注意事項
- 検索パラメータ:
gl=usおよびhl=enパラメータは、地理位置を米国に、言語を英語に設定します。必要に応じて調整してください。
結論
検索結果データを取得したら、以下のことが可能です:- 興味のある特定のデータポイントを解析
- 結果をデータベースに保存
- カスタム検索インターフェースを構築
- 検索結果に基づいてアクションをトリガー
- 他のAPIやサービスと統合