2019-03-31 11:00:21 +00:00
---
title: スターバックスの店舗情報をスクレイピング
author: kazu634
date: 2009-09-04
wordtwit_post_info:
- 'O:8:"stdClass":13:{s:6:"manual";b:0;s:11:"tweet_times";i:1;s:5:"delay";i:0;s:7:"enabled";i:1;s:10:"separation";s:2:"60";s:7:"version";s:3:"3.7";s:14:"tweet_template";b:0;s:6:"status";i:2;s:6:"result";a:0:{}s:13:"tweet_counter";i:2;s:13:"tweet_log_ids";a:1:{i:0;i:4767;}s:9:"hash_tags";a:0:{}s:8:"accounts";a:1:{i:0;s:7:"kazu634";}}'
categories:
- Perl
- scraper
---
< div class = "section" >
< p >
前から細々とやっていましたが、だいぶ慣れてきたので本格的に取り組みます。
< / p >
< p >
とりあえずここに書き散らかしておきますね。
< / p >
< h4 >
リンクの取得
< / h4 >
< p >
< a href = "http://www.starbucks.co.jp/search/index.html/" onclick = "__gaTracker('send', 'event', 'outbound-article', 'http://www.starbucks.co.jp/search/index.html/', 'スターバックス コーヒー | 店舗検索');" target = "_blank" > スターバックス コーヒー | 店舗検索< / a > からリンクを取得します。このXPathなら簡単なはず! !
< / p >
< pre class = "syntax-highlight" >
< span class = "synPreProc" > #!/usr/bin/perl< / span >
< span class = "synStatement" > use strict< / span > ;
< span class = "synStatement" > use < / span > Web::Scraper;
< span class = "synStatement" > use < / span > URI;
< span class = "synStatement" > my< / span > < span class = "synIdentifier" > $uri< / span > = URI-> < span class = "synStatement" > new< / span > (< span class = "synConstant" > " http://www.starbucks.co.jp/search/index.html/" < / span > );
< span class = "synStatement" > my< / span > < span class = "synIdentifier" > $scraper< / span > = scraper {
process < span class = "synConstant" > '//area[@shape=" RECT" ]'< / span > , < span class = "synConstant" > 'prefs[]'< / span > => < span class = "synConstant" > '@href'< / span > ;
process < span class = "synConstant" > '//td[@class=" SelectFromPlace" ]//a'< / span > , < span class = "synConstant" > 'citys[]'< / span > => < span class = "synConstant" > '@href'< / span > ;
};
< span class = "synStatement" > my< / span > < span class = "synIdentifier" > $result< / span > = < span class = "synIdentifier" > $scraper< / span > -> scrape(< span class = "synIdentifier" > $uri< / span > );
< / pre >
< h4 >
実行結果
< / h4 >
< pre class = "syntax-highlight" >
kazu634@srv634% perl 20090904225211_starbucks.pl ~/work/tmp_perl/scrap < span class = "synStatement" > [< / span > < span class = "synConstant" > 4861< / span > < span class = "synStatement" > ]< / span >
---
citys:
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/tokyo.php
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/kanagawa.php
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/osaka.php
prefs:
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%8C%< span class = "synConstant" > 97</ span > %E6%B5%B7%E9%< span class = "synConstant" > 81</ span > %< span class = "synConstant" > 93</ span >
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%9D%< span class = "synConstant" > 92</ span > %E6%A3%AE%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B2%A9%E6%< span class = "synConstant" > 89</ span > %8B%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%AE%AE%E5%9F%8E%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%A7%8B%E7%< span class = "synConstant" > 94</ span > %B0%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B1%B1%E5%BD%A2%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%A6%8F%E5%B3%B6%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E8%8C%A8%E5%9F%8E%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%A0%< span class = "synConstant" > 83</ span > %E6%9C%A8%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%BE%A4%E9%A6%AC%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%9F%BC%E7%8E%< span class = "synConstant" > 89</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%8D%< span class = "synConstant" > 83</ span > %E8%< span class = "synConstant" > 91</ span > %< span class = "synConstant" > 89</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/tokyo.php
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/kanagawa.php
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B1%B1%E6%A2%A8%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%< span class = "synConstant" > 95</ span > %B7%E9%< span class = "synConstant" > 87</ span > %8E%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%< span class = "synConstant" > 96</ span > %B0%E6%BD%9F%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%AF%8C%E5%B1%B1%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%9F%B3%E5%B7%9D%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%A6%8F%E4%BA%< span class = "synConstant" > 95</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%BB%8B%E8%B3%< span class = "synConstant" > 80</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E4%BA%AC%E9%< span class = "synConstant" > 83</ span > %BD%E5%BA%9C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/osaka.php
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%< span class = "synConstant" > 85</ span > %B5%E5%BA%AB%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%< span class = "synConstant" > 92</ span > %8C%E6%AD%8C%E5%B1%B1%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%A5%< span class = "synConstant" > 88</ span > %E8%< span class = "synConstant" > 89</ span > %AF%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%B3%A5%E5%8F%< span class = "synConstant" > 96</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B3%B6%E6%A0%B9%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B2%A1%E5%B1%B1%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%BA%< span class = "synConstant" > 83</ span > %E5%B3%B6%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B1%B1%E5%8F%A3%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%A6%8F%E5%B2%A1%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E4%BD%< span class = "synConstant" > 90</ span > %E8%B3%< span class = "synConstant" > 80</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%< span class = "synConstant" > 95</ span > %B7%E5%B4%8E%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E7%< span class = "synConstant" > 86</ span > %8A%E6%9C%AC%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%A4%A7%E5%< span class = "synConstant" > 88</ span > %< span class = "synConstant" > 86</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%AE%AE%E5%B4%8E%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%B9%BF%E5%< span class = "synConstant" > 85</ span > %< span class = "synConstant" > 90</ span > %E5%B3%B6%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%B2%< span class = "synConstant" > 96</ span > %E7%B8%< span class = "synConstant" > 84</ span > %E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%BE%B3%E5%B3%B6%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%A6%< span class = "synConstant" > 99</ span > %E5%B7%9D%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%< span class = "synConstant" > 84</ span > %9B%E5%AA%9B%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%AB%< span class = "synConstant" > 98</ span > %E7%9F%A5%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E9%9D%< span class = "synConstant" > 99</ span > %E5%B2%A1%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E6%< span class = "synConstant" > 84</ span > %9B%E7%9F%A5%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E5%B2%< span class = "synConstant" > 90</ span > %E9%< span class = "synConstant" > 98</ span > %9C%E7%9C%8C
- < span class = "synStatement" > !!</ span > perl/scalar:URI::http http://www.starbucks.co.jp/search/result_city2.php?< span class = "synIdentifier" > SearchPerfecture</ span > =%E4%B8%< span class = "synConstant" > 89</ span > %E9%< span class = "synConstant" > 87</ span > %8D%E7%9C%8C
< / pre >
< h4 >
「スタバ」に関連する最近のエントリ
< / h4 >
< ul >
< li >
< a href = "http://d.hatena.ne.jp/sirocco634/20081214/1229221635" onclick = "__gaTracker('send', 'event', 'outbound-article', 'http://d.hatena.ne.jp/sirocco634/20081214/1229221635', ' この考えに同意 – 武蔵の日記');" target = "_blank" > この考えに同意 – 武蔵の日記< / a >
< / li >
< li >
< a href = "http://d.hatena.ne.jp/sirocco634/20080423/1208960605" onclick = "__gaTracker('send', 'event', 'outbound-article', 'http://d.hatena.ne.jp/sirocco634/20080423/1208960605', ' 戸塚modi – 武蔵の日記');" target = "_blank" > 戸塚modi – 武蔵の日記< / a >
< / li >
< / ul >
2019-04-02 16:06:15 +00:00
< / div >