- Google App Engine(GAE)의 urlfetch 이용하여 URL 페이지 읽어오기
- python 기본 패키지인 urllib2 이용하여 URL 페이지 읽어오기
def get_url(word):try:return get_url_gae(word)except:return get_wiktion_xml_local(word)def get_url_gae(word):from google.appengine.api import urlfetchurl = 'http://abc.com/%s' % (word)# Changing User-Agent# https://developers.google.com/appengine/docs/python/urlfetch/#Request_Headers# http://stackoverflow.com/questions/2743521/how-to-change-user-agent-on-google-app-engine-urlfetch-serviceresult = urlfetch.fetch(url=url,headers={"User-agent", "jjjj/0.1 (2001-07-14)"})if result.status_code == 200:return result.content.decode('utf-8')else:return ''def get_url_local(word):import urllib2url = 'http://abc.com/%s' % (word)req = urllib2.Request(url)req.add_header("User-agent", "jjjj/0.1 (2001-07-14)")fp = urllib2.urlopen(req)return fp.read().decode('utf-8')
'Data/Text/Knowledge Analysis & Mining > Python' 카테고리의 다른 글
unicode, chatdet (0) | 2013.07.21 |
---|---|
python map reduce lambda (0) | 2013.07.20 |
python - JSON 데이타 load 하기 (0) | 2013.07.16 |
'쿵푸 팬더'의 사부는 너구리 ? (0) | 2013.06.26 |
[python] gzip, bzip 파일 부분 해제 (0) | 2013.04.23 |
WRITTEN BY
- manager@
Data Analysis, Text/Knowledge Mining, Python, Cloud Computing, Platform
,