Python geopy地理编码器中的超时错误
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python geopy地理编码器中的超时错误相关的知识,希望对你有一定的参考价值。
我是一个相对较新的Python用户,我正在尝试使用函数来使用“geopy”模块返回城市和国家的纬度和经度。我有错误,因为我的城市拼写错误,我已设法抓住。我现在遇到的麻烦是我遇到了超时错误。我已经阅读了这个问题Geopy: catch timeout error并相应地调整了我的超时参数。但是,在我收到超时错误之前,它现在运行了不同的时间长度。我试过在更快的网络上运行它,它在某种程度上起作用。问题是我需要为100k行执行此操作,并且在超时之前迭代的最大行数为20k。非常感谢任何有关如何解决此问题的帮助/建议。
import os
from geopy.geocoders import Nominatim
os.getcwd() #check current working directory
os.chdir("C:UsersPhilipDocumentsHDSDA1ProjectGlobal Terrorism Database")
#import file as a csv
import csv
gtd=open("gtd_original.csv","r")
csv_f=csv.reader(gtd)
outf=open("r_ready.csv","wb")
writer=csv.writer(outf,dialect='excel')
for row in csv_f:
if row[13] in ("","NA") or row[14] in ("","NA"):
lookup = row[12] + "," + row[8] # creates a city,country
geolocator = Nominatim()
location = geolocator.geocode(lookup, timeout = None) #looks up the city/country on maps
try:
location.latitude
except:
lookup = row[8]
location = geolocator.geocode(lookup)
row[13] = location.latitude
row[14] = location.longitude
writer.writerow(row)
gtd.close()
outf.close()
答案
我希望你能超越Nominatim服务(http://wiki.openstreetmap.org/wiki/Nominatim_usage_policy)的使用政策。尝试在请求之间放置1秒的休眠并缓存结果,可能是很多重复。
睡觉部分:
from time import sleep
### your code
row[14] = location.longitude
sleep(1) # after last line in if
缓存:
coords = {}
if coords.has_key([row[8], row[12] ]):
row[13] , row[14] = coords[ [ row[8], row[12] ] ]
else:
#geolocate
更新
性能:1请求/秒 - > 3600 reqs /小时 - > 36K请求/ 10h
import os
from time import sleep
from geopy.geocoders import Nominatim
os.getcwd() #check current working directory
os.chdir("C:UsersPhilipDocumentsHDSDA1ProjectGlobal Terrorism Database")
#import file as a csv
import csv
gtd=open("gtd_original.csv","r")
csv_f=csv.reader(gtd)
outf=open("r_ready.csv","wb")
writer=csv.writer(outf,dialect='excel')
coords = {}
for row in csv_f:
if row[13] in ("","NA") or row[14] in ("","NA"):
lookup = row[12] + "," + row[8] # creates a city,country
if coords.has_key( (row[8], row[12]) ): ## test if result is already cached
row[13] , row[14] = coords[ (row[8], row[12]) ]
else:
geolocator = Nominatim()
location = geolocator.geocode(lookup, timeout = None) #looks up the city/country on maps
try:
location.latitude
except:
lookup = row[8]
location = geolocator.geocode(lookup)
row[13] = location.latitude
row[14] = location.longitude
coords[ (row[8], row[12]) ] = (row[13] , row[14]) # cache the new coords
sleep(1) # sleep for 1 sec (required by Nominatim usage policy)
writer.writerow(row)
gtd.close()
outf.close()
另一答案
你可以使用GeocoderTimedOut
这是一个可以帮助你的示例函数
import geopy
from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut
def do_geocode(address):
geopy = Nominatim()
try:
return geopy.geocode(address)
except GeocoderTimedOut:
return do_geocode(address)
它非常简单,如果发生超时,它将重试。希望能帮助到你
以上是关于Python geopy地理编码器中的超时错误的主要内容,如果未能解决你的问题,请参考以下文章