如何从 r 中的 api 获取数据?
Posted
技术标签:
【中文标题】如何从 r 中的 api 获取数据?【英文标题】:How to get data from an api in r? 【发布时间】:2021-08-13 17:30:34 【问题描述】:我是api
的新手,在python
中遇到了一小段代码检索数据,我想在r
中复制它:强>
python 代码:
import requests
import json
from datetime import date
import time
import smtplib, ssl
#API URL
url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'
headers = 'accept': 'application/json','Accept-Language' : 'hi_IN','User-Agent': 'Mozilla/4.0'
result = requests.get(url, headers=headers)
#Print state ID
print(result.content.decode())
结果:
"states":["state_id":1,"state_name":"安达曼和尼科巴 群岛","state_id":2,"state_name":"安得拉 邦","state_id":3,"state_name":"阿鲁纳恰尔 Pradesh","state_id":4,"state_name":"Assam","state_id":5,"state_name":"Bihar","state_id":6,"state_name":"Chandigarh" ,"state_id":7,"state_name":"Chhattisgarh","state_id":8,"state_name":"Dadra 和 Nagar Haveli","state_id":37,"state_name":"达曼和 第乌","state_id":9,"state_name":"德里","state_id":10,"state_name":"果阿","state_id":11,"state_name":"古吉拉特" ,"state_id":12,"state_name":"哈里亚纳邦","state_id":13,"state_name":"喜马偕尔 邦","state_id":14,"state_name":"查谟和 孟加拉语"],"ttl":24
API 信息:
网址:'http://cdn-api.co-vin.in/api/v2/admin/location/states'
来自:https://apisetu.gov.in/public/marketplace/api/cowin#/Metadata%20APIs/states
来自:https://github.com/cowinapi/developer.cowin/issues/339
(注意: cowin API 仅限于印度访问。所以我猜你们中的许多人将无法使用它。但如果您能建议一些代码更改,它仍然会有所帮助。)
R
我已经在谷歌上搜索并尝试过下面的代码片段,但到目前为止没有一个起作用:
library(tidyverse)
library(rjson)
library(jsonlite)
library(RCurl)
library(httr)
states_url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'
headers = c('accept' = 'application/json',
'Accept-Language' = 'hi_IN',
'User-Agent' = 'Mozilla/4.0')
url(states_url, headers = headers)
GET(states_url)$content
GET(states_url)$headers
更新
我已经尝试过了,它没有给出错误,但不知道下一步该做什么:
states_url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'
headers = c('accept' = 'application/json',
'Accept-Language' = 'hi_IN',
'User-Agent' = 'Mozilla/4.0')
url(states_url, headers = headers)
与的联系 描述“http://cdn-api.co-vin.in/api/v2/admin/location/states” 类“url-wininet” 模式“r” 文本“文本” 打开“关闭” 可以读“是” 可以写“不”
GET(states_url, header = headers)$content
1 3c 21 44 4f 43 54 59 50 45 20 48 54 4d 4c 20 50 55 42 4c [20] 49 43 20 22 2d 2f 2f 57 33 43 2f 2f 44 54 44 29] 48 4c 20 34 2e 30 31 20 54 72 61 6e 73 69 74 69 6f 6e 61 6c [58] 2f 2f 45 4e 22 20 22 68 74 74 70 3a 2f 2f 77 77 77 2e 77 [77] 33 2e 6f 72 67 2f 54 52 2f 68 74 6d 6c 34 2f 6c 6f 6f 73 [96] 65 2e 64 74 64 22 3e 0a 3c 48 54 4d 4c 3e 3c 48 45 41 44 [115] 3e 3c 4d 45 54 41 20 48 54 54 50 2d 45 51 55 49 56 3d 22 [134] 43 6f 6e 74 65 6e 74 2d 54 79 70 65 22 20 43 4f 4e 54 45 [153] 4e 54 3d 22 74 65 78 74 2f 68 74 6d 6c 3b 20 63 68 61 7
str(GET(states_url))
List of 10
$ url : chr "http://cdn-api.co-vin.in/api/v2/admin/location/states"
$ status_code: int 403
$ headers :List of 9
..$ server : chr "CloudFront"
..$ date : chr "Tue, 25 May 2021 12:39:02 GMT"
..$ content-type : chr "text/html"
..$ content-length: chr "919"
..$ connection : chr "keep-alive"
..$ x-cache : chr "Error from cloudfront"
..$ via : chr "1.1 85ad220378d99bdabeb6c46016f1cf16.cloudfront.net (CloudFront)"
..$ x-amz-cf-pop : chr "BOM51-C1"
..$ x-amz-cf-id : chr "eeJq5ZtSJHZkLGoJZBTUL2xL5PcU2gjesnY7Qmg_kMnxZxZ1JUHPWA=="
..- attr(*, "class")= chr [1:2] "insensitive" "list"
$ all_headers:List of 1
..$ :List of 3
.. ..$ status : int 403
.. ..$ version: chr "HTTP/1.1"
.. ..$ headers:List of 9
.. .. ..$ server : chr "CloudFront"
.. .. ..$ date : chr "Tue, 25 May 2021 12:39:02 GMT"
.. .. ..$ content-type : chr "text/html"
.. .. ..$ content-length: chr "919"
.. .. ..$ connection : chr "keep-alive"
.. .. ..$ x-cache : chr "Error from cloudfront"
.. .. ..$ via : chr "1.1 85ad220378d99bdabeb6c46016f1cf16.cloudfront.net (CloudFront)"
.. .. ..$ x-amz-cf-pop : chr "BOM51-C1"
.. .. ..$ x-amz-cf-id : chr "eeJq5ZtSJHZkLGoJZBTUL2xL5PcU2gjesnY7Qmg_kMnxZxZ1JUHPWA=="
.. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list"
$ cookies :'data.frame': 0 obs. of 7 variables:
..$ domain : logi(0)
..$ flag : logi(0)
..$ path : logi(0)
..$ secure : logi(0)
..$ expiration: 'POSIXct' num(0)
..$ name : logi(0)
..$ value : logi(0)
$ content : raw [1:919] 3c 21 44 4f ...
$ date : POSIXct[1:1], format: "2021-05-25 12:39:02"
$ times : Named num [1:6] 0 0.242 0.283 0.284 0.321 ...
..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ...
$ request :List of 7
..$ method : chr "GET"
..$ url : chr "http://cdn-api.co-vin.in/api/v2/admin/location/states"
..$ headers : Named chr "application/json, text/xml, application/xml, */*"
.. ..- attr(*, "names")= chr "Accept"
..$ fields : NULL
..$ options :List of 2
.. ..$ useragent: chr "libcurl/7.64.1 r-curl/4.3.1 httr/1.4.2"
.. ..$ httpget : logi TRUE
..$ auth_token: NULL
..$ output : list()
.. ..- attr(*, "class")= chr [1:2] "write_memory" "write_function"
..- attr(*, "class")= chr "request"
$ handle :Class 'curl_handle' <externalptr>
- attr(*, "class")= chr "response"
Show in New Window
http_status(GET(states_url))
$category
[1] "Client error"
$reason
[1] "Forbidden"
$message
[1] "Client error: (403) Forbidden"
stringi::stri_enc_detect(GET(states_url, header = headers)$content)
[[1]]
Encoding Language Confidence
1 ISO-8859-1 en 0.54
2 ISO-8859-2 ro 0.26
3 UTF-8 0.15
4 UTF-16BE 0.10
5 UTF-16LE 0.10
6 Shift_JIS ja 0.10
7 GB18030 zh 0.10
8 EUC-JP ja 0.10
9 EUC-KR ko 0.10
10 Big5 zh 0.10
11 ISO-8859-9 tr 0.06
12 IBM424_rtl he 0.02
13 IBM424_ltr he 0.01
content(GET(states_url, header = headers), encoding = "UTF-8")
html_document
<html>
[1] <head>\n<meta http-equiv="Content-Type" content="text/htm ...
[2] <body>\n<h1>403 ERROR</h1>\n<h2>The request could not be ...
content(GET(states_url, header = headers), encoding = "ISO-8859-1")
html_document
<html>
[1] <head>\n<meta http-equiv="Content-Type" content="text/htm ...
[2] <body>\n<h1>403 ERROR</h1>\n<h2>The request could not be ...
python代码图片:
【问题讨论】:
在您的原始输出上尝试rawToChar
(GET(..)$content
)。
通过这样做我得到[1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">\n<HTML><HEAD><META HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charset=iso-8859-1\">\n<TITLE>ERROR: The request could not be satisfied</TITLE>\n</HEAD><BODY>\n<H1>403 ERROR</H1>\n<H2>The request could not be satisfied.</H2>\n<HR noshade size=\"1px\">\nRequest blocked.\nWe can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.\n< ......
@r2evans 我之前在状态代码中遇到了403
错误,我不确定为什么在 python 中正常工作时会出现此错误。
请参阅my other comment,了解有关 OTP 的建议需求。您的另一个问题中的代码缺少该组件,我只能想象python代码可以工作,因为您以前以某种方式包含了它(或者python的requests
已经从其他地方找到或缓存了它)。
您调用包含标题的url(states_url, headers = headers)
,但这对于后续的GET(.)
没有任何作用,您在其中不使用标题。你打电话给url(.)
是完全没有用的,除了让你确信那里有东西。 url(.)
调用中的任何内容都不会持续存在或在其他任何地方使用。删除您对url(.)
的呼叫,然后执行GET(states_url, add_headers(headers))
。
【参考方案1】:
您拨打headers
,但从不将它们包含在您对GET
的调用中。在那里使用它们。
GET(states_url, add_headers(headers))
【讨论】:
以上是关于如何从 r 中的 api 获取数据?的主要内容,如果未能解决你的问题,请参考以下文章
使用 React 中的 Fetch api 从 createContext 中的 api 获取数据不起作用
如何使用用户名和传递以及 SSIS 中的动态令牌从 API 获取数据