r 解析CapFriendly NHL Salary网站

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了r 解析CapFriendly NHL Salary网站相关的知识,希望对你有一定的参考价值。

options(stringsAsFactors = FALSE)

## load the packages
library(tidyverse)
library(rvest)

## settings
BASE_URL = "https://www.capfriendly.com/browse/active/2015&p=%s"
TABLE_NAME = "brwt"
TABLE_XPATH = '//*[@id="brwt"]'
PAGES = 1:29  ## there are 29 pages via the browser

## dataset to store our results
salaries = data.frame()

## loop and parse the data
for (i in PAGES) { 
  ## build the URL and get the page
  URL = sprintf(BASE_URL, i)
  resp = URL %>% read_html() %>% html_nodes(xpath=TABLE_XPATH)
  stats = resp[[1]] %>% html_table()
  ## make all strings to avoid type issues and bind the data
  stats2 = mutate_all(stats, funs(as.character(.)))
  salaries = bind_rows(salaries, stats2)
  ## cleanup
  rm(URL, resp, stats, stats2)
  ## status
  cat("finished ", i, "\n")
} #endfor


## save out the dataset
write_csv(salaries, "~/Downloads/salaries1415.csv", na="")

以上是关于r 解析CapFriendly NHL Salary网站的主要内容,如果未能解决你的问题,请参考以下文章

r nhl16-ALLSTAR薪金,clean.r

r NHL-ALLSTAR聚类,helpers.r

多行文本文件转列表解析成字典方法

css NHL代码

加入来自不同季节/年的两个数据集,与体育有关(例如,NHL,1991和1992年季节)

Mysql略复杂命令总结