R语言学习
Posted gonghaiyu
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了R语言学习相关的知识,希望对你有一定的参考价值。
R 包
检查可用的R包
libPaths()
When the above code executes, it produces the following project, which may vary depending on the local settings of our PCs & Laptops.
获取已安装的包列表
library()
When we execute the above function, it produces the following result, which may vary depending on the local settings of our PCs or laptops.
search()
When we execute the above code, it will produce the following result, which may vary depending on the local settings of our PCs and laptops:
安装新包
Install directly from CRAN
install.packages(“Package Name”)
The syntax of installing XML package is as follows:
install.packages(“XML”)
Install package manually
Once the downloading has finished, we will use the following command:
install.packages(file_name_with_path, repos = NULL, type = “source”)
Install the package named “XML”
install.packages(“C:\\Users\\ajeet\\OneDrive\\Desktop\\graphics\\xml2_1.2.2.zip”, repos = NULL, type = “source”)
加载包到镜像中
library(“package Name”, lib.loc = “path to library”)
Command to load the XML package
install.packages(“C:\\Users\\ajeet\\OneDrive\\Desktop\\graphics\\xml2_1.2.2.zip”, repos = NULL, type = “source”)
R包列表
There are some mostly used and popular packages which are as follows:
-
tidyr
The word tidyr comes from the word tidy, which means clear. So the tidyr package is used to make the data’ tidy’. This package works well with dplyr. This package is an evolution of the reshape2 package.
2)ggplot2
R allows us to create graphics declaratively. R provides the ggplot package for this purpose. This package is famous for its elegant and quality graphs which sets it apart from other visualization packages. -
ggraph
R provides an extension of ggplot known as ggraph. The limitation of ggplot is the dependency on tabular data is taken away in ggraph. -
dplyr
R allows us to perform data wrangling and data analysis. R provides the dplyr library for this purpose. This library facilitates several functions for the data frame in R. -
tidyquant
The tidyquant is a financial package which is used for carrying out quantitative financial analysis. This package adds to the tidyverse universe as a financial package which is used for importing, analyzing and visualizing the data. -
dygraphs
The dygraphs package provides an interface to the main javascript library which we can use for charting. This package is essentially used for plotting time-series data in R. -
leaflet
For creating interactive visualization, R provides the leaflet package. This package is an open-source JavaScript library. The world’s popular websites like the New York Times, Github and Flicker, etc. are using leaflet. The leaflet package makes it easier to interact with these sites. -
ggmap
For delineating spatial visualization, the ggmap package is used. It is a mapping package which consists of various tools for geolocating and routing. -
glue
R provides the glue package to perform the operations of data wrangling. This package is used for evaluating R expressions which are present within the string. -
shiny
R allows us to develop interactive and aesthetically pleasing web apps by providing a shiny package. This package provides various extensions with html widgets, CSS, and JavaScript. -
plotly
The plotly package provides online interactive and quality graphs. This package extends upon the JavaScript library -plotly.js. -
tidytext
The tidytext package provides various functions of text mining for word processing and carrying out analysis through ggplot, dplyr, and other miscellaneous tools. -
stringr
The stringr package provides simplicity and consistency to use wrappers for the ‘stringi’ package. The stringi package facilitates common string operations. -
reshape2
This package facilitates flexible reorganization and aggregation of data using melt () and decast () functions. -
dichromat
The R dichromat package is used to remove Red-Green or Blue-Green contrasts from the colors. -
digest
The digest package is used for the creation of cryptographic hash objects of R functions. -
MASS
The MASS package provides a large number of statistical functions. It provides datasets that are in conjunction with the book “Modern Applied Statistics with S.” -
caret
R allows us to perform classification and regression tasks by providing the caret package. CaretEnsemble is a feature of caret which is used for the combination of different models. -
e1071
The e1071 library provides useful functions which are essential for data analysis like Naive Bayes, Fourier Transforms, SVMs, Clustering, and other miscellaneous functions. -
sentimentr
The sentiment package provides functions for carrying out sentiment analysis. It is used to calculate text polarity at the sentence level and to perform aggregation by rows or grouping variables.
Data Structures in R Programming
R has many data structures, which include:
Vectors
“A vector is a collection of elements which is most commonly of mode character, integer, logical or numeric” A vector can be one of the following two types:
1、Atomic vector
2、Lists
List
A list is a special type of vector in which each element can be a different type.
Arrays
An array is a collection of a similar data type with contiguous memory allocation.
Matrices
Syntax
The basic syntax of creating a matrix is as follows:
matrix(data, no_row, no_col, by_row, dim_name)
Data Frames
A data frame is a two-dimensional array-like structure, or we can say it is a table in which each column contains the value of one variable, and row contains the set of value from each column.
There are the following characteristics of a data frame:
1、The column name will be non-empty.
2、The row names will be unique.
3、A data frame stored numeric, factor or character type data.
4、Each column will contain same number of data items.
Factors
Factors are also data objects that are used to categorize the data and store it as levels. Factors can store both strings and integers. Columns have a limited number of unique values so that factors are very useful in columns. It is very useful in data analysis for statistical modeling.
Factors are created with the help of factor() function by taking a vector as an input parameter.
Keywords in R Programming
R Statements
R Switch Statement
The basic syntax of If-else statement is as follows:
switch(expression, case1, case2, case3…)
ax= 1
bx = 2
y = switch(
ax+bx,
“Hello, Shubham”,
“Hello Arpita”,
“Hello Vaishali”,
“Hello Nishka”
)
print (y)
R next Statement
There is the following syntax for creating the next statement in R。next其实就是其他语言中的continue
next
a <- 1
repeat
if(a == 10)
break
if(a == 5)
next
print(a)
a <- a+1
R For Loop
In R, a for loop is defined as :
It starts with the keyword for like C or C++.
Instead of initializing and declaring a loop counter variable, we declare a variable which is of the same type as the base type of the vector, matrix, etc., followed by a colon, which is then followed by the array or matrix name.
In the loop body, use the loop variable rather than using the indexed array element.
There is a following syntax of for loop in R:
for (value in vector)
statements
Creating a matrix
mat <- matrix(data = seq(10, 21, by=1), nrow = 6, ncol =2)
Creating the loop with r and c to iterate over the matrix
for (r in 1:nrow(mat))
for (c in 1:ncol(mat))
print(paste(“mat[”, r, “,”,c, “]=”, mat[r,c]))
print(mat)
R Functions
Functions are used to avoid repeating the same task and to reduce complexity. To understand and maintain our code, we logically break it into smaller parts using the function. A function should be
Written to carry out a specified task.
May or may not have arguments
Contain a body in which our code is written.
May or may not return one or more output values.
“An R function is created by using the keyword function.” There is the following syntax of R function:
func_name <- function(arg_1, arg_2, …)
Function body
Components of Functions
There are four components of function, which are as follows:
R Built-in Functions
1、 String Function
substr(x, start=n1,stop=n2)
grep(pattern, x , ignore.case=FALSE, fixed=FALSE)
sub(pattern, replacement, x, ignore.case =FALSE, fixed=FALSE)
paste(…, sep=“”)
strsplit(x, split)
tolower(x)
toupper(x)
Data Reshaping in R
1、 矩阵转置
t(Matrix/data frame)
a <- matrix(c(4:12),nrow=3,byrow=TRUE)
a
print(“Matrix after transpose\\n”)
b <- t(a)
b
2、Joining rows and columns in Data Frame
cbind(vector1, vector2,…vectorN)
rbind(dataframe1, dataframe2,…dataframeN)
3、Merging Data Frame
library(MASS)
merging_pima<- merge(x = Pima.te, y = Pima.tr,
by.x = c(“bp”, “bmi”),
by.y = c(“bp”, “bmi”)
)
print(merging_pima)
nrow(merging_pima)
Data Interfaces
R CSV File
Getting and setting the working directory
Getting and printing current working directory.
print(getwd())
Setting the current working directory.
setwd(“C:/Users/ajeet”)
Getting and printingthe current working directory.
print(getwd())
Creating a CSV File
A text file in which a comma separates the value in a column is known as a CSV file.
Reading a CSV file
data <- read.csv(“record.csv”)
print(data)
Analyzing the CSV File
When we read data from the .csv file using read.csv() function, by default, it gives the output as a data frame. Before analyzing data, let’s start checking the form of our output with the help of is.data.frame() function. After that, we will check the number of rows and number of columns with the help of nrow() and ncol() function.
csv_data<- read.csv(“record.csv”)
print(is.data.frame(csv_data))
print(ncol(csv_data))
print(nrow(csv_data))
From the above output, it is clear that our data is read in the form of the data frame. So we can apply all the functions of the data frame, which we have discussed in the earlier sections.
1、Getting the maximum salary
Creating a data frame.
csv_data<- read.csv(“record.csv”)
Getting the maximum salary from data frame.
max_sal<- max(csv_data$salary)
print(max_sal)
2、Getting the details of the person who have a maximum salaryCreating a data frame.
csv_data<- read.csv(“record.csv”)
Getting the maximum salary from data frame.
max_sal<- max(csv_data$salary)
print(max_sal)
#Getting the detais of the pweson who have maximum salary
details <- subset(csv_data,salary==max(salary))
print(details)
3、Example: Getting the details of all the persons who are working in the IT departmentCreating a data frame.
csv_data<- read.csv(“record.csv”)
#Getting the detais of all the pweson who are working in IT department
details <- subset(csv_data,dept==“IT”)
print(details)
Example: Getting the details of the persons whose salary is greater than 600 and working in the IT department.Creating a data frame.
csv_data<- read.csv(“record.csv”)
#Getting the detais of all the pweson who are working in IT department
details <- subset(csv_data,dept==“IT”&salary>600)
print(details)
Example: Getting details of those peoples who joined on or after 2014.Creating a data frame.
csv_data<- read.csv(“record.csv”)
#Getting details of those peoples who joined on or after 2014
details <- subset(csv_data,as.Date(start_date)>as.Date(“2014-01-01”))
print(details)
Writing into a CSV file
csv_data<- read.csv(“record.csv”)
#Getting details of those peoples who joined on or after 2014
details <- subset(csv_data,as.Date(start_date)>as.Date(“2014-01-01”))Writing filtered data into a new file.
write.csv(details,“output.csv”)
new_details<- read.csv(“output.csv”)
print(new_details)
R Excel file
from:https://www.javatpoint.com/list-of-r-packages
以上是关于R语言学习的主要内容,如果未能解决你的问题,请参考以下文章