使用 AWS 驱动程序与 Redshift 的 R 连接不起作用,但可以与 Postgre 驱动程序一起使用

Posted

技术标签:

【中文标题】使用 AWS 驱动程序与 Redshift 的 R 连接不起作用,但可以与 Postgre 驱动程序一起使用【英文标题】:R connection to Redshift using AWS driver doesn't work but does work with Postgre driver 【发布时间】:2015-11-16 21:56:56 【问题描述】:

按照 AWS https://blogs.aws.amazon.com/bigdata/post/Tx1G8828SPGX3PK/Connecting-R-with-Amazon-Redshift 提供的示例后,我正在尝试建立与我的 redshift 数据库的连接。但是,在尝试使用他们推荐的驱动程序建立连接时出现错误。但是,当我使用 Postgre 驱动程序时,我可以建立与 redshift 数据库的连接。

AWS 说他们的驱动程序“针对性能和内存管理进行了优化”,所以我宁愿使用它。有人可以在下面查看我的代码,如果他们发现有问题,请告诉我?我怀疑我没有正确设置 URL,但不确定我应该改用什么?提前感谢您的帮助。

#' This code attempts to establish a connection to redshift database.  It
#' attempts to establish a connection using the suggested redshift but doesn't
#' work.

## Clear up space and set working directory

#Clear Variables
rm(list=ls(all=TRUE))
gc()

## Libriries for analyis

library(RJDBC)
library(RPostgreSQL)

#Create DBI driver for working with redshift driver directly

# download Amazon Redshift JDBC driver
download.file('http://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC41-1.1.9.1009.jar',
              'RedshiftJDBC41-1.1.9.1009.jar')

# connect to Amazon Redshift using specific driver        
driver_redshift <- JDBC("com.amazon.redshift.jdbc41.Driver",
               "RedshiftJDBC41-1.1.9.1009.jar", identifier.quote="`")

## Using postgre connection that works 

#postgre driver
driver_postgre <- dbDriver("PostgreSQL")

#establish connection
conn_postgre <- dbConnect(driver_postgre, host="nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com",
                 port="5439",dbname="dev", 
                 user="xxxx", password="xxxx")

#list the tables available
tables = dbListTables(conn_postgre)

## Use URL option to establish connection like the example on AWS website

# url <- "<JDBCURL>:<PORT>/<DBNAME>?user=<USER>&password=<PW>
# url <- "jdbc:redshift://demo.ckffhmu2rolb.eu-west-1.redshift.amazonaws.com
# :5439/demo?user=XXX&password=XXX" #useses example from AWS instructions

#url using my redshift database
url <- "jdbc:redshift://nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com
:5439/dev?user=xxxx&password=xxxx"

#attempt connect but gives an error
conn_redshift <- dbConnect(driver_redshift, url)

#gives the following error:
# Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : 
#                   java.sql.SQLException: Error message not found: CONN_GENERAL_ERR. Can't find bundle for base name com.amazon.redshift.core.messages, locale en

## Similier to postgre example that works but doesn't work when using redshift specific driver

#gives an error saying url is missing, but I am not sure which url to use?
conn <- dbConnect(driver_redshift, host="nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com",
                  port="5439",dbname="dev", 
                  user="xxxx", password="xxxx")

# gives the following error: 
#Error in .jcall("java/sql/DriverManager", "Ljava/sql/Connection;", "getConnection",  : 
# argument "url" is missing, with no default

【问题讨论】:

给出整个路径后。成功了吗? 【参考方案1】:

我已经这样做了,它对我有用:

drv <- JDBC("com.amazon.redshift.jdbc41.Driver","PathTO/RedshiftJDBC41-1.1.2.0002.jar")
conn <- dbConnect(drv,"jdbc:redshift://......redshift.amazonaws.com:5439/dev",User,PWD)

我在您的文章中看到的不同之处在于您没有在 driver_redshift 中提及 redshift jar 的完整路径。

希望它有效。

【讨论】:

以上是关于使用 AWS 驱动程序与 Redshift 的 R 连接不起作用,但可以与 Postgre 驱动程序一起使用的主要内容,如果未能解决你的问题,请参考以下文章

.jfindClass 中的 R Redshift 错误

通过 R 连接到 AWS Redshift - Mac OSX

将数据 rom R 直接写入 AWS Redshift db

AWS redshift 阻止了我的 IP

是否可以使用 PHP 的 PDO Postgres 驱动程序查询 AWS Redshift?

Amazon Redshift 中的 ETL 与 ELT [关闭]