在 docker 容器输出中运行 AWS 粘合作业,“com.amazonaws.SdkClientException:无法连接到服务端点:”
Posted
技术标签:
【中文标题】在 docker 容器输出中运行 AWS 粘合作业,“com.amazonaws.SdkClientException:无法连接到服务端点:”【英文标题】:Running AWS glue jobs in docker container outputs, "com.amazonaws.SdkClientException: Failed to connect to service endpoint:" 【发布时间】:2020-10-14 05:19:57 【问题描述】:我正在使用 Docker 开发本地 AWS 粘合作业(使用 pyspark)。我有一个包含使用 GlueContext 类的 aws 粘合作业的 python 文件 (song_data.py)。当我在容器终端中运行gluesparksubmit glue_etl_scripts/song_data.py --JOB-NAME test
以执行粘合作业脚本时,出现以下错误:
20/06/24 02:12:54 WARN EC2MetadataUtils: Unable to retrieve the requested metadata (/latest/dynamic/instance-identity/document). Failed to connect to service endpoint:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.util.EC2MetadataUtils.getItems(EC2MetadataUtils.java:402)
at com.amazonaws.util.EC2MetadataUtils.getData(EC2MetadataUtils.java:371)
at com.amazonaws.util.EC2MetadataUtils.getData(EC2MetadataUtils.java:367)
at com.amazonaws.util.EC2MetadataUtils.getEC2InstanceRegion(EC2MetadataUtils.java:282)
at com.amazonaws.regions.InstanceMetadataRegionProvider.tryDetectRegion(InstanceMetadataRegionProvider.java:59)
at com.amazonaws.regions.InstanceMetadataRegionProvider.getRegion(InstanceMetadataRegionProvider.java:50)
at com.amazonaws.regions.AwsRegionProviderChain.getRegion(AwsRegionProviderChain.java:46)
at com.amazonaws.services.glue.util.EndpointConfig$.getConfig(EndpointConfig.scala:42)
at com.amazonaws.services.glue.util.AWSConnectionUtils$.<init>(AWSConnectionUtils.scala:36)
at com.amazonaws.services.glue.util.AWSConnectionUtils$.<clinit>(AWSConnectionUtils.scala)
at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1205)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:52)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:80)
... 25 more
An error occurred while calling o28.getCatalogSource.
: java.lang.ExceptionInInitializerError
at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
at com.amazonaws.regions.AwsRegionProviderChain.getRegion(AwsRegionProviderChain.java:59)
at com.amazonaws.services.glue.util.EndpointConfig$.getConfig(EndpointConfig.scala:42)
at com.amazonaws.services.glue.util.AWSConnectionUtils$.<init>(AWSConnectionUtils.scala:36)
at com.amazonaws.services.glue.util.AWSConnectionUtils$.<clinit>(AWSConnectionUtils.scala)
... 12 more
在胶水作业文件(song_data.py)中调用glueContext.create_dynamic_frame.from_catalog() 方法时引发错误:
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark import SQLContext
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from configparser import ConfigParser
config = ConfigParser()
config.read_file(open('/usr/local/src/config/aws.cfg'))
sc = SparkContext.getOrCreate()
hadoop_conf = sc._jsc.hadoopConfiguration()
hadoop_conf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
hadoop_conf.set("fs.s3a.access.key", config.get('AWS', 'KEY'))
hadoop_conf.set("fs.s3a.secret.key", config.get('AWS', 'SECRET'))
hadoop_conf.set("fs.s3a.endpoint", "s3.us-west-2.amazonaws.com")
sql = SQLContext(sc)
glueContext = GlueContext(sql)
try:
song_df = glueContext.create_dynamic_frame.from_catalog(
database='sparkify',
table_name='song_data')
print ('Count: ', song_df.count())
print('Schema: ')
song_df.printSchema()
except Exception as e:
print(e)
我试过了:
使用不同的访问/密钥属性将 Hadoop 配置 fs.s3a 更改为 fs.s3:
hadoop_conf.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
hadoop_conf.set("fs.s3.awsAccessKeyId", config.get('AWS', 'KEY'))
hadoop_conf.set("fs.s3.awsSecretAccessKey", config.get('AWS', 'SECRET'))
hadoop_conf.set("fs.s3.endpoint", "s3.us-west-2.amazonaws.com")
使用 GlueContext 的 create_dynamic_frame_from_catalog() 方法代替 create_dynamic_frame.from_catalog():
song_df = glueContext.create_dynamic_frame_from_catalog(
database='sparkify',
table_name='song_data')
删除 Hadoop 端点配置:
# hadoop_conf.set("fs.s3a.endpoint", "s3.us-west-2.amazonaws.com")
更新的尝试
将 song_data.py 更改为:
conf = (
SparkConf()
.set('spark.hadoop.fs.s3a.access.key', config.get('AWS', 'KEY'))
.set('spark.hadoop.fs.s3a.secret.key', config.get('AWS', 'SECRET'))
.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
)
sc = SparkContext(conf=conf)
spark = SparkSession(sc)
glueContext = GlueContext(spark)
try:
print('Attempt 1:')
song_df = glueContext.create_dynamic_frame.from_options(
connection_type='s3',
connection_options="paths": [ "s3a://sparkify-dend-analytics"],
format='json')
print ('Count: ', song_df.count())
print('Schema: ')
song_df.printSchema()
except Exception as e:
print(e)
try:
print('Attempt 2:')
song_df = glueContext.create_dynamic_frame.from_catalog(
database='sparkify',
table_name='song_data')
print ('Count: ', song_df.count())
print('Schema: ')
song_df.printSchema()
except Exception as e:
print(e)
try:
print('Attempt 3:')
song_df = glueContext.create_dynamic_frame_from_catalog(
database='sparkify',
table_name='song_data')
print ('Count: ', song_df.count())
print('Schema: ')
song_df.printSchema()
except Exception as e:
print(e)
输出错误
尝试 1:
An error occurred while calling o37.getDynamicFrame.
: org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on
sparkify-dend-analytics: com.amazonaws.AmazonClientException: No AWS
Credentials provided by DefaultAWSCredentialsProviderChain :
com.amazonaws.SdkClientException: Unable to load AWS credentials from any
provider in the chain: [EnvironmentVariableCredentialsProvider: Unable to
load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or
AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)),
SystemPropertiesCredentialsProvider: Unable to load AWS credentials from
Java system properties (aws.accessKeyId and aws.secretKey),
WebIdentityTokenCredentialsProvider: You must specify a value for roleArn
and roleSessionName, com.amazonaws.auth.profile.ProfileCredentialsProvider@xxxxxxxx:
profile file cannot be null, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@xxxxxxxx: Failed
to connect to service endpoint: ]
尝试 2:
EC2MetadataUtils: Unable to retrieve the requested metadata (/latest/dynamic/instance-identity/document). Failed to connect to service endpoint:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
......
Caused by: java.net.ConnectException: Connection refused (Connection refused)
......
Caused by: com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
尝试 3:
An error occurred while calling o32.getCatalogSource.
: java.lang.NoClassDefFoundError: Could not initialize class com.amazonaws.services.glue.util.AWSConnectionUtils$
【问题讨论】:
【参考方案1】:在 Docker 容器中本地运行 Glue 作业无法访问 Glue 目录。
使用直接从 s3 读取数据而不是从目录读取数据
from_options(connection_type, connection_options=, format=None, format_options=, transformation_ctx="")
查找相同 here 的文档
更新: 您收到区域错误,这在本地运行胶水时很常见。
尝试运行下面的命令来提供你的区域,这用于初始化库,它仍然在本地工作
export AWS_REGION=us-east-1
【讨论】:
感谢您的建议。运行export AWS_REGION=us-west-2
修复了 Caused by: com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
错误。以上是关于在 docker 容器输出中运行 AWS 粘合作业,“com.amazonaws.SdkClientException:无法连接到服务端点:”的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 AWS java SDK 使用 AWS 粘合作业生成自动脚本