在 emr 上运行 mrjob 脚本的 ssh 密钥无效
Posted
技术标签:
【中文标题】在 emr 上运行 mrjob 脚本的 ssh 密钥无效【英文标题】:Invalid ssh key running mrjob script on emr 【发布时间】:2014-06-03 18:24:28 【问题描述】:我正在阅读guide,了解如何让 mrjob 在 EMR 上工作。我按照所有步骤操作,但是当我运行示例脚本时出现此错误:
matthew@WinterMute:~/work/projects/mrjob_examples$ python word_count.py -r emr moby.txt
using configs in /etc/mrjob.conf
using existing scratch bucket mrjob-4db6342a70e021ad
using s3://mrjob-4db6342a70e021ad/tmp/ as our scratch dir on S3
creating tmp directory /tmp/word_count.matthew.20140603.181541.006786
writing master bootstrap script to /tmp/word_count.matthew.20140603.181541.006786/b.py
Copying non-input files into s3://mrjob-4db6342a70e021ad/tmp/word_count.matthew.20140603.181541.006786/files/
Waiting 5.0s for S3 eventual consistency
Creating Elastic MapReduce job flow
Job flow created with ID: j-3DCN7LULSRILW
Created new job flow j-3DCN7LULSRILW
Job on job flow j-3DCN7LULSRILW failed with status FAILED: The given SSH key name was invalid
Logs are in s3://mrjob-4db6342a70e021ad/tmp/logs/j-3DCN7LULSRILW/
Scanning S3 logs for probable cause of failure
Waiting 5.0s for S3 eventual consistency
Terminating job flow: j-3DCN7LULSRILW
Traceback (most recent call last):
File "word_count.py", line 16, in <module>
MRWordFrequencyCount.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 494, in run
mr_job.execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 512, in execute
super(MRJob, self).execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 147, in execute
self.run_job()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 208, in run_job
runner.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/runner.py", line 458, in run
self._run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 809, in _run
self._wait_for_job_to_complete()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 1599, in _wait_for_job_to_complete
raise Exception(msg)
Exception: Job on job flow j-3DCN7LULSRILW failed with status FAILED: The given SSH key name was invalid
【问题讨论】:
我目前也有同样的问题,但它是喜怒无常的,曾经有一次它奏效了。我的其余尝试都失败了。 【参考方案1】:您的工作似乎开始正常,但随后 mrjob 无法通过 ssh 连接到主节点以监控其状态。如果没有看到您的配置文件,主要是ec2_key_pair_file
和ec2_key_pair
选项,很难判断到底是什么设置不正确。确保您遵循Configuring AWS credentials 指南。您必须指定一个有效的密钥对名称(在“密钥对”部分下的 EC2 管理仪表板中查看)和相应.pem
文件的路径。
【讨论】:
这是我保存在 /etc/mrjob.conf 中的 conf 文件 pastebin.com/qGxbiJsd。我很确定我完全遵循了指南(顺便说一句,我删除了所有这些安全凭证)。【参考方案2】:我自己搜索错误时发现了这个问题。
我设法解决了这个问题 - SSH 密钥是特定于区域的,因此您需要将 mrjob.conf 文件中的区域设置为 SSH 密钥所属的区域:
runners:
emr:
aws_access_key_id: HADOOPHADOOPBOBADOOP
aws_region: us-west-1
aws_secret_access_key: MEMIMOMADOOPBANANAFANAFOFADOOPHADOOP
请看这里:https://pythonhosted.org/mrjob/guides/configs-basics.html
【讨论】:
那些不是 ssh 密钥,那些是 aws 凭据。以上是关于在 emr 上运行 mrjob 脚本的 ssh 密钥无效的主要内容,如果未能解决你的问题,请参考以下文章
mrjob 在 Amazon EMR 5.x 上不起作用,但在 EMR4.8.3 上运行
如何将 mrjob EMR 指向正确的 AWS 账户?我不断收到 ssh 密钥无效消息