如何使用 impyla 连接到 impala 或使用 pyhive 连接到 hive?
Posted
技术标签:
【中文标题】如何使用 impyla 连接到 impala 或使用 pyhive 连接到 hive?【英文标题】:How to connect to impala using impyla or to hive using pyhive? 【发布时间】:2019-09-16 13:49:36 【问题描述】:我正在尝试通过以下代码使用 impyla 连接到 impala:
from impala.dbapi import connect
conn = connect(host='host_name.com', port=21050, user='usr', password='pass', use_ssl=True, auth_mechanism='LDAP')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()
根据文档,这个库需要版本 0.2.1 中的 thrift_sasl,但我无法安装它,因为它显示了这个错误
Collecting thrift_sasl==0.2.1
Using cached https://files.pythonhosted.org/packages/80/36/16dfe92d32df63cc2b7b7be8d0e4a736617b7e52daaa7f83ae386a89d179/thrift_sasl-0.2.1.tar.gz
Collecting sasl>=0.2.1 (from thrift_sasl==0.2.1)
Using cached https://files.pythonhosted.org/packages/8e/2c/45dae93d666aea8492678499e0999269b4e55f1829b1e4de5b8204706ad9/sasl-0.2.1.tar.gz
Collecting thriftpy (from thrift_sasl==0.2.1)
Using cached https://files.pythonhosted.org/packages/f4/19/cca118cf7d2087310dbc8bd70dc7df0c1320f2652873a93d06d7ba356d4a/thriftpy-0.3.9.tar.gz
Requirement already satisfied: six in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from sasl>=0.2.1->thrift_sasl==0.2.1) (1.12.0)
Requirement already satisfied: ply<4.0,>=3.4 in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from thriftpy->thrift_sasl==0.2.1) (3.11)
Installing collected packages: sasl, thriftpy, thrift-sasl
Running setup.py install for sasl ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-rjwkcc76\install-record.txt' --single-version-externally-managed --compile
cwd: C:\Users\psowa\AppData\Local\Temp\pip-install-y42wej4x\sasl\
Complete output (27 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.6
creating build\lib.win-amd64-3.6\sasl
copying sasl\__init__.py -> build\lib.win-amd64-3.6\sasl
running egg_info
writing sasl.egg-info\PKG-INFO
writing dependency_links to sasl.egg-info\dependency_links.txt
writing requirements to sasl.egg-info\requires.txt
writing top-level names to sasl.egg-info\top_level.txt
reading manifest file 'sasl.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'sasl.egg-info\SOURCES.txt'
copying sasl\saslwrapper.cpp -> build\lib.win-amd64-3.6\sasl
copying sasl\saslwrapper.h -> build\lib.win-amd64-3.6\sasl
copying sasl\saslwrapper.pyx -> build\lib.win-amd64-3.6\sasl
running build_ext
building 'sasl.saslwrapper' extension
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
creating build\temp.win-amd64-3.6\Release\sasl
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Isasl -Ic:\users\psowa\appdata\local\programs\python\python36\include -Ic:\users\psowa\appdata\local\programs\python\python36\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\cppwinrt" /EHsc /Tpsasl/saslwrapper.cpp /Fobuild\temp.win-amd64-3.6\Release\sasl/saslwrapper.obj
saslwrapper.cpp
c:\users\psowa\appdata\local\temp\pip-install-y42wej4x\sasl\sasl\saslwrapper.h(22): fatal error C1083: Cannot open include file: 'sasl/sasl.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-rjwkcc76\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.
当我安装最新版本的 thrift_sasl jupyter 时出现此错误:
AttributeError: 'TSSLSocket' object has no attribute 'isOpen'
我也尝试使用以下代码通过 pyhive 进行连接:
from pyhive import hive
host_name = "host_name.com"
port = 10000
user = "usr"
password = "pass"
def hiveconnection(host_name, port, user,password):
conn = hive.Connection(host=host_name, port=port, username=user, password=password, auth='LDAP')
cur = conn.cursor()
cur.execute('SHOW DATABASES')
result = cur.fetchall()
return result
output = hiveconnection(host_name, port, user,password)
print(output)
它希望我安装 sasl,但是当我尝试这样做时,它显示:
Collecting sasl
Using cached https://files.pythonhosted.org/packages/8e/2c/45dae93d666aea8492678499e0999269b4e55f1829b1e4de5b8204706ad9/sasl-0.2.1.tar.gz
Requirement already satisfied: six in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from sasl) (1.12.0)
Installing collected packages: sasl
Running setup.py install for sasl ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-sedxucba\install-record.txt' --single-version-externally-managed --compile
cwd: C:\Users\psowa\AppData\Local\Temp\pip-install-9rn_a9g0\sasl\
Complete output (27 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.6
creating build\lib.win-amd64-3.6\sasl
copying sasl\__init__.py -> build\lib.win-amd64-3.6\sasl
running egg_info
writing sasl.egg-info\PKG-INFO
writing dependency_links to sasl.egg-info\dependency_links.txt
writing requirements to sasl.egg-info\requires.txt
writing top-level names to sasl.egg-info\top_level.txt
reading manifest file 'sasl.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'sasl.egg-info\SOURCES.txt'
copying sasl\saslwrapper.cpp -> build\lib.win-amd64-3.6\sasl
copying sasl\saslwrapper.h -> build\lib.win-amd64-3.6\sasl
copying sasl\saslwrapper.pyx -> build\lib.win-amd64-3.6\sasl
running build_ext
building 'sasl.saslwrapper' extension
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
creating build\temp.win-amd64-3.6\Release\sasl
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Isasl -Ic:\users\psowa\appdata\local\programs\python\python36\include -Ic:\users\psowa\appdata\local\programs\python\python36\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\cppwinrt" /EHsc /Tpsasl/saslwrapper.cpp /Fobuild\temp.win-amd64-3.6\Release\sasl/saslwrapper.obj
saslwrapper.cpp
c:\users\psowa\appdata\local\temp\pip-install-9rn_a9g0\sasl\sasl\saslwrapper.h(22): fatal error C1083: Cannot open include file: 'sasl/sasl.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-sedxucba\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.
有什么想法吗?
【问题讨论】:
【参考方案1】:在 2.7 版中使用 python 修复了这个问题。我认为兼容性存在问题。
【讨论】:
以上是关于如何使用 impyla 连接到 impala 或使用 pyhive 连接到 hive?的主要内容,如果未能解决你的问题,请参考以下文章
0039-如何使用Python Impyla客户端连接Hive和Impala
如何使用Python Impyla客户端连接Hive和Impala