用Phoenix / PQS 访问HBase

Posted sun_xo

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了用Phoenix / PQS 访问HBase相关的知识,希望对你有一定的参考价值。

Apache Phoenix is an open-source SQL skin for Apache HBase that makes it easier to connect to your HBase data through standard JDBC and SQL interfaces. This article describes how to build an environment for accessing hbase through phoenix step by step

1. Install hbase

1) download hbase binary from https://hbase.apache.org/, currently the latest stable release is 2.4.16
2) extract binary
        $ tar xvf hbase-2.4.16-bin.tar.gz -C ~/bigdata/
3) configrate hbase-env.sh and hbase-site.xml
        $ cd ~/bigdata/hbase-2.4.16/conf    
        $ diff -u hbase-env.sh.orig hbase-env.sh

 # The java implementation to use.  Java 1.8+ required.
-# export JAVA_HOME=/usr/java/jdk1.8.0/
+export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_321.jdk/Contents/Home/

        $ diff -u hbase-site.xml.orig hbase-site.xml

--- hbase-site.xml.orig	2020-01-22 23:10:15.000000000 +0800
+++ hbase-site.xml	2023-03-29 09:00:32.000000000 +0800
@@ -51,4 +51,20 @@
     <name>hbase.unsafe.stream.capability.enforce</name>
     <value>false</value>
   </property>
+	<property>
+		<name>hbase.rootdir</name>
+		<value>file:/Users/sun_xo/bigdata/hbase-2.4.16/HBase/HFiles</value>
+	</property>
+	<property>
+		<name>hbase.zookeeper.property.dataDir</name>
+		<value>/Users/sun_xo/bigdata/hbase-2.4.16/zookeeper</value>
+	</property>
+	<property>
+		<name>phoenix.schema.isNamespaceMappingEnabled</name>
+		<value>true</value>
+	</property>
+	<property>
+		<name>phoenix.schema.mapSystemTablesToNamespace</name>
+		<value>true</value>
+	</property>
 </configuration>

4) create relevant dirs
        $ cd ~/bigdata/hbase-2.4.16/
        $ mkdir -p HBase/HFiles zookeeper

2. try hbase

1) start hbase in standlone mode  
        $ cd ~/bigdata/hbase-2.4.16/
        $ bin/start-hbase.sh        
        $ netstat -ltpn | grep `jps | grep HMaster | awk 'print $1'`        # show sockets listening by hbase master

tcp6       0      0 127.0.0.1:2181          :::*                    LISTEN      620817/java
tcp6       0      0 :::16010                :::*                    LISTEN      620817/java
tcp6       0      0 192.168.55.250:16020    :::*                    LISTEN      620817/java
tcp6       0      0 :::16030                :::*                    LISTEN      620817/java
tcp6       0      0 192.168.55.250:16000    :::*                    LISTEN      620817/java

Note: HBase has a Web UI at http://localhost:16010

2). create a test table by using hbase shell
        $ bin/hbase shell

hbase:001:0> create 'test', 'cf'
Created table test
Took 0.6299 seconds                                                                                 
=> Hbase::Table - test
hbase:002:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.0078 seconds                                                                                 
hbase:003:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0020 seconds                                                                                 
hbase:004:0> put 'test', 'row3', 'cf:c', 'value3'
Took 0.0055 seconds
hbase:005:0> scan 'test'
ROW                        COLUMN+CELL                                                              
 row1                      column=cf:a, timestamp=2023-03-31T14:45:27.074, value=value1             
 row2                      column=cf:b, timestamp=2023-03-31T14:45:29.172, value=value2             
 row3                      column=cf:c, timestamp=2023-03-31T14:53:26.496, value=value3             
3 row(s)
Took 0.0416 seconds

3. install phoenix

1) download hbase binary from http://archive.apache.org/dist/phoenix/
2) extract binary
        $ tar xvf phoenix-hbase-2.4-5.1.1-bin.tar.gz -C ~/bigdata/
3) set environment for phoenix

export HBASE_HOME=~/bigdata/hbase-2.4.16
export PHOENIX_CLASS_PATH=~/phoenix-hbase-2.4-5.1.1-bin

4) distribute phoenix server jar to hbase
$ cd ~/bigdata/phoenix-hbase-2.4-5.1.1-bin
$ cp phoenix-server-hbase-2.4-5.1.1.jar ../hbase-2.4.16/lib/
Note: needs to restart hbase here

4. test phoenix
1) remove local hbase-site.xml and check hbase_conf_path of phoenix
        $ rm bin/hbase-site.xml
        $ bin/phoenix_utils.py

hbase_conf_path: /home/sunxo/bigdata/hbase-2.4.16/conf
...

Note: wrong hbase config will cause sqlline fail with following error:
ERROR 726 (43M10):  Inconsistent namespace mapping properties. Ensure that config phoenix.schema.isNamespaceMappingEnabled is consistent on client and server
2) connect to hbase by using sqlline
        $ bin/sqlline.py                       # if there is no exception, connected

0: jdbc:phoenix:> !tables        # several system tables have been created in hbase
+-----------+-------------+------------+--------------+---------+-----------+----------------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME |  TABLE_TYPE  | REMARKS | TYPE_NAME | SELF_REFERENCING_COL |
+-----------+-------------+------------+--------------+---------+-----------+----------------------+
|           | SYSTEM      | CATALOG    | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | CHILD_LINK | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | FUNCTION   | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | LOG        | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | MUTEX      | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | SEQUENCE   | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | STATS      | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | TASK       | SYSTEM TABLE |         |           |                      |
+-----------+-------------+------------+--------------+---------+-----------+----------------------+

3) create a view so we can manipulate the table in hbase

0: jdbc:phoenix:> create view "test" (
. . . . . . . .)>   "ROW" varchar primary key,
. . . . . . . .)>   "cf"."a" varchar,
. . . . . . . .)>   "cf"."b" varchar,
. . . . . . . .)>   "cf"."c" varchar
. . . . . . . .)> );
>
No rows affected (5.937 seconds)
0: jdbc:phoenix:>  select * from "test";
+------+--------+--------+--------+
| ROW  |   a    |   b    |   c    |
+------+--------+--------+--------+
| row1 | value1 |        |        |
| row2 |        | value2 |        |
| row3 |        |        | value3 |
+------+--------+--------+--------+
3 rows selected (0.069 seconds)

5. Install phoenix query server (PQS)

1) download hbase binary from https://phoenix.apache.org/download.html
2) extract binary
        $ tar xvf phoenix-queryserver-6.0.0-bin.tar.gz -C ~/bigdata/

6. test phoenix query server

1) check phoenix_class_path of PQS
        $ cd ~/bigdata/phoenix-queryserver-6.0.0
        $ bin/phoenix_utils.py

phoenix_class_path: /home/sunxo/bigdata/phoenix-hbase-2.4-5.1.1-bin
...

Note: if the phoenix_class_path is wrong, start query server wiill fail with following error:
Error: Could not find or load main class org.apache.phoenix.queryserver.server.QueryServer

2) start PQS
        $ bin/queryserver.py

23/03/31 16:35:19 INFO server.Server: Started @622ms
23/03/31 16:35:19 INFO server.HttpServer: Service listening on port 8765

3) from other terminal, start sqlline-thin to connect to hbase
        $ bin/sqlline-thin.py

sqlline version 1.9.0
0: jdbc:phoenix:thin:url=http://localhost:876> !tables
+-----------+-------------+------------+--------------+---------+-----------+----------------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME |  TABLE_TYPE  | REMARKS | TYPE_NAME | SELF_REFERENCING_COL |
+-----------+-------------+------------+--------------+---------+-----------+----------------------+
|           | SYSTEM      | CATALOG    | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | CHILD_LINK | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | FUNCTION   | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | LOG        | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | MUTEX      | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | SEQUENCE   | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | STATS      | SYSTEM TABLE |         |           |                      |
|           | SYSTEM      | TASK       | SYSTEM TABLE |         |           |                      |
|           |             | test       | VIEW         |         |           |                      |
+-----------+-------------+------------+--------------+---------+-----------+----------------------+

7. Test from a java client using PQS thin diver

$ cat Phoenix.java

public class Phoenix 
    private final Connection conn;

    public Phoenix(String hostName, int port) throws SQLException 
        String url;
        if (port == 2181) 
            url = String.format("jdbc:phoenix:%s:%d", hostName, port);
         else if (port == 8765) 
            url = String.format("jdbc:phoenix:thin:url=http://%s:%d;serialization=PROTOBUF", hostName, port);
         else 
            throw new RuntimeException();
        
        Properties properties = new Properties();
        properties.setProperty("phoenix.schema.isNamespaceMappingEnabled", "true");
        properties.setProperty("phoenix.schema.mapSystemTablesToNamespace", "true");
        this.conn = DriverManager.getConnection(url, properties);
    

    public void close() throws SQLException 
        conn.close();
    

    public void test() throws SQLException 
        PreparedStatement statement = conn.prepareStatement("select * from \\"test\\"");
        ResultSet rs = statement.executeQuery();
        while (rs.next()) 
            System.out.println(rs.getString("a"));
            System.out.println(rs.getString("b"));
            System.out.println(rs.getString("c"));
        
        statement.close();
    
    
    public static void main(String[] args) throws SQLException 
//        Phoenix db = new Phoenix("localhost", 2181);
        Phoenix db = new Phoenix("localhost", 8765);
        db.test();
        db.close();
        System.out.println("closed");
    

        $ javac Phoenix.java
        $ java -cp "phoenix-queryserver-client-6.0.0.jar:." Phoenix

value1
null
null
null
value2
null
null
null
value3
closed

Reference:
https://hbase.apache.org/
https://phoenix.apache.org/

phoenix PQS的kerberos相关配置

thin 客户端的实例代码

jdbc:phoenix:thin:url=<scheme>://<server-hostname>:<port>;authentication=SPNEGO

示例

jdbc:phoenix:thin:url=<scheme>://<server-hostname>:<port>;authentication=SPNEGO;principal=my_user;keytab=/home/my_user/my_user.keytab

CDH PQS的相关设置
https://docs.cloudera.com/documentation/enterprise/6/6.2/topics/phoenix_configuring_pqs.html

kerberos 401:https://serverfault.com/questions/470323/kerberos-authentication-failing-with-401

https://github.com/apache/phoenix/tree/master/python/requests-kerberos
https://community.cloudera.com/t5/Support-Questions/Phoenix-Query-Server-Connection-URL-example/td-p/147474
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cdh_sg_hbase_authentication.html

以上是关于用Phoenix / PQS 访问HBase的主要内容,如果未能解决你的问题,请参考以下文章

hbase--集成Phoenix实现类SQL操作hbase

hbase:使用Phoenix连接Hbase

2021年大数据HBase:Apache Phoenix的安装

2021年大数据HBase:Apache Phoenix的安装

2021年大数据HBase:Apache Phoenix的基本介绍

2021年大数据HBase:Apache Phoenix的基本介绍