Apache Ranger and AWS EMR Automated Installation Series : Windows AD + Open-Source Ranger

Posted 2023-01-31 Laurence Geng

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Apache Ranger and AWS EMR Automated Installation Series : Windows AD + Open-Source Ranger相关的知识，希望对你有一定的参考价值。

文章目录

1. Windows AD + Open-Source Ranger Solution Overview
- 1.1 Solution Architecture
- 1.2 Ranger in Detail
2. Installation & Integration
3. Verification
- 3.1 HDFS Access Control Verification
- 3.2 Hive Access Control Verification
4. Appendix

As last article of this series, we will introduce last high applicability scenario: “Windows AD + Open-Source Ranger”. This article address is https://laurence.blog.csdn.net/article/details/128800790, for reprint please indicate the source.

1. Windows AD + Open-Source Ranger Solution Overview

1.1 Solution Architecture

In this solution, Windows AD plays authentication provider, all user accounts data store on it, Ranger plays authorization controller, it will sync accounts data from Windows AD so as to grant privileges against user accounts from Windows AD, meanwhile, emr cluster need install a series of ranger plugins, these plugins will check with ranger server to assure if current user has permission to perform an action. And emr cluster will also sync accounts data from Windows AD via SSSD so as a user can login nodes of emr cluster and submit jobs. As end users, they can SSH login nodes of emr cluster with her/his Windows AD account, and if Hue is available, they can also login Hue with this account. This article address is https://laurence.blog.csdn.net/article/details/128800790, for reprint please indicate the source.

1.2 Ranger in Detail

Let’s deep dive into ranger for more details, its architecture looks as following:

The installer will finish following jobs:

① Install mysql as Policy DB for Ranger;
② Install Solr as Audit Store for Ranger;
③ Install Ranger Admin;
④ Install Ranger UserSync;
⑤ Install HDFS Ranger Plugin；
⑥ Install Hive Ranger Plugin；

2. Installation & Integration

Generally, the installation & integration process can be divided into 3 stages: 1. Prerequisites -> 2. All-In-One Install -> 3. Create EMR Cluster, the following diagram illustrates the progress in detail:

At stage 1, we need do some preparatory works; At stage 2, we start to install and integrate, here are 2 options at this stage: one is all-in-one installation driven by a command-line based workflow, the other is step-by-step installation. For most cases, all-in-one installation is always the best choice, however, sometimes, your installation workflow may be interrupted by unforeseen errors, if you want to continue installing from last failed step, please try step-by-step installation. Or sometimes, you want to re-try a step with different argument values to find the right one, step-by-step is also better choice; At stage 3, we need create an emr cluster. If you already have one, skip this job. In most cases, we need install ranger on an existing cluster not a new cluster, for emr-native ranger, it is impossible to install on an existing cluster (because emr-native ranger plugins can only be installed when creating cluster), but open-source ranger does NOT have this problem, you can be free to install on an existing or new emr cluster.

There is a little bit overlapping on execution sequence between stage 2 and 3. At step 2.4, the installation progress will be pending, the installer will indicate users to create their own cluster and keep monitoring target cluster’s status, once the cluster is ready, the progress will resume and continue to perform rest actions.

As a design principle, the installer does NOT include any actions to create an emr cluster, you should always create your cluster by yourself, because an emr cluster in practice could have any unpredictable settings, i.e., application-specific (hdfs, yarn, etc.) configuration, step scripts, bootstrap scripts and so on, it is unadvised to couple ranger’s installation with emr cluster’s creation.

Notes：

The installer will treat local host as ranger server to install everything of Ranger, for non-ranger operations, i.e., installing EMR plugins, it will initiate remote operations via SSH. So, you can just stay on ranger server to execute command lines, no need to switch among multiple hosts.
Although it is not required, we suggest you always use FQDN as host address, Both IP and hostname without domain name are not recommended.

2.1 Prerequisites

2.1.1 VPC Constraints

To integrate with Windows AD, EMR cluster nodes need join windows domain(realm), a series of constraints are imposed on VPC, before installing, please ensure the hostname of ec2 instance is no more than 15 characters. This is a limitation from Windows AD, however, as aws assigns DNS hostnames based on IPv4 address, this limitation propagates to VPC. If the CIDR of VPC can constraint IPv4 address is no more than 9 characters, the assigned DNS hostnames can be limited within 15 characters. With the limitation, a recommended CIDR setting of VPC is 10.0.0.0/16.

Although we can change default hostname after ec2 instances available, however, the hostname will be used when the computers join the Windows AD directory, this happened during emr cluster creating, a post modifications on hostname does NOT work(Technically, a possible workaround is to put modifying hostname actions into bootstrap scripts, but we didn’t try it, for changing hostname, please refer to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-hostname.html.

2.1.2 Create Windows AD Server

First, we need create a Windows AD server with powershell scripts. First, create an ec2 instance with Windows Server 2019 Base image (2016 is also tested and supported), then login with Administrator account, download the Windows AD installation scripts file from https://github.com/bluishglc/ranger-emr-cli-installer/releases/download/v2.0/ad.ps1, save to desktop.

Next, press “Win + R” to open a run dialog, copy following command line and replace parameter values with your own settings:

Powershell.exe -NoExit -ExecutionPolicy Bypass -File %USERPROFILE%\\Desktop\\ad.ps1 -DomainName <replace-with-your-domain> -Password <replace-with-your-password> -TrustedRealm <replace-with-your-realm>

The ad.ps1 has pre-defined default parameter values: the domain name is example.com, password is Admin1234!, trusted realm is COMPUTE.INTERNAL. As a quick-start, you can just right-click the ad.ps1 file and select Run with PowerShell to execute it. (Note that you can NOT run this powershell scripts by right-click “Run with PowerShell” on us-east-1, because its default trusted realm is EC2.INTERNAL, so you should set -TrustedRealm EC2.INTERNAL explicitly via above command line.)

After scripts executed, the compute will ask for restarting, this is forced by Windows. We should wait for the computer to restart then re-login as Administrator so as subsequent commands in the scripts file go on executing. Be sure to RE-LOGIN, otherwise a part of scripts have no chance to execute.

After re-login, we can open “Active Directory Users and Computers” from Start Menu -> Windows Administrative Tools -> Active Directory Users and Computers or enter dsa.msc from “Run” dialog to check created AD, if everything goes well, we can get a following AD directory:

Next, we need check DNS setting, invalid DNS setting will result in installation failure. A common error when running scripts is “Ranger Server can’t solve DNS of Cluster Nodes”, this problem is usually caused by incorrect DNS forwarder setting. We can open “DNS Manager” from Start Menu -> Windows Administrative Tools -> DNS or enter dnsmgmt.msc from “Run” dialog, then open tab “Forwarders”. Normally，there is a record which IP address should be 10.0.0.2:

10.0.0.2 is the default DNS server address for 10.0.0.0/16 network in VPC, according to VPC document: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html, it says:

The Amazon DNS server does not reside within a specific subnet or Availability Zone in a VPC. It’s located at the address 169.254.169.253 (and the reserved IP address at the base of the VPC IPv4 network range, plus two) and fd00:ec2::253. For example, the Amazon DNS Server on a 10.0.0.0/16 network is located at 10.0.0.2. For VPCs with multiple IPv4 CIDR blocks, the DNS server IP address is located in the primary CIDR block.

The forwarder’s IP address usually comes from “Domain name servers” of your VPC’s “DHCP Options Set”, its default value is AmazonProvidedDNS. If you changed it, when creating Windows AD, the forwarder’s IP will become your changed value. It is probably happened when re-install Windows AD in a VPC, if you didn’t recover “Domain name servers” to AmazonProvidedDNS before re-install, the forwarder’s IP is always the address of previous Windows AD server, it may NOT exist anymore, that’s why ranger server or cluster nodes can’t solve DNS. So, we can simply change forwarder IP to default value, i.e., 10.0.0.2 in 10.0.0.0/16 network.

The other DNS related configuration is IPv4 DNS setting, usually, its default setting is OK, just attach it here as reference(in cn-north-1 region):

2.1.3 Create DHCP Options Set and Attach To VPC

Joining windows domain(realm) requires that nodes in VPC can reach one another over the network and resolve each other’s domain names. So it is required to set the Windows AD as DNS server in “DHCP Options Sets” of VPC. The following command line will complete this job (run following scripts on a Linux host which has installed aws cli):

# run on a host which has installed aws cli
export REGION='<change-to-your-region>'
export VPC_ID='<change-to-your-vpc-id>'
export DNS_IP='<change-to-your-dns-ip>'

# solve domain name based on region
if [ "$REGION" = "us-east-1" ]; then
    export DOMAIN_NAME="ec2.internal"
else
    export DOMAIN_NAME="$REGION.compute.internal"
fi
                
# create dhcp options and return id
dhcpOptionsId=$(aws ec2 create-dhcp-options \\
    --region $REGION \\
    --dhcp-configurations '"Key":"domain-name","Values":["'"$DOMAIN_NAME"'"]' '"Key":"domain-name-servers","Values":["'"$DNS_IP"'"]' \\
    --tag-specifications "ResourceType=dhcp-options,Tags=[Key=Name,Value=WIN_DNS]" \\
    --no-cli-pager \\
    --query 'DhcpOptions.DhcpOptionsId' \\
    --output text)


# attach the dhcp options to target vpc
aws ec2 associate-dhcp-options \\
    --dhcp-options-id $dhcpOptionsId \\
    --vpc-id $VPC_ID

The following is a snapshort of created DHCP options from aws web console:

The “Domain name” - cn-north-1.compute.internal will be the “domain name” part of long hostname (FQDN). Usually, for us-east-1 region, please specify ec2.internal; for other regions, specify <region>.compute.internal. Note that do NOT set the domain name of Windows AD to it, i.e., example.com in our example, they are 2 different things, otherwise joining realm will fail. The “Domain name server” - 10.0.7.240 is the private IP of the Windows AD server. And the following is a snapshot of VPC which has attached this DHCP options set:

2.1.4 Create EC2 Instances as Ranger Server

Next, we need prepare an EC2 instance as Ranger server. When creating instance, please select Amazon Linux 2 image and guarantee network connections among instances and the cluster to be created are reachable.

As a best practice, it’s recommended to add ranger server into ElasticMapReduce-master security group, because Ranger is very close to emr cluster, it can be regarded as a non-emr-build-in master service. For Windows AD, we have to make sure its ports 389 is reachable from ranger and all nodes of emr cluster to be created, or to be simple, you also add Windows AD into ElasticMapReduce-master security group.

2.1.5 Download Installer

After EC2 instances are ready, pick the ranger server, login via ssh, run following commands to download installer package:

sudo yum -y install git
git clone https://github.com/bluishglc/ranger-emr-cli-installer.git

2.1.6 Upload SSH Key File

As mentioned before, the installer is based on local host (ranger server), to perform remote installing actions on emr cluster, SSH private key is required, so we should upload it to ranger server, and make a note of the file path, it will be the value of variable SSH_KEY.

2.1.7 Export Environment-Specific Variables

During installing, following environment-specific arguments will be passed more than once, it’s recommended to export them first, then all command lines just refer these variables instead of literals.

export REGION='TO_BE_REPLACED'
export ACCESS_KEY_ID='TO_BE_REPLACED'
export SECRET_ACCESS_KEY='TO_BE_REPLACED'
export SSH_KEY='TO_BE_REPLACED'
export AD_HOST='TO_BE_REPLACED'

The following is comments of above variables:

REGION: Aws Region, i.e., cn-north-1, us-east-1 and so on.
ACCESS_KEY_ID: Aws access key id of your IAM account. Be sure your account has enough privileges, it’s better having admin permissions.
SECRET_ACCESS_KEY: Aws secret access key of your IAM account.
SSH_KEY: Ssh private key file path on local host you just uploaded
AD_HOST: FQDN of Windows AD server

Please carefully replace above variables’ value according to your environment, and remember to use FQDN as hostname, i.e., AD_HOST. The following is a copy of example:

export REGION='cn-north-1'
export ACCESS_KEY_ID='<change-to-your-aws-access-key-id>'
export SECRET_ACCESS_KEY='<change-to-your-aws-secret-access-key>'
export SSH_KEY='/home/ec2-user/key.pem'
export AD_HOST='ip-10-0-14-0.cn-north-1.compute.internal'

2.2 All-In-One Installation

2.2.1 Quick Start

Now, let’s start an all-in-one installation, execute this command line:

sudo sh ./ranger-emr-cli-installer/bin/setup.sh install \\
    --region "$REGION" \\
    --access-key-id "$ACCESS_KEY_ID" \\
    --secret-access-key "$SECRET_ACCESS_KEY" \\
    --ssh-key "$SSH_KEY" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
	--ad-host "$AD_HOST" \\
    --ad-domain 'example.com' \\
    --ad-domain-admin 'domain-admin' \\
    --ad-domain-admin-password 'Admin1234!' \\
    --ad-base-dn 'cn=users,dc=example,dc=com' \\
    --ad-user-object-class 'person' \\
    --ranger-plugins 'open-source-hdfs,open-source-hive'

For parameters specification of above command line, please refer to appendix, we highlight 2 options: --ad-domain-admin and --ad-domain-admin-password, they only appear in “Windows AD + Open-Source Ranger” solution, we need leverage the 2 options to finish joining realm operation.

If everything goes well, the command line will execute step from 2.1 to 2.3 in workflow diagram, this may spend 10 minutes or more depending on the bandwidth of your network, then it will suspend and indicate user to enter emr cluster id. If target cluster is existing, we can fill its id immediately, if not, we should switch to emr web console to create it. then, the command line asks users to confirm if let Hue integrate with LDAP or not. if yes, when cluster ready, the installer will update emr configuration with Hue specific settings (this action will overwrite emr existing configuration).

Fill above 2 items, enter “y” to confirm all inputs, the installation process will resume and if target emr cluster is not ready yet, the command line will keep monitoring until it goes into “WAITING” status. The following is a snapshot for this moment of the command line:

When cluster is ready (status is “WAITING”), the command line will continue to execute from steps 2.4 to 2.6 of workflow, and finally end with an “ALL DONE!!” message.

2.2.2 Customization

Now, all-in-one installation is done, next, we introduce more about customization. Generally, this installer follows the principle of “Convention over Configuration”, most parameters are preset by default values, an equivalent version with full parameter list of above command line is as following:

sudo sh ./ranger-emr-cli-installer/bin/setup.sh install \\
    --region "$REGION" \\
    --access-key-id "$ACCESS_KEY_ID" \\
    --secret-access-key "$SECRET_ACCESS_KEY" \\
    --ssh-key "$SSH_KEY" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
	--ad-host "$AD_HOST" \\
    --ad-domain 'example.com' \\
    --ad-domain-admin 'domain-admin' \\
    --ad-domain-admin-password 'Admin1234!' \\
    --ad-base-dn 'cn=users,dc=example,dc=com' \\
    --ad-user-object-class 'person' \\
    --ranger-plugins 'open-source-hdfs,open-source-hive' \\
    --java-home '/usr/lib/jvm/java' \\
    --skip-install-mysql 'false' \\
    --skip-install-solr 'false' \\
    --skip-configure-hue 'false' \\
    --ranger-host $(hostname -f) \\
    --ranger-version '2.1.0' \\
    --mysql-host $(hostname -f) \\
    --mysql-root-password 'Admin1234!' \\
    --mysql-ranger-db-user-password 'Admin1234!' \\
    --solr-host $(hostname -f) \\
    --ranger-bind-dn 'cn=ranger,ou=services,dc=example,dc=com' \\
    --ranger-bind-password 'Admin1234!' \\
    --hue-bind-dn 'cn=hue,ou=services,dc=example,dc=com' \\
    --hue-bind-password 'Admin1234!' \\
    --restart-interval 30

The full-parameters version gives us a complete perspective of all custom options. In following scenarios, you may change some options’ value:

If you want to change default organization name dc=example,dc=com or default password Admin1234!, please run full-parameters version, and replace them with your own values.
If you need integrate with external facilities, i.e., an existing MySQL or Solr, please add corresponding --skip-xxx-xxx options and set it true.
If you have other pre-defined bind dn for hue, ranger and sssd, please add corresponding --xxx-bind-dn and --xxx-bind-password options to set them. Note that the bind dn for hue, ranger and domain-admin will be created automatically when installing Windows AD, but they are FIXED with naming pattern cn=hue|ranger|domain-admin,ou=services,<your-base-dn> not the given value of “–xxx-bind-dn” option, so if you assign other dn with “–xxx-bind-dn” option, you MUST create this dn by yourself in advance. The reason this install does NOT create the dn assigned by “–xxx-bind-dn” option is that a dn acutally is a tree path, to create it, we must create all nodes in the path, it is not cost-effective to implement such small but complicated function.

2.3 Step-By-Step Installation

As an alternative, you can also select step-by-step installation instead of all-in-one installation. we give the command line of each step, as for comments for each parameter, please refer to appendix.

2.3.1 Init EC2

This step will finish some fundamental jobs, i.e., install aws cli, jdk, and so on.

sudo sh ./ranger-emr-cli-installer/bin/setup.sh init-ec2 \\
    --region "$REGION" \\
    --access-key-id "$ACCESS_KEY_ID" \\
    --secret-access-key "$SECRET_ACCESS_KEY"

2.3.2 Install Ranger

This step will install all server-side components of Ranger, including MySQL, Solr, Ranger Admin and Ranger UserSync.

sudo sh ./ranger-emr-cli-installer/bin/setup.sh install-ranger \\
    --region "$REGION" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
    --ad-domain 'example.com' \\
    --ad-host "$AD_HOST" \\
    --ad-base-dn 'cn=users,dc=example,dc=com' \\
    --ad-user-object-class 'person' \\
    --ranger-bind-dn 'cn=ranger,ou=services,dc=example,dc=com' \\
    --ranger-bind-password 'Admin1234!'

2.3.3 Create EMR Cluster

For step-by-step installation, there is no interactive process for creating emr cluster, so just feel free to create cluster on emr web console. but we have to wait for the cluster is completely ready (in “WAITING” status), then export following environment-specific variables:

export EMR_CLUSTER_ID='TO_BE_REPLACED'

The following is a copy of example:

export EMR_CLUSTER_ID='j-2S04VJZ5YQHZ4'

2.3.4 Install Ranger Plugins

This step will install hdfs and hive plugins on ranger server side and agent side (EMR nodes). This is different from emr-native ranger solution, for emr-native ranger, EMR will install agent sides on each node automatically, for open-source ranger, we have to do this job by ourselves via this installer.

sudo sh ./ranger-emr-cli-installer/bin/setup.sh install-ranger-plugins \\
    --region "$REGION" \\
    --ssh-key "$SSH_KEY" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
    --ranger-plugins 'open-source-hdfs,open-source-hive' \\
    --emr-cluster-id "$EMR_CLUSTER_ID"

2.3.5 Install SSSD

This step will install and config SSSD on each node of emr cluster. We don’t need login each node, stay in local host to run the command line, it will perform on remote nodes via SSH.

sudo ./ranger-emr-cli-installer/bin/setup.sh install-sssd \\
    --ssh-key "$SSH_KEY" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
	--ad-host "$AD_HOST" \\
    --ad-domain 'example.com' \\
    --ad-domain-admin 'domain-admin' \\
    --ad-domain-admin-password 'Admin1234!' \\
    --emr-cluster-id "$EMR_CLUSTER_ID"

2.3.6 Configure Hue

This step will update hue configuration of emr, as highlighted in all-in-one installation , if you have other customized emr configuration, please skip this step, but you can still manually merge generated json file for hue configuration by command line into your own json.

sudo sh ./ranger-emr-cli-installer/bin/setup.sh configure-hue \\
    --region "$REGION" \\
    --solution 'open-source' \\
    --auth-provider 'ad' \\
    --ad-host "$AD_HOST" \\
    --ad-domain 'example.com' \\
    --ad-base-dn 'cn=users,dc=example,dc=com' \\
    --ad-user-object-class 'person' \\
    --hue-bind-dn 'cn=hue,ou=services,dc=example,dc=com' \\
    --hue-bind-password 'Admin1234!' \\
    --emr-cluster-id "$EMR_CLUSTER_ID"

3. Verification

After installation & integration is completed, it’s time to check if ranger works or not. The verification jobs are divided into 2 parts which are against hdfs and hive. First, let us login Windows AD via a client, i.e., LdapAdmin or Apache Directory Studio, then check out all DN, it should look as following:

Next, open ranger web console, the address is: http://<YOUR-RANGER-HOST>:6080, the default admin account/password is: admin/admin. After login, we should open “Users/Groups/Roles” page first, check if example users on Windows AD are already synchronized to ranger as following:

And besides, login the master node of emr cluster, export cluster id, because subsequent command lines need this variable.

# run on master node of emr cluster
export EMR_CLUSTER_ID='TO_BE_REPLACED'

The following is a copy of example:

# run on master node of emr cluster
export EMR_CLUSTER_ID='j-2S04VJZ5YQHZ4'

3.1 HDFS Access Control Verification

Usually, there are a set of pre-defined policies for hdfs plugin after installation as following:

We do NOT configure any HDFS permissions for example-user-1, but if we login Hue with the account example-user-1, you will see it can browse most directories and files on HDFS, this is because most directories and files has a+w permission. Please keep in mind that HDFS r/w/x file mode attributes and ranger-based permissions always take effective at the same time.

To verify if HDFS plugin works, we select “blacklist” mode to test. First, let’s create a directory named /ranger-test on hdfs, and set example-user-1 as its owner:

# run on master node of emr cluster
sudo -u hdfs hdfs dfs -mkdir /ranger-test
sudo -u hdfs hdfs dfs -chown example-user-1:example-group /ranger-test
sudo -u hdfs hdfs dfs -chmod 700 /ranger-test

Next, let’s add a deny-policy which disable example-user-1 read and write ranger-test:

Any policy changes on ranger web console will sync to agent side (emr cluster nodes) within 30 seconds, we can run following commands on master node to check if local policy file is updated:

# run on master node of emr cluster
for i in 1..10; do
    printf "\\n%100s\\n\\n"|tr ' ' '='
    sudo stat /etc/ranger/HDFS_$EMR_CLUSTER_ID/policycache/hdfs_HDFS_$EMR_CLUSTER_ID.json
    sleep 3
done

Once local policy file is up to date, the deny policy become effective, then login Hue with Windows AD account “example-user-1” created by installer, open “File Browser”, click root directory “/”, then click “ranger-test” folder, we will get an error message: “Cannot access:/ranger-test”:

Even current user example-user-1 is the owner of this folder, it is still blocked by ranger hdfs plugin, this means hdfs access control is managed by ranger.

Finally, remember to REMOVE the “ranger-test” policy so as example-user-1 has full privileges to access this folder, because following hive verification will re-use this folder.

3.2 Hive Access Control Verification

Usually, there is a set of pre-defined policies for hive plugin after installation, to eliminate interference, keep verification simple, let’s REMOVE them first:

Any policy changes on ranger web console will sync to agent side (emr cluster nodes) within 30 seconds, we can run following commands on master node to check if local policy file is updated:

# run on master node of emr cluster
for i in 1..10; do
    printf "\\n%100s\\n\\n"|tr ' ' '='
    sudo stat /etc/ranger/HIVE_$EMR_CLUSTER_ID/policycache/hiveServer2_HIVE_$EMR_CLUSTER_ID.json
    sleep 3
done

Once local policy file is up to date, removing-all-policies action become effective, then login Hue with Windows AD account “example-user-1” created by installer, open hive editor, enter following sql (remember to replace “ranger-test” with your own bucket) to create a test table (change ‘ranger-test’ to your own bucket name):

-- run in hue hive editor
create table ranger_test (
  id bigint
)
row format delimited
stored as textfile location '/ranger-test';

then, run it and an error occurs:

It shows example-user-1 is blocked by database-related permissions, this proves hive plugin is working, then we go back to ranger, add a hive policy named “all - database, table, column” as following:

It grants example-user-1 all privileges on all databases, tables and columns, then check policy file again on master node with previous command line, once updated, go back to Hue, re-run that sql, it will go well as following:

To double check if example-user-1 has full read & write permissions on the table, we can run following sql:

insert into ranger_test(id) values(1);
insert into ranger_test(id) values(2);
insert into ranger_test(id) values(3);
select * from ranger_test;

The execution result is:

By now, hive access control verifications are passed.

4. Appendix

The following is parameter specification:

Parameter	Comment
–region	the aws region.
–access-key-id	the aws access key id of your IAM account.
–secret-access-key	the aws secret access key of your IAM account.
–ssh-key	the ssh private key file path.
–solution	the solution name, accepted values ‘open-source’ or ‘emr-native’.
–auth-provider	the authentication provider, accepted values ‘ad’ or ‘openldap’.
–openldap-host	the FQDN of openldap host.
–openldap-base-dn	the base dn of openldap, for example: ‘dc=example,dc=com’, change it according to your env.
–openldap-root-cn	the cn of root account, for example: ‘admin’, change it according to your env.
–openldap-root-password	the password of root account, for example: ‘Admin1234!’, change it according to your env.
–ranger-bind-dn	the bind dn for ranger, for example: ‘cn=ranger,ou=services,dc=example,dc=com’, this should be an existing dn on Windows AD / OpenLDAP, change it according to your env.
–ranger-bind-password	the password of ranger bind dn, for example: ‘Admin1234!’, change it according to your env.
–openldap-user-dn-pattern	the dn pattern for ranger to search users on OpenLDAP, for example: ‘uid=0,ou=users,dc=example,dc=com’, change it according to your env.
–openldap-group-search-filter	the filter for ranger to search groups on OpenLDAP, for example: ‘(member=uid=0,ou=users,dc=example,dc=com)’, change it according to your env.
–openldap-user-object-class	the user object class for ranger to search users, for example: ‘inetOrgPerson’, change it according to your env.
–hue-bind-dn	the bind dn for hue, for example: ‘cn=hue,ou=services,dc=example,dc=com’, this should be an existing dn on Windows AD / OpenLDAP, change it according to your env.
–hue-bind-password	the password of hue bind dn, for example: ‘Admin1234!’, change it according to your env.
–example-users	the example users to be created on OpenLDAP & Kerberos so as to demo ranger’s feature, this parameter is optional, if omitted, no example users will be created.
–ranger-bind-dn	the bind dn for ranger, for example: ‘cn=ranger,ou=services,dc=example,dc=com’, this should be an existing dn on Windows AD / OpenLDAP, change it according to your env.
–ranger-bind-password	the password of bind dn, for example: ‘Admin1234!’, change it according to your env.
–hue-bind-dn	the bind dn for hue, for example: ‘cn=hue,ou=services,dc=example,dc=com’, this should be an existing dn on Windows AD / OpenLDAP, change it according to your env.
–hue-bind-password	the password of hue bind dn, for example: ‘Admin1234!’, change it according to your env.
–sssd-bind-dn	the bind dn for sssd, for example: ‘cn=sssd,ou=services,dc=example,dc=com’, this should be an existing dn on Windows AD / OpenLDAP, change it according to your env.
–sssd-bind-password	the password of sssd bind dn, for example: ‘Admin1234!’, change it according to your env.
–ranger-plugins	the ranger plugins to be installed, comma separated for multiple values. for example: ‘open-source-hdfs,open-source-hive’, change it according to your env.
–skip-configure-hue	skip to configure hue, accepted values ‘true’ or ‘false’, dafault value is ‘false’.
–skip-migrate-kerberos-db	skip to migrate kerberos database, accepted values ‘true’ or ‘false’, dafault value is ‘false’.