Docker 上的 DataStax Enterprise:由于 /hadoop/conf 目录不可写而无法启动
Posted
技术标签:
【中文标题】Docker 上的 DataStax Enterprise:由于 /hadoop/conf 目录不可写而无法启动【英文标题】:DataStax Enterprise on Docker: fails to start due to /hadoop/conf directory not being writable 【发布时间】:2017-03-26 20:10:35 【问题描述】:我关注了DataStax's guide on best practices for using DSE with Docker,但我在使用 DataStax 提供的所有默认设置脚本和 Dockerfile 时遇到了以下错误。
错误日志
Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377) ~[dse-core-5.0.3.jar:5.0.3]
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306) ~[dse-core-5.0.3.jar:5.0.3]
... 7 common frames omitted
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveConfiguration(CassandraJobConf.java:466) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveDseHadoopConfiguration(CassandraJobConf.java:345) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:300) ~[dse-hadoop-5.0.3.jar:5.0.3]
... 11 common frames omitted
Unable to start DSE server: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
com.datastax.bdp.plugin.PluginManager$PluginActivationException: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:327)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:259)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:169)
at com.datastax.bdp.plugin.PluginManager.preStart(PluginManager.java:77)
at com.datastax.bdp.server.DseDaemon.preStart(DseDaemon.java:490)
at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:462)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:583)
at com.datastax.bdp.DseModule.main(DseModule.java:91)
Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310)
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174)
at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20)
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377)
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306)
... 7 more
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf
错误很简单,尝试通过在Dockerfile
中添加一些额外的chmod
调用来解决它,但无济于事。
Dockerfile
# Provided without any warranty, these files are intended
# to accompany the whitepaper about DSE on Docker and are
# not intended for production and are not actively maintained.
# Loosely based on docker-cassandra by the fine folk at Spotify
# -- https://github.com/spotify/docker-cassandra/
# Loosely based on cassandra-docker by the one and only Al Tobey
# -- https://github.com/tobert/cassandra-docker/
# base yourself on any ubuntu 14.04 image containing JDK8
# official Docker Java images are distributed with OpenJDK
# Datastax certifies its product releases specifically
# on the Oracle/Sun JVM, so YMMV with OpenJDK
FROM nimmis/java:oracle-8-jdk
# Avoid ERROR: invoke-rc.d: policy-rc.d denied execution of start.
RUN echo "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d
RUN export DEBIAN_FRONTEND=noninteractive && \
apt-get update && \
apt-get -y install adduser \
curl \
lsb-base \
procps \
zlib1g \
gzip \
python \
python-support \
sysstat \
ntp bash tree && \
rm -rf /var/lib/apt/lists/*
# grab gosu for easy step-down from root
RUN curl -o /bin/gosu -SkL "https://github.com/tianon/gosu/releases/download/1.4/gosu-$(dpkg --print-architecture)" \
&& chmod +x /bin/gosu
# DSE tarball can be download into the folder where Dockerfile is
# wget --user=$USER --password=$PASS http://downloads.datastax.com/enterprise/dse-5.0.0-bin.tar.gz
# you may want to replace dse-5.0.0-bin.tar.gz with the corresponding downloaded package name. When
# downloaded, please remove the version number part of the filename (or create a symlink), so the
# resulting file is named dse-bin.tar.gz (that way the docker file itself remains version independent).
#
# DataStax Agent debian package can be downloaded from
# wget --user=$USER --password=$PASS http://debian.datastax.com/enterprise/pool/datastax-agent_6.0.0_all.deb
# you may want to replace the specific version with the corresponding downloaded package name. When
# downloaded, please remove the version number part of the filename (or create a symlink), so the
# resulting file is named datastax-agent_all.deb (that way the docker file itself remains version
# independent).
ADD dse.tar.gz /opt
ADD datastax-agent_all.deb /tmp
ENV DSE_HOME /opt/dse
RUN ln -s /opt/dse* $DSE_HOME
# keep data here
VOLUME /data
# and logs here
VOLUME /logs
VOLUME /opt/dse
# create a dedicated user for running DSE node
RUN groupadd -g 1337 cassandra && \
useradd -u 1337 -g cassandra -s /bin/bash -d $DSE_HOME cassandra && \
chown -R cassandra:cassandra /opt/dse*
RUN chmod r+w -R /opt/dse/
# install the agent
RUN dpkg -i /tmp/datastax-agent_all.deb
# starting node using custom entrypoint that configures paths, interfaces, etc.
COPY scripts/dse-entrypoint /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-entrypoint
ENTRYPOINT ["/usr/local/bin/dse-entrypoint"]
# Running any other DSE/C* command should be done on behalf dse user
# Perform that using a generic command laucher
COPY scripts/dse-cmd-launcher /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-cmd-launcher
# link dse commands to the launcher
RUN for cmd in cqlsh dsetool nodetool dse cassandra-stress; do \
ln -sf /usr/local/bin/dse-cmd-launcher /usr/local/bin/$cmd ; \
done
# the detailed list of ports
# http://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/sec/secConfFirePort.html
# Cassandra
EXPOSE 7000 9042 9160
# Solr
EXPOSE 8983 8984
# Spark
EXPOSE 4040 7080 7081 7077
# Hadoop
EXPOSE 8012 50030 50060 9290
# Hive/Shark
EXPOSE 10000
# Graph
最后可能解决此问题的地方可能是用于在此容器启动时实际启动 DSE 的启动脚本。
DSE 启动脚本(由 Docker 容器在启动时调用)
#!/bin/sh
# Provided without any warranty, these files are intended
# to accompany the whitepaper about DSE on Docker and are
# not intended for production and are not actively maintained.
# Bind the various services
# These should be updated on every container start
if [ -z $IP ]; then
IP=`hostname --ip-address`
fi
echo $IP > /data/ip.address
# create directories for holding the node's data, logs, etc.
create_dirs()
local base_dir=$1;
mkdir -p $base_dir/data/commitlog
mkdir -p $base_dir/data/saved_caches
mkdir -p $base_dir/data/hints
mkdir -p $base_dir/logs
# tweak the cassandra config
tweak_cassandra_config()
env="$1/cassandra-env.sh"
conf="$1/cassandra.yaml"
base_data_dir="/data"
# Set the cluster name
if [ -z "$CLUSTER_NAME" ]; then
printf " - No cluster name provided; skipping.\n"
else
printf " - Setting up the cluster name: $CLUSTER_NAME\n"
regexp="s/Test Cluster/$CLUSTER_NAME/g"
sed -i -- "$regexp" $conf
fi
# Set the commitlog directory, and various other directories
# These are done only once since the regexep matches will fail on subsequent
# runs.
printf " - Setting up directories\n"
regexp="s|/var/lib/cassandra/|$base_data_dir/|g"
sed -i -- "$regexp" $conf
regexp="s/^listen_address:.*/listen_address: $IP/g"
sed -i -- "$regexp" $conf
regexp="s/rpc_address:.*/rpc_address: $IP/g"
sed -i -- "$regexp" $conf
# seeds
if [ -z "$SEEDS" ]; then
printf " - Using own IP address $IP as seed.\n";
regexp="s/seeds:.*/seeds: \"$IP\"/g";
else
printf " - Using seeds: $SEEDS\n";
regexp="s/seeds:.*/seeds: \"$IP,$SEEDS\"/g"
fi
sed -i -- "$regexp" $conf
# JMX
echo "JVM_OPTS=\"\$JVM_OPTS -Djava.rmi.server.hostname=127.0.0.1\"" >> $env
tweak_dse_in_sh()
# point C* logs dir to the created volume
sed -i -- "s|/var/log/cassandra|/logs|g" "$1/dse.in.sh"
tweak_spark_config()
sed -i -- "s|/var/lib/spark/|/data/spark/|g" "$1/spark-env.sh"
sed -i -- "s|/var/log/spark/|/logs/spark/|g" "$1/spark-env.sh"
mkdir -p /data/spark/worker
mkdir -p /data/spark/rdd
mkdir -p /logs/spark/worker
tweak_agent_config()
[ -d "/var/lib/datastax-agent" ] && cat > /var/lib/datastax-agent/conf/address.yaml <<EOF
stomp_interface: $STOMP_INTERFACE
use_ssl: 0
local_interface: $IP
hosts: ["$IP"]
cassandra_install_location: /opt/dse
cassandra_log_location: /logs
EOF
chown cassandra:cassandra /var/lib/datastax-agent/conf/address.yaml
setup_node()
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
chown -R cassandra:cassandra /data /logs /conf
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
# if marker file doesn't exist, setup node
[ ! -f "$DSE_HOME/tweaked_configs" ] && setup_node
[ -f "/etc/init.d/datastax-agent" ] && /etc/init.d/datastax-agent start
exec gosu cassandra "$DSE_HOME/bin/dse" cassandra -f "$@"
Docker 容器命令行参数
这是我用来通过 Docker 启动单个 DSE 实例的命令行参数:
#!/bin/bash
# Used to start a single DSE node that has both Spark and Cassandra running on it
OPSC_CONTAINER=$1
if [ -z "$OPSC_CONTAINER" ]; then
echo "usage: start_docker_cluster.sh OPSCContainerName"
echo " OPSCContainerName mandatory name of the container running OpsCenter"
exit 1
fi
[ -z "$CLUSTER_NAME" ] && CLUSTER_NAME="Test_Cluster"
STOMP_INTERFACE=`docker exec $OPSC_CONTAINER hostname -I`
docker run -p 7080:7080 -p 4040:4040 -p 7077:7077 -p 9042:9042 --link $OPSC_CONTAINER -d -e CLUSTER_NAME="$CLUSTER_NAME" -e STOMP_INTERFACE="$STOMP_INTERFACE" --name dse dse -k -t
-k -t
标志 indicate that we're going to be launching both Hadoop and Spark for this container。我已经删除了-t
标志,即使没有它,仍然会出现这个配置错误。
我需要做什么才能使/opt/dse/resources/hadoop/conf
目录可写,以便 DSE 可以成功启动?
【问题讨论】:
【参考方案1】:将 'chown -RHh cassandra:cassandra /opt/dse' 添加到入口点脚本解决了我无法写入 /opt/dse/resources/hadoop/conf 的问题。
回复。错误 04:15:04,789 SPARK-WORKER Logging.scala:74 - 无法创建工作目录 /var/lib/spark/worker
检查您的 spark-env.sh,并查看您的目录映射。就我而言,我安装了两个外部卷 - /data 和 /logs。这两个目录都归 cassandra:cassandra 所有。
# This is a base directory for Spark Worker work files.
if [ "x$SPARK_WORKER_DIR" = "x" ]; then
export SPARK_WORKER_DIR="/data/spark/worker"
fi
if [ "x$SPARK_LOCAL_DIRS" = "x" ]; then
export SPARK_LOCAL_DIRS="/data/spark/rdd"
fi
# This is a base directory for Spark Worker logs.
if [ "x$SPARK_WORKER_LOG_DIR" = "x" ]; then
export SPARK_WORKER_LOG_DIR="/logs/spark/worker"
fi
# This is a base directory for Spark Master logs.
if [ "x$SPARK_MASTER_LOG_DIR" = "x" ]; then
export SPARK_MASTER_LOG_DIR="/logs/spark/master"
fi
此视频展示了在 docker 上运行的全功能 DSE Enterprise:https://vimeo.com/181393134
【讨论】:
【参考方案2】:我在 DSE 启动脚本(由 Docker 容器在启动时调用) 的 setup_node()
部分添加了 chown -RHh cassandra:cassandra /opt/dse
,它解决了这个问题。查看chown --help
了解有关这些选项的更多信息。
注意:我现在稍后会收到
ERROR 04:15:04,789 SPARK-WORKER Logging.scala:74 - Failed to create work directory /var/lib/spark/worker
,但至少我的修复可以让您解决最初的问题。
setup_node()
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
tweak_dse_config "$DSE_HOME/resources/dse/conf"
chown -R cassandra:cassandra /data /logs /conf
chown -RHh cassandra:cassandra /opt/dse
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
【讨论】:
【参考方案3】:这样做:
我在 DSE 启动脚本的 setup_node() 部分添加了 chown -RHh cassandra:cassandra /opt/dse(由 Docker 容器在启动时调用)
Max 的回答对我有用,但我得到的不是他的问题
Unable to activate plugin com.datastax.bdp.plugin.DseFsPlugin
(...)
java.io.IOException: Failed to create work directory: /var/lib/dsefs
所以我不得不把我的 setup_node() 变成这个
setup_node()
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
chown -R cassandra:cassandra /data /logs /conf
mkdir /var/lib/dsefs
chown -RHh cassandra:cassandra /opt/dse /var/lib/dsefs
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
【讨论】:
以上是关于Docker 上的 DataStax Enterprise:由于 /hadoop/conf 目录不可写而无法启动的主要内容,如果未能解决你的问题,请参考以下文章
红帽 6.4 上的 Datastax Enterprise 5.0.0
在 EC2 Ruby on Rails 上连接到 datastax 集群上的 cassandra 节点
连接到 Apache Cassandra 的 DataStax Distribution 的容器
使用 datastax java 驱动程序 2.1.4 连接到 Cassandra 集群的速度太慢
com.datastax.oss -> java-driver-core 和 com.datastax.cassandra -> cassandra-driver-core 之间的 Cas