Mesos集群:2个Linux agent和1个Windows agent
Posted yangzhenping
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Mesos集群:2个Linux agent和1个Windows agent相关的知识,希望对你有一定的参考价值。
先上图哈:
一、Mesos master和slave(其实就是agent)搭建在一台机器上:
过程中遇到两个问题:
1. mesos failed to connect to 5050
I was having the same issues and what fixed it for me was the zookeeper configuration. In my case I was using the EC2 public IP Address rather than the private one. Once I changed the /etc/mesos/zk file to zk://<private IP>:2181/mesos I was able to connect without the constant error messages. In other words, zookeeper was reporting to be running in one IP and mesos-master was trying to connect using a different IP.
https://stackoverflow.com/questions/40641674/mesos-failed-to-connect-error-to-ip5050/43293013
vi /etc/hosts
service zookeeper restart
service mesos-slave restart
service mesos-master restart
service marathon restart
2. Failed to perform recovery: Incompatible agent info detected.
rm -rf /var/log/mesos/*.*
rm -f /var/mesos/meta/slaves/latest
cat /var/log/mesos/mesos-slave.ERROR
root@omi64ub16-dev1:~# ls /var/lib/mesos/meta/slaves/
f09b786a-3e72-44a0-99b5-3ff52bc7f816-S0 latest
root@omi64ub16-dev1:~# rm -rf /var/lib/mesos/meta/slaves/f09b786a-3e72-44a0-99b5-3ff52bc7f816-S0
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR
Log file created at: 2017/11/06 22:46:05
Running on machine: omi64ub16-dev1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1106 22:46:05.371798 64416 slave.cpp:6286] EXIT with status 1: Failed to perform recovery: Failed to find latest agent: No such file or directory
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/meta/slaves/latest
This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.
root@omi64ub16-dev1:~# rm -f /var/lib/mesos/meta/slaves/latest
root@omi64ub16-dev1:~# rm -rf /var/log/mesos/*.*
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR
cat: /var/log/mesos/mesos-slave.ERROR: No such file or directory
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR
cat: /var/log/mesos/mesos-slave.ERROR: No such file or directory
root@omi64ub16-dev1:~# ls /var/lib/mesos/meta/slaves/
8dc571e3-cd46-49f5-a4a7-6d95097c9a9d-S0 latest
root@omi64ub16-dev1:~#
二、加另一台Linux机器作为agent:
Setup Mesos Multi-node Cluster on Ubuntu:
https://techpolymath.com/2014/08/28/setup-mesos-multi-node-cluster-on-ubuntu/
echo 10.226.174.148 | sudo tee /etc/mesos-slave/ip
echo zk://10.226.210.177:2181/mesos | sudo tee /etc/mesos/zk
echo 10.226.174.148 | sudo tee /etc/mesos-slave/hostname
service mesos-slave restart
三、加另一台Windows机器作为agent:参考http://mesos.apache.org/documentation/latest/windows/
powershell命令:
cd c:\\
git clone https://git-wip-us.apache.org/repos/asf/mesos.git
cd mesos
mkdir build
cd build
cmake .. -G "Visual Studio 15 2017 Win64" -T "host=x64" -DENABLE_LIBEVENT=1
cmake --build .
src\\mesos-agent.exe --master=10.226.210.177:5050 --work_dir=C:\\ --launcher_dir=C:\\mesos\\build\\src --isolation=windows/cpu,filesystem/windows --hostname=10.226.157.145 --ip=10.226.157.145 --log_dir=C:\\mesos-log --runtime_dir=C:\\mesos-runtime
# --containerizers="docker, mesos"
其他一些参考文档:
http://mesos.apache.org/documentation/latest/building/
http://www.datio.com/architecture/mesos-architecture-roles-and-responsibilities/
https://mesosphere.github.io/marathon/docs/recipes.html
http://mesos.readthedocs.io/en/latest/
https://scalr-wiki.atlassian.net/wiki/spaces/docs/pages/26411010/Deploying+a+Mesos+cluster+using+Scalr
以上是关于Mesos集群:2个Linux agent和1个Windows agent的主要内容,如果未能解决你的问题,请参考以下文章