Mesos集群:2个Linux agent和1个Windows agent

Posted yangzhenping

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Mesos集群:2个Linux agent和1个Windows agent相关的知识,希望对你有一定的参考价值。

先上图哈:



一、Mesos master和slave(其实就是agent)搭建在一台机器上:

http://www.agiletrailblazers.com/blog/4-step-application-deployment-in-aws-using-apache-mesos-and-marathon


过程中遇到两个问题:

1. mesos failed to connect to 5050
I was having the same issues and what fixed it for me was the zookeeper configuration. In my case I was using the EC2 public IP Address rather than the private one. Once I changed the /etc/mesos/zk file to zk://<private IP>:2181/mesos I was able to connect without the constant error messages. In other words, zookeeper was reporting to be running in one IP and mesos-master was trying to connect using a different IP.
https://stackoverflow.com/questions/40641674/mesos-failed-to-connect-error-to-ip5050/43293013
vi /etc/hosts
service zookeeper restart
service mesos-slave restart
service mesos-master restart
service marathon restart

2. Failed to perform recovery: Incompatible agent info detected.
rm -rf /var/log/mesos/*.*
rm -f /var/mesos/meta/slaves/latest
cat /var/log/mesos/mesos-slave.ERROR
root@omi64ub16-dev1:~# ls /var/lib/mesos/meta/slaves/
f09b786a-3e72-44a0-99b5-3ff52bc7f816-S0  latest
root@omi64ub16-dev1:~# rm -rf /var/lib/mesos/meta/slaves/f09b786a-3e72-44a0-99b5-3ff52bc7f816-S0
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR  
Log file created at: 2017/11/06 22:46:05
Running on machine: omi64ub16-dev1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1106 22:46:05.371798 64416 slave.cpp:6286] EXIT with status 1: Failed to perform recovery: Failed to find latest agent: No such file or directory
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/meta/slaves/latest
        This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.
root@omi64ub16-dev1:~# rm -f /var/lib/mesos/meta/slaves/latest
root@omi64ub16-dev1:~# rm -rf /var/log/mesos/*.*                                              
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR        
cat: /var/log/mesos/mesos-slave.ERROR: No such file or directory
root@omi64ub16-dev1:~# cat /var/log/mesos/mesos-slave.ERROR
cat: /var/log/mesos/mesos-slave.ERROR: No such file or directory
root@omi64ub16-dev1:~# ls /var/lib/mesos/meta/slaves/                                             
8dc571e3-cd46-49f5-a4a7-6d95097c9a9d-S0  latest
root@omi64ub16-dev1:~#


二、加另一台Linux机器作为agent:

Setup Mesos Multi-node Cluster on Ubuntu:
https://techpolymath.com/2014/08/28/setup-mesos-multi-node-cluster-on-ubuntu/

echo 10.226.174.148 | sudo tee /etc/mesos-slave/ip
echo zk://10.226.210.177:2181/mesos | sudo tee /etc/mesos/zk
echo 10.226.174.148 | sudo tee /etc/mesos-slave/hostname
service mesos-slave restart

三、加另一台Windows机器作为agent:参考http://mesos.apache.org/documentation/latest/windows/

powershell命令:

cd c:\\

git clone https://git-wip-us.apache.org/repos/asf/mesos.git

cd mesos

mkdir build

cd build

cmake .. -G "Visual Studio 15 2017 Win64" -T "host=x64" -DENABLE_LIBEVENT=1

cmake --build .

src\\mesos-agent.exe --master=10.226.210.177:5050 --work_dir=C:\\ --launcher_dir=C:\\mesos\\build\\src --isolation=windows/cpu,filesystem/windows --hostname=10.226.157.145 --ip=10.226.157.145 --log_dir=C:\\mesos-log --runtime_dir=C:\\mesos-runtime

# --containerizers="docker, mesos"


其他一些参考文档:

http://mesos.apache.org/documentation/latest/building/
http://www.datio.com/architecture/mesos-architecture-roles-and-responsibilities/
https://mesosphere.github.io/marathon/docs/recipes.html
http://mesos.readthedocs.io/en/latest/
https://scalr-wiki.atlassian.net/wiki/spaces/docs/pages/26411010/Deploying+a+Mesos+cluster+using+Scalr



以上是关于Mesos集群:2个Linux agent和1个Windows agent的主要内容,如果未能解决你的问题,请参考以下文章

使用 mesos 的火花集群

Mesos 上的独立 Spark 集群访问不同 Hadoop 集群中的 HDFS 数据

使用Mesos和Marathon管理Docker集群

跟我一起学docker(16)--单节点mesos集群

Mesos 资源分配

在CentOS7上配置Marathon+Mesos+Docker实战