LSF 10.1 Community Edition Installation Guide

Posted 王万林 Ben

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了LSF 10.1 Community Edition Installation Guide相关的知识,希望对你有一定的参考价值。

(本文参考资料来源多处,感谢贡献者。)

0 Synopsis

LSF社区版,

每cluster支持up to 10台computing node

每node支持up to两个CPU socket

每node支持up to 60 core

每cluster支持up to 2500个run or pending job

 

1 Environment Details

1.1 Master and Computing Node Details

Node Name

Role

Mark

lsf-master-01

master

 

lsf-master-02

master replica

 

node-001

computing

 

node-002

computing

 

 

1.2 Directories on NFS

Directory

Usage

/home/lsf/media

LSF media

/home/lsf/dist

LSF installation

/home/lsf/install_dir

LSF installing tmp dir


 

2 Preperation

2.1 FreeIPA

FreeIPA is used for central user authentication and hosts' DNS resolving. You can also use other similar products, such as NIS.

2.1.1 Create lsfadmin Account.

2.1.2 Join all hosts to FreeIPA in order to resolve all hosts across the cluster.

2.2 Download tarball

Log in to IBMhttps://www-01.ibm.com/marketing/iwm/iwm/web/preLogin.do?source=swerpzsw-lsf-3Login to this site and download tarball named lsfsce10.2.0.6-x86_64.tar.gz

 

2.3 ssh key authentication setting

It's recommanded to setup ssh key-based password-less authentication.
 

3 Installing lsfce

3.1 untar

# cd /home/lsf/
# mkdir media && cd media
# cp /path/to/lsfsce10.2.0.6-x86_64.tar.gz ./
# tar -zxf lsfsce10.2.0.6-x86_64.tar.gz

3.1.1 create install tmp dir

# mkdir /home/lsf/install_dir
# cd /home/lsf/install_dir
# ln /home/lsf/media/lsfsce10.2.0.6-x86_64/lsf/lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z

3.2 Prepare install.config file

# tar Zxf /home//lsf/media/lsfsce10.2.0.6-x86_64/lsf/lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z -C /home/lsf/install_dir
# cd /home/lsf/install_dir/lsf10.1_lsfinstall
# cp install.config install.config_bak
# cat >> install.config << EOF
LSF_TOP="/home/lsf/dist"
LSF_ADMINS="lsfadmin"
LSF_CLUSTER_NAME="my-lsf-cluster"
LSF_MASTER_LIST="lsf-master-01 lsf-master-02"
LSF_TARDIR="/home/lsf/install_dir"
LSF_ADD_SERVERS="lsf-master-01 lsf-master-02 host-001 host-002"
EOF

3.3 install LSF

3.3.1 directory

# mkdir -p /home/lsf/dist

3.3.2 installing

# cd /home/lsf/install_dir/lsf10.1_lsfinstall
# ./lsfinstall -f install.config

Press Enter to continue viewing the license agreement, or

enter “1” to accept the agreement, “2” to decline it, “3”

to print it, “4” to read non-IBM terms, or “99” to go back

to the previous screen.

Press 1

Searching LSF 10.1 distribution tar files in /usr/share/lsf_distrib Please wait ...

1. linux2.6-glibc2.3-x86_64

Press 1 or Enter to install this host type:

Press 1

 

4 starting cluster

4.1 Initial LSF environment

 For csh, run on each node,

cat >> /etc/csh.cshrc << EOF
. /home/lsf/dist/conf/cshrc.lsf
EOF

For bash, run on each node,

cat >> /etc/profile << EOF
. /home/lsf/dist/conf/profile.lsf
EOF

4.2 master to node connection args

cat >> /home/lsf/dist/conf/lsf.conf << EOF
LSF_RSH=ssh
EOF

4.3 start cluster

on master node,

# lsfstartup

4.4 args setting for auto run

run on each compute node,

# /home/lsf/dist/10.1/install/hostsetup --boot="y" --top="/home/lsf/dist"

4.5 check status of cluster

[root@lsf-master-01 ~]# lsid
IBM Spectrum LSF Community Edition 10.1.0.6, May 25 2018
Copyright IBM Corp. 1992, 2016. All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

My cluster name is my-lsf-cluster
My master name is lsf-master-01.icinfra.cn
[root@lsf-master-01 ~]# lshosts -w
HOST_NAME                       type       model  cpuf ncpus maxmem maxswp server RESOURCES
lsf-master-01.icinfra.cn      X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (mg)
lsf-master-02.icinfra.cn      X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (mg)
host-001.icinfra.cn           X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (linux)
host-002.icinfra.cn           X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (linux)
[root@lsf-master-01 ~]# lsload -w
HOST_NAME               status  r15s   r1m  r15m   ut    pg  ls    it   tmp   swp   mem
lsf-master-01.icinfra.cn     ok   0.0   3.1   2.2  18%   0.0   1     0   44G  7.8G  7.1G
lsf-master-02.icinfra.cn     ok   0.1   1.1   0.8   8%   0.0   1    27   45G  7.8G  7.2G
host-002.icinfra.cn         ok   0.3   1.3   1.0   4%   0.0   1     0   45G  7.8G  7.2G
host-001.icinfra.cn         ok   0.3   1.8   1.5   6%   0.0   1     2   45G  7.8G  7.2G


 


 


5 支持的job数=2500个,如图所示

 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 

 

 

 

 

 

 

 

 

以上是关于LSF 10.1 Community Edition Installation Guide的主要内容,如果未能解决你的问题,请参考以下文章

LSF Community Edition(LSF社区版)介绍与下载

LSF - 基础 - 管理

LSF - LSF会使用什么cgroup子系统?

在IBM lsf.conf上构建Bazel

IBM Platform LSF--esub变量参数含义

LSF 作业管理系统