OpenNET: статья - Установка Lam-mpi кластера под FreeBSD (cluster freebsd proccess)

Ключевые слова: cluster, freebsd, proccess,  (найти похожие документы)

From: soup4you2
Newsgroups: http://bsdhound.com
Date: Sun, 21 Jan 2004 17:02:14 +0000 (UTC)
Subject: Установка Lam-mpi кластера под FreeBSD

Оригинал: http://www.bsdhound.com/newsread.php?newsid=205

   Installing Lam-mpi Cluster on FreeBSD How to

   A cluster is used to make a collection
   of 2 or more computers run as a single super computer. Clusters can be
   used to increase reliability and/or increase performance and resources
   available. A Beowulf cluster is a group of usually identical PC
   computers that are networked together into a TCP/IP LAN, and have
   libraries and programs installed which allow processing to be shared
   among them.

   Now before you get all happy here it's important to know that the
   applications need to be written for mpicc in order to utalize a
   cluster resource. you can consult the lam (http://lam-mpi.org/) website 
   for information and tutorials on it.

   Lets begin this quick and dirty howto.

   The first thing you need to take care of is each node on the cluster
   needs a DNS name. If your not running a DNS server using the
   /etc/hosts file will work just fine. I'm not going to get into the
   configuration of bind; Ill save that for a later date.
   Next our server needs to be configured as a NFS Server.

Server:
   ($:~)=> vi /etc/rc.conf
   nfs_server_flags="-u -t -n 4 -h 10.0.5.100" #Replace with your internal ip address.
   mountd_enable="YES"
   mountd_flags="-l -r"
   rpcbind_enable="YES"
   rpcbind_flags="-l -h 10.0.5.100" #Replace with your internal ip address.
   nfs_server_enable="YES"

   Then our client nodes need to be configured as a NFS client.

Client:
   ($:~)=> vi /etc/rc.conf
   nfs_client_enable="YES"

   Next thing we need to export our /home directory

Server:
   ($:~)=> vi /etc/exports  /home -maproot=0:0 -network 10.0.5.0 -mask 255.255.255.0

   Now each client needs to mount it

Client:
   ($:~)=> vi /etc/fstab
   10.0.5.100:/home          /home           nfs     rw              0      0

   Make sure your NFS share is working properly before continuing.
   Now we install the lam-mpi clustering software. Do this for all
   computers on the cluster.

All:
   ($:~)=> cd /usr/ports/net/lam
   ($:~)=> make install clean

   Next lets install some software to help us monitor the clusters.

All:
   ($:~)=> cd /usr/ports/sysutils/ganglia-monitor-core
   ($:~)=> make install clean

   On the server, we need the web interface for this. You should already
   have a web server setup with PHP installed and configured for the GD
   graphics library support.

Server:
   ($:~)=> cd /usr/ports/sysutils/ganglia-webfrontend
   ($:~)=> make install clean

   Now onto the configurations.

All:
   ($:~)=> cp /usr/local/etc/gmond.conf.sample /usr/local/etc/gmond.conf
   ($:~)=> vi /etc/gmond.conf

   There are 2 important areas to change in this file. The rest Ill leave
   to your digression.
   First being your cluster name:

All:  
   name  "ClusterName"

   Next the interface we wish to use for the cluster.

All:  
   mcast_if  xl0
   ($:~)=> cp /usr/local/etc/gmetad.conf.sample  /usr/local/etc/gmetad.conf

   Here we need to tell our monitors what hosts are available. Put a
   entry for every computer on the cluster.

All:  
   data_source "ClusterName"10 node1.yourdomain.com:8649
   node2.yourdomain.com:8649

   Make sure ClusterName matches the name in the gmond.conf configuration
   file. The 10 is the polling interval followed by the computers in the
   cluster.
   Now our monitoring software is configured lets configure the cluster
   software.

All:  
   ($:~)=> vi /usr/local/etc/lam-bhost.def

   Configuration for this is easy. Just put in the full domain names to
   each box.

All:  
   Node1.yourdomain.com
   Node2.yourdomain.com

   Now lets fire this puppy up.

All:  
   ($:~)=> mv /usr/local/etc/rc.d/gmetad.sh.sample /usr/local/etc/rc.d/gmetad.sh
   ($:~)=> mv /usr/local/etc/rc.d/gmond.sh.sample /usr/local/etc/rc.d/gmond.sh
   ($:~)=> /usr/local/etc/rc.d/gmetad.sh start
   ($:~)=> /usr/local/etc/rc.d/gmond.sh start

   Now on the server or whatever node you choose we need to start the
   cluster. Run this as a underprivileged user.

Server:
   ($:~)=> lambood -dv

   you should be presented with something like this:
   lamboot: boot schema file: /usr/local/etc/lam-bhost.def
   lamboot: opening hostfile /usr/local/etc/lam-bhost.def
   lamboot: found the following hosts:
   lamboot:   n0 node1.yourdomain.com
   lamboot:   n1 node2.yourdomain.com
   lamboot: resolved hosts:
   lamboot:   n0 node1.yourdomain.com --> 10.0.5.100
   lamboot:   n1 node2.yourdomain.com --> 10.0.5.105
   lamboot: found 2 host node(s)
   lamboot: origin node is 0 (node1.yourdomain.com)
   Executing hboot on n0 (node2.yourdomain.com - 1 CPU)...
   lamboot: attempting to execute "hboot -t -c lam-conf.lam -d -v -I " -H
   10.0.5.100 -P 57552 -n 0 -o 0     ""
   hboot: process schema = "/usr/local/etc/lam-conf.lam"
   hboot: found /usr/local/bin/lamd
   hboot: performing tkill
   hboot: tkill
   hboot: booting...
   hboot: fork /usr/local/bin/lamd
   [1]  44660 lamd -H 10.0.5.100 -P 57552 -n 0 -o 0 -d
   hboot: attempting to execute
   Executing hboot on n1 (node2.yourdomain.com - 1 CPU)...
   lamboot: attempting to execute "/usr/bin/ssh node2.yourdomain.com -n echo $SHELL"
   lamboot: got remote shell /usr/local/bin/bash
   lamboot: attempting to execute "/usr/bin/ssh node2.yourdomain.com -n
   hboot -t -c lam-conf.lam -d -v -s -I "-H 10.0.5.100 -P 57552 -n 1 -o  0    ""
   hboot: process schema = "/usr/local/etc/lam-conf.lam"
   hboot: found /usr/local/bin/lamd
   hboot: performing tkill
   hboot: tkill
   hboot: booting...
   hboot: fork /usr/local/bin/lamd
   [1]  53214 lamd -H 10.0.5.100 -P 57552 -n 1 -o 0 -d
   topology done
   lamboot completed successfully

   Looks good. Now lets make sure all of our clients are attached.

Server:
   ($:~)=>  lamnodes
   n0      node1.yourdomain.com:1
   n1      node2.yourdomain.com:1

   Congratulations.. Your clustered. You may open up your browser and
   view /usr/local/www/data-dist/ganglia and ultimately setup a point on
   your web server to view it.
   Now so how do i use this cluster?
   some commands that i commenly use are:

Server:
   ($:~)=> tping N
     1 byte from 1 remote node and 1 local node: 0.002 secs
     1 byte from 1 remote node and 1 local node: 0.001 secs
     1 byte from 1 remote node and 1 local node: 0.001 secs

   The tping command is same as ping but it's used to ping the nodes in
   the cluster. the N (uppercase) means all nodes in the cluster. If i
   just wanted to ping node2.yourdomain.com i would use the lamnodes
   command to find out the number associated with that node then run
   tping n1 (n1 being node2.yourdomain.com)

   Another benifit is i can sit on one machine and tell the cluster to
   start applications on the other machines and return the display to the
   monitor i'm on.. Lets try it shall we:

Server:
   ($:~)=> lamexec N echo "hi"
   hi
   hi

   Since i used the uppercase N meaning all nodes it ran the echo "hi" on
   both pc's returning the results to the 1 machine. i would suggest
   reading up on lamexec for other information and tips you can do with
   it. so how can you be sure it' running these processes on both pc's?
   watch this:

Server:
   ($:~)=> lamexec N hostname
   node1.yourdomain.com
   node2.yourdomain.com

   Also read the man lamd page it contains other useful programs for your
   cluster. Enjoy and happy crunching.

   Some Links Of Interest:

   Computer Clusters Profiles on TechTV
    http://www.techtv.com/screensavers/answerstips/story/0,24330,2554333,00.html

   Offmyserver building a Beowulf cluster
    http://www.offmyserver.com/cgi-bin/store/cluster.html

   Brooks paper on building a FreeBSD cluster
    http://people.freebsd.org/~brooks/papers/bsdcon2003/

   LAM-MPI Parallel computing page
    http://lam-mpi.org/

   LAM-MPI Download Page (For Mac Binaries If Needed)
    http://www.lam-mpi.org/7.0/download.php

   FreeBSD Cluster forum
    http://lists.freebsd.org/mailman/listinfo/freebsd-cluster

   Deploying Mac OS X Clusters
    http://www.cmu.edu/computing/project/macosx/

   Playstation 2 Super Computer
    http://www.techtv.com/screensavers/supergeek/story/0,24330,3474732,00.html