Pages

Tuesday, 14 January 2014

Zookeeper Setup Guide -Standalore/Quorum

#Zookeeper Quick Setup Guide


##Zookeeper Download
Visit http://www.apache.org/dyn/closer.cgi/zookeeper/ and dowload the stable zookeeper tar file.
I've downloaded this one: http://www.motorlogy.com/apache/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz

## Zookeeper as a standalone server
  Standalone server is not justice to zookeeper. But yes a developer for testing your application logic, you might want to develop using standalone server first.
So, here are steps needed to setup zookeeper in standalone mode:

  1. extract tar file into a folder say  /zookeeper
    1. tar -xvf  zookeeper-X.X.X.tar.gz  /zookeeper
  2. cd /zookeeper
  3. Make configuration file : zoo.cfg
    1. Start by copying the sample configuration file provided already inside conf/ folder
      cp conf/zoo_sample.cfg   conf/zoo.cfg
  4. Edit zoo.cfg file
    1. Pay attention to dataDir and clientPort. Use port on which you want to run zookeeper server and dataDir where zookeeper will create file structure. (myid , worker, tasks, snapshots etc)

      vim conf/zoo.cfg
       The number of milliseconds of each tick
      tickTime=2000
      # The number of ticks that the initial
      # synchronization phase can take
      initLimit=10
      # The number of ticks that can pass between
      # sending a request and getting an acknowledgement
      syncLimit=5
      # the directory where the snapshot is stored.
      dataDir=./data
      # the port at which the clients will connect
      clientPort=2181
      clientPortAddress=127.0.0.1
      #New in 3.3.0: the address (ipv4, ipv6 or hostname) to listen for client connections
  5. Start your server
    1. bin/zkServer.sh 
    2. By default it's configured to use the config file stored in conf folder of zookeeper root folder. From config file i'll pick the port and run server on localhost. 
  6. Starting server using different config file and at different hostname
    1. bin/zkServer.sh   ip:port   /path/to/config_file.cfg
    2. bin/zkServer.sh  127.0.0.1:2181  

### Testing via connecting client to our zookeeper running server
  1. CD into your zookeeper installation folder
  2. launch client
    1. bin/zkClient.sh  It'll connect to default hostname and port: 127.0.0.1:2181
      OR
      bin/zkClient.sh -server 127.0.0.1:2181 To connect to many servers (zookeeper quorum) put all host:port separated by comma.
      Eg.
      bin/zkClient.sh -server 127.0.0.1:2181,127.0.0.1:2182, 127.0.0.1:2183>>Client will then connect to any one of the server specified but data will be synced across all of them.
  3. listing path
            ls /
    [zookeeper]
  4. try creating new znode
          create /workers ""        Where "" is data assigned to workers. Now just fill it with empty string
  5. list root path again
    ls /
    [workers, zookeeper]
  6. Your standalone installation and setup is verified and complete.
##ZooKeeper with Quorums


The configuration we have used so far is for a standalone server. If the server is up, the

service is up, but if the server fails, the whole service comes down with it. This doesn’t

quite live up to the promise of a reliable coordination service. To truly get reliability, we

need to run multiple servers.
Fortunately, we can run multiple servers even if we only have a single machine by setting up a more sophisticated configuration.

Common config file: 


           tickTime=2000
           initLimit=10
           syncLimit=5
           # Yes, you can give relative path.
           dataDir=./data
           clientPort=2181

           #For quorum communication and leader election two extra ports are used
           server.1=127.0.0.1:6666:6667
           server.2=127.0.0.1:7777:7778
           server.3=127.0.0.1:8888:8889

------


server.x=[hostname]:nnnnn[:nnnnn], etc
servers making up the ZooKeeper ensemble. When the server starts up, it determines which server it is by looking for the file myid in the data directory. That file contains the server number, in ASCII, and it should match x in server.x in the left hand side of this setting.
The list of servers that make up ZooKeeper servers that is used by the clients must match the list of ZooKeeper servers that each ZooKeeper server has.
There are two port numbers nnnnn. The first followers use to connect to the leader, and the second is for leader election. The leader election port is only necessary if electionAlg is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not necessary. If you want to test multiple servers on a single machine, then different ports can be used for each server.


Now, lets create a illusion of three server buy creating three different data directory and configuration file
$mkdir -p    ../z{1..3}/data
This will make three folders named  z1, z2, and z3 each with data directory using '-p' flag.
To identify server you need to create identity file "myid" with content containing server id only.
NOTE:  myid should be placed inside data directory 

echo "1" >   z1/data/myid
echo "2" >   z2/data/myid
echo "3" >   z3/data/myid


Lets create config file one for each .
z1/z1.cfg
             tickTime=2000
             initLimit=10
             syncLimit=5
             # Yes, you can give relative path.
             dataDir=./data
           #Make sure to change port for every new instance
            clientPort=2181
             server.1=127.0.0.1:6666:6667
             server.2=127.0.0.1:7777:7778
             server.3=127.0.0.1:8888:8889

z2/z2.cfg
             tickTime=2000
             initLimit=10
             syncLimit=5
             # Yes, you can give relative path.
             dataDir=./data
           #Make sure to change port for every new instance
            clientPort=2182
             server.1=127.0.0.1:6666:6667
             server.2=127.0.0.1:7777:7778
             server.3=127.0.0.1:8888:8889

z3/z3.cfg
             tickTime=2000
             initLimit=10
             syncLimit=5
             # Yes, you can give relative path.
             dataDir=./data
            #Make sure to change port for every new instance
            clientPort=2183
             server.1=127.0.0.1:6666:6667
             server.2=127.0.0.1:7777:7778
             server.3=127.0.0.1:8888:8889
---


Having created three zookeeper server by launching three instance of server on same machine but at different ip we can simulate zookeeper quorum.

Now, lets launch it.

PATH_TO_ZK=/path/to/zookeeper/installation

cd z1
$PATH_TO_ZK/bin/zkServer.sh  ./z1.cfg
cd z2
$PATH_TO_ZK/bin/zkServer.sh  ./z2.cfg
cd z3
$PATH_TO_ZK/bin/zkServer.sh  ./z3.cfg

NOTE: Server IP is picked from server list identified by content of myid file
### Testing via connecting client to our zookeeper running server
    • Launch zkClient.sh
      $PATH_TO_ZK/bin/zkClient.sh  -server 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
    • Follow steps we did above. Changes will be  synced across all servers
    • Congrats you just launched zookeeper ensemble

    ###NOTE: See the logs to get information on how clients are connecting by creating session. Each client is assigned a session id. Client can be disconnected if its session expired. Session can be expired by server or my client itself.  In case where server is unable to reach client i'll assume client died and it'll expire its session.


    Troubleshooting

    •  zookeeper.out : Just give notice to zookeeper.out log file created when  launching  servers( zkServer.sh).
      It has enough information needed to debug .
    • 2014-01-20 10:14:02,506 - FATAL [main:QuorumPeerMain@83] - Invalid config, exiting abnormallyYour config file is invalid. 
      • Please verify you have myid correctly entered next to server parameter and bind address is correct.  i.e  server.x=192.18.2.2:8888:8889
      • Make sure data director and log directory exist and writable
        •  dataDir=./this/path/should/exist
        • dataLogDir=/this/path/should/exist

    References:

    • Configuration parameters: http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#ch_administration

    No comments:

    Post a Comment