Sunday, 6 December 2015

Storm single node cluster Installation


What is Storm?

Storm is an open-source stream-processing framework originally developed by Backtype and
subsequently at Twitter. While the underlying framework is in Clojure, a Storm application --
called a Topology -- can be written in any programming language, with Java as the
predominant language of choice. Users are free to stitch together a directed graph of
execution, with Spouts (data sources) and Bolts (operators). Architecturally, it consists of a
central job and node management entity dubbed the Nimbus node, and a set of per-node
managers called Supervisors. The Nimbus node is in charge of work distribution, job
orchestration, communication, fault-tolerance, and state management (for which it relies on
ZooKeeper. The parallelism of a Topology can be controlled at 3 different levels:
number of workers (cluster wide processes), executors (number of threads per worker), and
tasks (number of bolts/spouts executed per thread). Intra-worker communication in Storm is
enabled by LMAX Disruptor while ZeroMQ1 is employed for inter-worker
communication. Moreover, tuple distribution across tasks is decided by groupings; with
shuffle grouping, which does random distribution, being the default option.

stormpicture.png


Pre-Requirements to be installed before the installation of Storm.


JAVA INSTALLATION:
Java 6 or 7 or 8 to be  installed on the linux (Ubuntu) machine .For Setting up the  java we need to install openjdk-(6 or 7 or 8).The Oracle JDK is the official JDK; however, it is no longer provided by Oracle as a default installation for Ubuntu.
command to install java on ubuntu machine :
sudo apt-get install openjdk-7-jdk

To know the path on which  java is installed use the following command
sudo update-alternatives --config java 

javapathcommand.png

update the part in ~/.bashrc profile (export the path)
javapath1.png

javapath2.png

 Install Git,Libtool,Automake,Uuid,g++,gcc-multilib packages using following commands :
  1. “sudo apt-get install git -y”
  2. “sudo apt-get install libtool -y”
  3. “sudo apt-get install automake -y”
  4. “sudo apt-get install uuid-dev”
  5. “sudo apt-get install g++ -y”
  6. “sudo apt-get install gcc-multilib -y"

ZooKeeper Installation:After the completion of all the pre-requirements packages, We Install Zookeeper service.
 step 1 : Download the zookeeper tarball from the any of  mirrors into your linux machine local directory using wget Command.(Mirror may change)
                                                                       zookeeperdownload1.png
step 2: Untar the Zookeeper tarball using “tar -xvf zookeeper-3.4.6.tar.gz”
step 3: we have to change the configuration properties of zookeeper so enter into zookeeper-3.4.6 extracted folder & go into the conf folder and make a copy of zoo-sample file and rename it as zoo.conf.
    Configure the properties of Zookeeper
  1. cd /usr/local/zookeeper (i renamed the folder zookeeper-3.4.6 into zookeeper).
  2. cd conf
  3. cp zoo-sample.cfg  zoo.cfg
  4. vi zoo.cfg
change the below details to conf-properties file
tickTime
This is the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
 zootktym.png
dataDir
This is the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
zoodatdir.png
Note: When we are running a multi node cluster the data path be should change in the slave nodes.
clientPort : This the port to listen for client connections.
zooclentport.png

  1. vi zoo.cfg
change the below details to conf-properties file
tickTime
This is the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
 zootktym.png
dataDir
This is the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
zoodatdir.png
Note: When we are running a multi node cluster the data path be should change in the slave nodes.
clientPort : This the port to listen for client connections.
zooclentport.png

  1. vi zoo.cfg
change the below details to conf-properties file
tickTime
This is the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
 zootktym.png
dataDir
This is the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
zoodatdir.png
Note: When we are running a multi node cluster the data path be should change in the slave nodes.
clientPort : This the port to listen for client connections.
zooclentport.png
step 4: When all the required changes are made in zoo.cfg start the zookeeper.change to the bin folder
cd /usr/local/zookeeper/bin

step 6: In this path we find some shell scripts among them use “zkServer.sh”script to start zookeeper.
usr/local/zookeeper/bin/ zkServer.sh start
zoozkserverstart.png
zkServer.sh supports the following commands:
start ,stop,status,start-foreground,restart,upgrade,print-cmd.

step 7: update the part in ~/.bashrc profile.

 Installing the ZeroMQ and Jzmq:

  • Create a directory with the name storm and download the needed tarballs in it for setting up the Storm-cluster.
  1. mkdir storm
  2. cd storm
  3. tar -xzf zeromq-2.1.7.tar.gz
  4. cd zeromq-2.1.7
 zeromq.png
  • change the directory to zeromq-2.1.7 & follow the following commands
  1. ./configure
zeromq2.png
  1. make
  2. sudo make install
  • now we have to add java-bindings to zeromq.
    Install jzmq
  1. cd jzmq
  2. ./Makefile.am
  3. ./autogen.sh
  4. ./configure
  5. make
  6. sudo make install

Finally installation of Storm:
Install storm from the storm downloads get the mirrors and download the storm tarball.
If we need to install all the pre-requirements without any errors.
installation :
step 1:     Download the tarball of storm . (Mirror may change)           

step 2:  untar the tarball of storm
          “ tar -xvf apache-storm-0.9.0.tar.gz
step3:  update the ~/.bashrc file with the storm path.
       stormpath1.png

4.2 setup configuration


step 4: Edit the config file of storm. After setting up the path for the storm edit the storm.yaml file in the conf directory of the storm directory .( i changed the apache-storm-0.9.0 to storm)
sudo  cd/usr/local/storm/conf/storm.yaml
stormconfig.png

step 5: Edit the config file of Storm By adding the following
        “ sudo  vi /usr/local/storm/conf/storm.yaml “    

Install storm package from the storm downloads get the mirrors and download the storm tarball.
Note :If we need to install Storm, all the pre-requirements should be installed without any errors.

Installation :

step 1:     Download the tarball of storm .               

step 2:  untar the tarball of storm
          “ tar -xvf apache-storm-0.9.0.tar.gz
step3:  update the ~/.bashrc file with the storm path.
       stormpath1.png

setup configuration


step 4: Edit the config file of storm. After setting up the path for the storm edit the storm.yaml file in the conf directory of the storm directory .( i changed the apache-storm-0.10.0-beta1 to storm)
sudo  cd/usr/local/storm/conf/storm.yaml
stormconfig.png

step 5: Edit the config file of Storm By adding the following
        “ sudo  vi /usr/local/storm/conf/storm.yaml “   


step 6:  To start the storm nimbus, supervisor & ui            
             “   cd /usr/local/storm/bin/
              To start nimbus       : “storm nimbus”
              To start supervisor  : “storm supervisor”
              To start storm ui      : “storm ui ”  
NOTE : Before starting the storm services make sure that zookeeper service is running.

step 7: Start the zookeeper service and supervisor service on supervisor nodes.
  • When we start all the storm services the storm ui shows the connected supervisors on ui and shows the nimbus up time on storm ui.
  • open Net-browser type on the Nimbus or UI address eg:  127.0.0.1:8772 which open up the Storm UI.


Monday, 14 September 2015

Installing Hortonworks Sandbox 2.3 – VirtualBox on Ubuntu


Getting Ready to install on Ubuntu using Oracle VirtualBox.

Prerequisites
To use the Hortonworks Sandbox on Ubuntu you must have the following resources available to you: 
● Hosts: A 64-bit machine with a chip that supports virtualization.
               A BIOS that has been set to enable virtualization support. 
● Host Operating Systems:  Ubuntu, 12.x,14.x or later 
● Supported Browsers: 
               Google Chrome – latest stable release.
● At least 4 GB of RAM 
   Note: if you wish to enable the optional Ambari or Hbase projects, you will need 8GB of physical RAM and will need to increase the RAM allocated to the virtual machine to at least 4 GB.
● Virtual Machine Environments: Oracle VirtualBox version 4.2 or later 
● The correct virtual appliance file for your environment. Download them from http://hortonworks.com/products/hdp/

Virtual Machine Overview

The Hortonworks Sandbox is delivered as a virtual appliance that is a bundled set of operating system, configuration settings, and applications that work together as a unit. The virtual appliance (indicated by an .ovf or .ova extension in the filename) runs in the context of a virtual machine (VM), a piece of software that appears to be an application to the underlying (host) operating system, but that looks like a bare machine, including CPU, storage, network adapters, and so forth, to the operating system and applications that run on it. To run the Hortonworks Sandbox you must install one of the supported virtual machine environments on your host machine, either Oracle VirtualBox or VMware Fusion (Mac) or Player (Windows/Linux). 

Here is a link for setting up vbox on Ubuntu : bigdatadays.

Installing on UBUNTU OS using Oracle VirtualBox

1. Open the Oracle VM VirtualBox Manager Double click:(VirtualBox Extension-Pack)


2. The Oracle VM Virtualization Manager window opens.( I already had one VM file don't worry about it ).
 
3. Import the Hortonworks Sandbox appliance file: File->Import Appliance.

4. Click the Open appliance button; the file browser opens. Make sure you select the correct appliance. In this case, the top file is the VirtualBox formatted file. Click the Open button.

5. The Import Virtual Appliance screen opens.

6. You return to the Import Virtual Appliance screen. Click Next.

7. The Appliance settings screen appears. You should configure at least 4GB of physical RAM installed. You may wish to allocate more RAM to the VM – 8GB of RAM in the Virtual Appliance will improve the performance. Click Import.

8. The appliance is imported.

9 .Turn on the Hortonworks Sandbox. Select the appliance and click the green Start arrow. A console window opens and displays an information screen. Click OK to clear the info screen.

10 .Hortonworks(Cent OS) starts Running on VM & your done.

one last word from my side "Magic is believing in yourself, if you can do that, you can make anything happen." Just Try the Magic With in YOU.

Wednesday, 9 September 2015

VirtualBox_Setup on Linux

Simple things should be simple, complex things should be possible.


Sometimes simple things becomes complex and looks like impossible & Same happened to me. Simple task of setting up a Virtual-Box on My-PC showed me a real nightmare when I followed the documentation. I'm not saying that following the documentation and setting up Virtual-Box is difficult but It is tedious.

 Here are the Simple and easy steps to installation v-box on your PC

 Procedure:

The steps provided describe how to install virtual box on Linux.The screen-shots displayed are taken from Linux system.After installation machine runs the latest Virtual Box software.

 1) open web browser & type virtualbox.org . 

 2)  Navigate to downloads find your machine.



3 Go to VirtualBox 5.0.4 Linux hosts.















4 select the Linux version (ubuntu 14.x or Debian x) i386(32Bit) or AMD64(64bit)














Download the deb file.

5 Next back to DOWNLOADS page and download virtual box Extension pack for all supported versions.














6 Install the VM deb file by using the command : sudo dpkg -i file-path.














7 On successful installation Adduser to Vbox by command: sudo adduser username vboxusers.

when your done with it  Congrats!! Now you completed the installation of Virtual-box on Linux system & your Vmbox is Ready to use.