Setting up Galera notification command on on Ubuntu

Galera Cluster:

Galera Cluster is a synchronous multi-master database cluster, based on synchronous replication and Oracle’s MySQL/InnoDB. When Galera Cluster is in use, you can direct reads and writes to any node, and you can lose any individual node without interruption in operations and without the need to handle complex failover procedures.

At a high level, Galera Cluster consists of a database server—that is, MySQL or MariaDB—that then uses the Galera Replication Plugin to manage replication.

Setting up Galera notification command:

While you can use the database client to check the status of your cluster, the individual nodes and the health of replication, you may find it counterproductive to log into the client on each node to run these checks. Galera Cluster provides a notification script and interface for customization, allowing you to automate the monitoring process for your cluster.

More details can be found here, http://galeracluster.com/documentation-webpages/notificationcmd.html

Default script is implemented in Mysql which is not feasible. Because, when whole cluster is down you won’t be able to connect any node in cluster to get details about current status.
Hence, it is good idea to lot cluster status to some other data source.

Here we will be logging all status updates to MongoDb. Let’s create script,

vim galera-wsrep-notify.sh

#!/bin/sh -eu

MONGO_HOST="192.168.0.1"
MONGO_DB="galera_log_db";
MONGO_COL="galera_log_coll";
NODE_NAME="HOST_NAME";

configuration_change()
{
mongo --host $MONGO_HOST $MONGO_DB --eval 'db.'$MONGO_COL'.insert({"_id" : "members", "members" : "'$MEMBERS'" });';

local idx=0

for NODE in $(echo $MEMBERS | sed s/,/\ /g)
do
idx=$(( $idx + 1 ))
done

mongo --host $MONGO_HOST $MONGO_DB --eval 'db.'$MONGO_COL'.insert({"_id" : "'$NODE_NAME'", "size" : "'$idx'", "status" : "'$STATUS'", "primary" : "'$PRIMARY'" });'
}

status_update()
{
mongo --host $MONGO_HOST $MONGO_DB --eval 'db.'$MONGO_COL'.update({"_id":"'$NODE_NAME'"},{$set:{"status":"'$STATUS'"}});';
}

COM=status_update # not a configuration change by default

while [ $# -gt 0 ]
do
case $1 in
--status)
STATUS=$2
shift
;;
--uuid)
CLUSTER_UUID=$2
shift
;;
--primary)
[ "$2" = "yes" ] && PRIMARY="1" || PRIMARY="0"
COM=configuration_change
shift
;;
--index)
INDEX=$2
shift
;;
--members)
MEMBERS=$2
shift
;;
esac
shift
done

# Undefined means node is shutting down
if [ "$STATUS" != "Undefined" ]
then
$COM
fi

exit 0

You can update mysql node host name and Mongodb parameters.
NOTE: You will need to setup above script on each node in cluster.

Now, let’s set script as notification command in cluster setting,

sudo mv galera-wsrep-notify.sh /etc/mysql/conf.d
sudo chown mysql:mysql /etc/mysql/conf.d/galera-wsrep-notify.sh
sudo chmod 700 /etc/mysql/conf.d/galera-wsrep-notify.sh

sudo vim /etc/mysql/conf.d/cluster.cnf

Add wsrep option,
wsrep_notify_cmd=”/etc/mysql/conf.d/galera-wsrep-notify.sh”

Now restart nodes one by one and you will start getting data into mongo collections.

Now you will need monitoring script to check this data for cluster status and alert dev-ops in case if anything is wrong. (e.g. split brain and no node is part of primary component)

$vim galera-cluster-check.py

#!/usr/bin/python3

import pymongo
import pprint

try:
from pymongo import MongoClient
client = MongoClient('192.168.0.1', 27017)
db = client.galera_log_db
col = db.galera_log_coll
for doc in col.find():
pprint.pprint(doc)
except Exception as e:
print(e)


$python3 galera-cluster-check.py

Above script is just printing rows. You can implement your customized logic.

Thanks.

Posted in MariaDb, Uncategorized | Tagged | Leave a comment

Setting up Galera Arbitrator on Ubuntu

Galera Arbitrator:

The recommended deployment of Galera Cluster is that you use a minimum of three instances. Three nodes, three datacenters and so on.

In the event that the expense of adding resources, such as a third datacenter, is too costly, you can use Galera Arbitrator. Galera Arbitrator is a member of the cluster that participates in voting, but not in the actual replication.

More details can here,
http://galeracluster.com/documentation-webpages/arbitrator.html
https://www.percona.com/doc/percona-xtradb-cluster/5.6/howtos/garbd_howto.html

Setting up Galera Arbitrator :

Suppose you already have 2 nodes,
1) 192.168.0.1
1) 192.168.0.2

and you want to add arbitrator for cluster failover on new node 192.168.0.3. Then,

Create config file,
vim /etc/default/garb.arbtirator.config

Add,
group = cluster_name
address = gcomm://192.168.0.1:4567,192.168.0.2:4567,192.168.0.3:4567
log = /var/logs/garbd.log

create log file,
sudo touch /var/logs/garbd.log
sudo chmod 777 /var/logs/garbd.log

Start garbd,
garbd -d -c /etc/default/garb.arbtirator.config

Now you can see ne wcluster count using mariadb status variable “%wsrep_%”.

Thanks.

Posted in Architecture | Tagged , | Leave a comment

How to setup Apache Thrift on Ubuntu

How to setup Apache Thrift on Ubuntu:

The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.
Mode details can be found here, Apache Thrift

Basic requirements:
C++:
$sudo apt-get install automake bison flex g++ git libboost1.58-all-dev libevent-dev libssl-dev libtool make pkg-config
Java:
Any java,
$sudo apt-get install ant
Python:
$sudo apt-get install python-all python-all-dev python-all-dbg
Php:
$sudo apt-get install php7.0-dev php7.0-cli phpunit

Download Apache Thrift :

$wget http://mirror.fibergrid.in/apache/thrift/0.10.0/thrift-0.10.0.tar.gz
$tar -xvf thrift-0.10.0.tar.gz

Build and Install the Apache Thrift compiler:

$cd thrift-0.10.0/
$./configure
$sudo make
$sudo make check
$sudo make install
$thrift -version

if got following error,
thrift: error while loading shared libraries: libthriftc.so.0: cannot open shared object file: No such file or directory
then,

$vim ~/.bashrc
export LD_LIBRARY_PATH=/usr/local/lib/:${LD_LIBRARY_PATH}
$source ~/.bashrc
$echo $LD_LIBRARY_PATH
$thrift -version

Above setup also generates source code for given language using thrift files as follows. You don’t need to generate them yourself for examples,

$cd thrift-0.10.0/tutorial/java/
$../../compiler/java/thrift -r --gen java ../../tutorial/tutorial.thrift

Test examples:
Now, client-server source code is ready to run. You can run them using following java commands.

$cd thrift-0.10.0/tutorial/java/
$make tutorial
OR
$make tutorialserver
$make tutorialclient
OR
java -cp ../../lib/java/build/*:../../lib/java/build/lib/*:tutorial.jar JavaServer
java -cp ../../lib/java/build/*:../../lib/java/build/lib/*:tutorial.jar JavaClient simple
java -cp ../../lib/java/build/*:../../lib/java/build/lib/*:tutorial.jar JavaClient secure

Posted in Apache Thrift | Tagged | Leave a comment

Best Java tools for profiling

Best Java tools :

Recently, I was building real-time, high performance back-end Java application which computes 16K ratios per company for around 70K companies (~1.1bn total) by pulling data from different data sources like, Mongodb, Cassandra, Mysql and Redis. It was multi-threaded application which was using Dynamic programming approach for storing intermediate results to pass as an input to calculate derived ratios on next layer. Having small setup of average servers which were serving both front-end application and back-end batch processing, It was very important to build solution which has less CPU/IO and memory utilization.
When I tested application the first time, response time for computing one company was 2mnts with 20mb memory usage and it was really not acceptable.
So, I decided to profile code to find duplicate computations to reduce CPU/IO which will improve response time and remove any memory leaks due to strong references. I searched for best tool for profiling java application and came across some amazing tools which help me reduce response time to 4sec and <2Mb memory usage for one company (before GC).

tools

All these tools are as following,

1) VisualVM :

It is free and very handy all-in-One Java troubleshooting visual tool integrating command-line JDK tools and lightweight profiling capabilities. Designed for both development and production time to monitor application locally and remotely. VisualVM comes with default Oracle Hotspot JVM installation which gives you insights of CPU Utilization, Memory Usage, Threads, CPU/Memory sampling/profilling.
You can extend basic functionality adding more plugin for Mbean, Jconsole and Garbage collection. It is possible to take Thead/Heap dump and perform manual GC on running application. It is super set of Jconsole and requires Jstatd for monitoring application remotely.
More details can be found here, VisualVM.

visualvm

2) Java Mission Control :

Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed run-time information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collection framework built into the Oracle JDK. It allows Java administrators and developers to gather detailed low level information about how the Java Virtual Machine (JVM) and the Java application are behaving. Java Mission Control is an advanced set of tools that enables efficient and detailed analysis of the extensive of data collected by Java Flight Recorder.
The tool chain enables developers and administrators to collect and analyze data from Java applications running locally or deployed in production environments.
More details can be found here, Java Mission Control.

jvm

Following tools are paid but they gives you very deep insights to pin point issues and save lots of time.

3) JProfiler:

JProfiler is paid all-in-One paid with intuitive UI which helps you resolve performance bottlenecks, pin down memory leaks and understand threading issues.
More details can be found here, JProfiler.

jprofiler

3) Yourkit:

YourKit is paid all-in-one profiler utilizes all of the advanced Java profiling features and capabilities. It profile any SE or EE application, server, technology and framework; on multiple platforms; locally and remotely; in development, testing and production. For teams and companies of any size.
More details can be found here, Yourkit.

yourkit

I hope, this post will save your precious time on finding best java profiling tools.
Thank you.

Posted in Uncategorized | Leave a comment

How to setup cassandra on Ubuntu 14.04

What is Cassandra?

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

Cassandra cluster data flow :

request_response

Cassandra architecture:

cassandra_arch

More information can be found here. Apache Cassandra, Datastax Cassandra Documantation and Users.

Cassandra cluster read/updates:

read_write

Install dependencies:-

Java8 :

#sudo apt-get install software-properties-common python-software-properties
#sudo apt-add-repository ppa:webupd8team/java
#sudo apt-get update
#sudo apt-get install oracle-java8-installer

#java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

#python3 --version
Python 3.4.3

Install Cassandra:-

Following steps will install Cassandra and other tools like cqlsh client, noodtool administrative tool.


#echo "deb http://www.apache.org/dist/cassandra/debian 36x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
#curl https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
#sudo apt-get update
#sudo apt-key adv --keyserver pool.sks-keyservers.net --recv-key A278B781FE4B2BDA
#sudo apt-get install cassandra

#nodetool status
#cqlsh
cqlsh> show VERSION
[cqlsh 5.0.1 | Cassandra 3.6 | CQL spec 3.4.2 | Native protocol v4]
cqlsh> SHOW HOST
Connected to Test_Cluster at 127.0.0.1:9042.
cqlsh> SELECT cluster_name, listen_address FROM system.local;
cqlsh> tracing off
cqlsh> paging off
cqlsh> expand off

Setup Cassandra Multi-node Cluster:

Cassandra uses,
storage_port: 7000 for cluster communication (7001 if SSL is enabled),
native_transport_port: 9042 for native protocol clients,
JMX : 7199 for Java tools
rpc_port: 9160 for remote client communication.

Now let’s setup environment for cluster for 3 servers (192.168.0.1,192.168.0.2,192.168.0.3),
Do following steps for all cassandra nodes.

#sudo service cassandra stop

If you nhave SSD, point data dirs for SSDs.

#sudo mkdir -m=777 /mnt/ssd/cassandra
#sudo vim /etc/cassandra/cassandra.yaml

Setup clustor,

cluster_name: 'Stockopedia_Production_Cluster'

Point data dir to SSD,

data_file_directories:
- /mnt/ssd/cassandra/data
commitlog_directory: /mnt/ssd/cassandra/commitlog
saved_caches_directory: /mnt/ssd/cassandra/saved_caches
hints_directory: /mnt/ssd/cassandra/hints

Add node seeds,

seed_provider:
- seeds: "192.168.0.1,192.168.0.2,192.168.0.3"

Update Ip/ports, eth0 : 192.168.0.1

listen_interface: eth0
start_rpc: true
rpc_interface: eth0
broadcast_rpc_address: 1.2.3.4

Update snitch for grouping machines into “datacenters” and “racks.”:

endpoint_snitch: GossipingPropertyFileSnitch

#vim /etc/cassandra/cassandra-rackdc.properties
dc=dc2
rack=rack2

#sudo service cassandra start

Test cassandra environment:


# nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.0.1 7.84 GiB 256 100.0% bde625de-8af6-477f-9a69-asdedec0fc62 rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.0.2 7.84 GiB 256 100.0% 162ae9dc-e572-4927-98a4-qwe371c9f071 rack2
Datacenter: dc3
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.0.3 7.84 GiB 256 100.0% 162ae9dc-e572-4927-98a4-qwe371sdf071 rack3

Monitoring cassandra servers using VisualVm and Jconsole:

Remote process connection string : service:jmx:rmi://server-ip:7299/jndi/rmi://server-ip:7199/jmxrmi

vim /etc/cassandra/cassandra-env.sh

LOCAL_JMX=no
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=server-ip"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"

Cassandra Model:

cassandra_model

Import CSV using cqlsh client and upload it to cassandra table:

First export data from mysql database,

mysql>SELECT instrument,IFNULL(open,''),IFNULL(high,''),IFNULL(low,''),IFNULL(close,''),IFNULL(volume,''),IFNULL(date,''),created_at INTO OUTFILE '/tmp/price.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY ''
LINES TERMINATED BY '\n'
FROM price;
Query OK, 129865916 rows affected (7 min 5.59 sec)


#cqlsh
cqlsh> CREATE KEYSPACE price_db WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'dc1' : 1, 'dc2' : 1, 'dc3' : 1 };
cqlsh> USE price;
cqlsh> COPY price_db.price_table (instrument,open,high,low,close,volume,date,created_at) FROM '/tmp/price.csv';

cqlsh> CREATE TABLE price_db.price_table (
instrument varchar,
open decimal,
high decimal,
low decimal,
close decimal,
volume decimal,
date date,
created_at timestamp,
PRIMARY KEY (instrument, date)
) WITH CLUSTERING ORDER BY (date DESC) AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} ;

cqlsh> CREATE INDEX close ON price_db.price_table (close);
cqlsh> DESCRIBE price_db.price_tabl;
cqlsh> SELECT * FROM price_db.price_tabl;
cqlsh> SELECT close FROM price_db.price_tabl where instrument = 'XYZ';
cqlsh> DROP TABLE price_db.price_tabl;
cqlsh> TRUNCATE TABLE price_db.price_tabl;

Install Cassandra php-driver :

# php --version
PHP 5.5.9-1ubuntu4.20 (cli) (built: Oct 3 2016 13:00:37)

PHP data driver can be found here, http://datastax.github.io/php-driver/

sudo apt-get install g++ clang make cmake libssl-dev libgmp-dev libpcre3-dev git

#sudo wget http://downloads.datastax.com/cpp-driver/ubuntu/14.04/dependencies/libuv/v1.8.0/libuv_1.8.0-1_amd64.deb
#sudo wget http://downloads.datastax.com/cpp-driver/ubuntu/14.04/dependencies/libuv/v1.8.0/libuv-dev_1.8.0-1_amd64.deb
#sudo wget http://downloads.datastax.com/cpp-driver/ubuntu/14.04/cassandra/v2.5.0/cassandra-cpp-driver-dev_2.5.0-1_amd64.deb
#sudo wget http://downloads.datastax.com/cpp-driver/ubuntu/14.04/cassandra/v2.5.0/cassandra-cpp-driver_2.5.0-1_amd64.deb

#sudo dpkg -i libuv_1.8.0-1_amd64.deb
#sudo dpkg -i libuv-dev_1.8.0-1_amd64.deb
#sudo dpkg -i cassandra-cpp-driver_2.5.0-1_amd64.deb
#sudo dpkg -i cassandra-cpp-driver-dev_2.5.0-1_amd64.deb

#sudo pecl install cassandra

#sudo vim /etc/php5/mods-available/cassandra.ini
extension=cassandra.so;
#php5enmod cassandra

Conclusion:
We found cassandra is 3x faster that relational mysql for timeseries data. You will need to design data model according to query patterns using de-normalization so that you can pull all needed data using just one query.

Thank you.

Posted in Cassandra, Php, Uncategorized | Tagged | Leave a comment

How to setup a virtual machine

How to setup a virtual machine ?

This post covers how to setup VirtualBox , VMware Workstation, VMware Player and Vagrant applications and how to setup guest operating system.

Here, I’ve selected Windows 10 Home as Host OS and Ubunbu 16.04 LTS Xenial as guest OS.

Let’s first understand some basic terminologies regarding Virtualization.

Virtualization :
Virtualization is the creation of a virtual (rather than actual) version of something, such as an operating system, a server, a storage device or network resources.

HostedVirtualizationSmall_20090325191230

Hypervisor :
A hypervisor or virtual machine monitor (VMM) is a piece of computer software, firmware or hardware that creates and runs virtual machines.

virtualization

A computer on which a hypervisor is running one or more virtual machines is defined as a host machine. Each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Multiple instances of a variety of operating systems may share the virtualized hardware resources.

Type-1, native or bare-metal hypervisors
These hypervisors run directly on the host’s hardware to control the hardware and to manage guest operating systems. For this reason, they are sometimes called bare metal hypervisors.

Type-2 or hosted hypervisors
These hypervisors run on a conventional operating system just as other computer programs do.
A guest operating system runs as a process on the host. Type-2 hypervisors abstract guest operating systems
from the host operating system. VMware Workstation, VMware Player, VirtualBox and QEMU are examples of type-2 hypervisors.

400px-Hyperviseur

VirtualBox :

VirtualBox is a powerful x86 and AMD64/Intel64 virtualization product for enterprise as well as home use. It is freely available as Open Source Software under the terms of the GNU General Public License (GPL) version 2.

Presently, VirtualBox runs on Windows, Linux, Macintosh, and Solaris hosts and supports a large number of guest operating systems including but not limited to Windows (NT 4.0, 2000, XP, Server 2003, Vista, Windows 7, Windows 8, Windows 10), DOS/Windows 3.x, Linux (2.4, 2.6, 3.x and 4.x), Solaris and OpenSolaris, OS/2, and OpenBSD.

You can download it from here VirtualBox

There are two options to setup guest OS,
1) You need Ubuntu ISO image to set it up new guest OS machine. It can be downloaded from here ubuntu-16.04-desktop-amd64.iso.
2) You can directly download ready box from osboxs.

I’ve chosen to setup new machine using ISO image.

Installation setups are given on their website which are pretty simple. Installation steps has chosen default setting while allocation size/space to guest OS.
You need to make sure that your guest machine has sufficient space both for application and data which you will keeping on it.

Follow these steps to setup virtual machine,

Virtualbox Setting Up a New Virtual Machine

You can also share folder between Host and guest machine using VirtualBox/SharedFolders and
HOWTO: Use Shared Folders.

If you want to have it mount automatically upon each boot, put the mount command in /etc/rc.local (Debian based distro’s), or whatever script is run at the end of the boot process. The Shared Folders service should mount them automatically, but that doesn’t always happen.
Using /etc/fstab has little effect, because that file is processed before the SF module is loaded and will fail to mount the share. Sometimes, the share does get mounted because the GA check for it when they are loaded upon boot, but it’s very flaky, meaning it doesn’t work most of the time. You’re better of with the first option.
When you put the mount command in /etc/rc.local, so it’s mounted at startup, you can’t use the short notation for your home folder. During startup, everything is done through the root user, so using ~ for home, means it’s the home folder of Root (/root).

Guest Additions setup can be found here, Guest Additions.

Following are important steps for setting up shared folder,

sudo apt-get install build-essential
sudo apt-get install linux-headers-`uname -r`

sharename="shared_folder"
sudo mkdir /mnt/$sharename
sudo chmod 777 /mnt/$sharename
sudo mount -t vboxsf -o uid=1000,gid=1000 $sharename /mnt/$sharename
ln -s /mnt/$sharename $HOME/Desktop/$sharename

For me, it got mounted on /media/sf_share path for some reason.

If google chrome in guest OS (Ubuntu) is not working properly then Disable 3D-acceleration in guest OS setting on virtual box.

Resize Virtualbox Image :

1) Backup VDI image:

Take a backup of you image by coping ‘.vdi’ file to somewhere else. You may need it to restore your system in case of any data lose.

path_to_folder\VirtualBox VMs\Xenial64\Xenial64.vdi

2) Extent VDI image size:

"C:\Program Files\Oracle\VirtualBox\VBoxManage.exe" modifyhd "path_to_folder\VirtualBox VMs\Xenial64\Xenial64.vdi" --resize 30000
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%

Above command will extent VDI size of you image but you will still need to allocate that space to guest OS partition manually.

3) Allocate space to partition:

You can resize your existing guest OS parting using gparted.

Download gparted from this link, (Choose correct 32bit/64bit)
gparted

Boot your virtual machine using above gparted ISO.

NOTE: If gparted gui not working or you it is getting hanged in between.
Go to VM Setting->Systems and checked the “Enable EFI (special OSes only)” extended feature.

Now you should see gparted GUI.

You can see /dev/sda1, /dev/sda2 (linux-swap) partitions on screen. You can easily resize /dev/sda2 (linux-swap) partition by selecting it, choosing “Resize/Move” and applying using “Apply” button.

But, If you wish to extent /dev/sda1 partition then you will need to delete linux-swap and /dev/sda2 partition. Now It will allow you to resize /dev/sda1 partiontion. Set it whatever size you want. Also keep

some 3-4GB free for linux-swap.

Now, re-add your Linux Swap Space. Press on the unallocated space and right-click to “Created new Partition” -> choosing “Create as: Extended partition” .

Press “+Add” and right-click the new “unallocated” to Create a new partition. Choose as “Create as: Logical Partition” and underneath “File System: linux-swap”.

Now press “+Add” and then “Apply” in the main window. Hopefully all changes are successfully applied.

You can now safely shut down this Live CD Virtual Machine.

You can now check disk space again.

#df -h

VMware Workstation / VMware Player :

VMware virtualizes computing, from the data center to the cloud to mobile devices, to help our customers be more agile, responsive, and profitable.

VMware Workstation and VMware Player transforms the way technical professionals develop, test, demonstrate and deploy software by running multiple x86-based operating systems simultaneously on the same PC.

You can download it from here Vmware

Most of installation steps are identical to for both of them. There are few differences which can be found here on compare page.

Follow these steps to setup virtual machine,

VMware Workstation Setting Up a New Virtual Machine

Here also you will need some shared folder to sync data between Host and Guest machine.

Install vmware-tools as follows,

sudo apt-get install build-essential
sudo apt-get install linux-headers-`uname -r`
cp /cdrom/VM*.gz /tmp/
cd /tmp
tar xvzf VM*.gz
cd vmware-tools*
sudo ./vmware-install.pl

NOTE : if you got,
The package that need to be removed,
open-vm-tools

sudo apt-get remove open-vm-tools;

and try running above command again.

Hit enter for all defaults.

Details are given here, Installing VMware tools on an Ubuntu guest

Once it is done, follow steps given as below to setup shared folder,

Create a New Shared Virtual Machine

Now, your shared folder should be ready on following path,

ls /mnt/hgfs/

Vagrant :

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

To achieve its magic, Vagrant stands on the shoulders of giants. Machines are provisioned on top of VirtualBox, VMware, AWS, or any other provider. Then, industry-standard provisioning tools such as shell scripts, Chef, or Puppet, can be used to automatically install and configure software on the machine.

You can get it here, Vagrant

To set it up on ubuntu follow this Setup vagrant on ubuntu

You can find ready boxes here, hashicorp.

Now we can just start your virtual machine as follows,

C:\HashiCorp\Vagrant\bin>vagrant init boxcutter/ubuntu1604
C:\HashiCorp\Vagrant\bin>vagrant up --provider virtualbox

If you are using Window as Host OS and Ubuntu Server as guest OS then While setup you will get following details,

Username : ubuntu
PrivateKey : Drive:\HashiCorp\Vagrant\bin\.vagrant\machines\default\virtualbox\private_key

As Window doesn’t has ssh client by default. You will need ssh client like ‘putty’ to ssh to guest OS Ubuntu server.

Putty can be downloaded from here, putty.

Also download PuTTYgen.

- PuTTY – Client to for managing SSH sessions
- PuTTYgen – Tool for managing and creating SSH key pairs

Load Vagrant private key in PuTTYgen, set key passphrase (remember it. Will need later) and click on ‘Save private key’ button to Generate Putty(.ppk) ‘Private’.
Use Putty private key (.ppk) to set it under SSH-Auth section in Putty. You can also set default auth username under Putty Connection-Data section.

Now you should be able to ssh to guest Ubuntu server OS.

You can also setup shared folder on Vagrant. Steps are given here , Shared folder.

Now you can setup as many as guest OS using above applications.

Thanks.

Posted in Uncategorized, Virtualization | Tagged , , | Leave a comment

Cross-Origin Resource Sharing

Cross-Origin Resource Sharing:

Cross-Origin Resource Sharing (CORS) is a specification that enables truly open access across domain-boundaries.
If you serve public content, please consider using CORS to open it up for universal JavaScript/browser access.
Usually web browsers forbids cross-domain requests, due the same origin security policy.
Cross-origin resource sharing (CORS) is a technique that allow servers to serve resources to permitted origin domains
by adding HTTP headers to the server who are respected from web browsers.

Web Server Settings:

You can send the response header with PHP like,

header("Access-Control-Allow-Origin: *");
header("Access-Control-Allow-Headers: origin, x-requested-with, content-type");
header("Access-Control-Allow-Methods: GET, OPTIONS");

Or Set Header using Web Server. I am using Apache Web Server for this tutorial.

To add the CORS authorization to the header using Apache, simply add headers inside either
the <Directory>, <Location>, <Files> or <VirtualHost> sections of your server config
(usually located in a *.conf file, such as httpd.conf or apache.conf), or within a .htaccess file.

<IfModule mod_headers.c>
Header add Access-Control-Allow-Origin "*"
Header add Access-Control-Allow-Headers "origin, x-requested-with, content-type"
Header add Access-Control-Allow-Methods "GET, OPTIONS"
</IfModule>

Adding “*” in header will make your url public. Any one can request data from that url.

If you want to restrict it to one/more domains as follows

1) Only one domain “http://search.mysite.com/”.

Header add Access-Control-Allow-Origin "http://search.mysite.com/"

2) Multiple sub-domain

SetEnvIf Origin "^(.*\.mysite\.com)$" ORIGIN_SUB_DOMAIN=$1
Header set Access-Control-Allow-Origin "%{ORIGIN_SUB_DOMAIN}e" env=ORIGIN_SUB_DOMAIN

3) Multiple Other domain.

SetEnvIf Origin "^http(s)?://(.+\.)?(mysite\.org|mysite\.com)$" ORIGIN_DOMAIN=$0
Header always set Access-Control-Allow-Origin %{ORIGIN_DOMAIN}e env=ORIGIN_DOMAIN

File can be found at following location,
1) /etc/apache2/sites-available/mysite.conf
2) /etc/apache2/apache2.conf
3) /var/www/mywebsite/.htaccess

To ensure that your changes are correct, it is strongly reccomended that you use
#apachectl -t

to check your configuration changes for errors. After this passes, you may need to reload Apache to make sure your changes are applied by running the command

#sudo service apache2 reload
or
#apachectl -k graceful

Altering headers requires the use of mod_headers. Mod_headers is enabled by default in Apache, however, you may want to ensure it’s enabled by run
#a2enmod headers

Note: you can also use add rather than set, but be aware that add can add the header multiple times, so it’s likely safer to use set.

Web Browser/Client Code:

Include following scripts,

<link href="//code.jquery.com/ui/1.11.4/themes/smoothness/jquery-ui.css" rel="stylesheet" />
<script type="text/javascript" src="//code.jquery.com/jquery-1.10.2.js">
</script><script type="text/javascript" src="//code.jquery.com/ui/1.11.4/jquery-ui.js"></script>

and add following code at the end of your page.


$("#autocomplete").autocomplete({
autoFocus: true,
minLength: 3,
source: function (request, response) {
$.ajax({
url: "http://www.mysite.com/url",
data: 'query=' + request.term,
crossDomain: true,
type: "GET",
contentType: "application/json; charset=utf-8",
async: false,
success: function (data, status) {
if (data.content.error != null) {
alert(data.content.error);
} else {
var x;
var item = [];
for (x in data.content.data) {
item[x] = data.content.data[x].url;
}
response(item);
}
},
error: function (err) {
alert(err);
}
});
},
select: function (event, ui) {
val = $("#autocomplete").val();
window.open(ui.item.value);
$("#autocomplete").val(val);
return false;
}
});

If you want to prevent Apache from executing your code when the browser sends a HTTP OPTIONS Prefight request , returning 200 OK, Add following rewrite rule in .htaccess,

<IfModule mod_rewrite.c>
RewriteEngine On
# Always return 200 OK for OPTIONS method.
RewriteCond %{REQUEST_METHOD} OPTIONS
RewriteRule ^(.*)$ $1 [R=200,L]
</IfModule>

and use “always set” to overwrite cors headers in vhost section,

<VirtualHost *:80>
ServerName search.mysite.com
DocumentRoot /var/www/mysite/
<Directory /var/www/mysite>
AllowOverride All
Options -Indexes
</Directory>
<IfModule mod_headers.c>
SetEnvIf Origin "^(.*\.stockopedia\.com)$" ORIGIN_SUB_DOMAIN=$1
Header always set Access-Control-Allow-Origin "%{ORIGIN_SUB_DOMAIN}e" env=ORIGIN_SUB_DOMAIN
Header always set Access-Control-Allow-Headers "origin, x-requested-with, content-type"
Header always set Access-Control-Allow-Methods "GET, OPTIONS"
</IfModule>
</VirtualHost>

More details can be found at,

http://enable-cors.org/index.html

http://www.html5rocks.com/en/tutorials/cors/

Thanks.

Posted in Client-side Scripting. | Tagged , , | Leave a comment

Killing child process in shell script

Killing child process in shell script:

Many time we need to kill child process which are hanged or block for some reason. eg. FTP connection issue.

There are two approaches,

1) To create separate new parent for each child which will monitor and kill child process once timeout reached.

Create test.sh as follows,

#!/bin/bash
declare -a CMDs=("AAA" "BBB" "CCC" "DDD")
for CMD in ${CMDs[*]}; do
(sleep 10 & PID=$!; echo "Started $CMD => $PID"; sleep 5; echo "Killing $CMD => $PID"; kill $PID; echo "$CMD Completed.") &
done
exit;

and watch processes which are having name as ‘test’ in other terminal using following command.

watch -n1 'ps x -o "%p %r %c" | grep "test" '

Above script will create 4 new child processes and their parents. Each child process will run for 10sec. But once timeout of 5sec reach, thier respective parent processes will kill those childs.
So child won’t be able to complete execution(10sec).
Play around those timings(switch 10 and 5) to see another behaviour. In that case child will finish execution in 5sec before it reaches timeout of 10sec.

2) Let the current parent monitor and kill child process once timeout reached. This won’t create separate parent to monitor each child. Also you can manage all child processes properly within same parent.

Create test.sh as follows,

#!/bin/bash
declare -A CPIDs;
declare -a CMDs=("AAA" "BBB" "CCC" "DDD")
CMD_TIME=15;
for CMD in ${CMDs[*]}; do
(echo "Started..$CMD"; sleep $CMD_TIME; echo "$CMD Done";) &
CPIDs[$!]="$RN";
sleep 1;
done
GPID=$(ps -o pgid= $$);
CNT_TIME_OUT=10;
CNT=0;
while (true); do
declare -A TMP_CPIDs;
for PID in "${!CPIDs[@]}"; do
echo "Checking "${CPIDs[$PID]}"=>"$PID;
if ps -p $PID > /dev/null ; then
echo "-->"${CPIDs[$PID]}"=>"$PID" is running..";
TMP_CPIDs[$PID]=${CPIDs[$PID]};
else
echo "-->"${CPIDs[$PID]}"=>"$PID" is completed.";
fi
done
if [ ${#TMP_CPIDs[@]} == 0 ]; then
echo "All commands completed.";
break;
else
unset CPIDs;
declare -A CPIDs;
for PID in "${!TMP_CPIDs[@]}"; do
CPIDs[$PID]=${TMP_CPIDs[$PID]};
done
unset TMP_CPIDs;
if [ $CNT -gt $CNT_TIME_OUT ]; then
echo ${CPIDs[@]}"PIDs not reponding. Timeout reached $CNT sec. killing all childern with GPID $GPID..";
kill -- -$GPID;
fi
fi
CNT=$((CNT+1));
echo "waiting since $b secs..";
sleep 1;
done
exit;

and watch processes which are having name as ‘test’ in other terminal using following command.

watch -n1 'ps x -o "%p %r %c" | grep "test" '

Above script will create 4 new child processes. We are storing pids of all child processes and looping over them to check if they are finished their execution or still running.
Child process will execution till CMD_TIME time. But if CNT_TIME_OUT timeout reach , All children will get killed by parent process.
You can switch timing and play around with script to see behaviour.
One drawback of this approach is , it is using group id for killing all child tree. But parent process itself belong to same group so it will also get killed.

You may need to assign other group id to parent process if you don’t want parent to be killed.

Following is one more example which monitors php script and kills if it reaches timeout.

1) test.sh

#!/bin/bash
LOG='log.txt'
trap 'echo "LineNo.$LINENO" >> $LOG; exit 1;' ERR SIGINT SIGTERM;
CMD="php test.php;"
echo $CMD
eval $CMD &>> $LOG &
GPID=$(ps -o pgid= $$);
CPID=$!
echo "PIDs: $GPID - $CPID "
CNT=0;
CNT_TIME_OUT=10;
while (true); do
if ps -p $CPID > /dev/null ; then
echo "$CPID is running..";
if [ $CNT -gt $CNT_TIME_OUT ]; then
echo "Timeout reached $CNT_TIME_OUT sec. killing $GPID.. breaking..";
kill -- -$GPID;
fi
else
echo "$CPID is completed. Breaking..";
break;
fi
CNT=$((CNT+1));
echo "waiting since $b secs..";
sleep 1;
done
exit;

1) test.php

<?php
$i=0;
$l=300;
while($i<$l) {
#throw new \Exception("Testing");
$date = new DateTime();
$date->add(DateInterval::createFromDateString('yesterday'));
echo $date->format('Y-m-d H:i:s') . " => $i\n";
sleep(1);
$i++;
echo "End => $i\n";
}
die;
?>

Thanks.

Posted in Shell Script | Tagged | Leave a comment

Setup solr using zookeeper ensemble on ubnutu

solrcould-cluster-single-collection-zookeeper-ensemble

Setup Oracle Java:
Follow quick step given below to setup java latest version on your system,

java -version
tar -zxvf jdk-8u45-linux-x64.tar.gz
sudo mkdir -p /usr/lib/jvm/jdk1.8.0_45
sudo mv jdk1.8.0_45/* /usr/lib/jvm/jdk1.8.0_45/
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.8.0_45/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.8.0_45/bin/javac" 1
sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/lib/jvm/jdk1.8.0_45/bin/javaws" 1
sudo update-alternatives --set java /usr/lib/jvm/jdk1.8.0_45/bin/java
sudo update-alternatives --set javac /usr/lib/jvm/jdk1.8.0_45/bin/javac
sudo update-alternatives --set javaws /usr/lib/jvm/jdk1.8.0_45/bin/javaws
java -version
vim ~/.bashrc
export JAVA_HOME="/usr/lib/jvm/jdk1.8.0_45"
export PATH="$PATH:$JAVA_HOME/bin"
source ~/.bashrc
echo $JAVA_HOME
echo $PATH

For more detail kindly visit how-to-install-oracle-jdk-7-on-ubuntu-12-04

Setup Zookeeper Ensemble: (https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble)

Consider we have 2 servers (192.168.0.101, 192.168.0.111) and we are setting up 6 zookeeper nodes called znodes.
Follow quick step given below to setup Zookeeper Ensemble on your system,

wget 'http://mirror.symnds.com/software/Apache/zookeeper/stable/zookeeper-3.4.6.tar.gz'
tar -xvf zookeeper-3.4.6.tar.gz
sudo mkdir -p /usr/lib/zookeeper-3.4.6
sudo mv zookeeper-3.4.6/* /usr/lib/zookeeper-3.4.6/
cd /usr/lib/zookeeper-3.4.6/
sudo cp conf/zoo_sample.cfg conf/zoo_1.cfg
sudo vim conf/zoo_1.cfg

Add following configs,

tickTime=2000
dataDir=/var/lib/zookeeper/1/
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.101:2888:3888
server.2=192.168.0.101:2889:3889
server.3=192.168.0.101:2890:3890
server.4=192.168.0.111:2888:3888
server.5=192.168.0.111:2889:3889
server.6=192.168.0.111:2890:3890

Repeat above setting for all servers.

Make data directories:(Repeat for all servers)

sudo mkdir -p /var/lib/zookeeper/1/
sudo sh -c 'echo "1" > /var/lib/zookeeper/1/myid'

Start servers:(Repeat for all servers)
sudo bin/zkServer.sh start zoo_1.cfg

Check Status:

bin/zkServer.sh status zoo_1.cfg
echo status | nc localhost 2187

Test Client:

bin/zkCli.sh -server 192.168.0.101:2181
[zk: 192.168.0.101:2181(CONNECTED) 1] ls /
[zk: 192.168.0.101:2181(CONNECTED) 2] get /configs/gettingstarted/solrconfig.xml
[zk: 192.168.0.101:2181(CONNECTED) 3] quit

Stop Servers:(Repeat for all servers)
sudo bin/zkServer.sh stop zoo_1.cfg

For more detail kindly visit Setup-zookeeper-ensemble-on-ubuntu

Setup Solr:

Consider we have 2 servers (192.168.0.101, 192.168.0.111) and we are setting up 2 solr instances in shard=1:replication:2 setup.
wget 'http://apache.mirrors.hoobly.com/lucene/solr/5.2.0/solr-5.2.0.tgz'
SolrCloud Standalone Setup using Embedded Zookeeper (Testing environment): (http://lucene.apache.org/solr/quickstart.html)
Please follow above link for demos.

SolrCloud Mode Setup using External Zookeeper Ensemble (Testing environment): (https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference)
cd solr-5.2.0/tar -zxvf solr-5.2.0.tgz

Create Node:

mkdir -p example/cloud/node1/solr
cp server/solr/solr.xml example/cloud/node1/solr
mkdir -p example/cloud/node2/solr
cp server/solr/solr.xml example/cloud/node2/solr
bin/solr start -cloud -s example/cloud/node1/solr -h 192.168.0.101 -p 8983 -z 192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186
bin/solr start -cloud -s example/cloud/node1/solr -h 192.168.0.111 -p 8983 -z 192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186

Create Collection:
bin/solr create -c gettingstarted -d basic_configs -rf 2

Status:

bin/solr status
bin/solr healthcheck -c gettingstarted

Delete Collection:
bin/solr delete -c gettingstarted

Restart:

bin/solr restart -cloud -s example/cloud/node1/solr -h 192.168.0.101 -p 8983 -z 192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186
bin/solr restart -cloud -s example/cloud/node1/solr -h 192.168.0.111 -p 8983 -z 192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186

Stop Node:
bin/solr stop -all;

Clean all Testing files:
rm -Rf example/cloud/

SolrCloud as a Service using External Zookeeper Ensemble (Production environment): (https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production)

Run Installation script:

tar xzf solr-5.2.0.tgz solr-5.2.0/bin/install_solr_service.sh --strip-components=2
sudo bash ./install_solr_service.sh -help
sudo bash ./install_solr_service.sh solr-5.0.0.tgz -i /opt -d /var/solr -u solr -s solr -p 8983
OR
sudo bash ./install_solr_service.sh solr-5.2.0.tgz
id: solr: no such user
Creating new user: solr
Adding system user `solr' (UID 109) ...
Adding new group `solr' (GID 116) ...
Adding new user `solr' (UID 109) with group `solr' ...
Creating home directory `/home/solr' ...
Extracting solr-5.2.0.tgz to /opt
Creating /etc/init.d/solr script ...
Adding system startup for /etc/init.d/solr ...
Waiting to see Solr listening on port 8983 [/]
Started Solr server on port 8983 (pid=1704). Happy searching!
Service solr installed.
sudo service solr status
sudo service solr stop

Setup SolrCloud:

To run Solr in SorlCloud add following setting in environment specific include file (/var/solr/solr.in.sh),
ZK_HOST="192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186"

If you’re using a ZooKeeper instance that is shared by other systems, it’s recommended to isolate the SolrCloud znode tree using ZooKeeper’s chroot support. For instance, to ensure all znodes created by SolrCloud are stored under /solr, you can put /solr on the end of your ZK_HOST connection string, such as:
ZK_HOST="192.168.0.101:2181,192.168.0.101:2182,192.168.0.101:2183,192.168.0.111:2184,192.168.0.111:2185,192.168.0.111:2186/solr"

If using a chroot for the first time, you need to bootstrap the Solr znode tree in ZooKeeper by using the zkcli.sh script, such as:
/opt/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost 192.168.0.101:2181 -cmd bootstrap -solrhome /var/solr/data

If above script couldn’t be able to create ‘solr’ chroot because of 0 core (As we have not create any core yet) then create one by zookeeper client,

/usr/lib/zookeeper-3.4.6/bin/zkCli.sh -server 192.168.0.101:2181
[zk: 192.168.0.101:2181(CONNECTED) 2] create /solr solr

Note: Above fix is not specified anywhere in Solr Doc.
sudo service solr start

Upload a configuration directory: (https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities)
/opt/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost 192.168.0.101:2181/solr -cmd upconfig -confname data_driven_schema_configs -confdir /opt/solr/server/solr/configsets/data_driven_schema_configs/conf

To delete wrong config use zookeeper client,

/usr/lib/zookeeper-3.4.6/bin/zkCli.sh -server 192.168.0.101:2181
[zk: 192.168.0.111:2184(CONNECTED) 8] rmr /configs

Create Collection:(https://cwiki.apache.org/confluence/display/solr/Collections+API)

http://host:port/solr/admin/collections?action=CREATE&name=gettingstarted&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=data_driven_schema_configs

http://host:port/solr/admin/collections?action=DELETE&name=gettingstarted

For more detail kindly visit Install-solr-on-ubuntu

Important Download links,

Search Engine research:

https://www.elastic.co/products/elasticsearch

http://solr-vs-elasticsearch.com/

http://lucene.apache.org/solr/quickstart.html

https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide

http://wiki.apache.org/solr/FrontPage

https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

Requirement:

http://lucene.apache.org/solr/5_1_0/SYSTEM_REQUIREMENTS.html

Downloads:

http://mirror.reverse.net/pub/apache/lucene/solr/5.1.0/solr-5.1.0.tgz

http://apache.mirrors.hoobly.com/lucene/solr/5.2.0/solr-5.2.0.tgz

http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.tar.gz

http://archive.apache.org/dist/lucene/solr/

http://mirror.symnds.com/software/Apache/zookeeper/stable/zookeeper-3.4.6.tar.gz

Setup:

https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production

http://zookeeper.apache.org/doc/r3.4.6/zookeeperStarted.html

PHP Client:

http://wiki.apache.org/solr/SolPHP

https://pecl.php.net/package/solr

http://php.net/manual/en/book.solr.php

Symfony Bundle:

https://packagist.org/packages/solarium/solarium

https://packagist.org/packages/nelmio/solarium-bundle

https://packagist.org/packages/reprovinci/solr-php-client

https://packagist.org/packages/floriansemm/solr-bundle

https://packagist.org/packages/internations/solr-utils

https://packagist.org/packages/internations/solr-query-component


Thank you.

Posted in Java, Solr, Uncategorized, Zookeeper | Tagged , , | Leave a comment

Setup ZooKeeper Ensemble on Ubuntu

Setup ZooKeeper Ensemble on Ubuntu:

Download Apache ZooKeeper:

The first step in setting up Apache ZooKeeper is, of course, to download the software. It’s available from http://zookeeper.apache.org/releases.html.

#wget http://mirror.symnds.com/software/Apache/zookeeper/stable/zookeeper-3.4.6.tar.gz
#tar -xvf zookeeper-3.4.6.tar.gz

Configure the instance:
Lets create one in conf/zoo1.cfg:

#sudo mkdir -p /usr/lib/zookeeper-3.4.6
#sudo mv zookeeper-3.4.6/* /usr/lib/zookeeper-3.4.6/
#cd /usr/lib/zookeeper-3.4.6/
#cp conf/zoo_sample.cfg conf/zoo1.cfg
#vim conf/zoo1.cfg

Add following settings,

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.1:2888:3888
server.2=192.168.0.2:2888:3888
server.3=192.168.0.3:2888:3888

The parameters are as follows:
tickTime: Part of what ZooKeeper does is to determine which servers are up and running at any given time, and the minimum session time out is defined as two “ticks”. The tickTime parameter specifies, in miliseconds, how long each tick should be.
dataDir: This is the directory in which ZooKeeper will store data about the cluster. This directory should start out empty.
clientPort: This is the port on which Solr will access ZooKeeper.
initLimit: Amount of time, in ticks, to allow followers to connect and sync to a leader. In this case, you have 5 ticks, each of which is 2000 milliseconds long, so the server will wait as long as 10 seconds to connect and sync with the leader.
syncLimit: Amount of time, in ticks, to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be dropped.
server.X: These are the IDs and locations of all servers in the ensemble, the ports on which they communicate with each other. The server ID must additionally stored in the /myid file and be located in the dataDir of each ZooKeeper instance. The ID identifies each server, so in the case of this first instance, you would create the file /var/lib/zookeeper/1/myid with the content “1″.

Once this file is in place, you’re ready to start the ZooKeeper instance.

Then create /var/lib/zookeeper directory And create myid file, so each node can identify itself:

#sudo mkdir -p /var/lib/zookeeper
#echo "1" > /var/lib/zookeeper/myid

Where “1″ is the node number (so put “2″ for the next node and so on)
Do same for all nodes.

Standalone Setup:

You can also setup multiple instances on localhost. You just need to create separate data directory per instance for storing id and data and make all instances listen on different ports.

clientPort=2181
clientPort=2182
clientPort=2183

dataDir=/var/lib/zookeeper/1/
dataDir=/var/lib/zookeeper/2/
dataDir=/var/lib/zookeeper/3/

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

#echo "1" > /var/lib/zookeeper/1/myid
#echo "2" > /var/lib/zookeeper/2/myid
#echo "3" > /var/lib/zookeeper/3/myid

Once you have each node set up, you can start ZooKeeper by issuing on each node:

bin/zkServer.sh start zoo1.cfg
bin/zkServer.sh start zoo2.cfg
bin/zkServer.sh start zoo3.cfg

Check servers are running,
$bin/zkServer.sh status zoo1.cfg
$bin/zkServer.sh status zoo2.cfg
$bin/zkServer.sh status zoo3.cfg

$echo status | nc localhost 2181
$echo status | nc localhost 2182
$echo status | nc localhost 2183

Connect client,

/bin/zkCli.sh -server localhost:2181
[zk: localhost:2181(CONNECTED) 1] ls /
[zk: localhost:2181(CONNECTED) 2] ls /configs/
[zk: localhost:2181(CONNECTED) 3] ls /collections/
[zk: localhost:2181(CONNECTED) 4] get /configs/gettingstarted/solrconfig.xml
[zk: localhost:2181(CONNECTED) 5] quit

Stop them,

bin/zkServer.sh stop zoo1.cfg
bin/zkServer.sh stop zoo2.cfg
bin/zkServer.sh stop zoo3.cfg

Thanks you.

Posted in Zookeeper | Tagged | Leave a comment