What is Apache Zookeeper?

An Introduction to Apache Zookeeper, Key concepts behind it and use cases

Bhanuka Dissanayake
Level Up Coding

--

Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination.

After a long time, I am going to add an experience of mine to my blog in which I got to work on some distributed computing stuff. This was the first time I worked with distributed systems and used Apache Zookeeper for some interesting tasks did during last month. So I thought to do some digging on Zookeeper and thought it would be beneficial to me and also people who are getting started with Apache Zookeeper to make those findings in some documented manner.

Introduction

First, let's see briefly what the Zookeeper is. ZooKeeper is a coordinating and managing service to a large set of hosts in a distributed environment. ZooKeeper does this task with its simple architecture and API. To understand the role of the Apache Zookeeper properly, it's better to have some idea on distributed applications.

What is a Distributed Application?

A distributed application is an application that can run on multiple systems in a network by coordinating among themselves to complete a task in a fast and efficient manner. A group of systems in which a distributed application is running is called a Cluster and each machine running in a cluster is called a Node.

A distributed application has two parts, Servers and Clients. Server applications have a common interface so that clients can connect to any server in the cluster and get the same result. Client applications are the tools to interact with a distributed application.

Distributed systems are Reliable, scalable and hide the complexity of the system and act as a single entity(Transparency). There are some challenges of distributed applications too. Need to handle Race conditions where two or more machines trying to perform a particular task, which actually needs to be done only by a single machine at any given time. Deadlocks are also possible where two or more operations waiting for each other to complete indefinitely. Inconsistency is should be handled where the partial failure of data.

ZooKeeper’s Role

In this case, Apache ZooKeeper is a service used by a cluster to coordinate between themselves and maintain shared data with robust synchronization techniques.

Architecture

Next, let's discuss the Zookeeper’s client-server architecture.

Clients are the nodes in the distributed application cluster, who does access information from the server.

Each client sends a message to the server for a particular time interval to let know that the client is alive and the server sends an acknowledgement when a client connects. If there is no response from the connected server, the client automatically redirects the message to another server.

An ensemble is a group of ZooKeeper servers. It requires a minimum of 3 nodes to form an ensemble. The server is a node in the ensemble, provides services to clients.

The leader is elected on the service startup and performs automatic recovery if any of the connected nodes failed. Followers follow leader instructions.

The ZooKeeper data model

Zookeeper uses Hierarchical Namespace for memory representation of its file system. ZooKeeper node is referred to as znode. Every znode is identified by a name and separated by a “/”. The namespace looks similar to a Unix filesystem. Every znode in the ZooKeeper data model maintains a stat structure.

Types of Znodes

Persistence znode — Alive even after the client which created that particular znode is disconnected. By default, all znodes are persistent unless it is specified.

Ephemeral znode — Active until the client is alive. If the client gets disconnected from the ensemble, then this gets automatically deleted. Due to this, these znodes are not allowed to have children. Once znode is deleted, then the next suitable node will fill its position. Ephemeral znodes are important in Leader election.

Sequential znode — Can be either persistent or ephemeral. When a new znode is created, then ZooKeeper sets the path by attaching a 10 digit sequence number to the original name. If two sequential znodes are created concurrently, then ZooKeeper doesn't use the same number for each znode. Sequential znodes are important in Locking and Synchronization.

Sessions

Once a client connects to a server, ZooKeeper creates a ZooKeeper session and a session id is assigned to the client. The client sends PING requests at a particular time interval to keep the session alive. The session will end if the ZooKeeper service does not receive PINGs from a client for more than the session timeout specified at the starting of the service. Then the ephemeral znodes created during that session also get deleted.

Watches

Watches are a simple mechanism that notifies clients about the changes in the ensemble. Clients can set watches when reading a znode. Znode changes are modifications of data associated with the znode or changes in the znode’s children. Watches are triggered only once. If a client wants a notification again, it should be done by another read operation. When the session is expired, the client will be disconnected and the associated watches will be removed.

Zookeeper — Workflow

Once a ZooKeeper ensemble starts, it will wait for the clients to connect. Clients will connect to one of the nodes in the ZooKeeper ensemble. It may be a leader or a follower node. Once a client is connected, the node assigns a session ID to the particular client and sends an acknowledgement to the client. If the client does not get an acknowledgement, it tries to connect another node in the ZooKeeper ensemble. Once connected to a node, it can perform functions like read, write, or store the data as per the need. The client will PING to the node at a regular interval to make sure that the connection is not lost.

Use cases

The common services provided by ZooKeeper are as follows,

  • Naming service − Identifying the nodes in a cluster by name.
  • Configuration management − Latest and up-to-date configuration information of the system for a joining node.
  • Cluster management − Joining / leaving of a node in a cluster and node status in real-time.
  • Leader election − Electing a node as a leader for coordination purposes.
  • Locking and synchronization service − Locking the data while modifying it. This is used in automatic fail recovery while connecting other distributed applications.
  • Highly reliable data registry − Availability of data even when one or a few nodes are down.

I hope you got an idea of Apache Zookeeper and its role in distributed applications, architecture and use cases. Thank you for reading!

References

  1. Apache Zookeeper documentation
  2. Zookeeper Tutorial — Tutorialspoint

--

--

Software Engineer | Computer Science & Engineering — University of Moratuwa