How to Install Zookeeper and Kafka Cluster

Larry Deng
5 min readApr 17, 2023

This article explains how to install the zookeeper and kafka cluster. For convenience, it will just run the cluster on single server, but it's the same for multiple servers case.


Key Concept of Zookeeper

ZooKeeper is a distributed coordination service that provides a hierarchical key-value store used to maintain configuration information, provide distributed synchronization, and offer group services. The key concepts of ZooKeeper are:

  1. Nodes: ZooKeeper stores data in a hierarchical namespace similar to a file system. Each node in the namespace is called a “znode” and can store a small amount of data, typically less than 1 MB.
  2. Watches: Clients can set watches on znodes to receive notifications when the znode changes. Watches are one-time triggers that are fired when the data associated with a znode changes or when a znode is deleted.
  3. Quorums: ZooKeeper is designed to operate in a replicated mode, which provides high availability and fault tolerance. ZooKeeper uses a consensus protocol called ZAB (ZooKeeper Atomic Broadcast) to maintain consistency across all the nodes in the cluster. To achieve this, a quorum of nodes must agree on any changes to the data stored in ZooKeeper.
  4. Sessions: When a client connects to ZooKeeper, it creates a session. The session is used to maintain the connection between the client and the server and can be used to associate watches with znodes. ZooKeeper sessions have timeouts, and clients must periodically renew their sessions to prevent them from expiring.
  5. ACLs: ZooKeeper provides access control lists (ACLs) to control access to znodes. ACLs can be used to restrict access to certain znodes or to certain operations on znodes.

Overall, ZooKeeper provides a simple and reliable way to coordinate distributed systems by providing a shared and consistent view of configuration information and synchronization primitives. It is a critical component in many distributed systems and is widely used in production environments.

Key Concept of Kafka

Kafka is a distributed streaming platform that is used for building real-time data pipelines and streaming applications. The key concepts of Kafka are:

  1. Topics: A topic is a category or feed name to which messages are published. A topic is divided into partitions, which allows for scalability and parallelism.
  2. Partitions: A partition is a ordered sequence of messages in a topic. Each partition is a separate file on the broker, and messages within a partition are ordered by their offset.
  3. Brokers: A broker is a Kafka server that stores and receives messages from producers and consumers. A Kafka cluster consists of one or more brokers.
  4. Producers: Producers are processes that write messages to Kafka topics. They can specify which partition they want to write to, or they can rely on the default partitioner to select a partition.
  5. Consumers: Consumers are processes that read messages from Kafka topics. They can read from one or more partitions, and can maintain their own offset in each partition they consume from.
  6. Consumer Groups: Consumer groups are sets of consumers that work together to consume a topic. Each message is consumed by only one consumer in a consumer group, which allows for parallel consumption.
  7. Offsets: An offset is a unique identifier for each message within a partition. Consumers can keep track of the last message they read by storing the offset of the last message they consumed.
  8. Replication: Kafka provides replication of partitions for fault tolerance. Each partition can have multiple replicas, and each replica is stored on a different broker. This allows for high availability and durability of data.

Overall, Kafka provides a scalable, fault-tolerant, and distributed messaging system that can handle large volumes of data in real-time. Its key features include topics, partitions, brokers, producers, consumers, consumer groups, offsets, and replication.

Install Zookeeper Cluster

Download Zookeeper

Download the Zookeeper installation package:

curl -o apache-zookeeper-3.7.1-bin.tar.gz

Unzip the installation package:

tar xvf apache-zookeeper-3.7.1-bin.tar.gz

Create Configuration

Create the folder zk1, and add the config:






Repeat for 2 and 3:











Start the servers

Start the 3 servers:

$ ./apache-zookeeper-3.7.1-bin/bin/ start ./zk1/zk.config 
ZooKeeper JMX enabled by default
Using config: ./zk1/zk.config
Starting zookeeper ... STARTED

$ ./apache-zookeeper-3.7.1-bin/bin/ start ./zk2/zk.config
ZooKeeper JMX enabled by default
Using config: ./zk2/zk.config
Starting zookeeper ... STARTED

$ ./apache-zookeeper-3.7.1-bin/bin/ start ./zk3/zk.config
ZooKeeper JMX enabled by default
Using config: ./zk3/zk.config
Starting zookeeper ... STARTED

Health check

Check the status:

$ ./apache-zookeeper-3.7.1-bin/bin/ status ./zk1/zk.config 
ZooKeeper JMX enabled by default
Using config: ./zk1/zk.config
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower

$ ./apache-zookeeper-3.7.1-bin/bin/ status ./zk2/zk.config
ZooKeeper JMX enabled by default
Using config: ./zk2/zk.config
Client port found: 2182. Client address: localhost. Client SSL: false.
Mode: leader

$ ./apache-zookeeper-3.7.1-bin/bin/ status ./zk3/zk.config
ZooKeeper JMX enabled by default
Using config: ./zk3/zk.config
Client port found: 2183. Client address: localhost. Client SSL: false.
Mode: follower

Connect to one server and create data:

$ ./apache-zookeeper-3.7.1-bin/bin/ -server localhost:2181

[zk: localhost:2181(CONNECTED) 0] create /pkslow
Created /pkslow
[zk: localhost:2181(CONNECTED) 1] create /pkslow/website
Created /pkslow/website

Connect to other server to check the data:

$ ./apache-zookeeper-3.7.1-bin/bin/ -server localhost:2182

[zk: localhost:2182(CONNECTED) 1] get /pkslow/website

Install Kafka Cluster

Download Kafka

Download the package:

curl -o kafka_2.13-3.4.0.tgz

Unzip the package:

tar -xzf kafka_2.13-3.4.0.tgz


Configuration for broker1:

Configuration for broker2:

Configuration for broker3:

Start the brokers

Start the kafka servers:

./kafka_2.13-3.4.0/bin/ ./kafka1/
./kafka_2.13-3.4.0/bin/ ./kafka2/
./kafka_2.13-3.4.0/bin/ ./kafka3/

Check and Test

Create topic:

$ kafka_2.13-3.4.0/bin/ --create --topic pkslow-topic --bootstrap-server localhost:9091,localhost:9092,localhost:9093 --partitions 3 --replication-factor 3
Created topic pkslow-topic.

List topic:

$ kafka_2.13-3.4.0/bin/ --list --bootstrap-server localhost:9091,localhost:9092,localhost:9093

Describe the topic:

$ kafka_2.13-3.4.0/bin/ --describe --topic pkslow-topic --bootstrap-server localhost:9091,localhost:9092,localhost:9093
Topic: pkslow-topic TopicId: 7CLy7iZeRvm8rCrn8Dw_mA PartitionCount: 3 ReplicationFactor: 3 Configs:
Topic: pkslow-topic Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: pkslow-topic Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: pkslow-topic Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1

Producer sends message to brokers:

$ kafka_2.13-3.4.0/bin/ --broker-list localhost:9091,localhost:9092,localhost:9093 --topic pkslow-topic
>My name is Larry Deng.
>My website is

Consumer receives message from brokers:

$ kafka_2.13-3.4.0/bin/ --bootstrap-server localhost:9091,localhost:9092,localhost:9093 --topic pkslow-topic --from-beginning
My name is Larry Deng.
My website is


Please check the configuration on GitHub pkslow-samples