A brief description of Apache Kafka

Posted By : Lokesh Babu Sharma | 30-Aug-2019

Server

Introduction

Apache Kafka, a pub/sub messaging system, is a stream-processing software platform. It is developed by LinkedIn in early 2011 and donated to the Apache Software Foundation.Java and Scala languages have been used to develop Kafka. The project aims is to hand over a durable, fast, scalable, and fault-tolerant publish-subscribe messaging system for handling real-time data feeds. Its storage layer is essentially making it highly valuable for enterprise infrastructures to process streaming data because it is a massively scalable public / subscribe message queue. Kafka can also connect to external systems to import and export data via Kafka Connect and hand over Kafka Streams, a Java stream processing library.
Generally, Kafka is used for two broad classes of applications first one is building real-time streaming data pipelines that easily transfer data between systems or applications and the second one is building real-time streaming applications that transform or react data as streams.

First concepts of apache Kafka are :

1. Kafka works as a cluster in one or more servers that can connect multiple datacenters.
2. Kafka stores the stream of 'records' according to categories called topics.
3. Each record consists of a timestamp, a key, and a value.

Apache Kafka Architecture

Mainly four core APIs are used in Kafka:
1. Producer API - Publish streams of records to one or more Kafka topics.
2. Consumer API - Subscribe one or more topics and process the streams of records.
3. Streams API - Converts the input stream to output stream and producing results. Kafka works as a stream processor, consuming an input stream and producing an output stream.
4. Connector API - Running and building reusable producers and consumers that connect Kafka topics to already in use applications or data systems.

Kafka as a cluster have different components:

1. Kafka Broker - A Kafka cluster has one or more servers (Kafka brokers) to maintain load balance. Due to stateless brokers use ZooKeeper to maintain their cluster state. A single Kafka broker instance can handle thousands of reads and writes in a second and each broker can handle TB of messages without reducing performance. Kafka topics are divided into many partitions, each partition can be placed on a single or separate machine to allow for multiple consumers to read from a topic in parallel.
2. Kafka Zookeeper - a Zookeeper is a top-level software developed by Apache which used to maintain configuration data and naming and to hand over robust and flexible synchronization within distributed systems and acts as a centralized service. Zookeeper keeps information on the status of cluster nodes of Kafka and keeps information on Kafka topics, partitions, etc. The brain of the whole system is known as Zookeeper Atomic Broadcast (ZAB) protocol.
3. Kafka Producer - Producers send data to brokers. All the producers search the newly started broker and automatically sends a message to that newly started broker. Kafka producer doesn’t wait for acknowledgments from the broker and Kafka producer sends messages as fast as the broker can handle, it doesn’t wait for acknowledgments from the broker.
4. Kafka Consumers - Kafka broker is stateless this is why consumers manage that how many messages have been consumed. The broker is issued an asynchronous pull request to the consumer to have bytes buffer ready to consume. Once the consumer acknowledges a particular message offset, the consumer has consumed all prior messages.
fundamental concepts
5. Kafka topics - The stream of a particular type /classification of data is defined by a topic. The producer produces a stream of data with topics then consumers consume these data topic wise. In the Kafka, cluster topics name must be unique. We can use unlimited topics. We can update data or messages after gets published.
6. Partitions in Kafka - Topics are breaks into the partition and replicated across brokers. Messages, each assigned an incremental id called offset, are stored in sequence manner in a partition, These messages are meaningful only within the partition.
7. Topic replication Factor - Topic makes a replica in another broker. If any broker goes down, topics replica's from another broker can solve this crisis.

Setup Apache Kafka :

1. First Install JDK using the following commands:

sudo apt-get update
sudo apt-get install default-jdk

2. Setup Kafka using the following commands:

wget http://www-us.apache.org/dist/kafka/2.2.1/kafka_2.12-2.2.1.tgz
extract the archive file
tar xzf kafka_2.12-2.2.1.tgz
move file
mv kafka_2.12-2.2.1 /usr/local/kafka

3. Start Kafka :

cd /usr/local/kafka

4. First run Zookeeper :

bin/zookeeper-server-start.sh config/zookeeper.properties

5. Then run Kafka :

bin/kafka-server-start.sh config/server.properties

Conclusion

Apache Kafka is one of the best pub/sub messaging systems. It is quite simple to use as a publish-subscribe system.

Related Tags

wordpress

data management solutions

Data Analysis

Git

blockchain mobile wallet

digital asset management

About Author

Lokesh Babu Sharma

Lokesh is in backend team. He Believes in smart work, He is good in programming. He loves to play with codes.He has expertise in Java.

Ready to innovate? Let's get in touch

Attach files

Recaptcha is required.

Backend

Full Stack

Frontend

Blockchain

Mobile

Video Streaming

E-commerce

ERP

CMS

Devops

AR/VR

Software Development Services

Metaverse Innovation & Consulting

Digital Experience

Digital Trivergence

Data Services

Scaffold

Company

A brief description of Apache Kafka

Posted By : Lokesh Babu Sharma | 30-Aug-2019

Related Tags

About Author

Lokesh Babu Sharma

More From Oodles

OBS Studio : A Versatile Open-Source Tool for Live Streaming

Discover how OBS Studio simplifies video recording and live streaming, including unique use cases like ultrasound streaming with AWS Lambda integration.

Prahalad Singh Ranawat | 01-Jan-2025

Using Datafication For Greater Competitive Advantage

Datafication enhances technology’s capabilities in analyzing its operating environment faster, by transforming its elements into quantifiable nature.

Arpita Pal | 27-Nov-2023

Building a Face Recognition System from Scratch: A Step By Step Guide

Implementing a face recognition system involves not only the theoretical understanding but also addressing real-world challenges.

Priyansha Singh | 29-May-2023

Ready to innovate? Let's get in touch

Valued Services

Resources

Expertise

Connect with us

© Copyright 2025 Oodles Technologies Pvt Ltd. All rights reserved.