Kafka介绍

作者: gb_QA_log | 来源:发表于2018-04-11 17:25 被阅读0次

kafka配置KAFKA_LISTENERS和KAFKA_ADV
Ansible Role 消息系统之【Kafka Manage
Spring Boot集成Kafka
kafka学习的相关网址
Go使用Kafka生产数据
Kafka 0.10.x的Consumer 和 Producer
Apache Kafka 基础介绍
Kafka 原理介绍及安装部署
[Kafka 101-0] Kafka简介
Kafka 介绍

1 消息队列的介绍

参考：https://agiledon.github.io/blog/2012/12/27/distributed-architecture-based-on-message/

2 kafka intro

Kafka as a Messaging System

两种风格兼具

How does Kafka's notion of streams compare to a traditional enterprise messaging system?
Messaging traditionally has two models: queuing and publish-subscribe.

In a queue, a pool of consumers may read from a server and each record goes to one of them;

in publish-subscribe the record is broadcast to all consumers.

Each of these two models has a strength and a weakness.

The strength of queuing is that it allows you to divide up the processing of data over multiple consumer instances, which lets you scale your processing. Unfortunately, queues aren't multi-subscriber—once one process reads the data it's gone.

Publish-subscribe allows you broadcast data to multiple processes, but has no way of scaling processing since every message goes to every subscriber.

The consumer group concept in Kafka generalizes these two concepts. As with a queue the consumer group allows you to divide up processing over a collection of processes (the members of the consumer group). As with publish-subscribe, Kafka allows you to broadcast messages to multiple consumer groups.

The advantage of Kafka's model is that every topic has both these properties—it can scale processing and is also multi-subscriber—there is no need to choose one or the other.

保证顺序

Kafka has stronger ordering guarantees than a traditional messaging system, too.

A traditional queue retains records in-order on the server, and if multiple consumers consume from the queue then the server hands out records in the order they are stored. However, although the server hands out records in order, the records are delivered asynchronously to consumers, so they may arrive out of order on different consumers. This effectively means the ordering of the records is lost in the presence of parallel consumption. Messaging systems often work around this by having a notion of "exclusive consumer" that allows only one process to consume from a queue, but of course this means that there is no parallelism in processing.
(consumer并行消费partitions):
Kafka does it better. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances. Note however that there cannot be more consumer instances in a consumer group than partitions.

Kafka as a Storage System

Any message queue that allows publishing messages decoupled from consuming them is effectively acting as a storage system for the in-flight messages. What is different about Kafka is that it is a very good storage system.

Data written to Kafka is written to disk and replicated for fault-tolerance. Kafka allows producers to wait on acknowledgement so that a write isn't considered complete until it is fully replicated and guaranteed to persist even if the server written to fails.

The disk structures Kafka uses scale well—Kafka will perform the same whether you have 50 KB or 50 TB of persistent data on the server.

As a result of taking storage seriously and allowing the clients to control their read position, you can think of Kafka as a kind of special purpose distributed filesystem dedicated to high-performance, low-latency commit log storage, replication, and propagation.

For details about the Kafka's commit log storage and replication design, please read this page.

Kafka for Stream Processing

cppkafka未实现，但rdkafka有..

3 Introduction to librdkafka - the Apache Kafka C/C++ client library

类似：kafka Documentation - producerconfigs

batch.num.messages - the minimum number of messages to wait for to accumulate in the local queue before sending off a message set. 具体在librdkafka/src/rdkafka_broker.c有代码体现。

queue.buffering.max.ms - how long to wait for batch.num.messages to fill up in the local queue. A lower value improves latency at the cost of lower throughput and higher per-message overhead. A higher value improves throughput at the expense of latency. The recommended value for high throughput is > 50ms.

网友评论

我爱编程

本文标题：Kafka介绍

本文链接：https://www.haomeiwen.com/subject/pvhzhftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Kafka介绍

1 消息队列的介绍

2 kafka intro

Kafka as a Messaging System

两种风格兼具

保证顺序

Kafka as a Storage System

Kafka for Stream Processing

3 Introduction to librdkafka - the Apache Kafka C/C++ client library

相关文章

kafka配置KAFKA_LISTENERS和KAFKA_ADV

Ansible Role 消息系统之【Kafka Manage

Spring Boot集成Kafka

kafka学习的相关网址

Go使用Kafka生产数据

Kafka 0.10.x的Consumer 和 Producer

Apache Kafka 基础介绍

Kafka 原理介绍及安装部署

[Kafka 101-0] Kafka简介

Kafka 介绍

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

我爱编程