概述
Kafka源码包含多个模块,每个模块负责不同的功能。以下是一些核心模块及其功能的概述:
服务端源码 :实现Kafka Broker的核心功能,包括日志存储、控制器、协调器、元数据管理及状态机管理、延迟机制、消费者组管理、高并发网络架构模型实现等。
Java客户端源码 :实现了Producer和Consumer与Broker的交互机制,以及通用组件支撑代码。
Connect源码 :用来构建异构数据双向流式同步服务。
Stream源码 :用来实现实时流处理相关功能。
Raft源码 :实现了Raft一致性协议。
Admin模块 :Kafka的管理员模块,操作和管理其topic,partition相关,包含创建,删除topic,或者拓展分区等。
Api模块 :负责数据交互,客户端与服务端交互数据的编码与解码。
Client模块 :包含Producer读取Kafka Broker元数据信息的类,如topic和分区,以及leader。
Cluster模块 :包含Broker、Cluster、Partition、Replica等实体类。
Common模块 :包含各种异常类以及错误验证。
Consumer模块 :消费者处理模块,负责客户端消费者数据和逻辑处理。
Controller模块 :负责中央控制器的选举,分区的Leader选举,Replica的分配或重新分配,分区和副本的扩容等。
Coordinator模块 :负责管理部分consumer group和他们的offset。
Javaapi模块 :提供Java语言的Producer和Consumer的API接口。
Log模块 :负责Kafka文件存储,读写所有Topic消息数据。
Message模块 :封装多条数据组成数据集或压缩数据集。
Metrics模块 :负责内部状态监控。
Network模块 :处理客户端连接,网络事件模块。
Producer模块 :生产者细节实现,包括同步和异步消息发送。
Security模块 :负责Kafka的安全验证和管理。
Serializer模块 :序列化和反序列化消息内容。
Server模块 :涉及Leader和Offset的checkpoint,动态配置,延时创建和删除Topic,Leader选举,Admin和Replica管理等。
Tools模块 :包含多种工具,如导出consumer offset值,LogSegments信息,Topic的log位置信息,Zookeeper上的offset值等。
Utils模块 :包含各种工具类,如Json,ZkUtils,线程池工具类,KafkaScheduler公共调度器类等。
这些模块共同构成了Kafka的整体架构,使其能够提供高吞吐量、高可用性的消息队列服务。
kafka源码分支为1.0.2
kafka.controller.ControllerChannelManager类:
/** * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package kafka.controller import java.net.SocketTimeoutException import java.util.concurrent.{BlockingQueue, LinkedBlockingQueue} import com.yammer.metrics.core.Gauge import kafka.api._ import kafka.cluster.Broker import kafka.common.{KafkaException, TopicAndPartition} import kafka.metrics.KafkaMetricsGroup import kafka.server.KafkaConfig import kafka.utils._ import org.apache.kafka.clients._ import org.apache.kafka.common.metrics.Metrics import org.apache.kafka.common.network._ import org.apache.kafka.common.protocol.ApiKeys import org.apache.kafka.common.requests.UpdateMetadataRequest.EndPoint import org.apache.kafka.common.requests._ import org.apache.kafka.common.security.JaasContext import org.apache.kafka.common.security.auth.SecurityProtocol import org.apache.kafka.common.utils.{LogContext, Time} import org.apache.kafka.common.{Node, TopicPartition} import scala.collection.JavaConverters._ import scala.collection.mutable.HashMap import scala.collection.{Set, mutable} object ControllerChannelManager { val QueueSizeMetricName = "QueueSize" } //在 Kafka 中,ControllerChannelManager 是一个关键组件,负责管理控制器与其它 Kafka 代理(Broker)之间的通信。 // 控制器(Controller)是 Kafka 集群中的一个特殊角色,负责管理集群的元数据和协调集群的操作,如分区分配、副本同步等。 // //ControllerChannelManager 的主要功能包括: //管理通信通道:维护控制器与各个 Broker 之间的网络连接和通信通道。 //发送控制命令:向 Broker 发送控制命令,如分区重分配、副本同步等。 //接收响应:接收 Broker 发送的响应和状态更新。 //处理失败和重试:处理与 Broker 通信过程中可能出现的失败,并进行重试。 //监控和统计:监控通信状态和性能,收集统计信息。 //KafkaController 通过发送特定的请求来与 broker 通信。这些请求包括但不限于: //LeaderAndIsrRequest:用于更新分区的 leader 和 ISR(In-Sync Replicas)。 //StopReplicaRequest:用于停止某个分区的副本。 //UpdateMetadataRequest:用于更新分区的元数据信息。 //ReassignPartitionsRequest:用于重新分配分区的副本。 //ControllerChannelManager 在初始化时,会为集群中的每个节点初始化一个 ControllerBrokerStateInfo 对象,该对象包含四个部分: // //NetworkClient:网络连接对象; //Node:节点信息; //BlockingQueue:请求队列; //RequestSendThread:请求的发送线程。 //其具体实现如下所示: class ControllerChannelManager(controllerContext: ControllerContext, config: KafkaConfig, time: Time, metrics: Metrics, stateChangeLogger: StateChangeLogger, threadNamePrefix: Option[String] = None) extends Logging with KafkaMetricsGroup { import ControllerChannelManager._ protected val brokerStateInfo = new HashMap[Int, ControllerBrokerStateInfo] private val brokerLock = new Object this.logIdent = "[Channel manager on controller " + config.brokerId + "]: " newGauge( "TotalQueueSize", new Gauge[Int] { def value: Int = brokerLock synchronized { brokerStateInfo.values.iterator.map(_.messageQueue.size).sum } } ) //会为每个broker创建对应的RequestSendThread线程 controllerContext.liveBrokers.foreach(addNewBroker) def startup() = { brokerLock synchronized { brokerStateInfo.foreach(brokerState => startRequestSendThread(brokerState._1)) } } def shutdown() = { brokerLock synchronized { brokerStateInfo.values.foreach(removeExistingBroker) } } //向 broker 发送请求(并没有真正发送,只是添加到对应的 queue 中), 请求的的发送是在 每台 Broker 对应的 RequestSendThread 中处理的。 def sendRequest(brokerId: Int, apiKey: ApiKeys, request: AbstractRequest.Builder[_ <: AbstractRequest], callback: AbstractResponse => Unit = null) { brokerLock synchronized { val stateInfoOpt = brokerStateInfo.get(brokerId) stateInfoOpt match { case Some(stateInfo) => stateInfo.messageQueue.put(QueueItem(apiKey, request, callback)) case None => warn("Not sending request %s to broker %d, since it is offline.".format(request, brokerId)) } } } def addBroker(broker: Broker) { // be careful here. Maybe the startup() API has already started the request send thread brokerLock synchronized { if(!brokerStateInfo.contains(broker.id)) { addNewBroker(broker) startRequestSendThread(broker.id) } } } def removeBroker(brokerId: Int) { brokerLock synchronized { removeExistingBroker(brokerStateInfo(brokerId)) } } private def addNewBroker(broker: Broker) { val messageQueue = new LinkedBlockingQueue[QueueItem] debug("Controller %d trying to connect to broker %d".format(config.brokerId, broker.id)) val brokerNode = broker.getNode(config.interBrokerListenerName) val logContext = new LogContext(s"[Controller id=${config.brokerId}, targetBrokerId=${brokerNode.idString}] ") val networkClient = { val channelBuilder = ChannelBuilders.clientChannelBuilder( config.interBrokerSecurityProtocol, JaasContext.Type.SERVER, config, config.interBrokerListenerName, config.saslMechanismInterBrokerProtocol, config.saslInterBrokerHandshakeRequestEnable ) val selector = new Selector( NetworkReceive.UNLIMITED, Selector.NO_IDLE_TIMEOUT_MS, metrics, time, "controller-channel", Map("broker-id" -> brokerNode.idString).asJava, false, channelBuilder, logContext ) new NetworkClient( selector, new ManualMetadataUpdater(Seq(brokerNode).asJava), config.brokerId.toString, 1, 0, 0, Selectable.USE_DEFAULT_BUFFER_SIZE, Selectable.USE_DEFAULT_BUFFER_SIZE, config.requestTimeoutMs, time, false, new ApiVersions, logContext ) } val threadName = threadNamePrefix match { case None => "Controller-%d-to-broker-%d-send-thread".format(config.brokerId, broker.id) case Some(name) => "%s:Controller-%d-to-broker-%d-send-thread".format(name, config.brokerId, broker.id) } val requestThread = new RequestSendThread(config.brokerId, controllerContext, messageQueue, networkClient, brokerNode, config, time, stateChangeLogger, threadName) requestThread.setDaemon(false) val queueSizeGauge = newGauge( QueueSizeMetricName, new Gauge[Int] { def value: Int = messageQueue.size }, queueSizeTags(broker.id) ) brokerStateInfo.put(broker.id, new ControllerBrokerStateInfo(networkClient, brokerNode, messageQueue, requestThread, queueSizeGauge)) } private def queueSizeTags(brokerId: Int) = Map("broker-id" -> brokerId.toString) private def removeExistingBroker(brokerState: ControllerBrokerStateInfo) { try { // Shutdown the RequestSendThread before closing the NetworkClient to avoid the concurrent use of the // non-threadsafe classes as described in KAFKA-4959. // The call to shutdownLatch.await() in ShutdownableThread.shutdown() serves as a synchronization barrier that // hands off the NetworkClient from the RequestSendThread to the ZkEventThread. brokerState.requestSendThread.shutdown() brokerState.networkClient.close() brokerState.messageQueue.clear() removeMetric(QueueSizeMetricName, queueSizeTags(brokerState.brokerNode.id)) brokerStateInfo.remove(brokerState.brokerNode.id) } catch { case e: Throwable => error("Error while removing broker by the controller", e) } } protected def startRequestSendThread(brokerId: Int) { val requestThread = brokerStateInfo(brokerId).requestSendThread if(requestThread.getState == Thread.State.NEW) requestThread.start() } } case class QueueItem(apiKey: ApiKeys, request: AbstractRequest.Builder[_ <: AbstractRequest], callback: AbstractResponse => Unit) class RequestSendThread(val controllerId: Int, val controllerContext: ControllerContext, val queue: BlockingQueue[QueueItem], val networkClient: NetworkClient, val brokerNode: Node, val config: KafkaConfig, val time: Time, val stateChangeLogger: StateChangeLogger, name: String) extends ShutdownableThread(name = name) { private val socketTimeoutMs = config.controllerSocketTimeoutMs override def doWork(): Unit = { def backoff(): Unit = CoreUtils.swallowTrace(Thread.sleep(100)) val QueueItem(apiKey, requestBuilder, callback) = queue.take() var clientResponse: ClientResponse = null try { var isSendSuccessful = false while (isRunning.get() && !isSendSuccessful) { // if a broker goes down for a long time, then at some point the controller's zookeeper listener will trigger a // removeBroker which will invoke shutdown() on this thread. At that point, we will stop retrying. try { if (!brokerReady()) { isSendSuccessful = false backoff() } else { val clientRequest = networkClient.newClientRequest(brokerNode.idString, requestBuilder, time.milliseconds(), true) clientResponse = NetworkClientUtils.sendAndReceive(networkClient, clientRequest, time) isSendSuccessful = true } } catch { case e: Throwable => // if the send was not successful, reconnect to broker and resend the message warn(("Controller %d epoch %d fails to send request %s to broker %s. " + "Reconnecting to broker.").format(controllerId, controllerContext.epoch, requestBuilder.toString, brokerNode.toString), e) networkClient.close(brokerNode.idString) isSendSuccessful = false backoff() } } if (clientResponse != null) { val requestHeader = clientResponse.requestHeader val api = requestHeader.apiKey if (api != ApiKeys.LEADER_AND_ISR && api != ApiKeys.STOP_REPLICA && api != ApiKeys.UPDATE_METADATA) throw new KafkaException(s"Unexpected apiKey received: $apiKey") val response = clientResponse.responseBody stateChangeLogger.withControllerEpoch(controllerContext.epoch).trace("Received response " + s"${response.toString(requestHeader.apiVersion)} for a request sent to broker $brokerNode") if (callback != null) { callback(response) } } } catch { case e: Throwable => error("Controller %d fails to send a request to broker %s".format(controllerId, brokerNode.toString), e) // If there is any socket error (eg, socket timeout), the connection is no longer usable and needs to be recreated. networkClient.close(brokerNode.idString) } } private def brokerReady(): Boolean = { try { if (!NetworkClientUtils.isReady(networkClient, brokerNode, time.milliseconds())) { if (!NetworkClientUtils.awaitReady(networkClient, brokerNode, time, socketTimeoutMs)) throw new SocketTimeoutException(s"Failed to connect within $socketTimeoutMs ms") info("Controller %d connected to %s for sending state change requests".format(controllerId, brokerNode.toString)) } true } catch { case e: Throwable => warn("Controller %d's connection to broker %s was unsuccessful".format(controllerId, brokerNode.toString), e) networkClient.close(brokerNode.idString) false } } } class ControllerBrokerRequestBatch(controller: KafkaController, stateChangeLogger: StateChangeLogger) extends Logging { val controllerContext = controller.controllerContext val controllerId: Int = controller.config.brokerId //记录每个 broker 与要发送的 LeaderAndIsr 请求集合的 map; val leaderAndIsrRequestMap = mutable.Map.empty[Int, mutable.Map[TopicPartition, LeaderAndIsrRequest.PartitionState]] //记录每个 broker 与要发送的 StopReplica 集合的 map; val stopReplicaRequestMap = mutable.Map.empty[Int, Seq[StopReplicaRequestInfo]] //记录要发送的 update-metadata 请求的 broker 集合; val updateMetadataRequestBrokerSet = mutable.Set.empty[Int] //记录 update-metadata 请求要更新的 Topic Partition 集合。 val updateMetadataRequestPartitionInfoMap = mutable.Map.empty[TopicPartition, UpdateMetadataRequest.PartitionState] //创建新的请求前, 需要确保前一批请求全部发送完毕,否则抛出异常 //这个方法的主要作用是检查上一波的 LeaderAndIsr、UpdateMetadata、StopReplica 请求是否已经发送, // 正常情况下,Controller 在调用 sendRequestsToBrokers() 方法之后,这些集合中的请求都会被发送,发送之后,会将相应的请求集合清空, // 当然在异常情况可能会导致部分集合没有被清空,导致无法 newBatch(),这种情况下,通常策略是重启 controller,因为现在 Controller 的设计还是有些复杂, // 在某些情况下还是可能会导致异常发生,并且有些异常还是无法恢复的。 def newBatch() { // raise error if the previous batch is not empty if (leaderAndIsrRequestMap.nonEmpty) throw new IllegalStateException("Controller to broker state change requests batch is not empty while creating " + "a new one. Some LeaderAndIsr state changes %s might be lost ".format(leaderAndIsrRequestMap.toString())) if (stopReplicaRequestMap.nonEmpty) throw new IllegalStateException("Controller to broker state change requests batch is not empty while creating a " + "new one. Some StopReplica state changes %s might be lost ".format(stopReplicaRequestMap.toString())) if (updateMetadataRequestBrokerSet.nonEmpty) throw new IllegalStateException("Controller to broker state change requests batch is not empty while creating a " + "new one. Some UpdateMetadata state changes to brokers %s with partition info %s might be lost ".format( updateMetadataRequestBrokerSet.toString(), updateMetadataRequestPartitionInfoMap.toString())) } def clear() { leaderAndIsrRequestMap.clear() stopReplicaRequestMap.clear() updateMetadataRequestBrokerSet.clear() updateMetadataRequestPartitionInfoMap.clear() } //向对应的 Broker 添加 LeaderAndIsr 请求,请求会被添加到 leaderAndIsrRequestMap 集合中; //并通过 addUpdateMetadataRequestForBrokers() 方法向所有的 Broker 添加这个 Topic-Partition 的 UpdateMatedata 请求, //leader 或 isr 变动时,会向所有 broker 同步这个 Partition 的 metadata 信息,这样可以保证每台 Broker 上都有最新的 metadata 信息。 def addLeaderAndIsrRequestForBrokers(brokerIds: Seq[Int], topic: String, partition: Int, leaderIsrAndControllerEpoch: LeaderIsrAndControllerEpoch, replicas: Seq[Int], isNew: Boolean = false) { val topicPartition = new TopicPartition(topic, partition) //将请求添加到对应的 broker 上 brokerIds.filter(_ >= 0).foreach { brokerId => val result = leaderAndIsrRequestMap.getOrElseUpdate(brokerId, mutable.Map.empty) val alreadyNew = result.get(topicPartition).exists(_.isNew) result.put(topicPartition, new LeaderAndIsrRequest.PartitionState(leaderIsrAndControllerEpoch.controllerEpoch, leaderIsrAndControllerEpoch.leaderAndIsr.leader, leaderIsrAndControllerEpoch.leaderAndIsr.leaderEpoch, leaderIsrAndControllerEpoch.leaderAndIsr.isr.map(Integer.valueOf).asJava, leaderIsrAndControllerEpoch.leaderAndIsr.zkVersion, replicas.map(Integer.valueOf).asJava, isNew || alreadyNew)) } //在更新 LeaderAndIsr 信息时,主题的 metadata 相当于也进行了更新,需要发送这个 topic 的 metadata 给所有存活的 broker addUpdateMetadataRequestForBrokers(controllerContext.liveOrShuttingDownBrokerIds.toSeq, Set(TopicAndPartition(topic, partition))) } //向给定的 Broker 发送某个 Topic Partition 的 StopReplica 请求; def addStopReplicaRequestForBrokers(brokerIds: Seq[Int], topic: String, partition: Int, deletePartition: Boolean, callback: (AbstractResponse, Int) => Unit = null) { brokerIds.filter(b => b >= 0).foreach { brokerId => stopReplicaRequestMap.getOrElseUpdate(brokerId, Seq.empty[StopReplicaRequestInfo]) val v = stopReplicaRequestMap(brokerId) if(callback != null) stopReplicaRequestMap(brokerId) = v :+ StopReplicaRequestInfo(PartitionAndReplica(topic, partition, brokerId), deletePartition, (r: AbstractResponse) => callback(r, brokerId)) else stopReplicaRequestMap(brokerId) = v :+ StopReplicaRequestInfo(PartitionAndReplica(topic, partition, brokerId), deletePartition) } } //向给定的 Broker 发送某一批 Partitions 的 UpdateMetadata 请求。 //首先过滤出要发送的 Partition 列表,如果没有指定要发送 partitions 列表,那么默认就是发送全局的 metadata 信息; //接着将已经标记为删除的 Partition 从上面的列表中移除; //将要发送的 Broker 列表添加到 updateMetadataRequestBrokerSet 集合中; //将前面过滤的 Partition 列表对应的 metadata 信息添加到对应的 updateMetadataRequestPartitionInfoMap 集合中; //将当前设置为删除的所有 Partition 的 metadata 信息也添加到 updateMetadataRequestPartitionInfoMap 集合中,添加前会把其 leader 设置为-2, // 这样 Broker 收到这个 Partition 的 metadata 信息之后就会知道这个 Partition 是设置删除标志。 /** Send UpdateMetadataRequest to the given brokers for the given partitions and partitions that are being deleted */ def addUpdateMetadataRequestForBrokers(brokerIds: Seq[Int], partitions: collection.Set[TopicAndPartition] = Set.empty[TopicAndPartition]) { //将 Topic-Partition 添加到对应的 map 中 def updateMetadataRequestPartitionInfo(partition: TopicAndPartition, beingDeleted: Boolean) { val leaderIsrAndControllerEpochOpt = controllerContext.partitionLeadershipInfo.get(partition) leaderIsrAndControllerEpochOpt match { case Some(l @ LeaderIsrAndControllerEpoch(leaderAndIsr, controllerEpoch)) => val replicas = controllerContext.partitionReplicaAssignment(partition) val offlineReplicas = replicas.filter(!controllerContext.isReplicaOnline(_, partition)) val leaderIsrAndControllerEpoch = if (beingDeleted) { val leaderDuringDelete = LeaderAndIsr.duringDelete(leaderAndIsr.isr) LeaderIsrAndControllerEpoch(leaderDuringDelete, controllerEpoch) } else { l } val partitionStateInfo = new UpdateMetadataRequest.PartitionState(leaderIsrAndControllerEpoch.controllerEpoch, leaderIsrAndControllerEpoch.leaderAndIsr.leader, leaderIsrAndControllerEpoch.leaderAndIsr.leaderEpoch, leaderIsrAndControllerEpoch.leaderAndIsr.isr.map(Integer.valueOf).asJava, leaderIsrAndControllerEpoch.leaderAndIsr.zkVersion, replicas.map(Integer.valueOf).asJava, offlineReplicas.map(Integer.valueOf).asJava) updateMetadataRequestPartitionInfoMap.put(new TopicPartition(partition.topic, partition.partition), partitionStateInfo) case None => info("Leader not yet assigned for partition %s. Skip sending UpdateMetadataRequest.".format(partition)) } } //过滤出要发送的 partition val filteredPartitions = { //Partitions 为空时,就过滤出所有的 topic val givenPartitions = if (partitions.isEmpty) controllerContext.partitionLeadershipInfo.keySet else partitions if (controller.topicDeletionManager.partitionsToBeDeleted.isEmpty) givenPartitions else givenPartitions -- controller.topicDeletionManager.partitionsToBeDeleted } //将 broker 列表更新到要发送的集合中 updateMetadataRequestBrokerSet ++= brokerIds.filter(_ >= 0) //对于要更新 metadata 的 Partition,设置 beingDeleted 为 False filteredPartitions.foreach(partition => updateMetadataRequestPartitionInfo(partition, beingDeleted = false)) //要删除的 Partition 设置 BeingDeleted 为 True controller.topicDeletionManager.partitionsToBeDeleted.foreach(partition => updateMetadataRequestPartitionInfo(partition, beingDeleted = true)) } //发送请求给 broker(只是将对应处理后放入到对应的 queue 中) //此方法将三个集合中的请求发送对应 Broker 的请求队列中,这里简单作一个总结: //从 leaderAndIsrRequestMap 集合中构造相应的 LeaderAndIsr 请求,通过 Controller 的 sendRequest() 方法将请求添加到 Broker 对应的 MessageQueue 中,最后清空 leaderAndIsrRequestMap 集合; //从 updateMetadataRequestPartitionInfoMap 集合中构造相应的 UpdateMetadata 请求,,通过 Controller 的 sendRequest() 方法将请求添加到 Broker 对应的 MessageQueue 中,最后清空 updateMetadataRequestBrokerSet 和 updateMetadataRequestPartitionInfoMap 集合; //从 stopReplicaRequestMap 集合中构造相应的 StopReplica 请求,在构造时会根据是否设置删除标志将要涉及的 Partition 分成两类,构造对应的请求,对于要删除数据的 StopReplica 会设置相应的回调函数,然后通过 Controller 的 sendRequest() 方法将请求添加到 Broker 对应的 MessageQueue 中,最后清空 stopReplicaRequestMap 集合。 //走到这一步,Controller 要发送的请求算是都添加到对应 Broker 的 MessageQueue 中,后台的 RequestSendThread 线程会从这个请求队列中遍历相应的请求,发送给对应的 Broker。 def sendRequestsToBrokers(controllerEpoch: Int) { try { val stateChangeLog = stateChangeLogger.withControllerEpoch(controllerEpoch) val leaderAndIsrRequestVersion: Short = if (controller.config.interBrokerProtocolVersion >= KAFKA_1_0_IV0) 1 else 0 //LeaderAndIsr 请求 leaderAndIsrRequestMap.foreach { case (broker, leaderAndIsrPartitionStates) => leaderAndIsrPartitionStates.foreach { case (topicPartition, state) => val typeOfRequest = if (broker == state.basePartitionState.leader) "become-leader" else "become-follower" stateChangeLog.trace(s"Sending $typeOfRequest LeaderAndIsr request $state to broker $broker for partition $topicPartition") } //leader id 集合 val leaderIds = leaderAndIsrPartitionStates.map(_._2.basePartitionState.leader).toSet val leaders = controllerContext.liveOrShuttingDownBrokers.filter(b => leaderIds.contains(b.id)).map { _.getNode(controller.config.interBrokerListenerName) } //构造 LeaderAndIsr 请求,并添加到对应的 queue 中 val leaderAndIsrRequestBuilder = new LeaderAndIsrRequest.Builder(leaderAndIsrRequestVersion, controllerId, controllerEpoch, leaderAndIsrPartitionStates.asJava, leaders.asJava) controller.sendRequest(broker, ApiKeys.LEADER_AND_ISR, leaderAndIsrRequestBuilder, (r: AbstractResponse) => controller.eventManager.put(controller.LeaderAndIsrResponseReceived(r, broker))) } //清空 leaderAndIsr 集合 leaderAndIsrRequestMap.clear() updateMetadataRequestPartitionInfoMap.foreach { case (tp, partitionState) => stateChangeLog.trace(s"Sending UpdateMetadata request $partitionState to brokers $updateMetadataRequestBrokerSet " + s"for partition $tp") } val partitionStates = Map.empty ++ updateMetadataRequestPartitionInfoMap val updateMetadataRequestVersion: Short = if (controller.config.interBrokerProtocolVersion >= KAFKA_1_0_IV0) 4 else if (controller.config.interBrokerProtocolVersion >= KAFKA_0_10_2_IV0) 3 else if (controller.config.interBrokerProtocolVersion >= KAFKA_0_10_0_IV1) 2 else if (controller.config.interBrokerProtocolVersion >= KAFKA_0_9_0) 1 else 0 //构造 update-metadata 请求 val updateMetadataRequest = { val liveBrokers = if (updateMetadataRequestVersion == 0) { // Version 0 of UpdateMetadataRequest only supports PLAINTEXT. controllerContext.liveOrShuttingDownBrokers.map { broker => val securityProtocol = SecurityProtocol.PLAINTEXT val listenerName = ListenerName.forSecurityProtocol(securityProtocol) val node = broker.getNode(listenerName) val endPoints = Seq(new EndPoint(node.host, node.port, securityProtocol, listenerName)) new UpdateMetadataRequest.Broker(broker.id, endPoints.asJava, broker.rack.orNull) } } else { controllerContext.liveOrShuttingDownBrokers.map { broker => val endPoints = broker.endPoints.map { endPoint => new UpdateMetadataRequest.EndPoint(endPoint.host, endPoint.port, endPoint.securityProtocol, endPoint.listenerName) } new UpdateMetadataRequest.Broker(broker.id, endPoints.asJava, broker.rack.orNull) } } new UpdateMetadataRequest.Builder(updateMetadataRequestVersion, controllerId, controllerEpoch, partitionStates.asJava, liveBrokers.asJava) } // 将请求添加到对应的 queue updateMetadataRequestBrokerSet.foreach { broker => controller.sendRequest(broker, ApiKeys.UPDATE_METADATA, updateMetadataRequest, null) } updateMetadataRequestBrokerSet.clear() updateMetadataRequestPartitionInfoMap.clear() // StopReplica 请求的处理 stopReplicaRequestMap.foreach { case (broker, replicaInfoList) => val stopReplicaWithDelete = replicaInfoList.filter(_.deletePartition).map(_.replica).toSet val stopReplicaWithoutDelete = replicaInfoList.filterNot(_.deletePartition).map(_.replica).toSet debug("The stop replica request (delete = true) sent to broker %d is %s" .format(broker, stopReplicaWithDelete.mkString(","))) debug("The stop replica request (delete = false) sent to broker %d is %s" .format(broker, stopReplicaWithoutDelete.mkString(","))) val (replicasToGroup, replicasToNotGroup) = replicaInfoList.partition(r => !r.deletePartition && r.callback == null) // Send one StopReplicaRequest for all partitions that require neither delete nor callback. This potentially // changes the order in which the requests are sent for the same partitions, but that's OK. val stopReplicaRequest = new StopReplicaRequest.Builder(controllerId, controllerEpoch, false, replicasToGroup.map(r => new TopicPartition(r.replica.topic, r.replica.partition)).toSet.asJava) controller.sendRequest(broker, ApiKeys.STOP_REPLICA, stopReplicaRequest) replicasToNotGroup.foreach { r => val stopReplicaRequest = new StopReplicaRequest.Builder( controllerId, controllerEpoch, r.deletePartition, Set(new TopicPartition(r.replica.topic, r.replica.partition)).asJava) controller.sendRequest(broker, ApiKeys.STOP_REPLICA, stopReplicaRequest, r.callback) } } stopReplicaRequestMap.clear() } catch { case e: Throwable => if (leaderAndIsrRequestMap.nonEmpty) { error("Haven't been able to send leader and isr requests, current state of " + s"the map is $leaderAndIsrRequestMap. Exception message: $e") } if (updateMetadataRequestBrokerSet.nonEmpty) { error(s"Haven't been able to send metadata update requests to brokers $updateMetadataRequestBrokerSet, " + s"current state of the partition info is $updateMetadataRequestPartitionInfoMap. Exception message: $e") } if (stopReplicaRequestMap.nonEmpty) { error("Haven't been able to send stop replica requests, current state of " + s"the map is $stopReplicaRequestMap. Exception message: $e") } throw new IllegalStateException(e) } } } case class ControllerBrokerStateInfo(networkClient: NetworkClient, brokerNode: Node, messageQueue: BlockingQueue[QueueItem], requestSendThread: RequestSendThread, queueSizeGauge: Gauge[Int]) case class StopReplicaRequestInfo(replica: PartitionAndReplica, deletePartition: Boolean, callback: AbstractResponse => Unit = null) class Callbacks private (var stopReplicaResponseCallback: (AbstractResponse, Int) => Unit) object Callbacks { class CallbackBuilder { var stopReplicaResponseCbk: (AbstractResponse, Int) => Unit = null def stopReplicaCallback(cbk: (AbstractResponse, Int) => Unit): CallbackBuilder = { stopReplicaResponseCbk = cbk this } def build: Callbacks = new Callbacks(stopReplicaResponseCbk) } }