Running into LeaderNotAvailableException when using Kafka 0.8.1 with Zookeeper 3.4.6

Apache ZookeeperApache Kafka

Apache Zookeeper Problem Overview


I installed the stable version of kafka (0.8.1 with 2.9.2 Scala) as per their website and am running it with a 3 node zookeeper ensemble (3.4.6). I tried to create a test topic but keep seeing that there is no leader assigned to the partition of the topic:

[kafka_2.9.2-0.8.1]$ ./bin/kafka-topics.sh --zookeeper <zookeeper_ensemble> --describe --topic test-1
Topic:test-1	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: test-1	Partition: 0	**Leader: none**	Replicas: 0,1,2	**Isr:** 

I tried to write to the topic anyway using the console producer but ran into the LeaderNotAvailableException exception:

[kafka_2.9.2-0.8.1]$ ./kafka-console-producer.sh --broker-list <broker_list> --topic test-1

hello world

[2014-04-22 11:58:48,297] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,321] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,322] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: test-1 (kafka.producer.async.DefaultEventHandler)

[2014-04-22 11:58:48,445] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,467] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,467] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: test-1 (kafka.producer.async.DefaultEventHandler)

[2014-04-22 11:58:48,590] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,612] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,612] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: test-1 (kafka.producer.async.DefaultEventHandler)

[2014-04-22 11:58:48,731] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,753] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,754] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: test-1 (kafka.producer.async.DefaultEventHandler)

[2014-04-22 11:58:48,876] WARN Error while fetching metadata [{TopicMetadata for topic test-1 -> 
No partition metadata for topic test-1 due to kafka.common.LeaderNotAvailableException}] for topic [test-1]: class kafka.common.LeaderNotAvailableException  (kafka.producer.BrokerPartitionInfo)

[2014-04-22 11:58:48,877] ERROR Failed to send requests for topics test-1 with correlation ids in [0,8] (kafka.producer.async.DefaultEventHandler)

[2014-04-22 11:58:48,878] ERROR Error in handling batch of 1 events (kafka.producer.async.ProducerSendThread)
kafka.common.FailedToSendMessageException: Failed to send messages after 3 tries.
	at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
	at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)
	at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)
	at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)
	at scala.collection.immutable.Stream.foreach(Stream.scala:547)
	at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)
	at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)

I should also state that this was working initially for a few days and then suddenly any topic that was created had this missing leader problem.

Apache Zookeeper Solutions


Solution 1 - Apache Zookeeper

Kafka uses an external coordination framework (by default Zookeeper) to maintain configuration. It seems the configuration is now out-of-sync with the Kafka log data. In this case, I'd remove affected topic data and related Zookeeper data.

For Test Environment:

  1. Stop Kafka-server and Zookeeper-server
  2. Remove the data directories of both services, by default they are /tmp/kafka-log and /tmp/zookeeper.
  3. Start Kafka-server and Zookeeper-server again
  4. Create a new topic

Now you are able to work with the topic again.

For Production Environment:

As the Kafka topics are stored in different directories, you should remove particular directories. You should also remove /brokers/{broker_id}/topics/{broken_topic} from Zookeeper by using a Zookeeper client.

Please read Kafka documentation carefully to make sure the configuration structure, before you do anything stupid. Kafka is rolling out a delete topic feature (KAFKA-330), so that the problem can be solved more easily.

Solution 2 - Apache Zookeeper

I had the same issue. It turns out that Kafka requires the machine's hostname to be resolveable to connect back to itself.

I updated the hostname on my machine and, after a restart of zookeeper and kafka, the topic could be written to correctly.

Solution 3 - Apache Zookeeper

I had solved this problem by adding an entry into /etc/hosts for 127.0.0.1 with fully qualified host name:

127.0.0.1       x4239433.your.domain.com x4239433

Producer and consumer started working fine.

Solution 4 - Apache Zookeeper

I had the same problem. In the end I had to delete stop the Kafka nodes, then follow the advice here on how to delete Kafka topics. Once I had got rid of the broken topics, I was able to start Kafka again successfully.

I would like to know if there is a better approach, and how to avoid this happening in the future.

Solution 5 - Apache Zookeeper

I have ran into this problem couple times and finally figured out why I was having the issue. I am going to add the findings here too. I am on Linux VM, short answer is, I was having this issue since my VM got an new IP. If you look under the config files and open up the server.properties you will see this line

advertised.host.name=xx.xx.xx.xxx or localhost.

Make sure this IP matches your current IP, you can check for your IP here.

Once I have fixed that, everything started to work properly. I am using 0.9.0.0 version.

I hope this helps someone.

Solution 6 - Apache Zookeeper

I had the same problem, solved the JDK from 1.7 to 1.6

Solution 7 - Apache Zookeeper

had the same problem. make sure you have at least one topic on each partition your consumer / producer is using. Zookeeper will not find a leader of a partition if there are no topics using that partition

Solution 8 - Apache Zookeeper

It is the problem with JDK.

I have installed openjdk

java version "1.7.0_51"
OpenJDK Runtime Environment (IcedTea 2.4.4) (7u51-2.4.4-0ubuntu0.12.04.2)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)

But I changed that to oracle jdk (follow this link : http://www.webupd8.org/2012/06/how-to-install-oracle-java-7-in-debian.html)

java version "1.7.0_80" Java(TM) SE Runtime Environment (build
1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

Now it works fine. Hope this helps.

Solution 9 - Apache Zookeeper

So one more possible answer -- the IP address in the advertised.hostname in the kafka config/server.properties may be mistyped with an extra space.

In my cases

advertised.host.name=10.123.123.211_\n (where _ is an extra space)

instead of the correct

advertised.host.name=10.123.123.211\n

For some reason this was working for 6 months without issues, and presumably some library update removed the relaxed lookup of the IP address trimming off the extra space.

A simple fix of the config file and restart of kafka solves this problem.

Solution 10 - Apache Zookeeper

I faced exactly the same problem when I was trying to play with Kafka in my local system (mac OS X El Capitan). The problem was with my zookeeper, it was not referring to correct config file. Restart the zookeeper, and then Kafka and execute the following command. check if Leader is not None. If Leader is none, delete that topic and re-create it.

kafka-topics --zookeeper localhost:2181 --describe --topic pytest

Output will be like

Topic:pytest	PartitionCount:1	ReplicationFactor:1	Configs:
Topic: pytest	Partition: 0	Leader: 0	Replicas: 0	Isr: 0

I hope this should help.

Solution 11 - Apache Zookeeper

I faced the issue with Kafka, Zookeeper pod in Openshift and the Kafka was TLS enabled. I had to add the below environment variables to Kafka,

  • KAFKA_ZOOKEEPER_CONNECT

  • KAFKA_SSL_KEYSTORE_LOCATION

  • KAFKA_SSL_TRUSTSTORE_LOCATION

  • KAFKA_SSL_KEYSTORE_PASSWORD

  • KAFKA_SSL_TRUSTSTORE_PASSWORD

  • KAFKA_ADVERTISED_LISTENERS

  • KAFKA_INTER_BROKER_LISTENER_NAME

  • KAFKA_LISTENERS

And after setting the variables, I had to delete and recreate the pods, to have it work.

Solution 12 - Apache Zookeeper

Add "advertised.host.name=localhost" in config/server.properties and restart the Kafka server. It worked for me

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser3561789View Question on Stackoverflow
Solution 1 - Apache Zookeeperstanleyxu2005View Answer on Stackoverflow
Solution 2 - Apache ZookeeperLee NethertonView Answer on Stackoverflow
Solution 3 - Apache ZookeeperIgor OstaptchenkoView Answer on Stackoverflow
Solution 4 - Apache ZookeeperMark ButlerView Answer on Stackoverflow
Solution 5 - Apache ZookeeperMhoqueView Answer on Stackoverflow
Solution 6 - Apache Zookeeperuser3603968View Answer on Stackoverflow
Solution 7 - Apache ZookeeperAndrewView Answer on Stackoverflow
Solution 8 - Apache ZookeeperAshwini AdlakhaView Answer on Stackoverflow
Solution 9 - Apache ZookeeperSorenView Answer on Stackoverflow
Solution 10 - Apache ZookeepersuyashView Answer on Stackoverflow
Solution 11 - Apache ZookeeperKannan RamamoorthyView Answer on Stackoverflow
Solution 12 - Apache ZookeeperR PiduguView Answer on Stackoverflow