Hot questions for Using Azure in apache kafka

Top Java Programmings / Azure / apache kafka

Question:

I have like 150 TB of JSON documents which are stored on my Personal windows Driver. I am moving those drivers to Microsoft Azure Storage Account. I want to pull that JSON data and post it on Kafka. And from Kafka I want to push to Couchbase using Kafka-couch connector. Whats the best approach and procedure to do? (Keeping Replication of data in mind)

Azure ---> Kafka ---> Couchbase

or Azure ---> Couchbase.

or Windows Drivers ---> Couchbase


Answer:

Based on your needs, I offer you two alternatives.

The first option, in which you create your own programs to get data from Azure Blob Storage and push data to Kafka.You can use the WebJob to run it in Azure Web App Service.

This option is time-consuming, but it costs less. You can refer to the snippet of code below or get more details of pushing data to kafka via java from here.

The second option ,in which you can use Azure HDInsight Service and follow the official document to access data stored in Azure Blob Storage via the syntax: wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path>.

Then , please download HDFS (Sink) Connectors on this site to push Json data to Kafka from HDInsight.

This option saves time, but it costs more.

You also could refer to the SO thread Kafka Connector for Azure Blob Storage and just choose one of the two options according to your needs.

Hope it helps you.

Question:

Java 8, Flink 1.9.1, Azure Event Hub

I can no longer connect to azure event hub with my flink project as of Jan 5th 2020. I was having the same issue with several spring boot apps but the issue was resolved when i upgraded to Spring Boot 2.2.2 which also updated Kafka Clients and Kafka dependencies to 2.3.1. I have attempted to update Flink's kafka dependencies without success. I've also submitted an issue

https://issues.apache.org/jira/browse/FLINK-15557

2020-01-10 19:36:30,364 WARN org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-1, groupId=****] Bootstrap broker *****.servicebus.windows.net:9093 (id: -1 rack: null) disconnected

Connection Properties

"sasl.mechanism"="PLAIN");
"security.protocol"="SASL_SSL");
"sasl.jaas.config"="Endpoint=sb://<FQDN>/;SharedAccessKeyName=<KeyName>;SharedAccessKey=<KeyValue>;EntityPath=<EntityValue>;

Answer:

You must be using entity level connection string and that is why your clients are observing connection failures. The issue should resolve when namespace level connection string is used.

Question:

Is there any API available for Azure Event hub to create a Event Hub via java or shell like amazon kinesis as below

AmazonKinesis kinesis = AmazonKinesisClientBuilder.standard().build()
CreateStreamRequest createStreamRequest = new CreateStreamRequest();
            createStreamRequest.setStreamName(stream);
            createStreamRequest.setShardCount(shards);
            kinesis.createStream(createStreamRequest);

Answer:

Here is Azure Java Management Library sample for Event Hubs - https://github.com/Azure-Samples/eventhub-java-manage-event-hub

If you want management via SAS keys then your options are

  1. CLI, PS, or .NET libraries - https://github.com/Azure/azure-event-hubs/tree/master/samples/Management

  2. REST - https://docs.microsoft.com/en-us/rest/api/eventhub/event-hubs-management-rest