Hot questions for Using Amazon S3 in amazon kinesis firehose

Top Java Programmings / Amazon S3 / amazon kinesis firehose

Question:

I am using aws since last 6 months and I developed application that puts batch request to firehose. It was working fine till today but when I redeployed in my local system it is saying java.lang.ClassNotFoundException: com.amazonaws.ClientConfigurationFactory. I know what this error means. But my question is why I got this exception today? I am using following dependency in my project:

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk</artifactId>
    <!--    <version>1.10.72</version> --> // I used this version today only for testing purpose
        <version>1.10.6</version>
    </dependency>
    <!-- <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-s3</artifactId>
        <version>1.10.71</version>
    </dependency> -->
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-core</artifactId>
        <version>1.10.37</version>
        <optional>false</optional>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk-kinesis</artifactId>
        <version>RELEASE</version>
    </dependency>

And I searched ClientConfigurationFactory class but don't find anywhere (anywhere means in my dependency).

My question is where is this class located and why I got this error only today? Because I did not face this error in my initial development (6 months before). I have not changed any dependencies/code today. And I am not using this class in my project (I have doubt aws-sdk may have been using inside).

Note: I can not ask Do I missed any dependency? Because it was working fine before.

Please comment if you have any doubt. Thanks.


Answer:

This is most likely because you have a mismatch of AWS SDK versions you are including. You are using a combination of SDK version 1.10.6, 1.10.71, 1.10.37 and RELEASE. You are asking for trouble mixing the versions like you are doing. Change all those to the same version and your problem will likely go away.

Question:

My pipeline is as follows:

Firehose -> Lambda (AWS' Java SDK) -> (S3 & Redshift)

An un-encoded (raw) JSON record is submitted to Firehose. It then triggers a Lambda function which transforms it slightly. Firehose then puts the transformed record into an S3 bucket and into Redshift.

For Firehose to add the transformed data to S3, it requires that the data be Base64 encoded (and Firehose decodes it before adding it to S3).

However, I have a URL within the data that, when decoded, = characters are replaced with their equivalent unicode character (\u003d) due to it being the character that Amazon's Base64 decoder uses as padding.

https://www.[snipped].com/...?returnurl\u003dnull\u0026referrer\u003dnull

How can I retain those = characters within the decoded data?

Note: I've tried using Base64.getUrlEncoder(), but AWS only seems to support Base64.getEncoder().


Answer:

It turns out that HTML escaping was enabled on the JSON library (Gson) that I was using when (de)serializing my Lambda record. To fix it, I just had to disable HTML escaping:

new GsonBuilder().disableHtmlEscaping().create();

Question:

Getting below error while processing AWS kinesis - Lambda function to S3

One or more record Ids were not returned. Ensure that the Lambda function returns all received record Ids.

The below are my code snippet.

{
        List<KinesisFirehoseOutputRecord> results = event.getRecords().stream()
                .map(record -> {
                    KinesisFirehoseOutputRecord outRec = new KinesisFirehoseOutputRecord();
                    outRec.setRecordId(record.getRecordId());
                    outRec.setData(record.getData());
                    if (record.getData().toLowerCase().contains("moldovan")) {
                        outRec.setResult("Ok");
                    } else {
                        outRec.setResult("Dropped");
                    }


                    return outRec;
                }).collect(Collectors.toList());

        return new KinesisFirehoseResponse(results);
        }

if i remove if else condition and add outRec.setResult("Ok");, it is working fine as expected. Any idea how to resolve this?


Answer:

Firehose do a check after the function is executed... the number of recordID processed equals to the recordID that was sended to the function.

The check is doing with the array that is returned al end of the function.

If you drop any element in the process, you must be assign with the status dropped.

https://github.com/awsdocs/amazon-kinesis-data-firehose-developer-guide/blob/master/doc_source/data-transformation.md#data-transformation-status-model