Hot questions for Using Amazon S3 in servlets

Question:

I need a servlet to return files from Amazon S3 servers. Only the server has the credentials to access, the S3 bucket is not public. I cannot change that. I was told to use data streams, but they are so slow. To test, I have a small proyect with thumbnails and when you click on one it opens a new tab with the full image. A 5mb image takes about a minute to load. That slow.

The function that reads from S3 and returns the data stream:

public void downloadDirectlyFromS3(String s3Path, String fileName, HttpServletResponse response) {
    AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
    s3Client.setEndpoint(S3ENDPOINT);

    S3Object s3object = s3Client.getObject(new GetObjectRequest(s3Path, fileName));

    byte[] buffer = new byte[5 * 1024 * 1024];

    try {
        InputStream input = s3object.getObjectContent();
        ServletOutputStream output = response.getOutputStream();
        for (int length = 0; (length = input.read(buffer)) > 0;) {
            output.write(buffer, 0, length);
        }
        output.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Answer:

There are two things that stand out that may be the cause of the problem.

public void downloadDirectlyFromS3(String s3Path, String fileName, HttpServletResponse response) {
    AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider()); // 1. new client for each request
    s3Client.setEndpoint(S3ENDPOINT);

    S3Object s3object = s3Client.getObject(new GetObjectRequest(s3Path, fileName)); //may return null if not found

    byte[] buffer = new byte[5 * 1024 * 1024];

    try {
        InputStream input = s3object.getObjectContent(); // 2. input stream is never closed
        ServletOutputStream output = response.getOutputStream();
        for (int length = 0; (length = input.read(buffer)) > 0;) {
            output.write(buffer, 0, length);
        }
        output.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

The first change I would make is to create one client for the entire application and reuse it. This is probably the main cause of your issue. AWS clients are considered thread safe and can be used by multiple requests at the same time. The client handles connection pooling and reuse which will help speed up multiple requests.

The second change would be to correctly close resources. input is never closed and output is not closed on exceptions. Consider using try-with-resources.

try(InputStream input = s3object.getObjectContent(); ServletOutputStream output = response.getOutputStream();) {

} catch (FileNotFoundException e) {
    e.printStackTrace(); // never thrown. s3object will be null
} catch (IOException e) {
    e.printStackTrace(); // consider using a logger for exceptions
}

Also, according to the javadocs s3object will be null when the object is not found so you don't have to check for FileNotFoundException.

Another consideration is the endpoint seems to be hardcoded. If the application is running on an ec2 instance and your development machine is configured correctly you can simply use the defaultClient.

AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();

The builder will lookup the endpoint for you.

When your application closes consider calling s3Client.shutdown().

For more information I found this useful.

Question:

I am building an image server. There is a servlet which takes in an image and intention is to store image in Amazon S3.

Part filePart = request.getPart("file");
InputStream inputStream = filePart.getInputStream();

How to stream the contents to amazon s3, without downloading file to secondary storage ?

The Amazon S3 api is

   System.out.println("Uploading a new object to S3 from a file\n");
    File file = new File(uploadFileName);
    s3client.putObject(new PutObjectRequest(
                       bucketName, keyName, file));

But this requires some file path to the local storage. How can I transfer the image from the servlet directly to S3 without storing it on disk ?


Answer:

Yes, there is another api is put inputStream into S3 directly.

public PutObjectRequest(java.lang.String bucketName,
            java.lang.String key,
            java.io.InputStream input,
            ObjectMetadata metadata)

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest(java.lang.String,%20java.lang.String,%20java.io.InputStream,%20com.amazonaws.services.s3.model.ObjectMetadata)