How to read or download or stream Document from S3 using AWS Lambda and Nodejs

Sometimes we need to show stored static documents to web applications for review, download them to the local file system. And you should not keep all the documents in your application static folder at all. You should read them from server space. Here I use a widely used AWS S3 bucket. 

 

stream read download document file from AWS s3

 

The Amazon S3 object store is one of the oldest services on AWS. Amazon S3 is an acronym for Amazon Simple Storage Service. It’s a typical web service that lets you store and retrieve data in an object store via an API reachable over HTTPS.

What do I want to achieve here?

- I have a PDF document in the AWS S3 bucket and that I want to download in the ClientWeb application created using React. Also, I have a nodeJs middleware which will help me to read/download the file in React Application.

 


Let's Begin...

Step 1: Keep/upload your file to the S3 bucket

You can upload all the files to the S3 bucket manually or programmatically. For manual upload, you can use the drag & drop feature, and if you want to upload files from the Client system, then I would suggest going through another blog, that I have written earlier. 


Step 2: Setup your NodeJs code for API gateway and lambda compatible

Then I wrote a Lambda function using NodeJS. For NodeJS with Lambda, you can check the following link, and can create a Node with TypeScript and Lambda compatible code.

The above link will give you a context of how you can set up Node + TypeScript And Lambda. Feel free to check the above link. But you need to do a couple of more stuff over there in order to access the S3 bucket file from NodeJS.

 

In your serverless.yml file.

service: download-file // lambda service name

 

plugins:

- serverless-offline

- serverless-deployment-bucket

 

custom:

headers:

- Content-Type

 

provider:

name: aws

runtime: nodejs12.x

region: ap-south-1

profile: default

deploymentBucket:

name: <<Your Bucket Name>>

serverSideEncryption: AES256

 

functions:

readBooks:

handler: <<Handler function name>>

events:

- http:

path: read-books

method: post

cors:

headers: ${self:custom.headers}

private: false // I don't need to your X-API-Key, so make it false. If you need you can make it true.

iamRoleStatementsName:

iamRoleStatements:

- Effect: Allow

Action:

- s3:GetObject

Resource: arn:aws:s3:::publicdocument/documents/*  // your bucket resource arn value

Resources:

NewResource:

Type: AWS::S3::Bucket

Properties:

BucketName: publicdocument // bucket name

 

 

You do need to add the following npm library.

  • npm i aws-lambda @types/aws-lambda aws-sdk

Once done, you can open the code editor and update the code accordingly. You can choose any options from the below - 

Option 1: Download the complete file in NodeJS and then send the blob data to the Client system.

- In this case, as the file is downloaded completely in NodeJS, so it will take a lot of time if your file is big enough (> 20MB). Lambda function has some limitations, such as - 

  1. Lambda can send blob data at once and the size should not be more than 5MB.
  2. Lambda function timeout default is the 30s.
  3. Lambda does not support Streaming functionality.

Following code will help to download complete file. (Not Recommended)

async servePDFStream() {

const fileName = this.requestBody.fileName;

const foldername = 'document';

 

const params = {

Bucket: 'publicdocument',

Key: `${foldername}/${fileName}`,

};

 

const s3 = new AWS.S3({region: process.env.AWS_REGION});

const { Body } = await s3.getObject(params).promise()

 

return Body;

}

 

 

Option 2: Get Pre Signed URL from S3 and share the URL to your Client Application. (Recommended)

Next option is create a presigned document URL and share it with Client application. You can set time for URL access, once the time elapse the URL will give 400 bad request.

async servePDFStream() {

const fileName = this.requestBody.fileName;

const foldername = 'document';

 

const params = {

Bucket: 'publicdocument',

Key: `${foldername}/${fileName}`,

};

 

const s3 = new AWS.S3();

const signedUrlExpireSeconds = 60 * 2;

const url = s3.getSignedUrl('getObject', {

Bucket: params.Bucket,

Key: params.Key,

Expires: signedUrlExpireSeconds

});

 

return url;

}

 

Never ever add aws-sdk library to your client application to get S3 object. To access S3 you need to give the AWS account access key and secret key. It always is good to have those secret data on the server side instead of a client side. 

 


Step 3: Download or Read Pre Signed URL from AWS S3

Once you have received the pre signed URL from S3, you can download the file or open the document in document viewer. In my use case, I download the PDF file and draw it in canvas using PDF.js. In that case my document is being secure and not shareble from one environemnt to other environment.

const downloadPDFFromURL = (url) => {

const xhrObj = new XMLHttpRequest();

xhrObj.open('GET', url, true);

xhrObj.responseType = "blob";

 

xhrObj.addEventListener('loadstart', loadStartFunction, false);

xhrObj.addEventListener('progress', progressFunction, false);

xhrObj.addEventListener('error', downloadError, false);

xhrObj.addEventListener('timeout', downloadTimeout, false);

xhrObj.addEventListener('abort', downloadAbort, false);

 

xhrObj.onreadystatechange = async (event) => {

try {

if (xhrObj && xhrObj.status === 400) {

console.log('download error');

setloading(false);

} else {

if (xhrObj && xhrObj.readyState === XMLHttpRequest.DONE) {

if (isMobile) {

const blobData = new Blob([xhrObj.response], { type: 'application/pdf' });

showPDFInViewer(blobData); // using this Blob data you can show it to PDF viewer or respective file opener function

} else {

const pdfData = await convertBlobToBase64(xhrObj.response);

loadPDFWithBlob(pdfData);

}

}

}

} catch (error) {

console.error('File upload exception: ', error);

setloading(false);

}

};

 

xhrObj.send(null);

}

 

const loadStartFunction = (event) => {

console.log('File download started');

}

 

const progressFunction = async (event) => {

if (event.lengthComputable) {

const progress = Math.round(event.loaded/event.total * 100)+'%';

setprogressTxt(progress);

}

}

 

const downloadError = () => {

console.log('Network Error!');

}

 

const downloadTimeout = () => {

console.log('Network Timeout!');

}

 

const downloadAbort = () => {

console.log('Upload Aborted!');

}

 

I hope you might like the content, if you have any other suggestion please comment below.

Thanks & Happy Coding!