ServerlessArchitecture#03 AWS Lambda, Lambda Execution Environment, Anatomy of a Lambda Function and Invocation Methods

ServerlessArchitecture#03 AWS Lambda, Lambda Execution Environment, Anatomy of a Lambda Function and Invocation Methods

What is AWS Lambda?

Since being launched in 2014, AWS Lambda service has spread fast amongst developers and cloud architects, for it is easy to use and there is a significant cost benefit (pay-per-use basis). AWS Lambda is a Function-as-a-service (FaaS) computing platform provided by Amazon Web Services (AWS). As a FaaS, it provides a computing platform to execute code in the cloud. As in any serverless system, it abstracts away the complexities of provisioning and managing a cloud infrastructure.

It is commonly used when building microservices applications but also serves monolithic and other types of architectures as well.

There are several use cases and multiple benefits from using AWS Lambda. Read the Link and have the time to go through the video.


Refresher: Lambda Vs EC2

image.png

Max timeout limit of lambda function has changed to 900 seconds



To summarize, the blog Use-Cases are:

  1. What is AWS Lambda?
  2. Do you know AWS Lambda is commonly used when building microservices applications but also serves monolithic and other types of architectures as well.
  3. Lambda execution environment and available runtimes
  4. Lets build a custom runtime for AWS Lambda
  5. You want to see a sample lambda code - check my Lambda Function which will be automatically triggered when you put .xls file in S3 bucket
  6. Anatomy of a Lambda function: {Handler Funciton, Event Object and Context Object}
  7. Invocation Methods: {Synchronous- RequestResponse, Asynchronous- Event, Dry Run - carry out a verification and validation process.}
  8. AWS Lambda Integrations
  9. How Lambda Works: A Bird's View
  10. AWS Lambda pricing and free tier

NOTE A: AWS Lambda Use Cases
Use Case 1: Swift Document Conversion
Use Case 2: Processing Video
Use Case 3: Operating Serverless Websites
Use Case 4: Security Alerts
Use Case 5: Automated File Synchronization
Use Case 6: Predictive Page Rendering


Selection_026.png

1.1 Lambda execution environment and available runtimes

  • Container with a 64-bit Amazon Linux AMI1
  • RAM: 128MB to 3008MB, in 64 MB increments
  • CPU increases incrementally with RAM2
  • Execution duration: up to 900 seconds (or 15 minutes)
  • Ephemeral disk space 512MB
  • Compressed package size 50MB
  • Uncompressed package size 250MB

Lambda supports the following runtimes by default:

RuntimeVersion
Python3.8,3.7,3.6 and 2.7
Node.js14.x,12.x, 10.x
Java8, 8(Corretto), 11(Corretto)
Go1.x
Ruby2.7, 2.5
.NET Core3.1, 2.1 (C#/PowerShell)

Developers can implement any other custom runtime of their choosing. The custom runtime will run in the Lambda execution environment. It can be a shell script or an executable binary.


Building a custom runtime A custom runtime's entry point is an executable file named bootstrap. The bootstrap file can be the runtime, or it can invoke another file that creates the runtime. The following example uses a bundled version of Node.js to run a JavaScript runtime in a separate file named runtime.js.

#!/bin/sh
cd $LAMBDA_TASK_ROOT
./node-v11.1.0-linux-x64/bin/node runtime.js

Your runtime code is responsible for completing some initialization tasks: { Retrieve settings, Initialize the function, Handle Errors }. Then it processes invocation events in a loop until it's terminated. The initialization tasks run once per instance of the function to prepare the environment to handle invocations.


1.2 An Example Lambda Function

1.2.1 lambda function will be automatically triggered when you put .xls file in S3 bucket: Usage Instruction

*** Challenge For you- try to optimize this code in terms of Cache, Memory Usage and Processing times

Instructions
-------------
> dont 'Test' this file by itself.
> this lambda function will be automatically triggered when you put .xls file in S3 bucket: read-excel-file-lambda-trigger-virtasant-test
> your task is to read the content of this xls file 
> then, add a new column to it 
> then save the file in lambda /tmp directory
> and finally upload the modified file as .csv under the same s3 Bucket

*** Note our current xlrd version is 2.0.1. xlrd version >2.0 doesnt support .xlsx extension. for that you have to install 'openpyxl' and packit into your layer

How I have created this Pandas,xlrd layer using Python 3.8
----------------------------------------------------------
> pip3 install pandas testresources xlrd -t build/python/lib/python3.8/site-packages --system

1.2.2 lambda function will be automatically triggered when you put .xls file in S3 bucket: Event Object

Event:  {'Records': #0
            [{
                'eventVersion': '2.1', 
                'eventSource': 'aws:s3', 
                'awsRegion': 'us-east-1', 
                'eventTime': '2021-02-04T12:50:07.629Z', 
                'eventName': 'ObjectCreated:Put', 
                'userIdentity': {'principalId': 'A37RXU4CDM1IE7'}, 
                'requestParameters': {
                    'sourceIPAddress': '26.147.204.47'
                }, 
                'responseElements': {'x-amz-request-id': 'D5948A9521C8DBB2', 'x-amz-id-2': '2He+GF44ooEVKJf378zmvwkn6J9S5nas7cze4ACJGkqSE0MsJa+wmVW4khrBTFSkt4ByDkT5jgqQAsp2CJnNokrhurBCUQRe'}, 
                's3': {
                        's3SchemaVersion': '1.0', 
                        'configurationId': '2906f736-b40b-45f0-ab14-5090d9b2ad18', 
                        'bucket': 
                                {
                                    'name': 'aws-lambda-trigger-virtasant', 
                                    'ownerIdentity': {'principalId': 'A37RXU4CDM1IE7'}, 
                                    'arn': 'arn:aws:s3:::aws-lambda-trigger-virtasant'
                                }, 
                        'object': 
                                {
                                    'key': 'problem_virtasant_fizzbuzz.py', 
                                    'size': 354, 
                                    'eTag': 'c36bf766b88721c84e6660bd73dd5eb4', 
                                    'sequencer': '00601BED83DCF96008'
                                }
                    }
                }]
        }

1.2.3 lambda function will be automatically triggered when you put .xls file in S3 bucket: Response Returned

{
    'ResponseMetadata': 
    {
        'RequestId': 'F9FBB8B359C02B02', 
        'HostId': 'Ng9cyLHxXauyGwpkeEj9WiQi3ywVuFc4Jp7g4CyIzUX3TjxJYXbz9H2UrAw/Y0xhpFQ+9HJxnh4=', 
        'HTTPStatusCode': 200, 
        'HTTPHeaders': 
            {
                'x-amz-id-2': 'Ng9cyLHxXauyGwpkeEj9WiQi3ywVuFc4Jp7g4CyIzUX3TjxJYXbz9H2UrAw/Y0xhpFQ+9HJxnh4=', 
                'x-amz-request-id': 'F9FBB8B359C02B02', 
                'date': 'Thu, 04 Feb 2021 13:19:55 GMT', 
                'last-modified': 'Thu, 04 Feb 2021 13:19:51 GMT', 
                'etag': '"c36bf766b88721c84e6660bd73dd5eb4"', 
                'accept-ranges': 'bytes', 
                'content-type': 'text/x-python', 
                'content-length': '354', 'server': 'AmazonS3'
            }, 
        'RetryAttempts': 0
    }, 
    'AcceptRanges': 'bytes', 
    'LastModified': datetime.datetime(2021, 2, 4, 13, 19, 51, tzinfo=tzutc()), 
    'ContentLength': 354, 
    'ETag': '"c36bf766b88721c84e6660bd73dd5eb4"', 
    'ContentType': 'text/x-python', 
    'Metadata': {}, 
    'Body': <botocore.response.StreamingBody object at 0x7fdd5edde1c0>
}

1.2.4 lambda function will be automatically triggered when you put .xls file in S3 bucket: The Code

import json
import pandas as pd
import xlrd as xl
import numpy as np
import io
import boto3

def lambda_handler(event, context):
    #Step-1: Check if the event has been triggered
    s3_client=boto3.client('s3')
    if event:
        #Step-1.1: Event has been triggered, now read the byte contents of the .xlsx file uploaded into S3 bucket
        print("Event S3 File/Upload/ReadFileName/ReadFileContent: ",event)
        s3_records=event['Records'][0] # this is a dict with keys
        '''
        dict_keys([ 'eventVersion', 
                    'eventSource', 
                    'awsRegion', 
                    'eventTime', 
                    'eventName', 
                    'userIdentity', 
                    'requestParameters', 
                    'responseElements', 
                    's3'])
        '''
        s3_bucket_name=s3_records['s3']['bucket']['name']
        s3_file_name=s3_records['s3']['object']['key']

        print("{} has been uploaded into {} ".format(s3_file_name,s3_bucket_name))

        s3_file_obj=s3_client.get_object(Bucket=s3_bucket_name,Key=s3_file_name) # this is an Dictionary
        s3_file_content_as_bytes=s3_file_obj['Body'].read() # as byte format, thats why I didnt use .decode('utf-8')

        #Step-1.2: convert this byte data in s3_file_content as fileLikeObject using io.BytesIO as pandas cant read the byte data
        s3_file_content_as_file_like_object=io.BytesIO(s3_file_content_as_bytes)

        #Step-1.3: Using pandas read this excel data from the fileLikeObject generated using io
        df=pd.read_excel(s3_file_content_as_file_like_object)

        #Step-1.4 Create a new column named ProjectManager into this excel file using pandas
        df=df.assign(projectmanager="Jahidul Arafat")
        print(df)

        #Step-1.5: Save this file as csv into the lambda /tmp directory
        df.to_csv("/tmp/update_{}.csv".format(s3_file_name))
        '''
        if you want to save this file as .xls then use 
        > df.to_excel()    --> for that you have to use the openpyxl module which is not installed in our lamda layer right now, so cant 
        '''

        #Step-1.6: Upload this file in /tmp into S3 bucket using S3 resource
        '''
        Resources represent an object-oriented interface to Amazon Web Services (AWS). They provide a higher-level abstraction than the raw, 
        low-level calls made by service clients. To use resources, you invoke the resource() method of a Session and pass in a service name:

        # Get resources from the default session
            sqs = boto3.resource('sqs')
            s3  = boto3.resource('s3')
        '''
        s3_file_name_updated="update_{}.csv".format(s3_file_name)
        s3_resource=boto3.resource('s3')
        s3_resource.Bucket(s3_bucket_name).upload_file("/tmp/{}".format(s3_file_name_updated),s3_file_name_updated)

    return "Simulated Pandas-Xlrd layer to read and modify .xlsx files"

1.3 Anatomy of a Lambda function

All Lambda functions consist of three key elements:

Key ElementDescription
Handler function- The main function that will be executed each time an event occurs.
- Takes two required arguments: an event and context objects (also takes an optional
callback object)
Event Object- The first argument to the handler function
- Contains information about the invocation event
- An invocation event could be an API request event holding the HTTP request object, see Section 1.2
Context object- Contains information about the Lambda runtime, such as the function name, version, memory limit.
- Provides a method that returns the remaining milliseconds left before the function times out.

1.4 Invocation Methods

Developers can invoke an AWS Lambda function through an HTTP API. AWS also provides SDKs that wrap Lambda API endpoints and make it easier to interact with functions.

1.4.1 Synchronous

When a Lambda function is invoked synchronously, it will keep the connection open until the execution is finished. Once it’s done, it returns the response provided by the function’s code (or an error, if that’s the case).

"This is the default invocation type for all Lambda functions"

Although not necessary, to explicitly invoke a function synchronously, the API consumer provides the InvocationType parameter as RequestResponse.

1.4.2 Asynchronous

When you invoke a function asynchronously, Lambda sends the event to a queue.

  • A separate process reads events from the queue and runs your function.
  • When the event is successfully added to the queue, Lambda returns a success message to the client.

    To invoke a function asynchronously, the API client must explicitly specify the InvocationType parameter as Event.

1.4.3 Dry Run

This invocation type will validate the parameter values and verify whether the API client has appropriate permission to invoke the function.

It will not actually execute the function code, only carry out a verification and validation process.

Simply specify DryRun as the InvocationType parameter to invoke a function in this mode.

1.5 AWS Lambda Integrations

Below is a list of services that integrate with Lambda. They invoke a given function either synchronous or asynchronously, depending on the service.

ServiceDescription
S3 (Simple Storage Service)- S3 is a scalable and highly durable storage service by AWS.
It an hold from image and video to text and columnar data files.
- The main operations in S3 are PUT (save or update an object), GET (retrieve an object), DELETE (remove an object).
- It can automatically send asynchronous event messages to AWS Lambda whenever an object is stored or deleted.
DynamoDB- DynamoDB is a distributed, highly scalable NoSQL database by AWS.
- By enabling DynamoDB Streams, write operations (insert, update, delete) against a DynamoDB table will be stored
in a stream of events.
- These events can be read by a Lambda function asynchronously to perform any kind of job in response to the database changes
SQS (Simple Queue Service)- SQS is a queue buffer service by AWS. Lambda can poll the queue of events and invoke a given function synchronously
- Messages are read in batches from the queue to improve performance and reduce costs.
- When the function successfully processes the queue messages, Lambda deletes them from the queue.
API Gateway- Lambda can be used as the backend processing platform to answer REST API requests received by an API Gateway
- Each request method (GET, PUT, POST) can be matched to a different Lambda function or a single function can serve all requests from an endpoint (or a group of endpoints).
- API Gateway will invoke Lambda synchronously
- Beware that, even though Lambda timeout limit is 15 minutes, API Gateway is limited to 29 seconds.
Load Balancer- Similarly to the API Gateway integration, Lambda can also serve HTTP requests received by an Application Load Balancer (ALB)
- Also as API Gateway, the ALB will invoke the Lambda function synchronously.
- ALB does not have a hard timeout limit. Processing time is limited to the Lambda timeout of 15 minutes.
SES (Simple Email Service)- SES is an email sending and receiving service by AWS.
- When configured to receive messages, it can invoke a Lambda function passing the email received as a parameter.
- SES can invoke in both synchronous and asynchronous modes, depending on how the integration is intended to work.
CloudFront / Lambda@Edge- CloudFront is a content delivery network (CDN) service by AWS.
It serves static content in a performant way distributing and replicating content across
hundreds of points-of-interest that are geographically closer to application users.

- Lambda@Edge is an integration between CloudFront and Lambda that allows to customize content delivered through the CDN
This can be used to inspect cookies and customize content returned by CloudFront, or to manipulate headers in the CloudFront HTTP responses.
Kinesis Firehose (Stream Processing)- Amazon Kinesis Data Firehose can automatically transform data received through streams.
- By integrating with AWS Lambda, it’s possible to implement custom transformation processes
Other Integrations- Amazon Cognito
- Amazon Lex
- Amazon Alexa
- AWS Step Functions
- Amazon Simple Notification Service
- AWS CloudFormation
- Amazon CloudWatch Logs
- Amazon CloudWatch Events
- AWS CodeCommit
- AWS Config
- AWS IoT Events

1.6 How Lambda Works: A Bird's View

image.png

1.7 AWS Lambda pricing and free tier

Lambda functions are billed by the time it takes to execute your function, being rounded up to the nearest 1ms and the GB-seconds based on the memory consumption. It also comes with a free tier of 1 million requests and 400,00 GB-seconds of compute time each month.

After the free tier, it costs $0.20 per 1 million requests and $0.00001667 for every GB-seconds. The GB-seconds is based on the memory consumption of the Lambda function. For further details check out the Lambda Pricing Page or the Lambda Cost Calculator.


AWS Lambda Use Cases

Here we will present you with some of the best AWS Lambda use cases.

Use Case 1: Swift Document Conversion

For those who are providing documents to users, there is always a problem because there is no standard document format that will satisfy desires of all users. Some of them would want it in HTML format while others will like to download it as a PDF file or even in some more specialized document format.

You can, of course, make and store copies of all document formats that are most likely to be requested by a user. However, soon you will find out that this takes a substantial amount of storage capacity which you can extend for a considerable cost or you can save your resources by using AWS Lambda. AWS Lambda can swiftly retrieve the required document, format and convert it, and finally serve it back to the user for a download or a display in a page.

Use Case 2: Processing Video

This case is useful for those of you who have stored video files on an S3 bucket. For now, you have an instance that polls the bucket on a regular basis. It sits until there is a new file uploaded, then it downloads the file, manipulates it in some way, and then drives it back to your origin server.

The problem is you have an instance that’s sitting idle for a good part of the day basically doing nothing since there is no new video being uploaded, which costs you money and your computer resources that are being used 24/7 for no good reason.

Using Lambda, in this case, allows you to upload your script (Java, JavaScript, Python, etc.) and design code to provoke an event when a new file is uploaded to the bucket, which will lead to file being processed and pushed back to your origin server. And that’s how easily Lambda can save your resources.

Use Case 3: Operating Serverless Websites

This might be the best way to take advantage of the pricing model of Lambda and S3 hosted static websites.

Think about hosting the web frontend on S3, and accelerating content transmission with CloudFront caching. The web frontend can send requests to Lambda functions using the API Gateway HTTPS endpoints. Lambda can handle the application logic and persist data to a fully managed database service (RDS for relational, or DynamoDB for a non-relational database).

Your Lambda functions and databases can be hosted within a VPC to separate them from other networks. As for Lambda, S3 and API Gateway, you get charged after the traffic incurred, and the only fixed expense will be running the database service.

Use Case 4: Security Alerts

Do you need to be aware of any security breaches in your cloud infrastructure, is it imperative to your overall cloud strategy to monitor your logs and to keep an audit trail? Lambda could help you a lot in this situation.

You can write a Lambda function to send you an alert on a specific event from your Cloudwatch/CloudTrail AWS activity logs to your designated on-call staff, via email, or you could even write a code which will trigger the AWS Lambda to call you on your phone.

Use Case 5: Automated File Synchronization

If you have a repository that needs to be regularly synchronized to a number of locations several times a day you might find AWS Lambda very useful. Many people don’t use a dedicated instance for things like this. Often they go and double, or even triple up on another instance and just assign this task to it. Depending on a security policy or contention for resources at specific periods during the day this will prove not to be the best option.

Instead of keeping an instance up and running all day long, chewing away at your budget every month, you can use a Lambda function that will be triggered by a scheduled event, run your synchronization job, and then just go away until the next cycle that it needs to run.

Use Case 6: Predictive Page Rendering

Are you using predictive page rendering to ready web pages for display based on the probability that the user will select them? You can use a Lambda-based application to retrieve documents and multimedia files, which might be used by the next page requested, and to conduct the beginning stages of rendering them for display.

If multimedia data is being served by an outside source, the AWS Lambda application can even check for its availability, and attempt to use alternate sources if they are not available.

AWS Lambda should be one of your prime go-to resources for approaching repetitive or time-exhausting tasks, along with the other heavy-lifting jobs of the data-processing world