cookery in aws lambda - uva · although the principal of function-as-a-service and aws lambda...

24
Bachelor Informatica Cookery in AWS Lambda Timo Dobber June 8, 2017 Supervisor(s): Adam Belloum, Miko laj Baranowski Signed: Informatica — University of Amsterdam A.s.Z Belloum

Upload: others

Post on 29-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

Bachelor Informatica

Cookery in AWS Lambda

Timo Dobber

June 8, 2017

Supervisor(s): Adam Belloum, Miko laj Baranowski

Signed:

Informatica—

UniversityofAmst

erdam

A.s.Z Belloum

Page 2: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

2

Page 3: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

Abstract

Cloud computing is getting a more prominent factor in life, but programming in thecloud is often hard and complicated. Cookery has been developed for this reason, to makedeveloping and connecting cloud applications easier. But most of the people who can developapplications using Cookery still do not have the financial means to realise it. This is whereAWS Lambda comes in. The combination of AWS Lambda and Cookery make it possible forpeople, who do not have any program experience, to program in the cloud with relativelylow costs. In this paper, we present a way to combine AWS Lambda and Cookery. Wedevelop two functionalities that can be embedded in Cookery and later be extended whennecessary. The result enables a new field for developers who lack the financial support orthe programming experience.

3

Page 4: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

4

Page 5: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

Contents

1 Introduction 71.1 Research question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Theoretical background 92.1 Cloud computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Software-as-a-Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Function-as-a-Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Cookery 11

4 Amazon Web Services 134.1 AWS Lambda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 AWS CloudWatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.3 boto3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5 Implementation 155.1 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.2 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.3 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6 Results 19

7 Conclusion and Discussion 217.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5

Page 6: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

6

Page 7: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 1

Introduction

Cloud computing is getting a more and more prominent factor in life. People use the cloud almostevery day, sometimes without even knowing it. The cloud comes in a variety like storage services,computation services, video streaming services and much more. Dropbox [10] and Google Drive[16] are an example of cloud storage services that are widely used, with Dropbox claiming tohave half a billion users in March 2016 [9]. On the other side, more and more people tend touse the cloud for business purposes. In the cloud, it is easier to collaborate across geographicallydistributed locations. In the Harvard Business Review: Cloud Computing Comes of Age [25],it is stated that cloud software greatly reduces the implementation time and it does not need abig up-front investment. It is also stated that a cloud provider could have an application up andrunning in five weeks, contrary to the 18 months that it would take according to the IT-business.On the other hand, the Harvard Business Review also shows that security of these cloud servicesis the biggest barrier.

But programming in the cloud has become more difficult and complex, due to the manyprogramming languages available and the big documentations that accompany the APIs (Appli-cation Programming Interfaces). In order to make programming in the cloud easier, Cookery hasbeen developed in the context of PhD research work at the UvA [6]. Cookery enables developersto combine multiple cloud applications in an easy way. On top of that, Cookery uses its own Do-main Specific Language (DSL) to make it more accessible for people without any programmingexperience.

Many small start-ups cannot afford their own server; hence the cloud approach provides asolution. They can buy some computing time or storage somewhere in the cloud, which will becheaper at the beginning as opposed to buying a server. Running a service on the cloud comeswith the next problem. You need to build a software infrastructure to handle all the requests yourapplication will receive. This is a long and tedious job that not everybody can do, certainly notall small start-ups. This problem is solved by the introduction of Function-as-a-Service (FaaS),also known as serverless computing.

1.1 Research question

Function-as-a-service makes it possible for smaller businesses to create an application in thecloud. On the other hand, we have Cookery that makes it easier for people who do not have anyprogramming experience. These two services combined would be a powerful tool for start-upswhich lack the financial support or programming knowledge, to create and manage their ownapplications. This leads us to the following research question:

How can we develop a framework based on Cookery and running on AWS Lambda?

With this research question, we aim to develop a framework that combines the advantages ofserverless computing with the advantages of Cookery, like the easy usage for end users. This canbe achieved by extending the toolkit of Cookery with new implementations that connect withAWS Lambda [4].

7

Page 8: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

This research question also brings some challenges with it. Cookery is a great framework andhas some great opportunities to make programming easier. But Cookery is a new frameworkand the toolkit does not contain that much except for some basic features. For example creatinga new Cookery project, which instantiates a directory with the necessary files. So, every newfunctionality has to be developed from scratch, while also keeping in mind that it can be extendedlater. AWS Lambda also has some limitations, like the rate limits, such as the maximum timeto run a function and maximum allocation space. But you also need to know how AWS Lambdahandles errors and failures. Another challenge is security, as mentioned earlier in the introduction.How can we guarantee the security on AWS Lambda and the cloud service providers we reachout to?

1.2 Related work

Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, thereis not much research about rewriting applications in order to operate with AWS Lambda.

Spillner and Dorodko [26] have researched a tool that analyses java code and transforms itinto AWS Lambda functions. The results showed that simple and heterogeneous code can betransformed without problem, which suggests that Cookery will be able to run functions on AWSLambda.

In Serverless Computation with OpenLambda Hendrickson et al. [18] present a new platformfor building applications in the serverless computation model. They also describe some keyaspects of serverless computation and show some performances of AWS Lambda.

Other studies show the findings of creating a service using serverless computation. Yanbuilds a chatbot using IBM OpenWhisk [19], an alternative of AWS Lambda [29]. While Kiranuses a lambda service to construct a data-handling backend with high throughput, but lowcosts [22]. Malawski presents in his study an approach of combining scientific workflows withserverless computation, considering multiple architectural designs [23]. The results showed thatthe prototype scientific workflow on Google Cloud Functions [15], another alternative of AWSLambda, introduced no significant delays. The paper also states that working with the serverlesscomputation model can bring some complications when considering bigger workflows that runmore than 5 minutes or that the preparation of more complex applications to execute on aserverless computation service can be an issue.

In chapter 2 we will talk about the theoretical background, which is focused on cloud comput-ing and its different services. An in-depth view of Cookery is given in chapter 3. In chapter 4 weintroduce AWS and the different services that are needed for this project. Our implementation isdescribed in chapter 5, we will describe in detail what we have done in this project. The resultsare presented and discussed in chapter 6, while in chapter 7, we give a conclusion and discussour project and what it enables for other people.

8

Page 9: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 2

Theoretical background

2.1 Cloud computing

In A View of Cloud Computing [2] it is stated that ”cloud computing refers to both the ap-plications delivered as services over the Internet and the hardware and systems software in thedata-centres that provide those services”, where the services are being referred to as Software-as-a-Service. The cloud would then be the data-centres hardware and software, and the servicebeing sold is utility computing. In the end, it is stated that cloud computing is then the sum ofSoftware-as-a-Service and utility computing.

While this definition is a bit vague, it does show what cloud computing is. Another definitionis given by Wang et al., in the early days of cloud computing: A computing Cloud is a setof network enabled services, providing scalable, QoS (Quality of Service) guaranteed, normallypersonalized, inexpensive computing infrastructures on demand, which could be accessed in asimple and pervasive way [28].

Cloud computing is thus a collection of multiple services, which are offered via the Internet.The cloud service offerings are divided in three main categories, namely SaaS, Software-as-a-Service, IaaS, Infrastructure-as-a-Service and PaaS, Platform-as-a-Service. In the last few yearsa new service has been developed, which came to be known as Function-as-a-Service or FaaS. Oneof the reasons cloud computing has become so big, is that the user does not need to download anysoftware, but instead uses it via the Internet. This makes cloud computing extremely scalable andflexible, because this takes away the concerns of geographical locations, hardware performanceand software configurations [28].

2.2 Software-as-a-Service

Software-as-a-Service (SaaS) is basically what the name implies, software which is being offeredonline as a service, with or without subscription or on-demand payment. A user will not haveto worry about downloading, installing, setting up, running and updating the program, becausethe service provider will do that. An example of this is Netflix, a complete video streamingservice offered with a subscription, accessible via a browser. So basically, SaaS can be seen asthe application layer of cloud computing. While this is mostly beneficial for the end user, SaaSis also an outcome for the IT-business as mentioned in the introduction.

2.3 Function-as-a-Service

The more recently established Function-as-a-Service (FaaS) is somewhat smaller as Software-as-a-Service. Again, this is exactly as it says, with FaaS a user only runs a function on an externalserver. This is also called serverless computing, because you do not need a server for yourapplication anymore. You basically have the function running on a server in the cloud. A user of

9

Page 10: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

the application provides the input and the function returns the output. This makes it possiblefor smaller businesses to develop their own application without buying servers. Developers alsodon't need a system administrator to maintain the servers, they don't need to write a completeinfrastructure that can scale with the demands of the applications and they don't need to handleall the administration. So basically, applications can scale up rapidly without needing to startnew servers [18].

In the article Serverless Computation with OpenLambda [18] the authors show that we havereached a new stage in the sharing model with FaaS, which is shown in figure 2.1. As seenin the figure we have progressed from only sharing the hardware, which is done with virtualmachines like VMWare, sharing the hardware and the operating system, as seen with containerslike Docker, to sharing the runtime of a system.

The handler is started in a container, which can only be used by the handler itself. Althoughmultiple containers run in the same run-time, communication between containers is not possible.Other functions would then be able to intercept your functions and gain access to valuableinformation.

On the other hand, a user will have to recognise some places where performance issues canarise. The readiness latency, the time it takes to start, restart or unpause a container, canhave consequences for the overall performance [18]. And there are more like the number ofcontainers per memory (container density), package support, cookies and sessions. A study hascompared the cost, performance and response time of different implementation architecturessuch as monolithic architecture, microservice architecture operated by the cloud customer andmicroservice operated by AWS Lambda. With the microservice architecture a developer willtry to develop an application as a suite of small services [11], which all run their own process.The results of this study show that a microservice operated by AWS Lambda is up to 77.08%cheaper per million requests than the other two methods, while the response times are fasterthan the cloud customer operated microservice architecture and about the same as the monolithicarchitecture [27]. There are multiple online platforms that offer FaaS. Some examples are GoogleCloud Functions [15], Microsoft Azure [24], AWS Lambda and IBM OpenWhisk [19]. AWSLambda is chosen for this project, because it's the only one to offer the service in combinationwith Python.

Figure 2.1: The grey areas are shared [18].

10

Page 11: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 3

Cookery

Baranowski, in the context of his PhD work, developed Cookery, a framework to make pro-gramming with other cloud applications a lot easier [6]. Cookery makes it possible for peopleto combine cloud services using a high-level language. This high-level language, or Cookerylanguage, has the same syntax as English, which makes it possible for people without any pro-gramming experience to make applications and understand what is going on. An example of theCookery language is: A = split File text_file.txt.

When we take a closer look at Cookery, we can see that it is actually composed of threelayers, which can be seen in figure 3.1. The first layer is used by a user to develop Cookeryapplications using the Cookery language. In this layer activities can be defined and modified.The second layer is for developers and instead of the previous layer, this layer makes use of thecookery Domain Specific Language (DSL). This layer is used for defining and modifying actions,subjects and conditions. The third and last layer is the Cookery back-end and is also intendedfor developers. Here developers can implement protocols, which are for the activities and data,and communication with execution environments.

As mentioned above, the Cookery language has a syntax based on English, which allowspeople not familiar with programming to understand what is happening. The sentences that onecan make are called activities. These activities consist of other elements, like variables, actions,subjects and conditions. The mentioned elements all have their own role within the Cookerylanguage. Variables are optional and they assign results of an activity to a label which can laterbe used as a reference, thus representing the data flows. An action refers to its implementationin the Cookery DSL and it represents remote operations. Subjects represent the input or outputof an application, also known as remote data. They can, for example, be used for retrievingdata from a cloud service. Both actions and subjects can be followed by arguments and bothimplementations are divided between all three levels. Conditions are used with keywords, like ifor with, to separate them from the rest of an activity. They are routines defined in Cookery DSLand are meant to transform data before it is passed to an action. This data can be retrievedin different ways, including from a remote location in a subject or from a variable. The middlelayer provides the Cookery DSL, which allows users to define actions, subjects and conditions(Cookery elements). These elements all consist of a name, a regular expression and a Pythonroutine. Cookery comes with a toolkit in order to makes things easy. The toolkit enables a userto execute Cookery applications, generate new projects and evaluate expressions.

11

Page 12: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

Figure 3.1: User roles and layers in Cookery [6]. The first layer is for developing applicationswith Cookery language, the second layer for is for defining actions, subjects and conditions

using the Cookery DSL and the third layer is the back-end where protocols are implemented.

12

Page 13: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 4

Amazon Web Services

Amazon has reacted well on the coming of cloud computing. In 2006, they started with AmazonWeb Services, offering IT infrastructure services to businesses in the form of web services [1].AWS currently offers more than 70 services across many fields in the IT business, like artificialintelligence, storage, security, computing and many more. The possibility to connect most ofthe services makes AWS a very robust platform. Furthermore, AWS uses a pay-per-use billingmodel, which make the platform very interesting from a financial point of view. The servicesthat are of interest for this project are AWS Lambda and AWS CloudWatch, which are explainedsome more, and AWS S3.

4.1 AWS Lambda

AWS Lambda is Amazons service for serverless computation. It can be accessed via a webconsole of Amazon, where all other AWS services can also be found, via the AWS CommandLine Interface (CLI), or via the API using the boto3 package [7]. The latter allows a programmerto connect with the AWS services from his own programs. The functions on AWS Lambda canbe triggered via multiple ways. One way is via a website connecting with the API which is calledAWS API Gateway. Functions can also be manually invoked using boto3 mentioned above. Butthey can also be triggered by so called events, like a file being uploaded to AWS S3 or a schedulethat is set in AWS CloudWatch. Each invoked function will spawn a container to be executedin. While this can better handle security issues like hijacking someone else's functions, this alsoblocks the possibility to let your functions communicate. The functions are also stateless, thismeans that results will need to be returned or saved to a database in order to continue workingwith them. This is not particularly an issue with smaller applications, but for applications thatrun longer than 5 minutes, the maximum time a function can run on AWS Lambda, this canbe troublesome. Another thing that can be considered troublesome is the fact that only Pythonnative libraries and boto3 can be imported on AWS Lambda. Every other library that is needed,has to be included with the deployment package.

4.2 AWS CloudWatch

AWS CloudWatch monitors operational and performance metrics of AWS cloud services, includ-ing AWS Lambda. It can be used to read the logs of all functions and see whether functionsterminated successfully or not. But it can also be used to create rules, for periodic invocations ofLambda functions and invocations when a certain event pattern is matched, or to set alarms, toget an email whenever a function runs more than a certain number of seconds. When a rule getstriggered, it will create an event to invoke a Lambda function. We will use AWS CloudWatchto make rules so we can schedule our Lambda functions. This way, we can automatically runLambda functions every day or every 2 hours for example. It is also possible to run a functionevery 5 minutes, which allows us to run functions the whole day.

13

Page 14: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

4.3 boto3

An extra Python library is needed in order to connect with AWS via Python scripts, whichis boto3, also known as the API (Application Programming Interface) for AWS. We use thisinstead of the aforementioned console, which can be considered as the front-end of the API. Inorder to be able to connect to AWS services, you need to create a client or resource for a specificcloud service, like CloudWatch, CloudWatch events or Lambda as seen in listing 1. Resourcesrepresent object-oriented interfaces to AWS and provide a higher-level of abstraction than thelow-level clients, whose methods map close to 1:1 with service APIs [8]. For each cloud servicethere, are specific functions that can be invoked via the corresponding client, for example allfunctions related with Lambda can be called via the Lambda client.

1 import boto3

2

3 client = boto3.client("lambda")

4 client = boto3.resource("s3")

5

6 client = boto3.client(

7 "lambda",

8 aws_access_key_id=ACCESS_KEY,

9 aws_secret_access_key=SECRET_KEY,

10 region_name=REGION

11 )

Listing 1: Example of creating clients using the boto3 library

14

Page 15: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 5

Implementation

As mentioned before, we will try to run functions on AWS Lambda by creating a frameworkbased on Cookery and extending the toolkit. The functionalities we considered for this projectare the deployment of a Lambda function and the scheduling of a Lambda function. Thesefunctionalities are first implemented in Python without Cookery to simplify the process. Afterthat we create a use case that makes use of both functionalities and can be useful for futurework. Everything mentioned here is implemented with Python 3, just like Cookery, so it can beeffortlessly integrated and run with Cookery.

5.1 Deployment

In lambda_deploy.py you can find every method that is needed to deploy a function. All thefunctions that make use of the boto3 library are found in the file aws_connect.py, so it isseparated from the rest. In order to access the services that AWS provides via the API, a userwill need to have an access/secret key pair, which can be acquired from the web console or theAWS CLI. These can be configured on the system itself or just given as variables when creatingthe client as seen in listing 1. Users will need to think about some parameters to create a Lambdafunction, as seen in the create_function(...) function of the lambda client in listing 2.

A user will have to provide a fitting function name, with which the function is recognizable.AWS Lambda stores the function with this name. Variables like which runtime to use, how muchtime before the function times out, how much memory to use and the description of the functionare straightforward. The role variable is a role for AWS Lambda to assume when it executesa function and looks something like this: arn:aws:iam::123456789012:role/service-role/ c

testRole. A role is used to attach policies to, which are just permissions for invoking a Lambdafunction, getting full or read-only access to CloudWatch and so on. The roles and policies arebasically the security system that AWS uses. A user will have to explicitly add policies toroles, but can also easily delete them. The format of a role is the ARN (Amazon ResourceName) format and this format is used to identify all resources that are available with AWS.The handler variable is a concatenation of two names, the handler name and the file namewhere the handler is located: file_name.handler_function_name. The handler name is thename of the method with which AWS Lambda invokes the function and it has the followingsyntax: def handler_name(event, context). The function can also accept environment vari-ables, which will be the same for every invocation of the function. At last the code of the functionhas to be supplied, which can be done in two ways. One way is to pass the contents of the zip filethat contains the function. This method would not work for us, due to troubles with encodingand supplying the zip file. The other way is to zip the file and upload it to AWS S3 (SimpleStorage Service), a cloud storage service, with a given bucket name and key to find the zip filewithin that bucket. The buckets that AWS S3 uses are basically directories to store files. Thezip file gets deleted after it is send. lambda_deploy.py has to be invoked with some parameters.The directory to deploy, which must be in the same directory as lambda_deploy.py, holds the

15

Page 16: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

function that a user wants to deploy, with the file containing the handler in the root of thatdirectory. Further parameters are the function name on AWS Lambda, the handler file name,the handler method name and the optional environmental variables.

1 response = client.create_function(

2 FunctionName="string",

3 Runtime="python3.6",

4 Role="string",

5 Handler="string",

6 Code={

7 # "ZipFile": b"bytes",

8 "S3Bucket": "string",

9 "S3Key": "string"

10 },

11 Description="Description",

12 Timeout=300,

13 MemorySize=128,

14 Environment={

15 "Variables": {

16 "key: "value"

17 }

18 }

19 )

Listing 2: An example of the create_function(...) function of AWS Lambda from boto3,with the variables that we are interested in.

5.2 Scheduling

As mentioned above we will use AWS CloudWatch to schedule AWS Lambda functions, by creat-ing rules. The scheduling function needs fewer parameters than the deployment function, whichmakes it easier to implement. All functions that are needed for the scheduling and that need tocommunicate with AWS using boto3, can also be found in aws_connect.py. A user will need tospecify the period with which the function needs to be invoked. This period consists of a numberand a period specifier, like minutes, hours or days. We have put up some constraints to make iteasier to work with a period. A user will only be able to specify periods of 1-59 minutes, 1-23hours and 1-30 days. This makes handling rules easier since we do not have to handle multiplica-tions like 300 minutes. On top of that we can also easily check if a rule already exists. Then weneed a name for the rule and to make things simple we just use the concatenation of the period,like 2hours and 1day. The description of the rule will be an easy reflection of the rule. Thelambda_schedule.py first checks the given period and, with the rule name, checks whether thisrule already exists or not. When this is the case, we just add the Lambda function as a targetof the rule, if not we first create the rule. We cannot just add every Lambda function to a rule,because AWS regulates a limit of 5 targets per rule and a maximum of 100 rules. One final thingneeds to be done before the periodic event, which the rule creates, can trigger the Lambda func-tion. We need to explicitly add a permission for the rule to invoke the Lambda function, otherwiseAWS will give a FailedInvocation error. To add a permission, we need the name of the rule sowe can acquire the ARN of the rule. With the parameters Action="lambda:InvokeFunction"

and Principal="events.amazonaws.com" and the function name and rule ARN, we can addthis permission to the rule via the add_permission() method.

16

Page 17: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

5.3 Use-cases

A whole world of possibilities opens with the combination of Cookery and AWS Lambda. Toshow what this combination is capable of and the field we are enabling, we will create a use-case.A use-case can be anything and can have multiple cloud services involved. An example is postinga picture on Instagram and immediately save it on Dropbox, which is one of the possibilities asexplained by IFTTT (If This Than That) [20] or getting the last-minute news from a website.But a use-case in AWS Lambda can be bigger than this, because AWS Lambda has more com-putational power at its disposal. So, for this project we will create a more complicated use-case,which is a GitHub [13] monitor. This program monitors for commits in a given repository andnotifies people by means of an email that shows all the changes made that commit. Anotherreason to choose for this use-case is the ability to test it whenever we want and the testing is easy.With other use-cases we would be depended of, for example, the news to be released, or we wouldhave to build our own website. The GitHub monitor makes use of the GitHub REST API [14]to get all the information while being authenticated with a GitHub personal access token. Thescopes of this token can be configured by the user, like giving access to all private repositories,but not being able to delete them or to change user data. Making authenticated requests givesus the opportunity to send 5000 requests per hour to the API, while making unauthenticatedrequests only gives us 60 requests per hour. This program first checks if the authenticated userhas access to the given repository, for now the program only checks the user's repositories. Thisis done by sending a request to the REST API with the Python library urllib. The request isalso needed to authenticate to GitHub by adding a header with the personal access token to it,which can be seen in listing 3.

1 def make_request(url):

2 request = urllib.request.Request(url)

3 request.add_header("Authorization", "token " + github_token)

4 response = urllib.request.urlopen(request)

5 data = response.read().decode("utf-8")

6

7 return data

Listing 3: The function to make requests to GitHub with authentication using urllib

When we send a GET request to url = https://api.github.com/user/repos, we will geta string containing JSON1 as response, which we can easily be deserialized to a Python object,with the accessible repositories. This way we can easily examine every repository and check ifthe given one is among them. If we find the right one, we need to get all the commits of thatrepository. By sending another request which is a little more specific, we will get a list of all thecommits, sorted by time, with the last commit first.

url = "https://api.github.com/repos/" + repo["owner"]["login"] + "/" +

repo["name"] + "/commits"↪→

.This list gives us a lot of information about the commit, but not the information we want:

the changes that are made in that commit. But it does contain the commit time, which we canuse to compare with a pre-set time that depends on how big the request interval will be. Thecommit is considered new, whenever the commit time is later than that pre-set time, and thatmeans we would like to get the changes that are made. To get that extra information, we willhave to send yet another request with the unique sha of the commit:

url = "https://api.github.com/repos/" + repo["owner"]["login"] + "/" +

repo["name"] + "/commits/" + data[0]["sha"]↪→

1JSON (JavaScript Object Notation) is used to exchange and store data between a server and a browser astext, written with JavaScript object notation[21].

17

Page 18: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

In the response, we can see how many additions and deletions are made, and what the changesare in every file. This information will then be put into the body of an email, which will be sentwith Gmail [17] using the Python SMTP library. The authentication of Gmail is done with anapplication key, just like with GitHub. The email will look like figure 5.1, with the completelink to the commit on GitHub at the bottom. As mentioned before, the Lambda functions arestateless and terminate after 5 minutes. This means that we cannot use any variables aftertermination, except when we save them to a database or communicate them back. The way theGitHub monitor works is thus the simplest. We have the same problem with giving the repositoryto check. This information will be lost after termination and needs to be given every time thefunction gets invoked. This is where the environmental variables come in. They can be givento a Lambda function as key-value pair and they will be the same for every invocation whendeployed. This makes it also possible to reuse the GitHub monitor to check another repository,without changing the code.

Figure 5.1: An example of the email the user gets when a new commit is found. The completelink to the commit on GitHub is replaced.

18

Page 19: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 6

Results

We have developed the deployment and scheduling functionalities, which work as they should.The Lambda function immediately pops up on the web console and can instantly be invoked,which can be seen in figure 6.1. Same holds for the scheduling of a Lambda function, if needed therule is directly created and the function is made a target for the rule. The rule starts immediatelywith sending its periodic triggered events to invoke the targeted functions 6.2.

Figure 6.1: The AWS web console showing the available functions in AWS Lambda.

Figure 6.2: The AWS web console showing the rule in AWS CloudWatch which triggers every 5minutes, with the targeted AWS Lambda functions.

The intention is to run the GitHub monitor 24 hours a day, without any other constraints,which means that we can create different schedules and see which one performs the best. Atfirst, we let the GitHub monitor make a request for the commits every second and check if therehas been a commit within the last 1.5 seconds. This gives the most real-time notifications, butit is also easy to miss a commit. Especially if we consider the time it takes to send an email,which is about 7 to 8 seconds. On top of that, the commit time is not the time that a commithas been pushed. This means that a commit can be made way before the time that we checkfor new commits or that multiple commits have been made that day, while the monitor can onlyfind the last. When we change the request interval to 59 seconds and the time for a last committo 70 seconds, we counter the emailing latency and the occasional commit miss. The interval isset to 59 seconds, so the time it takes to sleep and make a request together do not exceed 60

19

Page 20: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

seconds. On the other hand, this method makes the monitor less real-time. Another problemthat is bound with these 2 intervals lies with the termination of the function. AWS Lambdastates that the maximum run time of a function is 5 minutes, but it also states that dependingon the event source, it may retry the failed Lambda function [12]. This means that we cannotjust enter an infinite while loop, but we have to properly terminate the function. We do thisby letting the program sleep for 59 seconds before making another request and count the timeswe made requests. We end the program after 5 requests, this roughly ends up to the needed 5minutes. Just running the GitHub monitor with a request interval of 1 minute gave interestingresults. Most of the time the function would run correctly as seen in 6.3. But there were someinvocations where AWS Lambda tended to get ”stuck” during run time, resulting in a failedfunction as seen in 6.4.

Instead of letting the program sleep for the hard-coded 59 seconds in the example above,we can use the get_remaining_time_in_millis() method that comes with the context objectpassed as a parameter by the handler method. This is the best way to keep track of the remainingtime, because it eliminates the email latency. We set the maximum running time of the Lambdafunction to 5 minutes and we still check every minute for new commits. After every request, wedivide the number of times to check, which is called count, by the remaining time the functionhas left and decrease count by one. We end the function when we have made five requests. In thisway, the function can better handle dynamic latencies, like request latencies or the mentionedemail latency, and can be done in roughly four minutes, which can potentially save up to 288computing minutes per day.

We can also run the program once every 5 minutes and check if there has been a commit inthe last 5 minutes. This is the safest method considering the latencies, but it is also the leastreal-time.

Figure 6.3: GitHub monitor running correctly with a request interval of 59 seconds.

Figure 6.4: GitHub monitor gets ”stuck” with a request interval of 59 seconds.

20

Page 21: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

CHAPTER 7

Conclusion and Discussion

7.1 Conclusion

Our research shows that it is possible to get a framework based on Cookery working on AWSLambda, as proposed by our research question: How can we develop a framework based onCookery and running on AWS Lambda? We can deploy functions on AWS Lambda in an easyway by providing parameters and eventually using the Cookery high-level language. Thesefunctions can then be scheduled in the same way using the scheduling function. This allows usto run functions on AWS Lambda without manually invoking them. The performances of AWSLambda are good with the tested use-case, just like the time it takes to deploy and schedulefunctions. The interesting discovery of the function that was ”stuck” happened only once andshows that, even on AWS Lambda, such errors can occur. The error is likely to be a crash ofthe container or server and can happen to any other server. Although AWS Lambda may retrya function if it fails, developers still need to be aware that this can occur, while developing anapplication.

The use-case shows one of the many possibilities we can achieve by using AWS Lambda.The ability to deploy and schedule a function, without having to manually invoke it, can bevery helpful. AWS Lambda is also very versatile and can be used in combination of othercloud providers and AWS services. The deployed functions can also be used as a back-end foran application. The developed functionalities make it easier to deploy applications using AWSLambda, without creating an infrastructure or be dependent of funds.

The security of AWS Lambda lays totally in your own hands. A user has to add policiesexplicitly to roles, like invoking a Lambda function or adding a rule to CloudWatch. The usageof the access and secret key pair of AWS and the application keys of GitHub and Gmail makeauthorization easier. The keys can be managed from the source itself. These keys are relativelysafe compared to login and passwords, considering that the keys are easily disposed of whensecurity is breached. On top of that, the GitHub key can have different scopes, managed by theuser, which narrow down the functionalities.

Thus this project enables a field for developers who need more computational power, butlack the financial support or the programming experience, to still be able to develop and deployapplications in an easy way. With this project we have created an elementary block for theCookery ecosystem. The code of this project will be added to Cookery.

7.2 Discussion

There are some things that need to be noted. The event and context parameters that are passedby the handler method are useful, but are configured by AWS Lambda. This means that thetesting of an application which uses these parameters needs to be done on AWS Lambda, whileapplications without them can be tested locally. This does not mean that applications that runlocally can immediately run on AWS Lambda, because you need to consider the configuration of

21

Page 22: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

the execution time and memory. Another thing to note is that AWS Lambda functions can onlyuse Python native libraries and boto3, any other library needs to be added to the deploymentpackage. This can be an issue for some applications that use many or big libraries.

FaaS and SaaS are really different things from a developer point of view. With SaaS thewhole application has to be build from the ground up, including infrastructure, error handlingand everything else. While FaaS is much more modular, because it only uses functions that can beused separately or as a full back-end. So while SaaS is more for full stack applications, includingfront and back-end, FaaS is more basic, and can be used for the back-end of applications andto simply run functions. FaaS is thus easier accessible than SaaS, especially when we considerthat we do not have to write an infrastructure for FaaS. We can say that FaaS and SaaS havedifferent purposes, which means that FaaS will not replace SaaS.

7.3 Future work

In the future Cookery can be extended with more services of AWS and other cloud providers, tocreate a broader framework and to enable more developers to create applications. For example,an interesting extension would be with AWS DynamoDB[3] or AWS RDS (Relational DatabaseService)[5] to make it easier to create and manage databases using Cookery. When we combinedatabases with the functions of AWS Lambda, we can create more complicated applications anddeploy them using Cookery. This also means that the toolkit of Cookery can be extended withmore services and functionalities in future projects.

22

Page 23: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

Bibliography

[1] About AWS. url: https://aws.amazon.com/about-aws/ (visited on 05/10/2017).

[2] Michael Armbrust et al. “A View of Cloud Computing”. In: Communications of the ACM53.4 (2010), pp. 50–58.

[3] AWS DynamoDB. url: https://aws.amazon.com/dynamodb/.

[4] AWS Lambda. url: https://aws.amazon.com/lambda/.

[5] AWS RDS. url: https://aws.amazon.com/rds/.

[6] Mikolaj Baranowski, Adam Belloum, and Marian Bubak. “Cookery: a Framework for De-veloping Cloud Applications”. In: 2015 International Conference on High PerformanceComputing Simulation (HPCS). 2015, pp. 635–638.

[7] Boto3. url: https://boto3.readthedocs.io.

[8] Boto3 Low-level Clients. url: http://boto3.readthedocs.io/en/latest/guide/

clients.html (visited on 04/20/2017).

[9] Celebrating half a billion users. url: https://blogs.dropbox.com/dropbox/2016/03/500-million/ (visited on 05/21/2017).

[10] Dropbox. url: https://www.dropbox.com/.

[11] Martin Fowler and James Lewis. Microservices a definition of this new architectural term.2014. url: https://martinfowler.com/articles/microservices.html (visited on05/18/2017).

[12] Function Errors (Python). url: http://docs.aws.amazon.com/lambda/latest/dg/python-exceptions.html (visited on 05/28/2017).

[13] GitHub. url: https://github.com.

[14] Github REST API. url: https://developer.github.com/v3/ (visited on 05/25/2017).

[15] Google Cloud Functions. url: https://cloud.google.com/functions/.

[16] Google Drive. url: https://www.google.com/drive/.

[17] Google Mail. url: https://mail.google.com/.

[18] Scott Hendrickson et al. “Serverless Computation with OpenLambda”. In: HotCloud’16Proceedings of the 8th USENIX Conference on Hot Topics in Cloud Computing. 2016,pp. 33–39.

[19] IBM OpenWhisk. url: https://developer.ibm.com/openwhisk/.

[20] If This Than That. url: http://ifttt.com.

[21] JSON. url: https://www.w3schools.com/js/js_json_intro.asp.

[22] Mariam Kiran et al. “Lambda Architecture for Cost-effective Batch and Speed Big Dataprocessing”. In: BIG DATA ’15 Proceedings of the 2015 IEEE International Conference onBig Data (Big Data). 2015, pp. 2785–2792.

[23] Maciej Malawski. “Towards Serverless Execution of Scientific Workflows HyperFlow CaseStudy”. In: Proceedings of the 11th Workshop on Workflows in Support of Large-ScaleScience. 2016, pp. 25–33.

23

Page 24: Cookery in AWS Lambda - UvA · Although the principal of Function-as-a-Service and AWS Lambda already exist for a while, there is not much research about rewriting applications in

[24] Microsoft Azure. url: https://azure.microsoft.com.

[25] Harvard Business Review Analytic Services. Cloud Computing Comes of Age. 2015.

[26] Josef Spillner and Serhii Dorodko. “Java Code Analysis and Transformation into AWSLambda Functions”. In: CoRR abs/1702.05510 (2017).

[27] Mario Villamizar et al. “Infrastructure Cost Comparison of Running Web Applicationsin the Cloud using AWS Lambda and Monolithic and Microservice Architectures”. In:2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(CCGrid). 2016, pp. 179–182.

[28] Lizhe Wang et al. “Cloud Computing: a Perspective Study”. In: New Generation Computing28.2 (2010), pp. 137–146.

[29] Mengting Yan et al. “Building a Chatbot with Serverless Computing”. In: MOTA ’16Proceedings of the 1st International Workshop on Mashups of Things and APIs. 2016, 5:1–5:4.

24