- You must have completed the AWS Academy Cloud Foundations course set in weeks 1-7
- You will require an AWS Academy Learner Lab account for the practical activity. You should receive an invite when this document is released. Please contact the LSDE Unit Director if you have no email or issues with the registration.
- A Secure Shell (SSH) client, such as MacOS Terminal or PuTTy on Windows, for server admin.
Via the LSDE BlackBoard coursework assessment page, submit one zip file, named using your UOB username (‘username.zip’), containing:
- a Report (‘report.pdf’) in PDF format containing:
o Part 1
o Part 2
- a Text File (‘credentials.txt’) containing your AWS Academy account credentials (username,password), to enable us to access and review your Learner Lab account as required.
In this document we provide a detailed explanation of the tasks, and the approach to marking.
Task 2: Scaling the WordFreq Application (75%)
WordFreq is a complete, working application, built using the Go programming language.
[NOTE: you are NOT expected to understand or permitted to modify the source code in any way]
The basic functionality of the application is to count words in a text file. It returns the top ten most frequent words found in a text document and can process multiple text files sequentially.
The application uses a number of AWS services:
- S3: There are two S3 buckets used for the application.
o One is used for uploading and storing original text files from your local machine. This is your uploading bucket.
o These files will be copied from the uploading bucket to the processing S3 bucket.
The bucket has upload notifications enabled, such that when a file is uploaded, a message notification is automatically added to a wordfreq SQS queue and an email will be sent to you.
- SQS: There are two queues used for the application.
o One is used for holding notification messages of newly uploaded text files from the S3 bucket. These messages are known as ‘jobs’, or tasks to be performed by the application, and specify the location of the text file on the S3 bucket.
o A second queue is used to hold messages containing the ‘top 10’ results of the processed jobs.
- SNS: Publishes messages to your email address and SQS queues.
- DynamoDB: A NoSQL database table is created to store the results of the processed jobs.
- EC2: The application runs on an Ubuntu Linux EC2 instance, which you will need to set up initially following the instructions given. This will include setting up and identifying the S3,SQS and DynamoDB resources to the application.
You will be required to initially set up and test the application, using instructions given with the zip download file. You will then need to implement auto-scaling for the application and improve its architecture based on principles learned in the CF course. Finally, you will write a report covering this process, along with some extra material.
Task A – Install the Application
Ensure you have accepted access to your AWS Academy Learner Lab account and have at least $40 credit (you are provided with $100 to start with). If you are running short of credit, please inform your instructor.
Refer to the WordFreq installation instructions (‘README.txt’) in the coursework zip download on the BlackBoard site, to install and configure the application in your Learner Lab account. These instructions do not cover every step – you are assumed to be confident in certain tasks, such as in the use of IAM permissions, launching and connecting via SSH to an EC2 instance, etc.
You will set up the database, storage buckets, queues and worker EC2 instance.
Finally, ensure that you can upload a file using the ‘run_upload.sh’ script and can see the results logged from the running worker service, before moving on to the next task.
[NOTE: The application code is in the Go language. You are NOT expected to understand or modify it.
Any code changes will be ignored and may lose marks.]
Task B – Design and Implement Auto-scaling
Review the architecture of the existing application. Each job process takes a random time to complete between 10-20 seconds (artificially induced, but DO NOT modify the application source code!). To be able to process multiple uploaded files, we need to add scaling to the application.
This should initially function as follows:
- When a given maximum performance metric threshold is exceeded, an identical worker instance is launched and begins to also process messages on the queues.
- When a given minimum performance metric threshold is exceeded, the most recently launched worker instance is removed (terminated).
- There must always be at least one worker instance available to process messages when the application architecture is ‘live’.
Using the knowledge gained from the Cloud Foundations course, architect and implement autoscaling functionality for the WordFreq application. Note that this will not be exactly the same as Lab 6 in Module 10, which is for a web application. You will not need a load balancer, and you will need to identify a different CloudWatch performance metric to use for the ‘scale out’ and ‘scale in’ rules.
The ‘Average CPU Utilization’ metric used in Lab 6 is not necessarily the best choice for this application.
Task C – Perform Load Testing
Once you have set up your auto-scaling infrastructure, test that it works. The simplest method is to create around 40 large text files. Please make sure you’ve uploaded 40 files to your uploading S3 bucket.
You can ‘purge’ all files from your processing S3 bucket, then you could copy all the .txt files from you uploading S3 bucket to your processing S3 bucket.
- Connect to one of your instances that in your Auto Scaling Group (via SSH connection).
- Copy all the .txt file from your uploading S3 bucket (e.g., zj-wordfreq-nov22-uploading) to your processing S3 bucket (e.g., zj-wordfreq-nov22-processing) by running the following command in your SSH terminal: aws s3 cp s3://<name of your uploading bucket> s3://<name of your processing bucket> –exclude “*” –include “*.txt” –recursive
Please watch the following behaviours:
- Watch the behaviour of your application to check the scale out (add instances) and scale in (remove instances) functionality works.
- Take screenshots of your copied files, the SQS queue page showing message status, the Auto Scaling Group page showing instance status and the EC2 instance page showing launched / terminated instances during this process.
- Take a screenshot of the emails you’ve received from Amazon S3 Notification. Ideally you are expected to get 40 emails. You only need to take a screenshot of one email to show the functionality of SNS.
- Try to optimise the scaling operation, for example so that instances are launched quickly when required and terminated soon (but not immediately) when not required. Note down settings you used and the fastest file processing time you achieved.
- Try using a few different EC2 instance types – with more CPU power, memory, etc. Note down any changes in processing time.
- Please delete all the .txt file in your processing S3 bucket after load testing.
- [NOTE: The Learner Lab accounts officially only allow a maximum of 9 instances running in one region, including auto-scaling instances. Learner Lab accounts are Limited in which EC2 Types and AWS services they can use. This is explained in the Lab Readme file on the Lab page; section ‘Service usage and other restrictions’.]
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: email@example.com 微信:itcsdx