Lab 7 Data Communication using Gephi and Tableau January 22, 2015

Overview of Lab 7: We will work on accessing cloud infrastructures amazon web services (aws) and Google App Engine. We will access amazon EC2 services for launching a Windows and Linux machines, work extensively with the Windows machine. We will also use amazon EMR (elastic mapreduce) for a simple wordcount example. Google cloud environment will be illustrated by launching application from Eclipse environment.

Exercise 1: Login into aws.amazon.com. We will use only free instances for this exercise. We illustrate the use of EC2 using a very mundane example.

Consider this real scenario: I have a UML tool that is 32-bit based. I use it all the time. It is NOT installing on my 64-bit Windows machine. I need a 32-bit machine now. Where do I go? How do I keep using this UML tool? I need a 32-bit machine. We will examine if any of amazon’s cloud AMIs can help.

1. Select EC2 and you will see a screen that shows a blue launch button.

2. Study the various components. Click Launch Instance.

3. For identifying the AMI that has Windows 32-bit operating systems, I search the Market Place with keywords “Windows 2003 Server” and choose a 32-bit Windows machine to launch.

4. Click through for using only the micro instance. It is of free-tier.

5. Create a new key pair: A key pair consists of a public key that AWS stores, and a private key file that you store. Together, they allow you to connect to your instance securely. For Windows AMIs, the private key file is required to obtain the password used to log into your instance. For Linux AMIs, the private key file allows you to securely SSH into your instance. Name it richs7, download the keypair and store it in a secure location.

6. Launch and wait for it be ready. Then Get Password by clicking on the instance. Password is decrypted using the richs7.pem file you saved in step 5. You will get the following message until password is ready. “Password Not Available yet”

7. The below screen shot shows the richs7.pem file loaded in the bottom window. Click Decrypt Password. Once you are able to get the password you need to decrypt it with richs.pem as shown below. Keep the password you get in a secure location. You could save it as the last line in the richs.pem file.

8. Connect to the instance using public DNS in the bottom half of the launch scree. Click on Remote Desktop and Run as Administrator.

9. Enter connect. Enter the password you encrypted and see your new machine (Windows 32-bit) appear.

10. Navigate to your local machine’s drive and drag and drop any item (file) you want into the new machine. We will drag and drop the rosecppdemo.exe from your unzipped folder into to the desktop. Double click on the downloaded software to install it. Go through the steps. Hooray, I have my UML tool working. Goal accomplished with the help of amazon aws cloud.

Exercise 2: Now we will connect to a Linux instance.

1. Login into aws and select EC2 and launch an instance. Select any of the free tier eligible Linux AMI.

2. We will choose the amazon free-tier eligible Linux AMI.

3. Choose an existing keypair “richs7” that we created earlier.

4. Navigate to the launch window. As you are awaiting the launch, create the putty key from the richs.pem.

5. Puttyà puttygenàload à browse to richs7.pem and load it (see the screen shot below).

6. Save the private key generated in richs7.ppk

7. We will use this key to log into the linux instance we launched.

8. To connect to the instance we need (i) public IP of the machine launched (ii) the putty key (iii) ssh/putty connection.

9. Launch puttyà enter the DNS address from the bottom part of the linux machine’s launch screen, and enter the key for authentication in the putty window. SSH--> Authàbrowse to the richs.ppk key.

10. When the Linux machines connects login with ec2-user as username. You will be connected to your brand new linux machine on the cloud.

11. Now you can install any software you want using “sudo yum install xyz”. You can transfer files in and out using ssh-based applications such as filezilla.


Exercise 3: Working with Google App Engine.

Goal: Deploy a DHTML (HTML + CSS + Javascript) project on Google App Engine. (We will not include a CSS file now, you can always add that on). We will also deploy three.js based project that we created in an earlier session.

Prerequisite: A DHTML project created and tested either using notepad++ or a similar editor or an Eclipse environment. For example the hangman project we created has hangman.html, hangman.js and the images (within a folder images). The three.js project is made with the moving sphere and cube. Both these code bases are available in the zip file for this session.

1. We have already installed Eclipse Kepler. This is actually Eclipse Kepler 4.3 version.

2. Next install the Google app Engine plug-in for the Eclipse you installed. (If you already have a version of Eclipse running, you may use that and get the right version of the plug-in).

3. Run Eclipse, go to the Help top line menu, in the drop-down list select Install New Software and window shown below appears. Fill in https://dl.google.com/eclipse/plugin/4.3 for the Kepler version of Eclipse.

4. Click Next and accept/okay in the next few pop-up Windows. Wait for the installation to complete; it will take some time. See instructions here: https://developers.google.com/eclipse/docs/install-eclipse-4.3

5. Now gather all the artefacts (files, images etc.) of the application you want to deploy in a google app project.

6. Fileà New Projectà Googleà Web Application project and fill in the details as shown below. You are filling these details: name of the project (for this use your naming convention and make it a meaningful one.) Then the package name: com.google.appengine.yourappname. The rest as shown below. Also select a universally unique name for you app as it shows up on the web and also for providing the link to the app.

Click create new project in workspace

Unclick web toolkit

Click google app engine

Leave app id blank

7. On the Windows explorer of the newly created project, go to the .war. Copy and paste all the components of the application you created to .war folder: hangman.html, hangman.js, and the images folder.

8. Edit the index.html and make href point to your .html file. In this case hangman.html.

9. Right click on the newly create application, Google and deploy application on app engine. You will have to sign into the app engine and also create a new application id that is universally unique as shown in the next two screen shots. You may have to create a new application. We will create an application id hangmanrichs2 (for me). For each of you it has to be different. Just use a different digit at the end.

10. The screenshots in the next page will help in navigating through the above steps.

11. If everything goes fine, you will see the application being deployed and the web front for the application show up. Click the application and use it.

12. You will see your application deployed on Google cloud. It can be accessed through the URL:

http://1-dot-hangmanrich2.appspot.com/hangman.html from anywhere, anytime.

13. Monitor the various aspects of the app by studying the Google app Admin console and status indicators. You will also be able to undeploy and delete the app from the console. Study all the features of the Google App Engine and also the tutorials.

Exercise 4: We will repeat the above steps for another application we created in an earlier session with three.js.

1. The files for this project are in cubespherefiles.

2. Click on 05-control-gui.html and refresh your memory about how it works.

3. Then follow the steps as given above in exercise 4 and make sure you are able to deploy the application and see it working.

4. Access it using the URL and see it working for any user.

Exercise 5: For the last exercise we get back to amazon web services and to it EMR: elastic map reduce offering. We will run the elastic map reduce (EMR). EMR is a not free-tier since it involves provisioning of a cluster of machines.

1. Create a S3 bucket with a unique name: I created richs2015 (all lowercase letters)

2. Create a folder input in this bucket and upload the files that need to be analyzed. We will use text (.txt) files.

3. Start EMRà Sample applicationsà Wordcount

4. Disable logging.

5. Edit the command and specify the input and output folder names as shown below.

6. Create cluster so that the cluster for MR wordcount can be provisioned and MR executed on the input data you have provided.

7. Monitor the cluster and wait until cluster completes the execution and is terminated.

8. Download the output files that provide the wordcounts of the words in the input file(s). Analyze it; understand and discuss the results.