Analysis of File Systems Performance in Amazon Ec2 Storage

ANALYSIS OF FILE SYSTEMS PERFORMANCE IN AMAZON EC2 STORAGE

Nagamani Pudtha

A Thesis submitted to the

School of Graduate Studies

in partial fulfilment of the requirements for the degree of

Master of Science

Department of Computer Science

University of Colorado Colorado Springs

December 2014

Colorado Springs Colorado

 Copyrights by Nagamani Pudtha 2014

This thesis final report for the Master of Computer Science degree by

Nagamani Pudtha

has been approved for the

Department of Computer Science

______

Advisor: Jia Rao

______

Dr. C. Edward Chow

______

Dr. Xiaobo Zhou

______

Date

Nagamani Pudtha (M.S. Computer Science)

Analysis of File systems performance in Amazon EC2 storage

Thesis directed by Professor Jia Rao, Department of Computer Science

ABSTRACT

Cloud computing has gained tremendous popularity in recent years. Cloud computing refers to using a third–party network of remote servers hosted on the internet to store and manage all your data, rather than locally. In simple words, cloud services provide you with your own hard drive in the cloud or on the internet. There are public clouds and private clouds available in the market. And whether a cloud is public or private, the success key is creating an appropriate and efficient server, network and storage infrastructure in which all resources can be efficiently utilized and shared. In cloud computing, data storage becomes even more crucial since all data resides on the storage systems in a shared infrastructure model. It is very important to understand the performance of a particular storage before making the transition.

In this paper we perform experiments on amazon EC2 cloud storage. We select Amazon public cloud platform since it is one of the most widely adopted public cloud platforms and offers Infrastructure-as-a-Service (IaaS). Prior work has shown that applications with signiﬁcant communication or I/O tend to perform poorly in virtualized cloud environments. However, there is a limited understanding of the I/O characteristics of cloud environments. In this paper several tests and benchmark were used to evaluate the I/O under diﬀerent storage settings of Amazon cloud with different types of file systems with the goal of to be able to exercise and observe the I/O performance from different perspectives by using different workloads with a special focus on long running jobs. We use FIO, FileBench and Blktrace benchmarking tools.Blktrace helps to provide detailed block level IO analysis ofboth EBS and Instance Store disks. Through a set of detailed micro and macro benchmarking measurements, the tests revealed the different levels of performance degradation in EBS and instance storage due to the different types of file systems and different types of workloads.

From our interpretation of test results in detail on how the time spent in different IO stages, we are providing user guidance on how to select the storage option whilechoosing instances.

ACKNOWLEDGEMENTS

First and foremost I would like to sincerely thank my advisor Jia Rao for all the guidance and interest he took in the progress of this work. I am very grateful to him for his constructive comments, feedback and also providing the resources like Amazon Web Services during our research. I will consider myself lucky if I have imbibed at least a small percentage of his admirable qualities like devotion and single-minded dedication towards work.

I am extremely grateful to Dr. Edward Chow and Dr. Xiabou Zhou for their valuable suggestions and advisesin the thesis proposal and during all of this thesis work without which it would have been very difficult for me to come up with this great work.

Many thanks go to my family members for their constant support and encouragement. I am grateful to my parents for all their love, affection and blessings without which I would not have gotten this far in life. And finally a special note of thanks goes to Venkat, my husband for his continual encouragement, support; advice and patience that has enabled me to accomplish things I never thought were possible.

CONTENTS

ABSTRACT

LIST OF TABLES

LIST OF GRAPHS

CHAPTER 1

INTRODUCTION

1.1MOTIVATION

1.2THESIS GOAL

1.3THESIS ORGANIZATION

CHAPTER 2

BACKGROUND

2.1AMAZON EC2

2.2AMAZON EC2 STORAGE

2.2.1ELASTIC BLOCK STORE

2.2.1.1GENERAL PURPOSE VOLUMES

2.2.2INSTANCE STORE

2.3IO ANALYSIS TOOLS

2.3.1BLKTRACE/BLKPARSE

CHAPTER 3

EXPERIMENT METHODOLOGY

3.1OVERVIEW

3.2PROPERTIES

3.3INSTANCE TYPE SELECTION

3.4FILE SYSTEM SELECTION

3.5TEST BED SETUP

3.6BENCHMARKING TOOLS

3.6.1MACRO BENCHMARKS

3.6.2MACRO BENCHMARKS

CHAPTER 4

EXPERIMENT RESULTS & ANALYSIS

4.1.FILEBENCH RESULTS

4.1.1EBS

4.1.2INSTANCE STORE:

4.2FIO RESULTS

4.2.1EBS:

4.2.2INSTANCE STORE

CHAPTER 6

IO ANALYSIS

CHAPTER 7

DISCUSSION

CHAPTER 8

FUTURE WORK

CHAPTER 9

CONCLUSION

BIBLIOGRAPHY

APPENDIX A

FIO Benchmark 64k block size with random workload:

APPENDIX B

SAMPLE FILEBENCH OUTPUT:

APPENDIX C

SAMPLE BLKTRACE OUTPUT:

APPENDIX D

SAMPLE BTT OUTPUT

LIST OF FIGURES

Figure 1: Amazon EC2 Storage

Figure 2: blktrace General Architecture

LIST OF TABLES

Table 1 Experiment Setup –EBS

Table 2 FileBench Workloads

Table 3 IOPS of different filesystems with different workloads on EBS with 8k….

Table 4 IOPS of different filesystems with different workloads on EBS with 512k

Table 5 IOPS of different filesystems with different workloads on Insance Store

with 8k

Table 6 IOPS of different filesystems with different workloads on Instance Store

with 512k

Table 7 FIO Benchmark Parameters for EBS and Instance store

Table 8 IOPS for 8 jobs & 4k block size- EBS vs Instance store

Table 9 IOPS for 16 jobs & 4k block size- EBS vs Instance store

Table 10 IOPS for 32 jobs & 4k block size- EBS vs Instance store

LIST OF GRAPHS

Graph 4.1:1 IOPS of different file systems with different workloads on EBS

Graph 4.1:2 Latency of different filesystems with different workloads on EBS

Graph 4.1:3 Bandwidth of file systems on EBS

Graph 4.1:4 IOPS of different filesystems with different workloads on Instance Store

Graph 4.1:5 Latency of different file systems with different workloads on Instance Store

Graph 4.1:6 Bandwidth of different filesystems with different workloads on Instance Store

Graph 4.1:7 E XT3 Random Read Write 8 jobs

Graph 4.1:8 EXT3 Random Read Write 16 jobs

Graph 4.1:9 EXT3 Random Read Write 32 jobs

Graph 4.1:10 EXT3 Sequential Read Write 8 jobs

Graph 4.1:11EXT3 Sequential Read Write 16 jobs

Graph 4.1:12 EXT3 Sequential Read Write 32 jobs

Graph 4.1:13 EXT4 Rand Read Write 8 jobs

Graph 4.1:14 EXT4 Rand Read Write 16 jobs

Graph 4.1:15 EXT4 Rand Read Write 32 jobs

Graph 4.1:16 EXT4 Sequential Read Write 8 jobs

Graph 4.1:17 EXT4 Sequential Read Write 16 jobs

Graph 4.1:18 EXT4 Sequential Read Write 32 jobs

Graph 4.1:19 XFS Rand Read Write 8 jobs

Graph 4.1:20 XFS Rand Read Write 16 jobs

Graph 4.1:21 XFS Rand Read Write 32 jobs

Graph 4.1:22 XFS Sequential Read Write 8 jobs

Graph 4.1:23 XFS Sequential Read Write 16 jobs

Graph 4.1:24 XFS Sequential Read Write 32 jobs

Graph 4.1:25 EXT3 Rand Read Write 8 jobs

Graph 4.1:26 EXT3 Rand Read Write 16 jobs

Graph 4.1:27 EXT3 Rand Read Write 32 jobs

Graph 4.1:28 EXT3 Sequential Read Write 8 jobs

Graph 4.1:29 EXT3 Sequential Read Write 8 jobs

Graph 4.1:30 EXT3 Sequential Read Write 32 jobs

Graph 4.1:31 EXT4 Random Read Write 8 jobs

Graph 4.1:32 EXT4 Random Read Write 16 jobs

Graph 4.1:33 EXT4 Random Read Write 32 jobs

Graph 4.1:34 EXT4 Sequential Read Write 8 jobs

Graph 4.1:35 EXT4 Sequential Read Write 16 jobs

Graph 4.1:36 EXT4 Sequential Read Write 32 jobs

Graph 4.1:37 XFS Random Read Write 8 jobs

Graph 4.1:38 XFS Random Read Write 16 jobs

Graph 4.1:39 XFS Random Read Write 32 jobs

Graph 4.1:40 XFS Sequential Read Write 8 jobs

Graph 4.1:41 XFS Sequential Read Write 16 jobs

Graph 4.1:42 XFS Sequential Read Write 32 jobs

CHAPTER 1 INTRODUCTION

1.1MOTIVATION

Amazon EC2, the leading IaaS (Infrastructure as a Service) provider and a subset of offerings from Amazon Web Services, has had a significant impact in the business IT community and provides reasonable and attractive alternatives to locally-owned infrastructure. Amazon Elastic Compute Cloud has been used for host of a small and medium sized enterprises for various usages. Amazon was introduced in 2006 and supports a wide range of instance types with different storage settings. Amazon EC2 provides Elastic Block Storage (EBS)[2], Instance Storage[3] and Amazon Simple Storage Service (S3).

There are a lot discussions and questions in the community about which storage setting should a user choose. This thesis will analyze the performance of Amazon EBS and Instance storages, with different combinations of file systems with different type of workloads and read write operations. And also we are extending the paper where authors focused on the nested file systems performance[1] on only one kind of storage on a single instance, where as our paper is working on Amazon Ec2’s EBS and Instance storage performances by launching 2 instances on each storage and tested with different workloads under different file systems.

Understanding how data makes its way from the application to storage devices is key to understanding how I/O works. With this knowledge, user can make much better decisions about storage design and storage purchases for their application. Monitoring the lowest level of the I/O stack, the block driver, is a crucial part of this overall understanding of I/O patterns.

1.2THESIS GOAL

In this thesis, we aim to present a measurement study to characterize the performance implications of the storage of Amazon Elastic Cloud Computing (EC2)[24] data center. Performance has a long tradition in storage research, we measure the performance by analyzing the I/O characteristics, workload demand, and storage configuration by attaching General Purpose SSD volume to instances and will provide a user guidance on how to select the storage option in Amazon EC2.

This research aims to answer the following questions within the bounds of the environments tested.

Are there any wide range of performance variations between Amazon EC2 storage options

Can the block size be a cause of I/O performance degradation

Which one delivers the better peak performance

Which one delivers more consistent performance

Is any of these two settings all-time winner for all workloads? Or the performance is workload-dependent.

We ran many experiments on both EBS and Instance store instances, collected detailed performance measurements, and analyzed them. We found that different workloads, not too surprisingly, have a large impact on system behavior. No single ﬁle system worked best for all workloads. Some ﬁle system features helped performance and others hurt it.

Our goal is to quickly observe the events in each layer, how they interact and also to provide enough information to study even small details.Togatherinformationfromthesecomponents,wehave used the existing blktrace mechanism[15][16].

1.3THESIS ORGANIZATION

The remainder of this thesis is structured as follows.

In chapter 2, we describe some of the background that led to this project and our goals for the project. Chapter 3, describes the experiment set up. Chapter 4 talks about the benchmarking tools what we are using in our experiment. Chapter 5 discuss the results of those benchmarks on EXT3, EXT4 and XFS file systems with EBS and Instance store volumes. Chapter 7 includes the discussion of the results. Chapter 7 focuses on the future areas of this thesis work. Chapter 8 will give the conclusion.

CHAPTER 2

BACKGROUND

2.1AMAZON EC2

Amazon Elastic Compute Cloud (Amazon EC2) [24]is a component of Amazon’s Web Services (AWS). EC2 is a central part of Amazon.com’s cloud computing platform. Amazon EC2 is a Web-based service from which user can rent for a monthly or hourly fee, virtual servers in the cloud andalso run custom applications on those servers. Elasticity in EC2 refers to the ease in which user can scale server and application resources as their computing demands needed.

Amazon EC2 uses the Xen virtualization technique[14] to manage physical servers. There might be several Xen virtual machines running on one physical server. Each Xen virtual machine is called an instance in Amazon EC2. There are several types of instances. Each type of instance provides a predictable amount of computing capacity. The input-output (I/O) capacities of these types of instances are vary according to the storage attached to those instances. Allocated EC2 instances can be placed at different physical locations. Amazon organizes the infrastructure into different regions and availability zones.

To use EC2, a subscriber creates an Amazon Machine Image (AMI) containing the operating system, application programs and configuration settings. Then the AMI is uploaded to the Amazon Simple Storage Service (Amazon S3) and registered with Amazon EC2, creating a so-called AMI identifier (AMI ID). Once this has been done, the subscriber can requisition virtual machines on an as-needed basis. Capacity can be increased or decreased in real time from as few as one to more than 1000 virtual machines simultaneously. Billing takes place according to the computing, storage and network resources consumed.

2.2AMAZON EC2 STORAGE

In amazon cloud we have three types of storage choices for an instance boot disk or its root device. They are Instance store, Elastic Block Storage (EBS) and Simple Storage Service (S3). In this section, we will briefly discuss these three storage settings. These three types of storages of EC2 is depicted in the following Figure 1[6]

Figure 1: Amazon EC2 Storage

2.2.1ELASTIC BLOCK STORE

Amazon’s EBS volumes provides persistent block level storage. Once we attach EBS volume to an instance, we can create file systems and we also can run the database on top of these volumes. Amazon EBS provides three types of volumes, first one is General Purpose (SSD), second one is Provisioned IOPS (SSD), and the last one is Magnetic. The three volume types differ in performance characteristics and cost. In our thesis research we are only attaching General Purpose storage volume to our instances as per the cost constraints.

2.2.1.1GENERAL PURPOSE VOLUMES

General Purpose (gp2) volume is the currently default EBS volume type when launching “Create Volume” in EC2 console in Amazon cloud. General Purpose volumes are backed by Solid-State Drives (SSDs) and are suitable for a broad range of workloads, including small to medium-sized databases, development and test environments, and boot volumes. General Purpose volumes provide the ability to burst up to 3,000 IOPS per volume, independent of volume size, to meet the performance needs of most applications. General Purpose volumes also deliver a consistent baseline of 3 IOPS/GB and provide up to 128MBps of throughput per volume. I/O is included in the price of General Purpose volumes, so you pay only for each GB of storage you provision.

General Purpose SSD, measured by the benchmark of IOPS, offers 10 times more input/output operations each second, with one-tenth the latency of magnetic tape drives, as well as greater bandwidth and consistency. When we need a greater number of IOPS than General Purpose (SSD) volumes provide or we have a workload where performance consistency is critical, Amazon EBS Provisioned IOPS (SSD) volumes will help us.

2.2.2INSTANCE STORE

Instance-store volumes[3] are temporary storage, which survive rebooting an EC2 instance, but when the instance is stopped or terminated (e.g., by an API call, or due to a failure), this store is lost. EBS volumes are built on replicated storage, so that the failure of a single component will not cause data loss.

For instances, as they are a temporary storage, you should not rely on these disks to keep long-term data or even other data that you would not want to lose when a failure happens (i.e., stop/start instance, failure on the underlying h/w, terminating instances), for these purposes, it is better to choose persistent storages like EBS or S3. And also, you can’t upgrade your instance and it is not scalable. But instance store is faster than EBS with its non-persistent characteristic.

2.3IO ANALYSIS TOOLS

Linux has some excellent tools for tracing I/O request queue operations in the block layer. For example, tools such as iostat, iotop[20], sar[19]and etc. iotopis used to get quite few I/O stats for a particular system, but it only gives you an overall picture of statistics without a great deal of in detail, hence it is not recommended to use iotop to determine how the application is doing the I/O. iotop only gives an idea of how much throughput and not the iops that the application is generating.

Iostat[18] is the go-to tool for Linux storage performance monitoring and allows you to collect quite a few I/O statistics. It is available nearly everywhere, it works on the vast majority of Linux machines, and it's relatively easy to use and understand. Relative to iotop, iostat gives you a much larger array of statistics, but it does not separate out I/O usage on a per-process basis instead you get aggregate view of all I/O usage.

Sar[19] is one of the most common tools for gathering information about system performance. It works like iotop, runs on each compute node and gather I/O statistics. But it examine the I/O pattern of an application at a higher level. To get around these issues, we will want to go deeper and watch I/O statistics.

2.3.1BLKTRACE/BLKPARSE

The tools that come with the kernel to watch I/O statistics in depth are blktrace and blkparse.These are very powerful tools.Blktrace is a block layer IO tracing technique which provides information about request queue operations up to user space in detail. Blktrace transfers event traces from the kernel into either long-term storage or provides formatted output via blkparse. Compared to all other tools (which we discussed above), it provides detailed information about request queue operations. Blktrace needs no special support or configuration apart from having debugfs mounted on /sys/kernel/debug.Blkparse utility formats the events stored in files and it directly outputs data collected by blktrace. General architecture of blktrace is shown in Figure 2[21].

Figure 2: blktrace General Architecture

There are around 20 different events produced by blktrace, of which we only use a few. Below we list few events what we use, for a full list refer to the blkparsemanual page [17].

Request Inserted (I): We use this to tell IO inserted onto request queue.