Computation and Storage in the Cloud: Understanding the Trade-Offs (Elsevier Insights)

Computation and Storage in the Cloud: Understanding the Trade-Offs (Elsevier Insights)

Dong Yuan

Language: English

Pages: 128

ISBN: 0124077676

Format: PDF / Kindle (mobi) / ePub


Computation and Storage in the Cloud is the first comprehensive and systematic work investigating the issue of computation and storage trade-off in the cloud in order to reduce the overall application cost. Scientific applications are usually computation and data intensive, where complex computation tasks take a long time for execution and the generated datasets are often terabytes or petabytes in size. Storing valuable generated application datasets can save their regeneration cost when they are reused, not to mention the waiting time caused by regeneration. However, the large size of the scientific datasets is a big challenge for their storage. By proposing innovative concepts, theorems and algorithms, this book will help bring the cost down dramatically for both cloud users and service providers to run computation and data intensive scientific applications in the cloud.

  • Covers cost models and benchmarking that explain the necessary tradeoffs for both cloud providers and users
  • Describes several novel strategies for storing application datasets in the cloud
  • Includes real-world case studies of scientific research applications
  • Covers cost models and benchmarking that explain the necessary tradeoffs for both cloud providers and users
  • Describes several novel strategies for storing application datasets in the cloud
  • Includes real-world case studies of scientific research applications

Kernel Adaptive Filtering: A Comprehensive Introduction

Elements of Computer Security (Undergraduate Topics in Computer Science)

Database Design for Mere Mortals (3rd Edition)

Effective DevOps: Building a Culture of Collaboration, Affinity, and Tooling at Scale (1st Edition)

Programming Massively Parallel Processors: A Hands-on Approach (2nd Edition) (Applications of GPU Computing Series)

An Introduction to Functional Programming Through Lambda Calculus (International Computer Science Series)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

of its deleted predecessors, if any. This makes minimising the total application cost a very complex problem. Data provenance is a kind of important metadata which records the dependencies among data sets [70], i.e. the information of how the data sets were generated. Data provenance is especially important for scientific applications in the cloud because the regeneration of data sets from the original data may be very time consuming and therefore carry a high cost. With data provenance

g: In the CTT-SP algorithm, the rules for setting weights to the edges guarantee that the paths from the start data set ds to every data set di in the CTT represent the storage strategies of the data sets fdk jdk ADDGXds ! dk ! di g, and Corollary 5.1 further indicates that the SP represent the MCSS. As defined in Section 5.1.1, the weight of the edge e , di, dj . is the sum of cost rates of dj and the data sets between di and dj, supposing that only di and dj are stored and the rest of data sets

have: TCRi;j 5 TCRi0 ; j0 nl nl i21 i021 X X X X .Xà vk 1 SCRi; j 1 V à xk 5 X à vk 1 SCRi0 ; j0 1 V à xk k5j11 k5j011 k51 k51 ! ! nl nl i021 i21 X X X X . vk 2 vk à X 1 xk 2 xk à V 1 ðSCRi0 ; j0 2 SCRi; j Þ 5 0 k51 k5j011 k51 k5j11 ð5:8Þ Algorithm : Input: Output: 01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. Find S_All DDG_LS {d1, d2, … dnl} S_All Create CTT for DDG_S ; Smin = Su,v = Dijkstra_Path (CTT, ds, de); Smax = S1,nl = Dijkstra_Path (CTT, d1, dnl ); Add Smin , Smax to S_All;

we can see from the zoom-in chart (bottom plane) in Figure 7.4, the time for calculating a new minimum cost benchmark is in the magnitude of seconds in general, hence much more efficient. This is because we take advantage of the pre-calculated PSSs that are saved in the hierarchy (see Section 5.2.4) and only need to recalculate the PSS of the local DDG_LS to derive the new benchmark. Hence the complexity of calculating the new benchmark is more or less independent of the size of the DDG. More

Cost-Effective Storage Strategies 3.4 Summary 15 15 17 18 19 19 20 20 21 4 Cost Model of Data Set Storage in the Cloud 4.1 Classification of Application Data in the Cloud 4.2 Data Provenance and DDG 23 23 23 17 vi Contents 4.3 Data Set Storage Cost Model in the Cloud 4.4 Summary 25 27 5 Minimum Cost Benchmarking Approaches 5.1 Static On-Demand Minimum Cost Benchmarking Approach 5.1.1 CTT-SP Algorithm for Linear DDG 5.1.2 Minimum Cost Benchmarking Algorithm for DDG with One Block

Download sample

Download