(+84) 463.28.7979

A Bucket Full Of Objects — Amazon S3

Amazon is not just about books anymore.  Since 2001, Amazon Web Services (AWS) has found ways of deriving income from the excess capacity of the network infrastructure used to run their own e-commerce business.  One of these products, the Simple Storage Service (S3) allows users to store their files, known as “objects” online.  As of March 2010, it was estimated that S3 contained over 102 billion objects.

Objects are referenced by their “keys,” which consist of an optional “pseudo folder” (directory) path name, followed by the name of the object.  The keys “His-Stuff/test.txt” and “Her-Stuff/test.txt” refer to 2 separate instances of the  “test.txt” object.  Because the “folder” name part of the keys are unique, so are the object instances.  The term “pseudo folder” is used because S3 does not really store objects in folders; the same way that Windows, OS-X or Linux does.  The entire object key is considered (by S3) to be the equivalent of a file name.

Object keys are stored in “buckets.”  Bucket names are unique across the entire S3 system.  Each bucket is owned by an AWS account.  AWS accounts can claim ownership of many buckets, but buckets can only be owned by one AWS account at a time.

Buckets are assigned to geographical “regions,” in which objects contained in the bucket are stored.  Currently, there are five regions; US East (N. Virginia), US West (N. California), EU West (Ireland), Asia Pacific Southeast (Singapore), and Asia Pacific Northeast (Tokyo).  Objects within a bucket are stored across multiple data centers within each region.  A sixth umbrella region called US Standard refers to either US East or West.  With the recent announcement of the Asia Pacific (Tokyo) region, it is expected that an umbrella “Asia Pacific Standard” region will also be created.  Unless explicitly specified, network mapping is used to automagically determine which region newly created buckets are assigned.

There are three reasons why you would explicitly assign a bucket to a particular region; to reduce latency, for regulatory reasons, and to avoid regional volatility.  If you are in an EU country, then by storing your objects in the EU region you reduce the amount of time needed to retrieve your data.  Similarly, for reasons of compliance with EU privacy laws, objects containing customer information may have to be kept in buckets assigned to the EU region.  Finally, if you have objects containing sensitive information and you are concerned about the ability of data centers within your default region to protect them, you might create buckets that are assigned to other more stable or data friendly regions.

Objects can be public or private (default).  Public objects are accessible by anyone who knows the bucket and key names of the object.  Private objects can only be accessed by the bucket’s owner, by explicit permission, or by expiring URL request.  It is the power of the expiring URL request that e-commerce products like WP eStore use, to enhance the download experience of digital products.  Once a download has been authorized, expiring URL requests are generated and users are redirected to S3 to download their digital product.

Expiring URL requests grant access to specific private objects within a bucket, for a small period of time.  Bookmarking of the URL request, and knowledge of the bucket and object names, are ineffective once the URL request has expired.

By redirecting users to the S3 network for downloads, you dramatically reduce the resource burden associated with large downloads from your own server; to a more faster and more reliable one.  This is analogues to the concept of having physical products kept and shipped to customers from a warehouse location that is separate from the retail location.

Redirection of downloads to S3 is not the same as simply using S3 to store files and then copying them through your server, to the user.  Without redirection, your server is not only burdened with having to act as a middleman during the download process, but with the additional bandwidth required to first copy the data to your own server and then resending it to the user.  That is why it is important to look for native Amazon S3 integration when considering e-commerce products.

The cost of using S3 is quite low and there is no monthly minimum.  Monthly cost is dependent upon the amount of data stored, the number of times an object is uploaded/downloaded, and the amount of bandwidth used.  Amazon offers a Free Usage Tier that allows you to try out all AWS products, including S3, for one year.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>