[ec2-beta] s3 fuse in ubuntu

Jon Marston jon.marston at babelcentral.com
Mon Apr 20 22:08:56 BST 2009


Hello,

Thanks for the thoughtful response,

I've got ebs running for different purposes than what I'm using s3 for. my goal is to be able to mount an s3 bucket as a filesystem in readonly mode to provide file-like read access to the the content of various s3 buckets. There are several use case for this, but here is one that got me thinking about it.

The problem I've got is that I'm storing user generated assets in s3. what is the best way to back that up to a system completely outside of Amazon? I also want to have incremental backups of my ec2 servers outside of the Amazon system.  Putting everything on an ebs volume and taking an ebs snapshot everyday isn't an economical solution. Plus I want redundancy in case my amazon account gets hacked and somebody wipes everything out with a series of clicks in a web interface.

The solution I've set up is rdiff-backup running on each ec2 server, and a drobo box in my basement that is running an rsync cronjob every night to pull down the data that has changed. This gets me incremental and redundant backups of my servers. Unfortunately my drobo box doesn't speak s3, it can only do rsync and ssh. My preferred solution is to have the ec2 instances mount the s3 buckets as volumes, and rsync the contents of the volumes to a remote location.

So I've got a fairly narrow/specific use case, but having fuse access to s3 buckets means that whatever my app is can work with s3 without needing to implement the entire api. In this case my app is rsync. Note that there are a number of s3-fuse projects available, just none of them are tested and packaged for ubuntu.

And fully understood that proper file security and log analysis is required when putting your credentials on a live server. As a youth, I was taught that the best way to keep a secret is to not tell anybody. These days I get queasy when I put private keys on public servers, and as such I tread very carefully when putting my aws credentials on the live server. Some sort of obfuscation can go a long way. JungleDisk implement such obfuscation. However I was unable to make use of JungleDisk for my use case however because my files are uploaded programmatically using jets3t and manually using s3fox. JungleDisk gets flakey working with files uploaded using with anything other than JungleDisk.

Jon Marston


On 4/20/09 4:31 PM, "Jim Cheetham" <jim at inode.co.nz> wrote:

2009/4/20 Jon Marston <jon.marston at babelcentral.com>:
> I'm interested in mounting s3 buckets as drives in my ubuntu ec2 instances.

S3 isn't a very good place to keep a filesystem in (yes, you can
present virtually anything as a filesystem these days, including Gmail
mailboxes ... ), but it is a very good place to keep file objects --
especially ones that are publicly available over HTTP.

EBS provides a better file-system-able storage medium, stored
effectively in the same place; in Amazon. The only downside to EBS
that I can see at the moment is that it isn't easy to access it
directly with a simple client program, you can only get to it from a
running EC2 instance.

What is your intended usage? And do you have any particular standard
in mind for representing a filesystem in S3? (i.e. the various tools
often use different conventions for directories -- for example,
inherently it's difficult to represent an empty directory in S3 and
the common tools seem to choose different incompatible mechanisms)

> I'm also not to keen to put my aws credentials as plain text on my ec2 instance.

I'm not sure I agree with you there; certainly instead of embedding
the AWS credentials in an AMI, you probably should start up an
instance, and then connect to it with a scripted process over ssh to
copy in the current credentials. Give the files very restricted file
permissions, and then only one user account can access them. If you
are concerned about someone cracking in to your instance while it is
running, then as well as setting up appropriate firewall configs (both
on the machine and on the EC2 gateway, restricting available services
to specific originating IPs) you should consider active log collection
for intrusion detection. Just the same as you would do on any other
internet-facing machine, really.

-jim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/mailman/private/ec2/attachments/20090420/156a9cb4/attachment-0002.htm 


More information about the Ec2-beta mailing list