Octopus Cloud Storage System

The Octopus is a software service, designed to provide a high-availability cloud-based storage solution.

The following table provides some information about some of the existing S3-compatible public cloud service offerings and private cloud service solutions:

Service	Public/Private Cloud	Note
Amazon S3	Public
Cloudian	Public
Connectria Cloud Storage	Public	It is unclear if this service is still available
Dunkel Cloud Storage	Public
Google Cloud Storage	Public
Host Europe Cloud Storage	Public	Defunct since end 2014
HP Helion Public cloud	Public	Defunct since January 2016
Nirvanix	Public	Defunct since September 2013
Apache CloudStack	Private
Ceph	Private
Cumulus (Nimbus)	Private
Minio	Private
pWalrus	Private	Parallel version of Walrus
Riak Cloud Storage	Private
S3ninja	Private	Emulates the S3 API for development and testing purposes
Swift (OpenStack)	Private
Walrus (Eucalyptus)	Private

Octopus' aim is to support multiple different S3-compatible services. Support for S3 and Walrus is implemented now. See the list of already implemented features.

Publications

Octopus - A Redundant Array of Independent Services (RAIS). Christian Baun, Marcel Kunze, Denis Schwab, Tobias Kurze. Proceedings of the 3rd International Conference on Cloud Computing and Services Science (CLOSER 2013) in Aachen. SCITEPRESS. ISBN: 978-989-8565-52-5, P.321-328
Redundant Cloud Storage with Octopus. Christian Baun, Marcel Kunze. This non-published paper from August 2011 summarizes the features and design of Octopus.

How it works

Octopus is designed to run inside a PaaS like Google’s AppEngine, AppScale or typhoonAE.

One of the benefits of a cloud platform is that the users don’t need to install the software at client side.

The users can import their credentials to S3 and Walrus services into Octopus. Octopus checks if a bucket with the naming scheme octopus_storage-at- exists. If not, the bucket will be created and the users can upload files - called objects in the S3 world – with one click to the connected storage services into the Octopus bucket.

The following figure shows the steps to upload an object. After the customers login, his client requests (1) the Octopus website with the HTML form and the list of objects.

The object list is requested (2) from the storage services and transferred (3) to Octopus. The synchronicity of the objects is checked (4) by Octopus using checksums. All S3-compatible store a MD5 checksum for each object. These checksums are transferred automatically when a list of objects is requested and they allow to verify if the objects located at the different storage services are synchronized. Any time, when a list of objects is requested, Octopus checks if the objects are still synchronized across the storage services. After the synchronicity check, the web site with the HTML form is transferred (5) to the customers browser. After the customer selected the local file and started the upload with the submit button, the object is transferred (6) to the first storage service. If the upload was successful, a confirmation message is send (7) back to the browser. Step 6 and 7 are repeated for each additional storage service used.

A drawback of Octopus is that the files that shall be uploaded to the cloud storage services cannot be cached by Octopus itself because files cannot be stored by the applications inside the PaaS. This causes another drawback of Octopus. All files need to be transferred to each connected storage service. If a user has credentials for multiple storage services, the file needs to be transferred from the client (browser) to the storage services one after one.

Because each object is transferred directly from the customers browser to all connected storage services, the amount of data that need to be transferred, increases linear with each additional storage service used. Therefore, the use of multiple storage services leads to disproportionately long transfer times.

Octopus is written in Python and JavaScript. The communication with the S3-compatible storage services is done via boto, a Python interface to the Amazon Web Services. The user interface is HTML (generated with Django) and some JavaScript (jQuery).

Implemented Features

Import of credentials for Amazon S3 and Walrus.
RAID-1 mode. Upload to one or two storage services with a single click.
Check for synchronicity with help of the MD5 checksums.
Erase objects inside different storage services with a single click.
Erase all objects in all storage services with a single click.
Alter Access Control List (ACL) of objects inside one or two storage services with a single click.

Next Steps

Implementation of automatic repair when check for synchronicity failed.
Currently, each user can import credentials for only one Amazon S3 account and a single Walrus Private Cloud storage service.
Implement support for Google Blobstore. Objects (called blobs) of max. 2 GB size can be upload into the Blobstore via HTTP POST and then accessed from App Engine applications. Blobstore could be used as a proxy for Octopus to avoid multiple uploads from the browser to the storage services.

Google Storage could be used as a proxy for Octopus too because objects inside Google Storage can be accessed from applications running inside the App Engine.
Implementation of a RAID-5 mode. Benefits would be that no provider has a full (working) copy of the customers data and if a provider is not operational any more, the customers data is still available.

Challanges and Limitations

Cumulus does not support uploading objects via POST yet. Maybe future releases have this feature and can be used by Octopus.
In S3 and Google Storage, the MD5 checksums is enclosed by double quotes. In Walrus they are not.
If no submit button inside a form is used to upload an object into Walrus, some bytes of garbage data is appended to the object.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
bilder		bilder
boto		boto
dateutil		dateutil
documents		documents
favicon		favicon
internal		internal
s3		s3
stylesheets		stylesheets
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
app.yaml		app.yaml
error_messages.py		error_messages.py
index.yaml		index.yaml
jquery.blockUI.js		jquery.blockUI.js
jquery.min.js		jquery.min.js
library.py		library.py
main.html		main.html
octopus.py		octopus.py
robots.txt		robots.txt
simple.cfg		simple.cfg
upload.js		upload.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Octopus Cloud Storage System

Publications

How it works

Implemented Features

Next Steps

Challanges and Limitations

About

Releases

Packages

Languages

License

christianbaun/octopuscloud

Folders and files

Latest commit

History

Repository files navigation

Octopus Cloud Storage System

Publications

How it works

Implemented Features

Next Steps

Challanges and Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages