Introduction

DepSky is a system that improves the availability, confidentiality and integrity of stored data in the cloud. It reaches this goal by encrypting, enconding and replicating all the data on a set of differents clouds, forming a cloud-of-clouds. For the current implementation of the system and for the text below we consider a cloud-of-clouds formed by four clouds.

More specifically DepSky address four important limitations:

Protocols

Below is a brief explanation of the DepSky protocols to store data in a cloud-of-clouds. All of them replicate the data for all clouds used but only is ensured that the data is properly stored in three (due to the Byzantines quoruns).

DepSky-A

This protocol replicates all the data in clear text in each cloud.

DepSky-CA

This protocol uses secret sharing and erasure code techniques to replicate the data in a cloud-of-clouds. The image below show how this is donne. First is generated an encryption key, and after that the original data block is encrypted. Then the encrypted data block is erasure coded and are computed key shares of the encryption key. In this case we get four erasure coded blocks and four key shares because we use four clouds. Lastly, is stored in each cloud a different coded block together with a different key share.

<FIGURE>

DepSky-only-JSS

This protocol only use secret sharing. Basically, is generated an encryption key and the data is encrypted. Then is generated four key shares of the key. Finally are spread by each cloud the data encrypted together with a different key share.

DepSky-only-JEC

On the other hand, this protocol only use erasure codes to replicate the data. The data is erasure coded in four different blocks and then each of them is stored in a different provider.

This protocol may be useful to those who your application already encrypt the data.

Costs

As would be expected, a DepSky client would be required to pay four (using a cloud-of-clouds of four cloud providers) times more than he would pay if uses a single cloud. That not happens (if using DepSky-CA protocol) due to the erasure codes techniques. The erasure codes technique used (see JEC) allow us to store in each of the four cloud providers only half of the orginal block data size. So, using DepSky, the client only will pay twice more than using a single cloud.

For more information see the DepSky paper. You can find it here EuroSys'11 paper.


Getting Started with DepSky

This section explains you how to create the providers accounts to form a cloud-of-clouds environment. If you want to test DepSky without create the accounts, you can use local storage instead. Please read the next section called Testing DepSky.

First of all, you need to download the latest stable version available and extract it. Make sure you have java 1.7 or later installed.

Done this, you need to fill up the accounts.properties file (you can find it inside the config folder). To fill up this file you need first create accounts in the cloud providers we support. To do that follow the links below:

After create the accounts you have access to yours API keys and so, you can fill up the accounts.properties file. To help you to find your keys, follow the steps below.

If you only want to use Amazon S3 as your cloud storage provider, you can only create one account at Amazon S3 and use the example file provided (config/accounts_amazon.properties). To do that, copy the content of the 'accounts_amazon.properties' file to the one mentioned before (config/accounts.properties). In this case will be used four different Amazon S3 locations to store the data (US_Standard, EU_Ireland, US_West and AP_Tokyo).

Now all the setup is finished and DepSky is ready to be used.

Testing DepSky

To test DepSky we provide a simple main that can be found in src.depskys.core.LocalDepSkySClient. To run this main use the DepSky_Run.sh scritp at the root of the project providing 3 arguments:

Let us give you an example. If we run DepSky with the command below, we gonna start a session with the client id 0, all the data will be replicated using erasure codes and secret sharing and will be stored on the cloud providers.

$ ./DepSky_Run 0 1 0

This main allow you to read, write and delete. You have five commands available:

This main is not enough to take advantage of all the functionalities provided by DepSky. To learn more about all you can do with DepSky read the nexte section.

Using DepSky as a Library

To start, you need to create a src.depskys.core.LocalDepSkySClient object. As you can see below, the constructor receive the client id and a boolean. If the boolean value is set to false, will be used the local storage, otherwise will be used the cloud storage.

public LocalDepSkySClient(int clientId, boolean useModel) throws StorageCloudException {

        this.clientId = clientId;
        DepSkySKeyLoader keyLoader = new DepSkySKeyLoader(null);
        if(!useModel){
                this.cloud1 = new LocalDiskDriver("cloud1");
                this.cloud2 = new LocalDiskDriver("cloud2");
                this.cloud3 = new LocalDiskDriver("cloud3");
                this.cloud4 = new LocalDiskDriver("cloud4");
                this.drivers = new IDepSkySDriver[]{cloud1, cloud2, cloud3, cloud4};
        }else{  
                List<String[][]> credentials = null;
                try {
                        credentials = readCredentials();
                } catch (FileNotFoundException e) {     
                        System.out.println("accounts.properties file dosen't exist!");
                        e.printStackTrace();
                } catch (ParseException e) {
                        System.out.println("accounts.properties misconfigured!");               
                        e.printStackTrace();
                }
                this.drivers = new IDepSkySDriver[4];
                String type = null, driverId = null, accessKey = null, secretKey = null;
                for(int i = 0 ; i < credentials.size(); i++){
                        for(String[] pair : credentials.get(i)){
                                if(pair[0].equalsIgnoreCase("driver.type")){
                                        type = pair[1];
                                }else if(pair[0].equalsIgnoreCase("driver.id")){
                                        driverId = pair[1];
                                }else if(pair[0].equalsIgnoreCase("accessKey")){
                                        accessKey = pair[1];
                                }else if(pair[0].equalsIgnoreCase("secretKey")){
                                        secretKey = pair[1];
                                }
                        }
                        drivers[i] = DriversFactory.getDriver(type, driverId, accessKey, secretKey);
                }
        }       
        this.manager = new DepSkySManager(drivers, this, keyLoader);
        this.replies = new HashMap<Integer, CloudRepliesControlSet>();
        this.N = drivers.length;
        this.F = 1;
        this.encoder = new ReedSolEncoder(2, 2, 8);
        this.decoder = new ReedSolDecoder(2, 2, 8);

        if(!startDrivers()){
                System.out.println("Connection Error!");
        }
  }

The second step is create too many src.depskys.core.DepSkySDataUnit objects as you want. Each object of this type represents our storage model. Concretely, a src.depskys.core.DepSkySDataUnit refers to an object that have associated one metadata file and all the versions written to it. The example bellow illustrate it.

  exampleFilemetadata
  exampleFilevalue1004
  exampleFilevalue2004
  exampleFilevalue3004
  ...

Each DepSkySDataUnit object contains information about the protocol used to replicate the data, the metadata information, the written versions, etc. Furthermore, each one of these objects (by that we mean all the files associated with it) can be stored in a different bucket. There are two ways to create a DepSkySDataUnit object. The first example below (1) will write to a container named regId (which will contain regIdmetadata and regIdvalue files) inside a default bucket of DepSky. Using the second example a user is able to specify the bucket where the data will be stored.

  (1)
  public DepSkySDataUnit(String regId) {
  ...

  (2)
  public DepSkySDataUnit(String regId, String bucketName) {
  ...

After creating a DepSkySDataUnitobject, you need to specify what protocol will be used to replicate the data that will be written in this container. By default, each DepSkySDataUnit object will use DepSky-A (data is replicated in clear_text). To use one of the others three protocols follow the code below.

  DepSkySDataUnit dataUnit = new DepSkySDataUnit("container");
  dataUnit.setUsingPVSS(true); //to use DepSky-CA
  dataUnit.setUsingErsCodes(true); //to use only erasure codes
  dataUnit.setUsingSecSharing(true); //to use only secret sharing

When you want to perform operations in the LocalDepSkySClient object (read, write, etc) you have to use a DepSkySDataUnit object.

Write

When you want to use the write operation, you have to pass the DepSkySDataUnit object for which you want to write and the data to be written. As we can see below, this operation return a byte[]. This byte[] is a SHA-1 hash of the written data. This hash must be saved by the client if he want to use the read matching operation (see bellow).

  public synchronized byte[] write(DepSkySDataUnit reg, byte[] value) throws Exception {
  ...

Read

To use this operation, you only have as argument the DepSkySDataUnit object. This operation will read the last version written to this DepSkySDataUnit.

  public synchronized byte[] read(DepSkySDataUnit reg) throws Exception {
  ...

Read Matching

This operation have the function of read a old version of a given DepSkySDataUnit. To do that you have to pass a byte[] containing the hash of the version you want to read. This hash is the one returned by the write operation.

  public synchronized byte[] readMatching(DepSkySDataUnit reg, byte[] hashMatching) throws Exception{
  ...

Delete

The delete operation will delete all the files associated with the given DepSkySDataUnit, that includes all the versions written and the metadata file.

  public synchronized void deleteContainer(DepSkySDataUnit reg) throws Exception{
  ...

SetAcl

The setacl operation will change the permissions of a specified DepSkySDataUnit. Specifically, it will change the permissions of the bucket where the objects are stored, as well as the permissions of the objects within it. For do that we have to share the bucket in the four used clouds (once the data is replicated among them). The protocols to share a bucket in the used clouds can be found in this paper.

  public synchronized LinkedList<Pair<String, String[]>> setAcl(DepSkySDataUnit reg, String permission,                                      
     LinkedList<Pair<String, String[]>> cannonicalIds) throws Exception {
  ...

The operation receives 3 arguments. The first corresponds to the DepSkySDataUnit that will be shared. The second specifies the permission that other users will have to access the specified DepSkySDataUnit. It can be "r" for read, "w" for write, and "rw" for read and write. The last field has information about the user who will have access to the shared resource. This last field must be constructed following the example below where each line represent an entry in the LinkedList (which is a Pair).

 -> <"AMAZON-S3", [canonicalId]>
 -> <"GOOGLE-STORAGE", [email]>
 -> <"RACKSPACE", [name, email]>
 -> <"WINDOWS-AZURE", []>

For Amazon S3, the grantee user can find the canonicalId in the same page of the access credential (see the beginning of this page). For the other clouds, the information is quite intuitive. For Google Storage is only need the email of the grantee (must be a gmail account). For RackSpace the name and the grantee. Finally, for Windows Azure nothing is needed (see this paper).

This operation returns a LinkedList> with the same organization of the one given as argument. This list must be given to the grantee user, as well as the name of the DepSkySDataUnit in order he can access the shared resource. But first the user who is sharing must add to it some information. More specifically, he must add to the AMAZON-S3 pair his own cannonicalID, and to the GOOGLE-STORAGE pair his email.

Once the grantee user have this list with he, he can use it in the other operations (read, write, delete) to operate on the shared bucket.