Adam S.Z. Belloum
a.s.z.belloum@uva.nl / a.belloum@esciencecenter.com
Large OBject Cloud Data storagE fedeRation (LOBCDER)
Motivation
-
Provide loosely-coupled flexible distributed easy to use architecture
-
Build on top of existing solutions
-
Use data storage federation to address these problems.
-
Present them via a standardized protocol that can be also mounted
-
Provide a file system abstraction
-
Introduce of a common management layer that loosely couples independent storage resources
-
Distributed applications have a global shared view of the whole available storage space
-
Applications can be developed locally and deployed on the cloud platform without changing the data access parameters
-
Use storage space efficiently with the copy-on-write strategy
-
Replication of data can be based on efficiency cost measures
-
Reduce the risk of vendor lock-in in clouds since no large amount of data are on a single provider
Design Considerations
-
LOBCDER is a storage federation service making available distributed unstuctured data stored in various storage framework and independent providers
-
LOBCDER loosely couples a variety of storage technologies such as Openstack-Swift ,iRODS GridFTP
-
LOBCDER is a distributed file system that aims for "transparency" in a number of aspects
-
It can be "invisible" to clients which "see" a system similar to a local file system
-
Behind the scenes, it handles locating files, transporting data providing:
-
Access transparency: clients are unaware that files are distributed and can access them in the same way as local files are accessed
-
Location transparency: a consistent namespace encompasses remote files. The name of a file does not give its location
-
Concurrency transparency: all clients have the same view of the state of the file system
-
Heterogeneity: provided across different hardware operating system platforms
-
Replication transparency: replicate files across multiple servers. Clients are unaware of this
-
Migration transparency: files are able to move around without the client's knowledge
-
It can also provide more advanced functionality to the rest of the modules in the VPH-Share cloud platform
System Overview (Fonrtend Layer)
-
The frontend provides access control, authentication and authorization
-
It is a WebDAV servlet which provides interoperability as an RFC standard
-
It enables network transparency through the use of numerous clients that are able
to mount WebDAV
-
It supports versioning, locking, and custom properties
-
Authentication and authorization is delegated to the authentication service
-
The authentication service authenticates user according to a security token
-
The authentication service validates the token and returns information about the user
-
For clients that want control over properties that depend on the infrastructure we have
implemented a REST interface
System Overview (Resource Layer)
-
The resource layer creates a logical representation of the physical storage space
and to manage the physical files
-
The WebDAVResourceFactory, and the WebDAVResource provide a WebDAV representation
of the LogicalResource
-
The ResourceCatalog connects to the persistence layer and queries
-
LogicalResources
-
The Task component manages the physical files. It schedules file's replication and delition
-
The LogicalResources hold basic metadata such as modification date, length, etc.
-
The PDRI component represents the physical data.
-
The StorageSite component provides a description for the storage resources
-
The backend layer provides the necessary abstraction to uniformly access physical storage resources.
It is a Virtual Resource System API
-
The VFSClient can perform file system operations on physical data.
-
Different VFSDriver implementations allow transparent access to storage resources
-
The persistence layer is a relational database which holds the logical data that are represented by the LogicalResource
-
It provides Atomicity, Consistency, Isolation and Durability (ACID).
-
These properties are necessary in a multiuser environment for maintaining a synchronized and consistent view of the shared file system.
More details about this work can be found in:
[1] S. Koulouzis, D. Vasyunin, R.S. Cushing, A.S.Z. Belloum, Cloud Data Storage Federation for Scientific Applications, In Proceedings of the Euro-Par 2013: Parallel Processing Workshops, Lecture Notes in Computer Science, Aachen, Germany, Aug 2013.
[2] Cloud Federation for Sharing Scientific Data S. Koulouzis, R. Cushing, D. Vasunin, A.S.Z Belloum and M.T. Bubak 8th IEEE International Conference on eScience (eScience 2012) Chicago, Illinois, 8-12 October 2012. [poster]
.