Adam S.Z. Belloum
a.s.z.belloum@uva.nl / a.belloum@esciencecenter.com
SDN-Aware Data Transfer for scientific Applications
Motivation
-
Improve performance and reliability of data transfers with programmable networks
-
Take advantage of the high-speed research networks
-
Offer automated tools capable of eectively moving data
-
Combining data management services and SDN, scattered data and loosely coupled infrastructure can be easily and transparently integrated to provide scientists with the data they need
Problem statement
-
Most scientific applications in Reaserch Infrastructures (RI) require access to geographically distributed data:
-
Data is often scattred over number of sources through a network that spans several sites and / or domains
-
Access to data is often provided by Data access service which:
1. Locates data sources
2. Selects suitable backend
3. Transfers data from storage backend to consumer
The main research problem we are addressing is how to improve the existing data access services using SDN and how
to optimize QoS of large data transfers between a consumer and a set of sources streaming data from a backend
Infrastructure Model
-
We model the RI with a single consumer requesting data which can be transferred from multiple sources
-
We assume RI uses SDN solutions, so the state of the network is available and controllable
-
The QoS optimization problem is represented as the Multiple Source Shortest Path (MSSP) problem
-
We want to discover an optimal path from a set of data sources to a destination
-
The infrastructure is modeled as a bidirectional weighted graph G(V; E)
-
V is the set of all vertices in the network
-
E all the edges or links
-
A single vertex c from V represents the consumer
-
D = {d1, d2, ..., dn} V are the sets of data sources
-
S = {s1, s2, ..., sm} V the switches
-
S and D are disjoint sets
-
-
The optimization goal of the weight function is primarily performance
-
It needs include measures of bandwidth, latency and load
-
The weight assigned to the edge between si and sj is defined as:
wsi ;sj = MTTsi ;sj + Lsi ;sj
-
where
-
MTT is the minimum transfer time and is dened as the time to move a le of size FS between switches si and sj assuming exclusive access to the link between them
-
MTT depends only on the bandwidth and latency of the link
-
L represents the additional delay caused by trac traversing the link at the same time
-
SDN-Aware Data Access Service
SDN-Aware Data Access Service aims at enabling :
-
Felxible data transfer independent from specific protocol
-
Use the best data source and autonomously switch sources if the current one is experiencing heavy load
-
Identify abd select best network path during transfer without requesting restart of the transfer
SDN-Aware Data Access Service extends the LOBCDER service [1], more details about this work can be found in:
[1] S. Koulouzis, D. Vasyunin, R.S. Cushing, A.S.Z. Belloum, Cloud Data Storage Federation for Scientific Applications, In Proceedings of the Euro-Par 2013: Parallel Processing Workshops, Lecture Notes in Computer Science, Aachen, Germany, Aug 2013.
[2] Cloud Federation for Sharing Scientific Data S. Koulouzis, R. Cushing, D. Vasunin, A.S.Z Belloum and M.T. Bubak 8th IEEE International Conference on eScience (eScience 2012) Chicago, Illinois, 8-12 October 2012. [poster]
.