Download - 4+Grid10-Presentation Wei Chen
-
8/3/2019 4+Grid10-Presentation Wei Chen
1/23
Exploiting Deadline Flexibility inGrid Workflow Rescheduling
Wei Chen
Alan Fekete
Young Choon Lee
-
8/3/2019 4+Grid10-Presentation Wei Chen
2/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
3/23
Computational Grid andWorkflow Application
Computational Grid:
Heterogeneous Computing Site (Resource Instance)
Advance Reservation
5
4
32
1
Workflow Application
Directed Acyclic Graph (DAG)
Job(V, E), where Vis the set of tasks andEis directed edges represent precedenceconstraints between corresponding tasks
}......,,{ 21 nRRRGrid
-
8/3/2019 4+Grid10-Presentation Wei Chen
4/23
Grid Workflow Scheduling
List scheduling heuristics
Heterogeneous Earliest-Finish-Time (HEFT) Greedy Best-First Strategy
It lacks an overall consideration in schedulingdifferent workflow jobs
-
8/3/2019 4+Grid10-Presentation Wei Chen
5/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
6/23
The Approach we build on: DeadlineGuaranteed Rescheduling (DGR)
Deadline-based scheduling: it allows each job tocome with a deadline, and from this, each task
of the job can be placed more flexibly (not onlyat the earliest possible timeslot)
A rescheduling mechanism: the tasks of an
earlier job might be rearranged to other timeslots or resource instances, giving extraresource availability for more urgent tasks
-
8/3/2019 4+Grid10-Presentation Wei Chen
7/23
3
4
An Example of Scheduling andRescheduling Workflow Jobs
Deadline (B)
Deadline (A)
4
32
1
5
2
(A) 1
(B)
(a)
A
1
A2A3
A
4
A
5
R1 R2 R3
(b)
A
1
A2
B
3
A
3
A
4
A
5
B
1
B
2
B
4
R1 R2 R3
(c)
A
1
A
2
B
3A
3
A
4
A
5
B
1
B
2
B
4
R1 R2 R3
A
2
-
8/3/2019 4+Grid10-Presentation Wei Chen
8/23
The Key Points of Our Approach
First, our approach loosely distributes tasks along the time axisaccording to the deadline of the workflow job, but not squeezesthem on the earliest finish time. It is more flexible in rescheduling toallow urgent tasks get required resource availability.
Second, our approach is not to reconsider schedules of the wholejob again. Each task is rescheduled within a time slot boundary sothat it does not affect the current schedules of all its predecessorsand successors. This simplifies the complexity of our algorithm.
Third, our rescheduling can be made not only in time dimension(another time slot), but also in space dimension (different resourceinstances). This increases the flexibility in rescheduling.
Our rescheduling is to rearrange advance reservations of tasksbefore they are submitted for execution. This approach does notincur the cost in task migration.
-
8/3/2019 4+Grid10-Presentation Wei Chen
9/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
10/23
Weighted DAG
Task Deadlines
(6))}(')(')(min{)(,)( ijjjiiji eDTvETvdeadlinevdeadlinevsuccessorvVv
(5)''/
(4))}(max{(3))()}()(max{)(),(,
(2)/)()(D,
(1)/)()(,
bandwidthnetworkaveragethe:
speedncomputatioaveragethe:
ratioDTDTratioETETmakespandeadlineratio
VvvEFTmakespanvETeDTvEFTvEFTvrpredecessovVv
BedataeTEe
SvworkloadvETVv
ii
iijjiiji
Bijijij
Siii
An advisable deadline for each task
The deadline of a workflow job can be guaranteed if all of its tasks arefinished before their deadlines. These advisable deadlines reasonablybalance the time for each task based on their workload proportions.
-
8/3/2019 4+Grid10-Presentation Wei Chen
11/23
Scheduling Algorithm
Input a DAG Output scheduling of the job
calculate deadlines for each task; rank tasks into a priority list
for each task in the list do
schedule task within its deadline
if it fails then
schedule task in the earliest finish time
ifthis finish time > jobs deadline then break the loop
end if
end for
if scheduling is not done then
rollback schedules have been made
for each task in the list do
schedule task in the earliest finish time
ifthis finish time > jobs deadline then reject the job
end for
end if
-
8/3/2019 4+Grid10-Presentation Wei Chen
12/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
13/23
Time Slot Boundary
The time slot boundary is calculated when a task tries tobe rescheduled on a specific resource instance
At the moment, the actual schedules of the tasks
predecessors and successors are known Since the target resource is specified, the actual network
bandwidths between the resource instance and that ofthe tasks predecessors or successors are also known
})()()(min{),(
})()()(max{),()(),(,
ikikki
ijijji
ikiji
ebandwidthedatavASTRvLF T
ebandwidthedatavAFTRvESTvsuccessorvvrpredecessovVv
-
8/3/2019 4+Grid10-Presentation Wei Chen
14/23
TTT
Bipartite Graph Matching
We make all tasks one part of nodes T (no matter whichworkflow job the task belongs to), and all resourceinstances the other part R.
Every task is linked with all its satisfiable resources. Thearrow of the line shows whether the task has beenscheduled on (or matched with) a resource instance,which is represented by an arrow pointing to the task.
1
(a)
R
2
3
1
2
1
(b)
R
2
3
1
2
1
(c)
R
2
3
1
2
-
8/3/2019 4+Grid10-Presentation Wei Chen
15/23
Rescheduling Algorithm
Input a task Output scheduling of the task
push the task into an empty stack S
while S is not empty
pop a task from S
for each satisfiable resource of the task do
calculate EST and LFT
if it can be scheduled in the boundary then
return: the scheduling
else if a task can be removed then
push it into S
end if
end for
end while
return: scheduling fails
-
8/3/2019 4+Grid10-Presentation Wei Chen
16/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
17/23
Experiment Setup
Heterogeneous Grid 1,000 heterogeneous computing sites Different setting in resource properties, computation
capacity and speed Computing sites are fully connected by varying
network bandwidths
Workflow Jobs
various sizes and parallelism degrees both computation intensive and communicationintensive ones
some are more urgent than others
-
8/3/2019 4+Grid10-Presentation Wei Chen
18/23
Acceptance Rate
Overall Acceptance Rate
20%
40%
60%
80%
100%
0 100 200 300
Number of Jobs Submitted
HEFT
DGRDGR-L
-
8/3/2019 4+Grid10-Presentation Wei Chen
19/23
Resource Utilization
Resource Utilization
0%
20%
40%
60%
80%
100%
0 100 200 300
Number of Submitted Jobs
HEFT
DGRDGR-L
-
8/3/2019 4+Grid10-Presentation Wei Chen
20/23
Running Time of Algorithms
Running Time
0
50
100
150
200
250
300
0 100 200 300
Number of Submitted Jobs
Time(ms)
HEFT
DGRDGR-L
-
8/3/2019 4+Grid10-Presentation Wei Chen
21/23
Agenda
Introduction
Deadline Guaranteed Rescheduling
Workflow Scheduling
Task Rescheduling
Performance Study
Conclusion
-
8/3/2019 4+Grid10-Presentation Wei Chen
22/23
Conclusion
A deadline-based strategy to schedule andreschedule workflow jobs; individual tasks canbe rescheduled, based on the requirements of
later jobs as they arrive. The approach satisfies Grid users as more jobs
can be finished before their deadlines, and italso benefits the Grid owner by improvingresource utilization.
By using appropriate heuristics, the cost of thescheduling decision-making is quite acceptableand scalable to a large number of tasksscheduled in the system.
-
8/3/2019 4+Grid10-Presentation Wei Chen
23/23
Thanks
Questions