4+grid10-presentation wei chen

Upload: prathyusai1990

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    1/23

    Exploiting Deadline Flexibility inGrid Workflow Rescheduling

    Wei Chen

    Alan Fekete

    Young Choon Lee

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    2/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    3/23

    Computational Grid andWorkflow Application

    Computational Grid:

    Heterogeneous Computing Site (Resource Instance)

    Advance Reservation

    5

    4

    32

    1

    Workflow Application

    Directed Acyclic Graph (DAG)

    Job(V, E), where Vis the set of tasks andEis directed edges represent precedenceconstraints between corresponding tasks

    }......,,{ 21 nRRRGrid

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    4/23

    Grid Workflow Scheduling

    List scheduling heuristics

    Heterogeneous Earliest-Finish-Time (HEFT) Greedy Best-First Strategy

    It lacks an overall consideration in schedulingdifferent workflow jobs

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    5/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    6/23

    The Approach we build on: DeadlineGuaranteed Rescheduling (DGR)

    Deadline-based scheduling: it allows each job tocome with a deadline, and from this, each task

    of the job can be placed more flexibly (not onlyat the earliest possible timeslot)

    A rescheduling mechanism: the tasks of an

    earlier job might be rearranged to other timeslots or resource instances, giving extraresource availability for more urgent tasks

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    7/23

    3

    4

    An Example of Scheduling andRescheduling Workflow Jobs

    Deadline (B)

    Deadline (A)

    4

    32

    1

    5

    2

    (A) 1

    (B)

    (a)

    A

    1

    A2A3

    A

    4

    A

    5

    R1 R2 R3

    (b)

    A

    1

    A2

    B

    3

    A

    3

    A

    4

    A

    5

    B

    1

    B

    2

    B

    4

    R1 R2 R3

    (c)

    A

    1

    A

    2

    B

    3A

    3

    A

    4

    A

    5

    B

    1

    B

    2

    B

    4

    R1 R2 R3

    A

    2

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    8/23

    The Key Points of Our Approach

    First, our approach loosely distributes tasks along the time axisaccording to the deadline of the workflow job, but not squeezesthem on the earliest finish time. It is more flexible in rescheduling toallow urgent tasks get required resource availability.

    Second, our approach is not to reconsider schedules of the wholejob again. Each task is rescheduled within a time slot boundary sothat it does not affect the current schedules of all its predecessorsand successors. This simplifies the complexity of our algorithm.

    Third, our rescheduling can be made not only in time dimension(another time slot), but also in space dimension (different resourceinstances). This increases the flexibility in rescheduling.

    Our rescheduling is to rearrange advance reservations of tasksbefore they are submitted for execution. This approach does notincur the cost in task migration.

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    9/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    10/23

    Weighted DAG

    Task Deadlines

    (6))}(')(')(min{)(,)( ijjjiiji eDTvETvdeadlinevdeadlinevsuccessorvVv

    (5)''/

    (4))}(max{(3))()}()(max{)(),(,

    (2)/)()(D,

    (1)/)()(,

    bandwidthnetworkaveragethe:

    speedncomputatioaveragethe:

    ratioDTDTratioETETmakespandeadlineratio

    VvvEFTmakespanvETeDTvEFTvEFTvrpredecessovVv

    BedataeTEe

    SvworkloadvETVv

    ii

    iijjiiji

    Bijijij

    Siii

    An advisable deadline for each task

    The deadline of a workflow job can be guaranteed if all of its tasks arefinished before their deadlines. These advisable deadlines reasonablybalance the time for each task based on their workload proportions.

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    11/23

    Scheduling Algorithm

    Input a DAG Output scheduling of the job

    calculate deadlines for each task; rank tasks into a priority list

    for each task in the list do

    schedule task within its deadline

    if it fails then

    schedule task in the earliest finish time

    ifthis finish time > jobs deadline then break the loop

    end if

    end for

    if scheduling is not done then

    rollback schedules have been made

    for each task in the list do

    schedule task in the earliest finish time

    ifthis finish time > jobs deadline then reject the job

    end for

    end if

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    12/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    13/23

    Time Slot Boundary

    The time slot boundary is calculated when a task tries tobe rescheduled on a specific resource instance

    At the moment, the actual schedules of the tasks

    predecessors and successors are known Since the target resource is specified, the actual network

    bandwidths between the resource instance and that ofthe tasks predecessors or successors are also known

    })()()(min{),(

    })()()(max{),()(),(,

    ikikki

    ijijji

    ikiji

    ebandwidthedatavASTRvLF T

    ebandwidthedatavAFTRvESTvsuccessorvvrpredecessovVv

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    14/23

    TTT

    Bipartite Graph Matching

    We make all tasks one part of nodes T (no matter whichworkflow job the task belongs to), and all resourceinstances the other part R.

    Every task is linked with all its satisfiable resources. Thearrow of the line shows whether the task has beenscheduled on (or matched with) a resource instance,which is represented by an arrow pointing to the task.

    1

    (a)

    R

    2

    3

    1

    2

    1

    (b)

    R

    2

    3

    1

    2

    1

    (c)

    R

    2

    3

    1

    2

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    15/23

    Rescheduling Algorithm

    Input a task Output scheduling of the task

    push the task into an empty stack S

    while S is not empty

    pop a task from S

    for each satisfiable resource of the task do

    calculate EST and LFT

    if it can be scheduled in the boundary then

    return: the scheduling

    else if a task can be removed then

    push it into S

    end if

    end for

    end while

    return: scheduling fails

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    16/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    17/23

    Experiment Setup

    Heterogeneous Grid 1,000 heterogeneous computing sites Different setting in resource properties, computation

    capacity and speed Computing sites are fully connected by varying

    network bandwidths

    Workflow Jobs

    various sizes and parallelism degrees both computation intensive and communicationintensive ones

    some are more urgent than others

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    18/23

    Acceptance Rate

    Overall Acceptance Rate

    20%

    40%

    60%

    80%

    100%

    0 100 200 300

    Number of Jobs Submitted

    HEFT

    DGRDGR-L

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    19/23

    Resource Utilization

    Resource Utilization

    0%

    20%

    40%

    60%

    80%

    100%

    0 100 200 300

    Number of Submitted Jobs

    HEFT

    DGRDGR-L

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    20/23

    Running Time of Algorithms

    Running Time

    0

    50

    100

    150

    200

    250

    300

    0 100 200 300

    Number of Submitted Jobs

    Time(ms)

    HEFT

    DGRDGR-L

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    21/23

    Agenda

    Introduction

    Deadline Guaranteed Rescheduling

    Workflow Scheduling

    Task Rescheduling

    Performance Study

    Conclusion

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    22/23

    Conclusion

    A deadline-based strategy to schedule andreschedule workflow jobs; individual tasks canbe rescheduled, based on the requirements of

    later jobs as they arrive. The approach satisfies Grid users as more jobs

    can be finished before their deadlines, and italso benefits the Grid owner by improvingresource utilization.

    By using appropriate heuristics, the cost of thescheduling decision-making is quite acceptableand scalable to a large number of tasksscheduled in the system.

  • 8/3/2019 4+Grid10-Presentation Wei Chen

    23/23

    Thanks

    Questions