**Chapter 1:** Street Networks

Operations research

Goal: determine how to carry out a sequence of tasks most efficiently

Efficient: least cost, least time, least distance, ......

Example: reading parking meters - need to walk down each street that has meters on it. Efficient: try not to walk down the same sidewalk twice!

Model of the problem: a graph = a bunch of points (vertices) connected together by line segments/curves (edges).

E.g., vertices = intersections, edges = sidewalks

Figure
A **path** in a graph is a way to walk from vertex to vertex along the edges.

A **circuit** is a path that begins and ends at the same vertex.

*Finding efficient circuits:* Need to cross every edge (read all
the meters!), the best possible will cross each edge exactly once
= **Euler circuit**.

Does every graph have an Euler circuit? No...

The **valence** of a vertex is the number of (ends of) edges that meet at
the vertex.

A graph is **connected** if for any pair of edges, you can find a path in the graph
going from one edge to the other (i.e., the graph doesn't come in pieces...)

**Euler's Theorem** A graph has an Euler circuit if and only if (i.e., exactly when)
it is both connected and every vertex has even valence.

Figure
Finding an Euler circuit: start anywhere, and imagine erasing each edge *just
before* you cross it; if erasing it disconnects what's left of the graph,
**don't go that way!** Pick a different edge to continue along.

If the graph has no Euler circuit, what is the best route you can find?

The sum of the valences of a graph = 2×(the number of edges) is *even*.
So the number of vertices with *odd* valence is also even. If we pair
them up, find a path between each pair, and duplicate the edges of our graph
(put in a second, parallel, edge next to the ones in our path), the resulting
graph *will* have an Euler circuit (this is called **Eulerizing** the
graph). The best Eulerization will have the fewest number of additional
edges. This means: the sum of the lengths of the paths pairing up vertices
is as small as possible. How do we find the best paths (Matching Problem)?
(Practice, practice, practice?) With 2n vertices of odd valence, you will
need to add at least n edges.

Figure Finding the best circuit: find an Euler circuit for the Eulerized graph, and `squeeze' it onto the original (imagine crossing the same edge twice, instead of walking across the added parallel edge).

Complicate things? Introduce one-way streets (add arrows to edges = a
**directed** graph = digraph)! Take into account the **lengths**
of edges, when finding the most efficient circuit.

**Chapter 2:** Visiting Vertices

Different problem, different goal: finding efficient delivery routes. Instead of crossing every street exactly once, try to visit every vertex exactly once!

vertices = destinations , edges = routes between destinations.

A circuit that passes through each vertex exactly once is called a
**Hamiltonian circuit**.

For **complete graphs** (every pair of vertices is joined by an edge), finding
Hamiltonian circuits is easy; keep walking to a new vertex until you visit them
all, then return to start.

Think: each edge is labelled with a number (=distance b/w vertices, cost of
shipping between destinations....). New goal: find **least cost**
Hamiltonian circuit. Problem: e.g., for a complete graph with 12 vertices, there
are 11×10×¼×2×1 = 11! (=) Hamiltonian circuits!
How do you find the best one?

You don't. But we can think of fast ways (algorithms) to find circuits that probably won't do too badly:

Nearest neighbor algorithm: Start at a vertex. Start building a Hamiltonian circuit by walking to the closest vertex (shortest length edge); repeat the process, walking to new vertices, until all have been visited, then walk back home.

Sorted edges algorithm: Idea: use the shortest edges you can. Sort the
edges in increasing order of length. Pick edges by starting with the
shortest one; keep choosing the shortest edge remaining that can be used with the
ones we picked to build a Hamiltonian circuit. I.e. **discard** the
shortest edge remaining if:

(a) you will have three edges sharing a vertex, or

(b) you will build a circuit that doesn't include all of the vertices.

Figure These are both "greedy" algorithms; at each stage they both choose the shortest edge meeting certain criteria.

Different problem, different goal: stringing phone lines between cities.

Complete graph with edges labelled (by distance, cost to string wire between,...)
What is the most efficient way to wire them together? Route calls through
other cities - don't need a direct connection. Just need to wire a **path**
between each pair. What is the shortest total collection of edges that includes
a path between every pair?

Idea: **Don't** build circuits! They waste wire (remove an edge, and all the
cities in the circuit are still wired together...)

A collection of edges that form no circuit is called a **tree**. A tree that
contains all of the vertices is called a **maximal tree**. Our shortest total collection
of edges will form what is called a **minimum cost spanning tree** for the graph.

**Kruskal's algorithm:** Steal from sorted edges algorithm! Pick the shortest
edge; continue
to add the shortest edge that **doesn't** build a circuit. Stop when all vertices
are connected together (Fact: for n vertices, you will need exactly n-1 edges.)

This applies just as well to graphs that aren't complete (just pretend that the missing edges each cost $1,000,000 - then the algorithm would never look at them).

Figure
Fact: Kurskal's algorithm will **always find** the minimum cost spanning tree.

Critical Path Analysis:

Idea: We have a project which involves several tasks, some of which must be completed before others can be begun (e.g., constructing a house, cooking dinner). What is the minimum amount of time required to complete all of the tasks?

We assume we have as many people as we need to assign each task to.

Model the progect with a directed graph. Tasks = vertices; a directed edge
points from one task to anopther that cannot be begun until the first is finished
(e.g., build foundation ® install refrigerator) : the **order-requirement
digraph**.

Basic idea: every directed path through the order-requirement digraph (**ORD**)
represents a collection of
tasks each of which must wait for the previous tasks to finish before
starting. So the time to finish *just those tasks* is the sum of the times
to do the tasks. The whole project cannot be completed in less than
this time.

A **critical path** for a digraph is the longest (in terms of total
completion time) directed path in the graph.

*Fact:* The shortest completion time for the collection of tasks is *equal* to the
total time along the critical path(s).

Figure
Finding the completion time: set up the ORD so that every arrow goes left to right.
Working from left to right, assign to each arrow the minimum completion time for the
task it points **from**. This time is the largest of the times
on arrows pointing **to** the task, plus the length of time for the task. For tasks
not required for anything else, write the corresponding time to the right of the
vertex. The largest number you write is the shortest completion time. The sequence of
vertices that forced this number to be as high as it is represents your critical path.

Tasks on the critical path(s) represent bottlenecks; places to focus efforts to save time.
Shortening a task that does not lie on *all* of the critical paths will not
decrease the completion time!

**Chapter 3:** Planning and Scheduling

In the real world, we do not have an unlimited number of people (= processors) to assign to our tasks. How do things work if we have a set number of processors?

simplifying assumptions:

- each processor can do any of the tasks (no specialists)

- once a task is started, it is completed without stopping

- when a processor finishes, it will immediately start a task if one is available

- an order-requirement digraph tells us when a task is available (i.e., if it's prerequisite tasks have been completed)

- a **priority list** tells us which jobs we *want* to do first (e.g.,
shortest first, longest first, one the get us paid first...)

The goal of our scheduling procedure might vary depending on what we want to focus on:

1. Shortest completion time with available resources

2. Shortest amount of idle time for each processor

3. Determine the minimum number of processors needed to complete the project in a fixed amount of time.

An easy algorithm to find the best completion time has not been (will never be?) found.

**List processing algorithm**: As each processor finishes a task, assign it the
highest priority task that the order requirement digraph says is available. If
no task is available, stand idle until one is available.

Priority list: T_{1},T_{2},T_{3},T_{4},T_{5},T_{6},T_{7}

Figure Some strange behavior:

- shorten each task in the project above by 1: longer completion time!

- add a third processor: longer completion time!

- ignore the order requirement digraph: longer completion time!

Moral: We chose a very bad priority list....

Critical path analysis tells us that tasks on the critical path(s) are potential bottlenecks, slowing down overall completion time. The first task on a critical path should be given highest priority.

**Critical path scheduling:** Build the priority list using critical paths. The
first task(s) on critical path(s) are given highest priority. Pretend that the
first task is *completed*, and find the **new critical path(s)** for the
remaining tasks. Give the first
task on the new path the next highest priority. Continue through the ORD until
every task has been placed on the priority list. Then schedule the tasks using
the ORD and this priority list.

**Independent tasks:** Get rid of the ORD!

Many projects naturally have no ORD; e.g., processing a large number of
photocopying orders, grading term papers (time=length?). The individual
tasks may be completed in any order; no task needs to wait until
another has finished. These are called **independent tasks**.

A fairly efficient way to schedule independent tasks is using a **
decreasing time** (priority) **list**. Idea: a common source of
inefficiency is having one processor working alone long after
all others have finished. A long task left until
the end will tie up only one processor, while everyone else sits idle.
So, make sure that the last task is as short as possible! Sort the
tasks in **decreasing** order of time to complete, and use this as
your priority list.

Fact: If T is the best possible completion time for m processors,
then the decreasing time list wil give a completion time of **at
most** (4/3 - 1/(3m))T (e.g., for 2 processors, this is (7/6)T).

**Bin Packing:** Set the completion time! How many processors do you
need to finish the project in a set amount of time T? (Note: no single
task can take more than time T...!)

Think: objects of differing heights (= completion times) being stacked on top of one another (= being assigned to each processor) into bins of height T (=for a total time of at most T each). How many bins (=how many processors) will you need to hold (= finish) all of the tasks?

Some likely procedures:

Next fit: keep adding items to a bin until the next one won't fit, then move on to the next bin.

First fit: put each item into the **first** bin it will fit in.

Worst fit: put each item into the open bin with the most available space (if possible), otherwise open a new bin.

These can all be inefficient; large objects appearing late in the list will not
be packed efficiently. (We all know that when packing, the **big** stuff goes in
first, and the little stuff is packed around it...)

More efficient: create a decreasing time list **first**, so that big stuff gets packed
away first. List all objects in decreasing order of size, first. Otherwise, proceed as
before. This gives:

Next fit decreasing, First fit decreasing, Worst fit decreasing.

File translated from T