Steven R. Dunbar

Department of Mathematics

203 Avery Hall

University of Nebraska-Lincoln

Lincoln, NE 68588-0130

http://www.math.unl.edu

Voice: 402-472-3731

Fax: 402-472-8466

Topics in

Probability Theory and Stochastic Processes

Steven R. Dunbar

__________________________________________________________________________

Smoothed Analysis of Linear Optimization

_______________________________________________________________________

Note: To read these pages properly, you will need the latest version of the Mozilla Firefox browser, with the STIX fonts installed. In a few sections, you will also need the latest Java plug-in, and JavaScript must be enabled. If you use a browser other than Firefox, you should be able to access the pages and run the applets. However, mathematical expressions will probably not display correctly. Firefox is currently the only browser that supports all of the open standards.

_______________________________________________________________________________________________

Mathematicians Only: prolonged scenes of intense rigor.

_______________________________________________________________________________________________

__________________________________________________________________________

- The performance time of an algorithm is usually expressed by its running time, expressed as a function of the input size of the problem it solves.
- The performance profiles of algorithms across the landscape of input instances can differ greatly.
- Average-case analyses employ distributions with concise mathematical descriptions, such as Gaussian random vectors, uniform vectors, and other standard distributions. The drawback of using such distributions is that the inputs in practice may have little resemblance to the inputs that are likely to be generated.
- An alternative is to identify typical properties of real data, define an input model that captures these properties, and then rigorously analyzes the performance of algorithms assuming their inputs have these properties. Smoothed analysis is a step in this direction.

__________________________________________________________________________

- The worst case measure is defined as
$${WC}_{A}\left[n\right]=\underset{x\in {\Omega}_{n}}{max}{T}_{A}\left[x\right].$$
- Suppose $S$
provides a distribution over each ${\Omega}_{n}$,
the average case measure corresponding to $S$
is:
$${Ave}_{A}^{S}\left[n\right]=E\left[{T}_{A}\left[x\right]\right]$$
where the expectation is over $x{\in}_{S}{\Omega}_{n}$ indicating that $x$ is randomly chosen from ${\Omega}_{n}$ according to distribution $S$.

- A Gaussian random vector of variance ${\sigma}^{2}$, centered at the origin in ${\Omega}_{n}={\mathbb{R}}^{n}$ is a vector in which each entry is an independent Gaussian random variable of variance ${\sigma}^{2}$ and mean $0$.
- The smoothed complexity of $A$
with $\sigma $-Gaussian
perturbations is given by
$${Smoothed}_{A}^{{\sigma}^{2}}\left[n\right]=\underset{x\in {\left[-1,1\right]}^{n}}{max}E\left[{T}_{A}\left({x}_{0}+g\right)\right]$$
where $g$ is a ${\sigma}^{2}$-Gaussian random vector.

__________________________________________________________________________

The performance time of an algorithm is usually expressed by its running time, expressed as a function of the input size of the problem it solves. The performance profiles of algorithms across the landscape of input instances can differ greatly and can be quite irregular. Some algorithms run in time linear in the input size on all instances, some take quadratic or higher order polynomial time, while some may take an exponential amount of time on some instances. For example, we showed in Worst Case and Average Case Behavior of the Simplex Algorithm. that on the Klee-Minty example in ${\mathbb{R}}^{n}$ the Simplex Algorithm with Dantzig’s Rule for pivoting takes ${2}^{n}-1$ steps.

Although we normally evaluate the performance of an algorithm by its running time, other performance parameters are often important. These performance measures include the amount of memory space required, the number of bits of precision required to achieve a given output accuracy, the number of cache misses, the error probability of a decision algorithm, the number of random bits needed in a randomized algorithm, the number of calls to a given subroutine, and the number of examples needed in a learning algorithm.

When $A$ is an algorithm for solving problem $P$, we let ${T}_{A}\left[x\right]$ denote the running time of algorithm $A$ on input instance $x$. An input domain $\Omega $ of all input instances is usually viewed as the union of a family of subdomains $\left\{{\Omega}_{1},{\Omega}_{2},\dots ,{\Omega}_{n},\dots .\right\}$ where ${\Omega}_{n}$ represents all instances in $\Omega $ of size $n$.

The worst case measure is defined as

$${WC}_{A}\left[n\right]=\underset{x\in {\Omega}_{n}}{max}{T}_{A}\left[x\right].$$

For example, the Klee-Minty example in Worst Case and Average Case Behavior of the Simplex Algorithm. shows that

$${WC}_{\text{Simplex}}\left[n\right]\ge C\cdot {2}^{n}$$

where $C$ is some constant measuring the running time at each pivot.

The average case measures have more parameters. In each average-case measure, one first determines a distribution of inputs and then measures the expected performance of the algorithm assuming inputs are drawn from this distribution. Supposing $S$ provides a distributions over each ${\Omega}_{n}$, the average case measure corresponding to $S$ is:

$${Ave}_{A}^{S}\left[n\right]=E\left[{T}_{A}\left[x\right]\right]$$

where the expectation is over $x{\in}_{S}{\Omega}_{n}$ indicating that $x$ is randomly chosen from ${\Omega}_{n}$ according to distribution $S$. One would ideally choose the distribution of inputs that occurs in practice, but it is rare that one can determine or cleanly express these distributions. Furthermore, the distributions can vary greatly from one application to another. Instead, average-case analyses have employed distributions with concise mathematical descriptions, such as Gaussian random vectors, uniform vectors, and other standard distributions. The drawback of using such distributions is that the inputs in practice may have little resemblance to the inputs that are likely to be generated.

An alternative is to identify typical properties of real data, define an input model that captures these properties, and then rigorously analyzes the performance of algorithms assuming their inputs have these properties. Smoothed analysis is a step in this direction. It is motivated by the observation that real data is often subject to some small degree of noise. For example, in industrial optimization and economic prediction, the input parameters could be obtained by physical measurements, and the measurements usually have some low magnitude uncertainty. At a high level, each input is generated from a two-stage model. In the first stage, an instance of the problem is formulated according to say physical, industrial or economic considerations. In the second stage, the instance from the first stage is slightly perturbed. The perturbed instance is the input to the algorithm.

In smoothed analysis, we assume the input to the algorithm is subject to a slight random perturbation The smoothed measure of an algorithm on an input instance is its expected performance over the perturbations of that instance. Define the smoothed complexity of the algorithm to be the maximum smoothed measure over the input instances.

A Gaussian random vector of variance ${\sigma}^{2}$, centered at the origin in ${\Omega}_{N}={\mathbb{R}}^{n}$ is a vector in which each entry is an independent Gaussian random variable of variance ${\sigma}^{2}$ and mean $0$, meaning that the probability density of each entry in the vector is

$$\frac{1}{\sqrt{2\pi {\sigma}^{2}}}{e}^{-{x}^{2}\u22152{\sigma}^{2}}$$

For a vector ${x}_{0}\in {\mathbb{R}}^{n}$, the $\sigma $-Gaussian perturbation of ${x}_{0}$ is a random vector $x={x}_{0}+g$ where $g$ is a Gaussian random vector of variance ${\sigma}^{2}$.

Definition. Suppose $A$ is an algorithm with ${\Omega}_{n}={\mathbb{R}}^{n}$. Then the smoothed complexity of $A$ with $\sigma $-Gaussian perturbations is given by

$${Smoothed}_{A}^{{\sigma}^{2}}\left[n\right]=\underset{x\in {\left[-1,1\right]}^{n}}{max}E\left[{T}_{A}\left({x}_{0}+g\right)\right]$$

where $g$ is a ${\sigma}^{2}$-Gaussian random vector.

In words, this definition says:

- Perturb the original input ${x}_{0}$ to obtain the input ${x}_{0}+g$
- Feed the perturbed input into the algorithm
- For each original input, measure the expected running time of the algorithm $A$ on random perturbations of that input.
- Then obtain the smoothed analysis by the expectation under the worst possible input.

By varying ${\sigma}^{2}$ between $0$ and infinity, one can use smoothed analysis to interpolate between worst-case and average case analysis. When ${\sigma}^{2}=0$, one recovers the ordinary worst-case analysis. As ${\sigma}^{2}$ grows large the random perturbation $g$ dominates the original ${x}_{0}$ and one obtains an average-case analysis. We are often interested in the case when $\sigma $ (the standard deviation, measured in the same units as $\parallel x\parallel $) is small relative to $\parallel x\parallel $ in which case $x+g$ is a slight perturbation of $x$. Smoothed analysis often demonstrates that a perturbed problem is less time-consuming to solve.

Definition. Algorithm $A$ has polynomial smoothed complexity if there exist positive constants ${n}_{0}$, ${\sigma}_{0}$, $c$, ${k}_{1}$ and ${k}_{2}$ such that for all $n\ge {n}_{0}$, and $0\le \sigma \le {\sigma}_{0}$

$${Smoothed}_{A}^{{\sigma}^{2}}\left[n\right]\le c\cdot {\sigma}^{-{k}_{2}}\cdot {n}^{{k}_{2}}.$$

Recall Markov’s Inequality: If $X$ is a random variable that takes only nonnegative values, then for any $a>0$:

$$\mathbb{P}\left[X\ge a\right]\le E\left[X\right]\u2215a$$

Therefore, if an algorithm $A$ has smoothed complexity $T\left(n,\sigma \right)$, then

$$\underset{{x}_{0}\in {\left[-1,1\right]}^{n}}{max}\mathbb{P}\left[{T}_{A}\left[{x}_{0}+g\right]\le {\delta}^{-1}T\left[n,\sigma \right]\right]\ge 1-\delta $$

Proof. Need to work through this. □

This says that if $A$ has polynomial smoothed complexity, then for any ${x}_{0}$, with probability at least $1-\delta $, $A$ can solve a random perturbation of ${x}_{0}$ in time polynomial in $n$, $1\u2215\sigma $, and $1\u2215\delta $.

This probabilistic upper bound does not imply that smoothed complexity of $A$ is $O\left(T\left[n,\sigma \right]\right)$. Blum, Dunagan, Beier, and Vöcking introduced a relaxation of polynomial smoothed complexity:

Definition. Algorithm $A$ has probably polynomial smoothed complexity if there exist constants ${n}_{0}$, ${\sigma}_{0}$, $c$ and $\alpha $ such that for $n\ge {n}_{0}$ and $0\le \sigma \le {\sigma}_{0}$,

$$\underset{x\in {\left[-1,1\right]}^{n}}{max}E\left[{T}_{A}{\left[{x}_{0}+g\right]}^{\alpha}\right]\le c\cdot {\sigma}^{-1}\cdot n$$

They show that some algorithms have probably polynomial smoothed complexity, in spite of the fact that their smoothed complexity is unbounded.

Spielman and Teng considered the smoothed complexity of the simplex algorithm with the shadow-vertex pivot rule developed by Gass and Saaty. They show that the smoothed complexity of the algorithm is polynomial. Vershynin improved their result to obtain a smoothed complexity of

$$O\left(max\left({n}^{5}\cdot {\left(log\left(m\right)\right)}^{2},{n}^{9}\cdot {\left(log\left(m\right)\right)}^{4},{n}^{3}\cdot {\sigma}^{-4}\right)\right)$$

This section is adapted from the article “Smoothed Analysis: An attempt to explain the behavior of algorithms in practice’ by Daniel A. Spielman and Shang-Hua Teng, [1].

_______________________________________________________________________________________________

__________________________________________________________________________

[1] Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis: An attempt to explain the behavior of algorithms in practice. Communications of the ACM, 52(10):77–84, October 2009.

__________________________________________________________________________

__________________________________________________________________________

I check all the information on each page for correctness and typographical errors. Nevertheless, some errors may occur and I would be grateful if you would alert me to such errors. I make every reasonable effort to present current and accurate information for public use, however I do not guarantee the accuracy or timeliness of information on this website. Your use of the information from this website is strictly voluntary and at your risk.

I have checked the links to external sites for usefulness. Links to external websites are provided as a convenience. I do not endorse, control, monitor, or guarantee the information contained in any external website. I don’t guarantee that the links are active at all times. Use the links here with the same caution as you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions or policies of my employer.

Information on this website is subject to change without notice.

Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1

Email to Steve Dunbar, sdunbar1 at unl dot edu

Last modified: Processed from LATEX source on January 27, 2011