Discrete Optimization



Constraint Programming


Linear Programming

\[\begin{align*} \min~c_1x_1+\cdots&+c_nx_n\\ a_{11}x_1+\cdots&+a_{1n}x_n= b_1\\ \cdots&\\ a_{m1}x_1+\cdots&+a_{mn}x_n= b_m\\ x_i\geq& 0 \end{align*}\]

where $b_m\geq 0$

The solution space is convex set and the optimal solution is at a vertex.

Simplex Algorithm

  1. A basic feasible solution (BFS) is like:

    \[\begin{align*} x_1 &= b_1 + \sum\limits_{i=m_1}^na_{1i}x_i\\ &\cdots\\ x_m &= b_m + \sum\limits_{i=m_1}^na_{mi}x_i \end{align*}\]

    If we take $x_i=0$ for $i=m+1,\cdots,n$, then $x_i=b_i$ for $i=1,\cdots,m$.

    Here $x_i$ for $i=1:m$ are basic variables and $x_i$ for $i=m+1:n$ are non basic variables.

  2. To move from a BFS to a better one,

    • Select an entering variable $x_e$ at right-hand side with $c_e<0$
    • Select a leaving variable $x_l$ at left-hand side:

    • Perform Gaussian elimination to eliminate basic variables

    If $c_e<0$ and all $a_{ie}\geq 0$, then we can increase $x_e$ arbitrarily to decrease the objective arbitrarily.

    If $b_i=0$, make $x_i$ leave do not change the objective.

    To assure the termination, we can select always the first possible entering variable.

  3. To detect whether a BFS is optimal

    If after having eliminated all the basic variables, the objective function is of the form


    with $\forall i, ~c_i\geq 0$

  4. To find the first BFS, we can introduce artificial variables:

    \[\begin{align*} \min~c_1x_1+\cdots&+c_nx_n\\ a_{11}x_1+\cdots&+a_{1n}x_n+y_1 =b_1\\ \cdots&\\ a_{m1}x_1+\cdots&+a_{mn}x_n+y_m =b_m\\ x_i\geq& 0\\ y_j\geq& 0\\ \end{align*}\]

    It’s a BFS with $y_j$ at left-hand side.

    In order to remove $y_j$, we consider the following problem

    \[\begin{align*} \min~y_1+\cdots&+y_m\\ a_{11}x_1+\cdots&+a_{1n}x_n+y_1 =b_1\\ \cdots&\\ a_{m1}x_1+\cdots&+a_{mn}x_n+y_m =b_m\\ x_i\geq& 0 \end{align*}\]

Matrix Representation

$c_1$ $c_n$ $-z$
$a_{11}$ $a_{1n}$ $b_1$
$a_{i1}$ $a_{in}$ $b_i$
$a_{m1}$ $a_{mn}$ $b_m$


For this problem, if the primal has an optimal solution, the dual has an optimal solution with the same cost.



It can be interpreted as:

The dual problem use the combination of constraints to find a lower bound of the primal problem.

Why we use the dual problem ?

If we have an optimal solution for the primal problem and we want to add a new constraint which makes the solution infeasible, but the dual is still feasible (we only need to add a dual varaible). Then we can solve the dual until the optimality and the solution becomes feasible for primal.

Moreover, the primal/dual problem use the same matrix table to solve the problem.


Boyd, Stephen, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.

Primal problem

We consider the following primal problem:

\[\begin{align*} \text{minimize}~f_0(x)\\ \text{s.t.}\quad f_i(x) &\leq 0,\quad i=1:m\\ h_i(x) &= 0,\quad i=1:p \end{align*}\]

Dual problem

We can rewrite the primal problem as an unconstrainted problem

\[\begin{align*} \text{minimize}~f_0(x)+\sum\limits_{i=1}^mI_-(f_i(x))+\sum\limits_{i=1}^pI_0(h_i(x)) \end{align*}\]


\[I_-(u)=\begin{cases} 0 & u\leq 0\\ \infty & u>0 \end{cases} ,\quad I_0(u)=\begin{cases} 0 & u = 0\\ \infty & u\neq 0 \end{cases}\]

These functions $I_-,I_0$ are used to express our displeasure associated with the constraint function value $f_i(x)$.

And we can replace $I_-(u)$ with $\lambda_i u$ where $\lambda_i\geq 0$ and $I_0(u)$ with $v_i u$ where $v_i\in\mathbb{R}$.

Since $\lambda_i u\leq I_-(u)$ and $v_i u \leq I_0(u)$, we have

\[\begin{align*} L(x,\lambda,v) & := f_0(x)+\sum\limits_{i=1}^m\lambda_i f_i(x)+\sum\limits_{i=1}^p v_i h_i(x)\\ & \leq f_0(x)+\sum\limits_{i=1}^mI_-(f_i(x))+\sum\limits_{i=1}^pI_0(h_i(x)) \end{align*}\]

We define the Lagrange dual function $g(\lambda,v)$ as

\[g(\lambda,v)=\inf\limits_x L(x,\lambda,v)\]

and the dual problem:

\[\begin{align*} \text{maximize}&\quad g(\lambda,v)\\ \text{s.t.}&\quad \lambda_i \geq 0,\quad i=1:m \end{align*}\]

This dual problem is a convex optimization because the objectif is the pointwise infimum of a family of affine functions and so it’s concave. This is the case whether or not the primal problem is convex.

Weak/Strong duality

Moreover, we can note that

\[\begin{align*} \sup\limits_{\lambda\geq 0,v}L(x,\lambda,v) &= \sup\limits_{\lambda\geq 0,v}(f_0(x)+\sum\limits_{i=1}^m\lambda_i f_i(x)+\sum\limits_{i=1}^p v_i h_i(x))\\ &=\begin{cases} f_0(x) & f_i(x)\leq 0,~h_i(x)=0\\ \infty & \text{else} \end{cases} \end{align*}\]

So we can express the optimal value of the primal problem as

\[p^*=\inf\limits_x \sup\limits_{\lambda\geq 0,v} L(x,\lambda,v)\]

and by the definition of the dual problem, we can also express the optimal value of the dual problem as

\[d^*=\sup\limits_{\lambda\geq 0,v}\inf\limits_x L(x,\lambda,v)\]

So we have the weak duality:

\[d^*\leq p^*\]


\[\forall \lambda\geq 0,v\quad \inf\limits_x L(x,\lambda,v) \leq L(x,\lambda,v)\]

so, for each $x$,

\[d^*=\sup\limits_{\lambda\geq 0,v}\inf\limits_x L(x,\lambda,v) \leq \sup\limits_{\lambda\geq 0,v}L(x,\lambda,v)\]

Therefore, we can do the infimum over $x$:

\[d^* \leq \inf\limits_x\sup\limits_{\lambda\geq 0,v}L(x,\lambda,v)=p^*\]



then we say that strong duality holds.

KKT Optimality Conditions

Karush-Kuhn-Tucker conditions:

\[\begin{align*} f_i(x) &\leq 0,\quad i=1:m\\ h_i(x) &= 0,\quad i=1:p\\ \lambda_i & \geq 0,\quad i=1:m\\ \lambda_if_i(x) & = 0,\quad i=1:m\\ \nabla_xL(x,\lambda,v) &= 0 \end{align*}\]

For any optimization problem for which strong duality obtains, any pair of primal and dual optimal points must satisfy the KKT conditions (necessary). When the primal problem is convex ($f_i$ are convex and $h_i$ are affine), the KKT conditions are also sufficient for the points to be primal and dual optimal.


Lagrange duality has an interesting economic interpretation.

Suppose the variable $x$ denotes how an enterprise operates and $f_0(x) $ denotes the cost. Each constraint $f_i(x)\leq 0$ represents a limit on resources (e.g., warehouse space, labor). The operating condition that minimizes cost while respecting the limits can be found by solving the problem:

\[\begin{align*} \text{minimize}~f_0(x)\\ \text{s.t.}\quad f_i(x) &\leq 0,\quad i=1:m \end{align*}\]

The resulting optimal cost is $p^*$.

If now the limits can be violated by paying an additional cost which is linear in the amount of violation: $\lambda_if_i$. The coeffcient $\lambda_i$ has the interpretation of the price for violating $f_i(x)\leq 0$. We assume $\lambda_i \geq 0$, i.e. the firm must pay for violations and it receives income if a constraint is not tight.

So the total cost to the firm, for operating condition $x$, and constraint prices $\lambda_i$, is:

\[L(x,\lambda)=f_0(x)+\sum\limits_{i=1}^m \lambda_if_i(x)\]

The firm will obviously operate so as to minimize its total cost $L(x,\lambda)$, which yields a cost $g(\lambda):=\inf\limits_x L(x,\lambda)$. The optimal dual value, $d^*=\sup\limits_{\lambda\geq 0}g(\lambda)$, is the optimal cost to the enterprise under the least favorable set of prices.

$d^*\leq p^*$ since in the second scenario some income can be derived from the constraints that are not tight. And if strong duality holds, the $\lambda^*$ is a set of prices for which there is no advantage to the firm in being allowed to pay for constraint violations or receive payments for nontight constraints. $\lambda^*$ is sometimes called a set of shadow prices for the primal problem.

Slater’s condition

For the following problem:

\[\begin{align*} \text{minimize}~f_0(x)\\ \text{s.t.}\quad f_i(x) &\leq 0,\quad i=1:m\\ Ax&=b \end{align*}\]

If there exists an $x\in\text{relint}D$ with

\[\begin{align*} f_i(x)\leq & 0,\quad i=1:k\\ f_i(x)< & 0,\quad i=k+1:m\\ Ax=&b \end{align*}\]

where $D=\bigcap\limits_{i=1}^m\text{dom}~f_i$ and $f_i(x),i=1:k$ are affine,

then the strong duality holds.


\[\text{relint}~C=\{x\in C| B(x,r) \cap \text{aff}~C\subset C~\text{for some}~r>0\}\] \[\text{aff}~C=\{\theta_1x_1+\cdots+\theta_kx_k|x_1,\cdots,x_k\in C,\theta_1+\cdots+\theta_k=1\}\] \[\text{aff}\{x\in\mathbb{R}^2|x_1^2+x_2^2=1\}=\mathbb{R}^2\]

Mixed Integer Programming

Some variables have to be integers

\[\begin{align*} \min~c_1x_1+\cdots&+c_nx_n\\ a_{11}x_1+\cdots&+a_{1n}x_n= b_1\\ \cdots&\\ a_{m1}x_1+\cdots&+a_{mn}x_n= b_m\\ x_i\geq& 0\\ x_i\in& \mathbb{N}~(i\in I) \end{align*}\]

Integer variables are typically 0/1 varaibles.

Branch and bound

Solve the linear relaxation

Big M Transformation

A constraint that two integers $x,y$ are different is $x\neq y$ which is equivalent to

\[x\leq y-1 \text{or} x\geq y+1\]

Since the disjunction is not allowed in a MIP model, we can introduce a 0/1 variable $b$ and a large number $M$

\[\begin{align*} x\leq&~ y -1+bM\\ x\geq&~ y+1-(1-b)M \end{align*}\]

But the linear relaxation will choose $b=0.5$

Gomory Cut

The idea is to add a linear constraint that:

For example, we have the following constraint and $b_1$ is fractional:


Since $\sum\limits_{j=m+1}\lfloor a_{1j}\rfloor x_j \leq \sum\limits_{j=m+1}a_{1j}x_j$

\[x_1+\sum\limits_{j=m+1}\lfloor a_{1j}\rfloor x_j\leq b_1\]

Since the left hand-side is an integer

\[x_1+\sum\limits_{j=m+1}\lfloor a_{1j}\rfloor x_j\leq \lfloor b_1 \rfloor\]


\[\sum\limits_{j=m+1}(a_{ij}-\lfloor a_{1j}\rfloor) x_j\geq b_1 - \lfloor b_1 \rfloor\]

The cut is actually to add a variable $s$ which should be positive

\[s=\sum\limits_{j=m+1}(a_{ij}-\lfloor a_{1j}\rfloor) x_j - (b_1 - \lfloor b_1 \rfloor)\]


Polyhedral Cuts

Cuts that represent the facets of the convex hull of the integer solution

If we have all the cuts, we could use linear programming to solve the problem.


Cover Cuts

Consider the following constraint:

\[\sum\limits_{j=1}^n a_jx_j\leq b\]

A set $C\subset\{1:n\}$ is a cover if

\[\sum\limits_{j\in C}a_j > b\]

and a cover is minimal if any subset is not a cover.

If $C$ is a cover, then the solution need to satisfie

\[\sum\limits_{j\in C}x_j \leq |C|-1\]

which is equivalent to

\[\sum\limits_{j\in C}(1-x_j) \geq 1\]

Given a solution, we would like to know whether there exists a cover cut that cut the solution. Which means is there a cover set $C$ that the solution violate the constraint:

\[\sum\limits_{j\in C}a_j > b,\quad\sum\limits_{j\in C}(1-x_j) < 1\]

To answer this question, we only need to solve the following problem:

\[\begin{align*} \min\quad &\sum\limits_{j\in\{1:n\}}(1-x_j^*)z_j\\ \text{s.t.}\quad & \sum\limits_{j\in\{1:n\}}a_jz_j > b\\ &z_j\in\{0,1\} \end{align*}\]

If the minimum value is lower than 1 then we have a cut, otherwise the cut doesn’t exist.

By replacing $z_j$ by $1-y_j$, it’s a Knapsack problem.

Column Generation

Column Generation for Cutting Stock