
Contiguous minimum singlesourcemultisink
cuts in weighted planar graphs,
Ivona Bezakova and Zachary Langley.
Proceedings of the 18th Annual International Conference on Computing and Combinatorics (COCOON),
Lecture Notes in Computer Science, Volume 7434, 4960, Springer, 2012.
We present a fast algorithm for uniform sampling of contiguous minimum cuts separating a source vertex from a set of sink vertices
in a weighted undirected planar graph with $n$ vertices embedded in the plane. The algorithm takes O(n) time per sample, after an
initial O(n^{3}) preprocessing time during which the algorithm computes the number of all such contiguous minimum cuts.
Contiguous cuts (that is, cuts where a naturally defined boundary around the cut set forms a simply connected planar region) have
applications in computer vision and medical imaging.

Counting and sampling minimum (s,t)cuts in weighted
planar graphs in polynomial time,
Ivona Bezakova and Adam J. Friedlander.
Theoretical Computer Science, Elsevier. Available online (May 2011), to appear in print. (Special issue for MFCS '10.)
Extended abstract in the Proceedings of the 35th Annual symposium on Mathematical Foundations of Computer Science (MFCS),
Lecture Notes in Computer Science, Volume 6281/2010, 126137, Springer, 2010.
A presentation can be found here.
We give an O(nd + n log n) algorithm computing the number of minimum
(s,t)cuts in weighted planar graphs, where n is the number of vertices and d
is the length of the shortest st path in the corresponding unweighted graph.
Previously, Ball and Provan gave a polynomialtime algorithm for unweighted
planar graphs with both s and t lying on the outer face. Our results hold for
all locations of s and t and weighted graphs, and have direct applications in
computer vision.

Path lengths in protein–protein interaction networks
and biological complexity,
Ke Xu, Ivona Bezakova, Leonid Bunimovich, and Soojin V. Yi
Proteomics,
Volume 11, Issue 10, 18571867, Wiley, May 2011.
We investigated the biological significance of path lengths in 12 protein–protein interaction
(PPI) networks. We put forward three predictions, based on the idea that biological
complexity influences path lengths. First, at the network level, path lengths are generally
longer in PPIs than in random networks. Second, this pattern is more pronounced in more
complex organisms. Third, within a PPI network, path lengths of individual proteins are
biologically significant. We found that in 11 of the 12 species, average path lengths in PPI
networks are significantly longer than those in randomly rewired networks. The PPI network
of the malaria parasite Plasmodium falciparum, however, does not exhibit deviation from
rewired networks. Furthermore, eukaryotic PPIs exhibit significantly greater deviation from
randomly rewired networks than prokaryotic PPIs. Thus our study highlights the potentially
meaningful variation in path lengths of PPI networks. Moreover, node eccentricity, defined as
the longest path from a protein to others, is significantly correlated with the levels of gene
expression and dispensability in the yeast PPI network. We conclude that biological
complexity influences both global and local properties of path lengths in PPI networks.
Investigating variation of path lengths may provide new tools to analyze the evolution of
functional modules in biological systems.

Computing and Counting Longest Paths on CircularArc Graphs in Polynomial Time,
George B. Mertzios and Ivona Bezakova.
Proceedings of the 6th LatinAmerican Algorithms, Graphs, and Optimization Symposium (LAGOS),
Electronic Notes in Discrete Mathematics, Volume 37, 219224, Elsevier, 2011.
Full version in preparation for submission.
A presentation can be found here.
The longest path problem asks for a path with the largest number of vertices in a
given graph. In contrast to the Hamiltonian path problem, until recently polynomial
algorithms for the longest path problem were known only for small graph classes, such
as trees. Recently, a polynomial algorithm for this problem on interval graphs has been
presented by Ioannidou etal. with running time O(n^{4}) on a graph with n vertices, thus answering
the open question posed by Uehara and Uno. Even though interval and circulararc graphs look
superficially similar, they differ substantially, as circulararc graphs are not perfect; for
instance, several problems  e.g. minimum coloring  are NPhard on circulararc graphs,
although they can be efficiently solved on interval graphs. In this paper, we prove that
for every path P of a circulararc graph G, we can appropriately "cut" the circle, such
that the obtained (not induced) interval subgraph G' of G admits a path P' on the same
vertices as P. This nontrivial result is of independent interest, as it suggests a generic
reduction of a number of path problems on circulararc graphs to the case of interval
graphs with a multiplicative linear time overhead of O(n). As an application of this
reduction, we present the first polynomial algorithm for the longest path problem on
circulararc graphs. In addition, by exploiting deeper the structure of circulararc graphs,
we manage to get rid of the linear time overhead of the reduction, and thus this algorithm
turns out to have the same running time O(n^{4}) as the one on interval graphs. Our
algorithm, which significantly simplifies the approach of Ioannidou etal., computes in the same time
an napproximation of the (exponentially large in worst case) number of different vertex
sets that provide a longest path; in the case where G is an interval graph, we compute
the exact number. Moreover, in contrast to Ioannidou etal., this algorithm can be directly extended
with the same running time to the case where every vertex has an arbitrary positive weight.

On the DiaconisGangolli Markov Chain for
Sampling Contingency Tables with CellBounded Entries,
Ivona Bezakova, Nayantara Bhatnagar, and Dana Randall.
Journal of Combinatorial Optimization, Springer. Available online (May 2010), to appear in print.
(Special issue for COCOON '09.)
Extended abstract in the Proceedings of the 15th
Annual International Conference on Computing and Combinatorics (COCOON),
Lecture Notes in Computer Science, Volume 5609/2009, 307316, Springer, 2009.
A presentation can be found here.
The problems of uniformly sampling and approximately
counting contingency tables have been widely studied, but efficient solutions
are only known in special cases. One appealing approach is the
Diaconis and Gangolli Markov chain which updates the entries of a random
2 x 2 submatrix. This chain is known to be rapidly mixing for
cellbounded tables only when the cell bounds are all 1 and the row and
column sums are regular. We demonstrate that the chain can require
exponential time to mix in the cellbounded case, even if we restrict to
instances for which the state space is connected. Moreover, we show the
chain can be slowly mixing even if we restrict to natural classes of problem
instances, including regular instances with cell bounds of either 0 or
1 everywhere, and dense instances where at least a linear number of cells
in each row or column have nonzero cellbounds.

Sampling Edge Covers in 3Regular Graphs,
Ivona Bezakova and William A. Rummler.
Proceedings of the 34th Annual symposium on Mathematical Foundations of Computer Science (MFCS),
Lecture Notes in Computer Science, Volume 5734/2009, 137148, Springer, 2009.
A presentation can be found here.
An edge cover C of an undirected graph is a set of edges such
that every vertex has an adjacent edge in C. We show that a Glauber
dynamics Markov chain for edge covers mixes rapidly for graphs with
degrees at most three. Glauber dynamics have been studied extensively
in the statistical physics community, with special emphasis on lattice
graphs. Our results apply, for example, to the hexagonal lattice. Our
proof of rapid mixing introduces a new cycle/path decomposition for the
canonical flow argument.

Sampling Binary Contingency Tables,
Ivona Bezakova.
Computing in Science & Engineering, 10(2):2631, 2008. (Invited publication.)
A presentation can be found here.
The binary contingency tables problem has sparked the interest of statisticians and
computer scientists alike. Although it's been studied from both theoretical and practical
perspectives, a truly usable algorithm remains elusive. The author presents several approaches to the problem.

Accelerating Simulated Annealing Algorithm for the Permanent
and Combinatorial Counting Problems,
Ivona Bezakova, Daniel Stefankovic, Vijay V. Vazirani, and Eric Vigoda.
SIAM Journal of Computing, 37(5):14291454, 2008.
Extended abstract
in the Proceedings of the 17th Annual ACMSIAM Symposium on Discrete Algorithms (SODA), ACM Press, 900907, 2006.
Relevant presentations can be found here.
We present an improved "cooling schedule" for simulated annealing algorithms for
combinatorial counting problems. Under our new schedule the rate of cooling accelerates as the
temperature decreases. Thus, fewer intermediate temperatures are needed as the simulated an
nealing algorithm moves from the high temperature (easy region) to the low temperature (difficult
region). We present applications of our technique to colorings and the permanent (perfect matchings
of bipartite graphs). Moreover, for the permanent, we improve the analysis of the Markov chain
underlying the simulated annealing algorithm. This improved analysis, combined with the faster
cooling schedule, results in an O(n^7 log^4 n) time algorithm for approximating the permanent of a
0/1 matrix.

Sampling Binary Contingency Tables with a Greedy Start,
Ivona Bezakova, Nayantara Bhatnagar, and Eric Vigoda.
Random Structures and Algorithms, 30(12):168205, 2007.
Extended abstract
in the Proceedings of the 17th Annual ACMSIAM Symposium on Discrete Algorithms (SODA), ACM Press, 414423, 2006.
Relevant presentations can be found here.
We study the problem of counting and randomly sampling binary contingency tables. For
given row and column sums, we are interested in approximately counting (or sampling) 0/1 n x m
matrices with the specified row/column sums. We present a simulated annealing algorithm with
running time O((nm)^2 D^3 dmax log^5(n + m)) for any row/column sums where D is the number
of nonzero entries and dmax is the maximum row/column sum. In the worst case, the running
time of the algorithm is O(n^11 log^5 n) for an n x n matrix. This is the first algorithm to directly
solve binary contingency tables for all row/column sums. Previous work reduced the problem
to the permanent, or restricted attention to row/column sums that are close to regular. The
interesting aspect of our simulated annealing algorithm is that it starts at a nontrivial instance,
whose solution relies on the existence of short alternating paths in the graph constructed by a
particular Greedy algorithm.

Analysis of Sequential Importance Sampling for Contingency Tables,
Ivona Bezakova, Alistair Sinclair, Daniel Stefankovic, and Eric Vigoda.
Proceedings of the 14th Annual European Symposium on Algorithms (ESA),
Lecture Notes in Computer Science, Volume 4168/2006, 136147, Springer, 2006.
Full version available from arXiv. Submitted.
Relevant presentations can be found here.
The sequential importance sampling (SIS) algorithm has gained considerable popularity for its empirical success.
One of its noted applications is to the binary contingency tables problem, an important problem in statistics,
where the goal is to estimate the number of 0/1 matrices with prescribed row and column sums. We give a family
of examples in which the SIS procedure, if run for any subexponential number of trials, will underestimate the
number of tables by an exponential factor. This result holds for any of the usual design choices in the SIS algorithm,
namely the ordering of the columns and rows. These are apparently the first theoretical results on the efficiency of
the SIS algorithm for binary contingency tables. Finally, we present experimental evidence that the SIS algorithm is
efficient for row and column sums that are regular. Our work is a first step in determining rigorously the class of
inputs for which SIS is effective.

Graph Model Selection using Maximum Likelihood,
Ivona Bezakova, Adam Kalai, and Rahul Santhanam.
Proceedings of the 23rd International Conference on Machine Learning (ICML), ACM International Conference Proceeding Series 148, 105112, 2006.
A more networkingoriented version is here. Preprint.
A presentation and a poster can be found here.
In recent years, there has been a proliferation
of theoretical graph models, e.g., preferential
attachment and smallworld models, motivated by realworld graphs such as
the Internet topology. To address the natural question
of which model is best for a particular data
set, we propose a model selection criterion for
graph models. Since each model is in fact a
probability distribution over graphs, we suggest using Maximum Likelihood to compare
graph models and select their parameters.
Interestingly, for the case of graph models,
computing likelihoods is a difficult algorithmic task. However, we design and implement
MCMC algorithms for computing the maximum likelihood for four popular models: a
powerlaw random graph model, a preferential attachment model, a smallworld model,
and a uniform random graph model. We
hope that this novel use of ML will objectify
comparisons between graph models.

Faster Markov Chain Monte Carlo Algorithms for the
Permanent and Binary Contingency Tables,
Ph.D. Thesis at the University of Chicago, 2006.
Random sampling and combinatorial counting are important building
blocks in many practical applications. However, for some problems
exact counting in deterministic polynomialtime is provably
impossible (unless P=NP), in which case the best hope is to devise
suitable approximation algorithms. Markov chain Monte Carlo (MCMC)
methods give efficient approximation algorithms for several
important problems.
This thesis presents the fastest known (approximation) algorithms
for two such problems, the permanent and binary contingency
tables. The permanent counts the number of (weighted) perfect
matchings of a given bipartite graph, and it has applications in
statistical physics, statistics, and several areas of computer
science. Binary contingency tables count the number of bipartite
graphs with prescribed degree sequences, and they are used in
statistical tests in a number of fields including ecology,
education and psychology.
Our MCMC algorithms use the simulated annealing approach, a
popular heuristic in combinatorial optimization. It utilizes
intuition from the physical annealing process, where a substance
is heated and slowly cooled so that it moves from a disordered
configuration into a crystalline (i.e., ordered) state. In
optimization, simulated annealing starts at an easy instance at
high temperature, and slowly "cools" into the desired hard
instance at low temperature. Simulated annealing is widely used,
with much empirical success but little theoretical backing.
This work presents a new cooling schedule for simulated annealing,
with provable guarantees for several counting problems. We also
improve results on the rate of convergence for the Markov chains
which are at the core of the simulated annealing algorithm. As a
consequence, we obtain a new algorithm, which for a graph with $n$
vertices, approximates the permanent of a 0/1 matrix (within an
arbitrarily close approximation) in time O*(n^7). We also
present a new simulated annealing algorithm for binary contingency
tables, which relies on an interesting combinatorial lemma. In
fact, we obtain a more general result: we give a polynomialtime
algorithm for counting bipartite graphs with prescribed degree
sequence and an arbitrary set of forbidden edges.
Finally, we discuss the importance sampling method, which is a
popular method for binary contingency tables. For any input
instance, this technique converges asymptotically to the correct
answer and in many cases it appears to converge fast, making it
especially suited for practical purposes. On the contrary, we
present a negative result: we describe a very simple instance for
which, in any subexponential number of steps, the sequential
importance sampling estimate differs from the correct answer by an
exponential factor (with high probability).

Allocating Indivisible Goods,
Ivona Bezakova and Varsha Dani.
Preprint.
Survey of results invited for publication in ACM SIGecom Exchanges, Vol. 5.3, 2005.
The MaxMin Fairness problem is as follows: Given m indivisible
goods and k players, each with a specified valuation function on the
subsets of the goods,
how should the goods be split between the players so as to maximize
the minimum valuation. Viewing the problem from a game theoretic
perspective, we show that for two
players and additive valuations the expected minimum of the
(randomized) cutandchoose mechanism is a 1/2approximation of the
optimum.
To complement this result we show that no truthful mechanism can
compute the exact optimum. We also consider the algorithmic
perspective when the (true)
additive valuation functions are part of the input. We present a simple
1/(mk+1) approximation algorithm which allocates to every player at
least 1/k fraction of the value of all but the k1 heaviest items
and we give an algorithm with additive error against the fractional
optimum bounded by the value of the largest item. The two
approximation algorithms are incomparable in the sense that there
exist instances when one outperforms the other.

The
Poincare Constant
of a Random Walk in HighDimensional Convex Bodies,
Master Thesis at the
University of Chicago, 2002.
Estimating the volume of a convex body is an important algorithmic
problem. High dimensions pose an especially difficult task for the
algorithm designer. To be considered efficient, the running time must
be polynomial in the dimension. The first provably efficient
approximation algorithm was presented by Dyer, Frieze, and Kannan
in 1989, and steadily improved by various authors over the next
decade. Fundamental to these works is the analysis of random walks for
generating random points from a convex body.
Kannan, Lovasz, and Simonovits recently analyzed a natural random
walk known as the ball walk. By bounding the socalled conductance,
they obtain (via a Cheegertype inequality) a bound on the Poincare
constant of the random walk. Their bound proves
the walk converges to a random point in time O*(n^3D^2), where n
is the dimension and D is the diameter of the body. We survey and
present a selfcontained view of their work. In addition, following an
outline proposed by Jerrum, we slightly
modify the KLS analysis to bound the Poincare constant without using
a Cheegertype inequality.

Compact
Representations of Graphs and Adjacency Testing,
"Magister" Thesis at the Comenius University, Bratislava, Slovakia, 2000.
A section of the thesis appeared at the Student Science Conference, Bratislava, Slovakia, 2000.
A succinct representation of a graph is a widely studied problem. We
examine this representation in two aspects: (1) its space complexity,
and (2) the time complexity of the basic "graph operations"  the
adjacency test and update operations (adding or removing an edge).
The adjacency matrix is optimal with respect to the adjacency test but
for n vertices its space complexity is always O(n^2) bits,
regardless of the number of edges m. The adjacency list
representation is spaceoptimal but the adjacency test takes more
time than it necessarily needs. The aim of this thesis was to find a
representation as close as possible to the optimal space complexity of
O(m log n) bits, and to the optimal time complexity of O(log n)
bitoperations for the adjacency test (and the update operations).
We present a representation using the technique of hashing: it takes
O(m log n) bits and the adjacency test can be performed in O(log
n loglog n) bitoperations. The
"listofallneighbors" procedure takes O((k + loglog n)log n)
bitoperations, where k is the number of neighbors to be listed, and
the expected amortized time complexity of the update operations is
O(log n loglog n).
We also suggest a new representation based on computing the distance
from all vertices to some given vertex. We show that this
representation is optimal (in both time and space complexities) for
the family of planar graphs.