# Network Topolgy and Iteration Complexity

## Weight/Mixing Matrix

In distributed optimization, existing works always employ a **weight matrix/mixing matrix** \(W\) to aggregate the information from neighbors. In this blog, I am going to first go through these assumptions and explicitly show their relations, for any undirected graph \(G=(\mathcal{V},\mathcal{E})\). Suppose there are \(n\) nodes, \(\mathcal{V} = \{1,2,\ldots,n\}\), and every node has a self-loop.

Assumption 1: The mixing matrix \(W = [w_{ij}]\in [0,1]^{n\times n}\) is defined according to the network topology: \(w_{ij}>0\) if \(\{i,j\}\in\mathcal{E}\); \(w_{ij}=0\), otherwise. W is doubly stochastic (\(W\mathbf{1} = \mathbf{1}, \mathbf{1}^T W = \mathbf{1}^T\)), and the spectral radius \(\rho(W-\frac{\mathbf{11}^T}{n})=\sigma<1\).

*Note: There are many variations of the above assumption. For example, one way is to replace the doubly stochasticity with symmetry and stochasticity for undirected graph only!.*

#### Why do we need such an assumption?

To make all the agents achieve consensus! From [1], Assumption 1 holds if and only if

\[\lim_{k\to \infty} W^k = \frac{\mathbf{11}^T}{n}.\]In order words, following the update \(x^{k+1} = Wx^k\), we can derive the consensus

\[x^{\infty} = \frac{\mathbf{11}^T}{n} x^0.\]

### Useful Facts of Stochastic Matrix

Fact 1: For any stochastic matrix \(A\), its absolute eigenvalues lie in \([0,1]\).

Fact 2: For any stochastic matrix \(A\), \(A\mathbf{1} = \mathbf{1}\).

*Proof*: See [2].

### Eigenvalues of Weight Matrix

It’s easy to show that the number of connected components = \(\text{rank}(W-I)\). If \(G\) is connected, there is only one eigenvalue \(\lambda_1 = 1\). Let the eigenvalues of \(W\) be \(1=\lambda_1>|\lambda_2|\ge |\lambda_3|\ge \cdots\ge|\lambda_n|\). The the spectral gap of \(W = 1-|\lambda_2|:=\theta\).

### What’s the relation between the spectral gap of \(W\) and \(\sigma\).

If \(W\) is symmetric, then \(\rho(W-\frac{\mathbf{11}^T}{n})= \Vert W-\frac{\mathbf{11}^T}{n}\Vert_2 = \lambda_{\max}(W-\frac{\mathbf{11}^T}{n}) = |\lambda_2|\).

*Proof:* Denote \(W = Q\Lambda Q^T\) as the eigendecomposition of \(W\). Then,

The \(\lambda_{\max}(Z) = \vert\lambda_{2}\vert\), which completes the proof.

Therefore, the spectral gap of \(W = 1-\sigma\).

## Indications of \(\sigma\)

Apparently, \(\sigma\) indicates the connectivity of the graph \(G\).

- If \(\sigma = 0\), the graph is fully connected.
- Smaller \(\sigma\) implies better connectivity.
- Using lazy Metropolis weights, the relationship between \(n\) and \(\frac{1}{1-\lambda}\) could be found in the Proposition 5, [3].

## References

[1] Xiao, L., & Boyd, S. (2004). Fast linear iterations for distributed averaging. Systems & Control Letters, 53(1), 65-78.

[2] Javad Mashreghi (2017). Slides: https://www.birs.ca/workshops/2017/17w2668/files/Eigenvalues-Doubly-Stochastic-Matrices-javad%20-%202017%20-%20Banff.pdf.

[3] A. Nedić, A. Olshevsky and M. G. Rabbat, “Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization,” in Proceedings of the IEEE, vol. 106, no. 5, pp. 953-976, May 2018, doi: 10.1109/JPROC.2018.2817461.

Feedback? email, anonymously or comment below: