Skip to content

Proof of Tighter Upper Bound on Non Zero Elements for CC

Evan West edited this page Jun 20, 2022 · 2 revisions

Proof Sketch

The basics

Recall that we build our sketches from vectors of every edge in the graph. We will use the following example graph for this proof sketch.

    1
 /  |  \
2 - 3 - 4
      \ |
        5

For this graph, the vector for node 1 is as follows:

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)
  1     1     1     0     0     0     0     0     0     0

These sketches can be combined to sample edges from an arbitrary cut in the graph. For example, the vector for nodes 2, 3 and 5 is:

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)
  1     1     0     0     0     0     0     1     0     1

Note that we have edges from multiple origins in this vector (thus increasing the number of possible non-zeros in the vector) and there are no edges within the set of nodes defining the cut (thus reducing the number of non-zeros).

Proof

Imagine a complete graph. This will maximize the number of non-zero entries in the characteristic vector. The vector of a single node will contain n-1 non-zero entries (one for each other node in the graph).

Adding another node to this characteristic vector will change the number of non-zero entries to

= total_edges - 2*edges_within_set
= 2(n-1) - 2

Before combining, each vector had (n-1) non-zero entries within it. After combining, one non-zero from each vector is removed resulting in 2 canceled elements. Thus in total there are 2n - 4 non-zero entries. From here it is simple to arrive at the general equation for a vector with a set of a nodes within it.

= total_edges - 2*edges_within_set
= a(n-1) - 2a(a-1)

Maximize this equation as follows:

= a(n-1) - 2a(a-1) = a(n-1) - 2a^2 + 2a = a(n+1) - 2a^2
d_a(a(n+1) - 2a^2) = n+1 - 4a
a = (n + 1) / 4
Therefore, max = (n^2 + n) / 4 - (1/2)(n+1)((n + 1) / 4 - 1) < n^2 / 4

Done 😄