Prüfer code¶
In this article we will look at the so-called Prüfer code (or Prüfer sequence), which is a way of encoding a labeled tree into a sequence of numbers in a unique way.
With the help of the Prüfer code we will prove Cayley's formula (which specified the number of spanning trees in a complete graph). Also we show the solution to the problem of counting the number of ways of adding edges to a graph to make it connected.
Note, we will not consider trees consisting of a single vertex - this is a special case in which multiple statements clash.
Prüfer code¶
The Prüfer code is a way of encoding a labeled tree with
Although using the Prüfer code for storing and operating on tree is impractical due the specification of the representation, the Prüfer codes are used frequently: mostly in solving combinatorial problems.
The inventor - Heinz Prüfer - proposed this code in 1918 as a proof for Cayley's formula.
Building the Prüfer code for a given tree¶
The Prüfer code is constructed as follows.
We will repeat the following procedure
Thus the Prüfer code for a given tree is a sequence of
The algorithm for computing the Prüfer code can be implemented easily with set
or priority_queue
in C++), which contains a list of all the current leafs.
vector<vector<int>> adj;
vector<int> pruefer_code() {
int n = adj.size();
set<int> leafs;
vector<int> degree(n);
vector<bool> killed(n, false);
for (int i = 0; i < n; i++) {
degree[i] = adj[i].size();
if (degree[i] == 1)
leafs.insert(i);
}
vector<int> code(n - 2);
for (int i = 0; i < n - 2; i++) {
int leaf = *leafs.begin();
leafs.erase(leafs.begin());
killed[leaf] = true;
int v;
for (int u : adj[leaf]) {
if (!killed[u])
v = u;
}
code[i] = v;
if (--degree[v] == 1)
leafs.insert(v);
}
return code;
}
However the construction can also be implemented in linear time. Such an approach is described in the next section.
Building the Prüfer code for a given tree in linear time¶
The essence of the algorithm is to use a moving pointer, which will always point to the current leaf vertex that we want to remove.
At first glance this seems impossible, because during the process of constructing the Prüfer code the leaf number can increase and decrease.
However after a closer look, this is actually not true.
The number of leafs will not increase. Either the number decreases by one (we remove one leaf vertex and don't gain a new one), or it stay the same (we remove one leaf vertex and gain another one).
In the first case there is no other way than searching for the next smallest leaf vertex.
In the second case, however, we can decide in
To do this we will use a variable
This variable is already very helpful in the first case.
After removing the current leaf node, we know that there cannot be a leaf node between
Even though we might have to perform multiple linear searches for the next leaf vertex, the pointer
vector<vector<int>> adj;
vector<int> parent;
void dfs(int v) {
for (int u : adj[v]) {
if (u != parent[v]) {
parent[u] = v;
dfs(u);
}
}
}
vector<int> pruefer_code() {
int n = adj.size();
parent.resize(n);
parent[n-1] = -1;
dfs(n-1);
int ptr = -1;
vector<int> degree(n);
for (int i = 0; i < n; i++) {
degree[i] = adj[i].size();
if (degree[i] == 1 && ptr == -1)
ptr = i;
}
vector<int> code(n - 2);
int leaf = ptr;
for (int i = 0; i < n - 2; i++) {
int next = parent[leaf];
code[i] = next;
if (--degree[next] == 1 && next < ptr) {
leaf = next;
} else {
ptr++;
while (degree[ptr] != 1)
ptr++;
leaf = ptr;
}
}
return code;
}
In the code we first find for each its ancestor parent[i]
, i.e. the ancestor that this vertex will have once we remove it from the tree.
We can find this ancestor by rooting the tree at the vertex ptr
is the pointer that indicates the minimum size of the remaining leaf vertices (except the current one leaf
).
We will either assign the current leaf vertex with next
, if this one is also a leaf vertex and it is smaller than ptr
, or we start a linear search for the smallest leaf vertex by increasing the pointer.
It can be easily seen, that this code has the complexity
Some properties of the Prüfer code¶
- After constructing the Prüfer code two vertices will remain.
One of them is the highest vertex
- Each vertex appears in the Prüfer code exactly a fixed number of times - its degree minus one.
This can be easily checked, since the degree will get smaller every time we record its label in the code, and we remove it once the degree is
Restoring the tree using the Prüfer code¶
To restore the tree it suffice to only focus on the property discussed in the last section. We already know the degree of all the vertices in the desired tree. Therefore we can find all leaf vertices, and also the first leaf that was removed in the first step (it has to be the smallest leaf). This leaf vertex was connected to the vertex corresponding to the number in the first cell of the Prüfer code.
Thus we found the first edge removed by when then the Prüfer code was generated. We can add this edge to the answer and reduce the degrees at both ends of the edge.
We will repeat this operation until we have used all numbers of the Prüfer code:
we look for the minimum vertex with degree equal to
In the end we only have two vertices left with degree equal to
This algorithm can be implemented easily in set<>
or priority_queue<>
in C++) to store all the leaf vertices.
The following implementation returns the list of edges corresponding to the tree.
vector<pair<int, int>> pruefer_decode(vector<int> const& code) {
int n = code.size() + 2;
vector<int> degree(n, 1);
for (int i : code)
degree[i]++;
set<int> leaves;
for (int i = 0; i < n; i++) {
if (degree[i] == 1)
leaves.insert(i);
}
vector<pair<int, int>> edges;
for (int v : code) {
int leaf = *leaves.begin();
leaves.erase(leaves.begin());
edges.emplace_back(leaf, v);
if (--degree[v] == 1)
leaves.insert(v);
}
edges.emplace_back(*leaves.begin(), n-1);
return edges;
}
Restoring the tree using the Prüfer code in linear time¶
To obtain the tree in linear time we can apply the same technique used to obtain the Prüfer code in linear time.
We don't need a data structure to extract the minimum. Instead we can notice that, after processing the current edge, only one vertex becomes a leaf. Therefore we can either continue with this vertex, or we find a smaller one with a linear search by moving a pointer.
vector<pair<int, int>> pruefer_decode(vector<int> const& code) {
int n = code.size() + 2;
vector<int> degree(n, 1);
for (int i : code)
degree[i]++;
int ptr = 0;
while (degree[ptr] != 1)
ptr++;
int leaf = ptr;
vector<pair<int, int>> edges;
for (int v : code) {
edges.emplace_back(leaf, v);
if (--degree[v] == 1 && v < ptr) {
leaf = v;
} else {
ptr++;
while (degree[ptr] != 1)
ptr++;
leaf = ptr;
}
}
edges.emplace_back(leaf, n-1);
return edges;
}
Bijection between trees and Prüfer codes¶
For each tree there exists a Prüfer code corresponding to it. And for each Prüfer code we can restore the original tree.
It follows that also every Prüfer code (i.e. a sequence of
Therefore all trees and all Prüfer codes form a bijection (a one-to-one correspondence).
Cayley's formula¶
Cayley's formula states that the number of spanning trees in a complete labeled graph with
There are multiple proofs for this formula. Using the Prüfer code concept this statement comes without any surprise.
In fact any Prüfer code with
Number of ways to make a graph connected¶
The concept of Prüfer codes are even more powerful. It allows to create a lot more general formulas than Cayley's formula.
In this problem we are given a graph with
Let us derive a formula for solving this problem.
We use
Thus in order to calculate the number of possible ways it is important to count how often each of the
Let
If the vertex
The fact that each edge adjacent to the vertex
To get the final answer we need to sum this for all possible ways to choose the degrees:
Currently this looks like a really horrible answer, however we can use the multinomial theorem, which says:
This look already pretty similar.
To use it we only need to substitute with
After applying the multinomial theorem we get the answer to the problem:
By accident this formula also holds for