Nicolò Andronio

Nicolò Andronio

Full-stack developer, computer scientist, engineer
Evil Genius in the spare time

Journey to the moon

It has been so long since I wrote an article! Probably around 10 years, a remarkable period full of void and nothingness! For the first post in my new blog I decided to pick a subject that always fascinated me, despite its apparent simplicity and “stapleness” in a developer’s toolbox. I am talking about graph theory! Yes, that very boring class you took ages ago, that one you always suspected was stolen from the mathematicians and shoved down your throat because y’know… someone said graphs are kinda important and blah blah blah

Well, brace yourselves because I’m going to propose a quite trivial solution to this HackerRank problem, named “Journey to the moon”. As experience tells us, computer scientists are always there when some self-important manager screws up with documents and forgets papers about astronauts’ nationalities. That’s serious business. Take your time to read the problem carefully and meet me here in 5 minutes.

Done? Great! So, there are surely several trivial solutions to the problem; however the best in my opinion is based on a kind of data structured called union/find forests or disjoint sets. They came to my mind right away because this problem basically is yelling their name out load: it’s the most flamboyant and immediate application of disjoint sets, since we are required to answer the query “do these elements belong to the same set?” in a very short time. Union/find forests support the following operations:

In principle, both find and merge have a worst-case time complexity O(n). However, with some smart optimizations, namely path compression and union by rank, they can be reduced to O(α(n)). Since α is an extremely slowly increasing function - the inverse of the exploding Ackermann function - the final average case amortized time complexity is reduced to O(1) for any possible value of n that can be valid in the physical universe. Pretty cool huh?

The basic idea behind my solution is simple. Each astronaut will be a vertex in the forest. When we get a pair of astronauts, we know that they belong to the same country, therefore we can add an edge between their vertices and join the trees they belong to. Thus a tree in the forest will represent all astronauts from the same country. By counting the number of disjoint trees and their size we get the exact subdivision of all our participants! From that it’s trivial to compute the total amount of possible combinations.

Before proposing the (hurried and quite messy) implementation that I used to solve the problem, I’d like to point out a rather important fact that is essential to reach a correct solution. Look at the second example. They motivate their own answer with the following sentence:

Persons numbered 0 and 2 belong to the same country, and persons 1 and 3 don’t share countries with anyone else, so they belong to unique countries on their own.

I highlighted the logical connection between the two statements because it contains some core information. It tells us that all astronauts that do not appear in the given dataset belong to their own country. This is a direct consequence of the premises fixed in the prologue but it is nonetheless quite easy to miss. In fact we are allowed to make such an assumption:

Assume that you are provided enough pairs to let you identify the groups of astronauts even though you might not know their country directly.

Now it’s time to code!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

class Solution
{
static void Main(String[] args)
{
var header = Console.ReadLine().Split(' ').Select(s => Int32.Parse(s));
var n = header.First();
var i = header.Last();

// The union/find forest will help us keep count of all the astrounauts' countries
var astrounauts = new UnionFindGraph(n);

for (int j = 0; j < i; j++)
{
var pair = Console.ReadLine().Split(' ').Select(s => Int32.Parse(s)).ToArray();
var a0 = pair[0];
var a1 = pair[1];

// So if two astrounauts belong to the same country they will go in the same tree
astrounauts.AddEdge(a0, a1);
}

// If n is greater than the number of defined vertices in the union/find forest,
// it means some astrounauts have not been named. I called this condition "hiddenVertices"
var hiddenVertices = (n > astrounauts.UsedVertices);
// The roots of the union/find forest represent the various defined countries
var roots = astrounauts.Roots.ToArray();
// This array holds the number of astronauts for each country...
var countriesCount = new long[roots.Length + 1];

// ... which is nothing more than the size of the country tree!
for (int j = 0; j < roots.Length; j++)
countriesCount[j] = astrounauts.GetSize(roots[j]);

// This is a special value that contains the number of "untold" astronauts
countriesCount[roots.Length] = (n - astrounauts.UsedVertices);

long ways = 0;

// Now to get the total ways we can combine them, just perform a trivial product.
// Notice that we are including the number of untold astronauts too implicitly in
// the last slot of the countriesCount array
for (int j = 0; j < countriesCount.Length; j++)
for (int h = j + 1; h < countriesCount.Length; h++)
ways += countriesCount[j] * countriesCount[h];

// Bonus points: recall that the remaining astronauts belong each one to their own country
// so we must take into accounts the total number of ways we can combine them too
if (hiddenVertices)
{
var t = countriesCount.Last();
ways += t * (t - 1) / 2;
}

Console.WriteLine(ways);
}
}

// The following implementation uses several underlying arrays to keep track of vertices.
// I made the choice because the size of this problem is given, and bounded to a reasonable threshold, therefore
// I know for sure that this graph will always fit in memory. Arrays are a nice choice
// because, despite their clumsiness and C-like-ness, they provide fast random access and we
// are working in a case where nodes are uniquely identified by densely packed integers (i.e. the best
// possible case).
// Given the premise, the number of a vertex is used as index and content in these arrays.
// We also need flags that tell us whether the i-th cell refers to an actually used vertex.

class UnionFindGraph
{
public const int Missing = -1;

// parents[i] contains the index of the parent of i
private int[] parents;
// treeSizes[i] contains the size (height) of the tree that i belongs to
private int[] treeSizes;
// used[i] is true iif the vertex i has been used in any edge
private bool[] used;

// By construction, vertices that are parents of themselves are the roots of their trees
public IEnumerable<int> Roots
{
get
{
for (int i = 0; i < parents.Length; i++)
if (parents[i] == i)
yield return i;
}
}

public int UsedVertices { get; private set; }

public UnionFindGraph(int size)
{
// size is the maximum number of vertices in this tree

this.parents = new int[size];
this.treeSizes = new int[size];
this.used = new bool[size];

this.UsedVertices = 0;

this.InitializeParents();
}

private void InitializeParents()
{
// Initially, no vertex has any parent because there is no vertex at all in the graph
for (int i = 0; i < parents.Length; i++)
parents[i] = Missing;
}

// This is a wrapper for the union routine and handles edge cases like missing roots
public void AddEdge(int head, int tail)
{
int headRoot = GetRoot(head);
int tailRoot = GetRoot(tail);

if ((headRoot == Missing) && (tailRoot == Missing))
{
// Both vertices don't belong to any connected component. Create a new tree rooted in head
// with only 2 vertices
SetParent(head, head);
SetParent(tail, head);
SetSize(head, 2);
}
else if (headRoot == Missing)
{
// Head doesn't belong to any connected component, but tail does. Attach head to the root of
// the tail component
SetParent(head, tailRoot);
SetSize(tailRoot, GetSize(tailRoot) + 1);
}
else if (tailRoot == Missing)
{
// Tail deson't belong to any connected component, but head does. Attach tail to the root of
// the head component
SetParent(tail, headRoot);
SetSize(headRoot, GetSize(headRoot) + 1);
}
else if (headRoot != tailRoot)
{
// If headRoot and tailRoot are the same, they already are in the same connected component.
// Otherwise, merge their connected components.
Merge(headRoot, tailRoot);
}

if (!used[head])
{
used[head] = true;
UsedVertices++;
}

if (!used[tail])
{
used[tail] = true;
UsedVertices++;
}
}

// Find
public int GetRoot(int vertex)
{
int parent = parents[vertex];

if ((parent == vertex) || (parent == Missing))
return parent;
else
{
// This is path compression: speeds up future queries by attaching
// this vertex directly to its computed parent, squashing the tree down to a shallower version
int root = GetRoot(parent);
SetParent(vertex, root);
return root;
}
}

public int GetSize(int vertex)
{
return treeSizes[GetRoot(vertex)];
}

private void SetParent(int vertex, int parent)
{
parents[vertex] = parent;
}

private void SetSize(int vertex, int size)
{
treeSizes[GetRoot(vertex)] = size;
}

// Union
private void Merge(int vertex1, int vertex2)
{
int root1 = GetRoot(vertex1);
int root2 = GetRoot(vertex2);

if (root1 == root2)
return;

int size1 = GetSize(root1);
int size2 = GetSize(root2);

// This is the union by rank, where rank in this case is simply called size.
// It just avoids the formation of long chains by trying to keep tree sizes similar

if (size1 < size2)
{
// Attach root1 to root2
SetParent(root1, root2);
SetSize(root2, size1 + size2);
}
else
{
// Attach root2 to root1
SetParent(root2, root1);
SetSize(root1, size1 + size2);
}
}
}