Nodes, weights, Uber and Tinder

I’m talking about matching at the moment, because it’s a niche interest of mine and also because I attended a talk on mentoring. I’ve already spoken about how quickly matching people gets ugly and difficult. The programme leader was experiencing this first hand — having grown the programme by 500%, he was starting to struggle to make matches based on qualitative data. He asked me to write something up to offer potential solutions. Here’s my very brief rundown.

There are a few methods by which people are matched. These can be widely categorised into:

  • Monogamous relationships
  • Polyamorous relationships
  • Relationships for the collective good

Let’s start with polyamory

Many-to-one, one-to-many, and many-to-many

This is the approach taken by most dating websites. The optimal outcome isn’t that one person is matched to one person. Instead the algorithm matches individuals to multiple others and then lets the users decide what to do next. Applying this approach to a mentor/mentee matching program is problematic, because the optimal outcome is a 1:1 exclusive pairing.

Relationships for the collective good

A typical example given for this kind of matching is employees and tasks. Every employee can carry out every task, but there’s a different cost in time — which I measure in dollars — for each employee. As a manager, I want to make sure I’m spending my employees’ time in the most efficient way.⁰ I lay out my table as in the previous blog and then find the “cheapest” route through.

Here the optimal matches are:

  • Armond: Clean bathroom
  • Francine: Sweep floors
  • Herbert: Wash windows

As you can see, it would be easy to convert tasks to people and calculate a score based on a number of traits that you’d like to match. The reason I call these “collective good matches” is because the people in them have no say. Armond may prefer to wash windows because he likes to see the sky, for example, but it is inefficient at an organisational level to match him to that task — unless part of our matching score takes into account the employee’s desire to do the task.¹

Although this task is easy with a 3 x 3 grid, at scale it requires an algorithm to manage.

Monogamous relationships

These relationships are a one-to-one match, like the ones above. We divide the population into groups and ask them to list their preferences. We can then apply an algorithm to match the two groups together. Unlike the above, where the group as a whole gets the best matches, this approach strives towards individual optimisation. This problem is usually stated as follows:

Given n men and n women, where each person has ranked all members of the opposite sex in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. When there are no such pairs of people, the set of marriages is deemed stable.²

The population you’re matching is divided into two groups; proposers and reviewers. Proposers ask their highest preference; if the reviewer prefers this proposal to their current partner then they will dump their current parter and hitch themselves to the proposer; if not, they’ll turn down the proposer.³ Once every proposer has asked, the process begins again, continuing until the round in which no reviewers swap partners.

A consequence of this process is that the preferences of the proposers are always privileged over the preferences of the reviewer.

Applying a monogamous model to the problem of mentor-mentee matches benefits the preferences of one group over the other. If this approach were to be taken, I’d recommend making the mentor the group whose preferences are most respected, as they are the more scarce resource. By contrast, applying the collective good approach might lead to poorer individual matches but will ensure everyone gets a good match, as opposed to a mix of lower and higher scoring matches.

A hybrid approach that generates a “collective good” score and uses that to proxy the list of preferences that would be generated by a “monogamous relationship” approach seems to offer the best of both worlds. It would generate the preferences “blind” — avoiding gender and race bias that we all have — as well as removing the necessity for mentors and mentees to exhaustively research and analyse hundreds of profiles from the opposing group.

⁰ The manager in this scenario is of course entirely fictional
¹ Whether or not it is wise for managers to take their employees’ feelings into account when assigning tasks is not really a question: it is self-evidently a good idea.
² For the purposes of this problem, heteronormativity reigns supreme.
³ Equally, reviewers would rather have someone than no-one

The difficulty of matching stuff

Math is hard.

This is something I’ve been getting my head around since I was about 16. I’m still doing it, and I love it. I love it mostly for the pure math problems which have zero application in real life, but there are occasional fun⁰ opportunities to apply them in real life.

Here’s an example: factorials get really big, really quickly. Factorials are written as n!, and represent n · (n-1) · (n-2)…2 · 1

For example, 3! is 3 · 2 · 1 = 6

4! is 24. 5! is 120. 6! is 720. 10! is three million, six hundred and twenty-eight thousand, eight hundred, or 3,628,800

Factorials get big quickly.

This is a nice fact, but not much use in the real world. Here’s a real world example.

Suppose there are three equally qualified candidates for positions, and three such positions. There are consequently 3! ways we could combine candidates and positions to produce the map of all possible outcomes. You can prove this for yourself by combining A, B, and C with 1, 2, 3 in as many ways as you can with no repetition. You should get the answer 6.

With just six routes through this table, we can just guess — and since everyone’s equally qualified it doesn’t really matter which we pick.

However, if we map candidate skills against skills required, we’ll have a much easier time — each value will get weighted, and we ought to be able to just pick the best ones. If the combo of candidate A and position 2 is the best fit, we’ll pick that one. Easy. And with just three candidates and three positions, odds are it’ll work out so everyone gets the thing they’re best at.¹

So far so good. What if you’ve got 10 candidates and 10 positions, you’ve carried out your calculations and found there’s a tie. To whom do you give the job? And what do you do with the loser? Moving them to their next best match might displace someone better suited:

A1 and B3 both have a score of 9.8 out of 10. B loses. Their next highest score is 7.2 out of 10 for position 8 - but candidate F has a solid 8.0 match for that post. 

You can keep shuffling B down, but eventually they’ll just be stuck with something terrible. You can tell B there’s no position for them — but you’ll be short a person. Or you can move everyone down one, which means everyone will be a bit less happy but you keep candidate B.²

And this is with just 10 candidates, 10 posts, and one tie. A real world, enterprise level version of this could theoretically try to match 400 people with 600 posts. It produces a matrix of 240,000 combinations which has 5.35 x 10¹⁰³² potential routes through it, assuming all candidates are equally well suited to all roles.

That number, by the way, runs out of words to describe it. 10¹⁰⁰ is a Googol, so this is ten Googols. It is a number with one thousand and thirty-three numbers in it. It is a Big Number.



It’s also only approximately right.As you can see, after a certain number of millions my laptop gave up entirely and just assumed that the lower numbers were zeros.

Of course that means for every role you can just roll a 600-sided die, but quite frankly that thing is for Dungeons and Dragons only and besides takes three people to roll.

So you try out algorithms, which is where we’ll move next time.

⁰ Yes, fun

¹ If you were particularly perverse, or ran a graduate training scheme, you might do the inverse: select the combinations with the lowest value to force the candidates to improve to meet the job spec.

² I don’t actually have answers here, I’m just posing the question.