Internship Lottery

As mentioned in the previous post, the new lottery is composed of 3 steps:

Simulate a large amount of RSDs and collect the results.
Successivly trade assignments between students to maximize satisfaction
Use the new assignments to produce a final assignment that’s acceptable on all parties.

What is this RSD? What does “trading” mean? What is this “final” assignment?

To understand what each step means, we need to understand the previous step that feeds it. So let’s start with RSD.

Random Serial Dictatorship

This is how the assignment used to work. And it was a pretty straightforward lottery:

A student’s names is pulled out of a bag
They are assigned their highest ranking preference that’s still available (i.e. the hospital still has vacancies)
The assigned hospital’s vacancies is decreased by 1.
Back to (1), until all students were assigned.

from random import sample

def match(preferences: list[int], hospitals: list[int]) -> int:
    for preference in preferences:
        if hospital[preference] > 0:
            return preference

def rsd(
    students: dict[str, list[int]],
    hospitals: list[int]
) -> Iterator[tuple[int, int]]:
    remaining = hospitals.copy()
    for student, preferences in sample(students.items(), k=len(students)):
        allocation = match(preferences, remaining)
        yield student, allocation
        remaining[allocation] -= 1

Example

In this example, there are 10 students ranking 4 hospitals (represented by colored squares).

On the left, are the students’ preferences from highest (left) to lowest (right).

On the right, are the hospitals’ vacancies. A full block is free, while an empty is clear.

The demo will cycle through the students, trying to grab the highest vacant position for each one. Which will be marked by a full square:

Assignment Trading

It is not unlikely, that due to the random nature of the lottery, one student might get an assignment that favorable to someone else. If this is mutual, then the two students can trade assignments. This is actually common in practice.

Sometimes, there are more complex cases where more than two students are involved in a “chain” of trades (which benefit all of them).

Assignment Probability & Happiness

As you can probably tell from playing with the example above, the assignment results are pretty random. We would like to eliminate the randomness and examine whether there’s some underlying order.

To do that, we will run the same RSD, but with many different picking order (shuffle the order in which students get to pick a hospital out of the hat). For each draw, we will record the result, and display the accumulated time each student got his i^th preference (visualized as vertical colored bars):

Rounds: 0

As you can see, after several thousands of RSD rounds, the results stabilize. At this point, we can say that each student has a certain probability of obtaining each of their preferences.

Formally, for each student, we count the number of times they got assigned their i^th preference:

\[\mathbf{n} = \left(n_1, ..., n_M\right)\]

If we divide this number by the amount of runs $n_{RSD}$, we obtain an assignment probability:

\[p_i = \frac{n_i}{n_{RSD}}\]

For each student, we define the student’s happiness by calculating a weighted sum of their assignment probabilities:

\[h = p_1 M^2 + p_2 \left(M-1\right)^2 + ... + p_M 1^2 = \sum_{i=1}^M p_i \left(M - i + 1\right)^2 = \mathbf{w}^T \cdot \mathbf{p}\]

Probability Trading

Extrapolating from the concept of assignment trading, we introduce a new concept called probability trading.

Suppose we have two students A and B with assignment probabilities $\mathbf{p}_A$ and $\mathbf{p}_B$.

We define “trading of probabilities” between the student. Student A can give a portion of their probability to be assigned to hospital i to student B, and in return, student B will give the same portion to student A.

Single probability trading

Since probability can be traded only between the same hospital, it will do us good to rewrite the formulas a bit.

For a student A, we define their preference mapping as $\pi_A : \mathcal{H} \rightarrow \mathrm{Preference}$.

If we store the assignment probabilities sorted by hospital, rather than preference, we can rewrite the happiness as:

\[h_A = \sum_{i \in \mathcal{H}} w_{\pi_A(i)} p_{A,i} = \mathbf{w}^T \cdot \mathbf{\pi_A} \cdot \mathbf{p_A}\]

In the simplest case, only one “dose” of probability is traded between students:

A gives up a chance of q to get hospital i (and B gains that chance)
B gives up an equal chance q to get hospital j (and A gains that chance)

In terms of happiness, A lost $q \cdot w_{\pi_A(i)}$ and gained $q \cdot w_{\pi_A(j)}$, whereas B lost $q \cdot w_{\pi_B(j)}$ and gained $q \cdot w_{\pi_B(i)}$:

\[\begin{eqnarray} \hat{h}_A = h_A - q w_{\pi_A(i)} + q w_{\pi_A(j)} = h_A + q \left(w_{\pi_A(j)} - w_{\pi_A(i)}\right)\\ \hat{h}_B = h_B - q w_{\pi_B(j)} + q w_{\pi_B(i)} = h_B + q \left(w_{\pi_B(i)} - w_{\pi_B(j)}\right) \end{eqnarray}\]

Since we want the trade to be mutually beneficial, we must require:

\[\begin{eqnarray} \Delta_A = q \left(w_{\pi_A(j)} - w_{\pi_A(i)}\right) & \gt & 0 \\ \rightarrow w_{\pi_A(j)} & \gt & w_{\pi_A(i)} \\ \rightarrow \left(M - \pi_A(j) + 1\right)^2 & \gt & \left(M - \pi_A(i) + 1\right)^2 \\ \rightarrow 1 \leq \pi_A(j) & \lt & \pi_A(i) \leq M \\ \Delta_B = q \left(w_{\pi_B(i)} - w_{\pi_B(j)}\right) & \gt & 0 \\ \rightarrow 1 \leq \pi_B(i) & \lt & \pi_B(j) \leq M \\ \end{eqnarray}\]

Which means, we can only trade between specific hospitals:

\[\mathrm{Trades} = \left\lbrace i, j \in \mathcal{H} : \pi_A(j) \lt \pi_A(i) \land \pi_B(i) \lt \pi_B(j)\right\rbrace\]

We want to find the size of the traded probability which maximizes the overall increase in happiness:

\[\Delta = q \left(w_{\pi_A(j)} - w_{\pi_A(i)} + w_{\pi_B(i)} - w_{\pi_B(j)}\right)\]

In addition, we have the following constraints:

\[\begin{eqnarray} q & \leq & p_{A,i} \\ q & \leq & p_{B,j} \\ p_{A,j} + q & \leq & 1 \\ p_{B,i} + q & \leq & 1 \end{eqnarray}\]

Which can be summarized as:

\[q \leq \min \left\lbrace p_{A,i}, p_{B,j}, 1 - p_{A,j}, 1 - p_{B,i}\right\rbrace\]

Since the target function $\Delta$ is linear in $q$, to maximize it, we just need to select $q$ to be the largest possible value under the constraints. So we take $q$ on the edge of the boundary (the other end of the boundary is 0, which matches the minimum):

\[\overline{q}_{i,j} = \min \left\lbrace p_{A,i}, p_{B,j}, 1 - p_{A,j}, 1 - p_{B,i}\right\rbrace\]

The above is true for a specific selection of i, j. To find the overall maximum, we need to consider all possible trades:

\[\overline{q} = \max_{\mathrm{Trades}} \overline{q}_{i,j} \left(w_{\pi_A(j)} - w_{\pi_A(i)} + w_{\pi_B(i)} - w_{\pi_B(j)}\right)\]

W = {h: (M - i)**2 for i, h in enumerate(Hospitals)}

@attr.s(auto_attribs=True)
class Student:
    ranking: List[Hospital]
    pi: Dict[Hospital, int]
    p: List[float]

    def give(self, h: Hospital, p: float):
        this.p[h] -= p

    def receive(self, h: Hospital, p: float):
        this.p[h] += p

    @property
    def happiness() -> float:
        return sum([p * W[this.pi[h]] for h, p in zip(this.ranking, this.p)])

def feasable_trades(
    A: Student,
    B: Student
) -> Iterator[Tuple[Hospital, Hospital]]:
    for i, j in pairs(A.pi.keys()):
        if A.pi[j] < A.pi[i] and B.pi[i] < B.pi[j]:
            yield i, j

def iterate_all(
    A: Student,
    B: Student
) -> Iterator[Tuple[float, Hospital, Hospital]]:
    for i, j in feasable_trades(A, B):
        p = min(A[i], B[j], 1 - A[j], 1 - B[i])
        # Increase in overall happiness
        delta = p * (W[A.pi[j]] - W[A.pi[i]] + W[B.pi[i]] - W[B.pi[j]])
        yield delta, p, i, j

def optimal_trade(A: List[float], B: List[float]):
    delta, p, i, j = max(iterate_all(A, B), operator.itemgetter(0))
    A.give(i, p)
    B.receive(i, p)
    B.give(j, p)
    A.receive(j, p)

Single Probability Trading Demo

Click through next to cycle mutually beneficial trades.

The number on the top is the happiness of each student (1 being the highest), with the number below it representing the total increase in happiness due to the trade(s).

The numbers on the colored bars represent the amount of traded assignments.

Multiple probability trading

In reality, there could be more complex trading schemes which can squeeze out even more happiness for each student. For example, imagine this scenario:

Student A gives $p$ of his probability to get hospital X
Student B gives $\frac{p}{4}$ of his probability to get hospital Y and $\frac{3p}{4}$ of his probability to get hospital Z

This should give you a good sense about the type of these transactions, and you can probably imagine just how complex these could become.

We can generalize these types of trades in the following manner:

Student A gives probabilities $\alpha$ to get hospitals $T_A \subset \mathcal{H}$ to student B.
In return student B gives probabilities $\beta$ for hospitals $T_B \subset \mathcal{H}$ to student A.

Of course, we require that $T_A \cap T_B = \emptyset$.

The changes in happiness will be as follows:

\[\begin{eqnarray} \Delta_A = \hat{h}_A - h_A = - \sum_{i \in T_A} w_{\pi_A(i)} x^-_i + \sum_{i \in T_B} w_{\pi_A(i)} x^+_i\\ \Delta_B = \hat{h}_B - h_B = - \sum_{i \in T_B} w_{\pi_B(i)} x^+_i + \sum_{i \in T_A} w_{\pi_B(i)} x^-_i \end{eqnarray}\]

We can somewhat simplify the problem by eliminating the requirements for $T_A$ and $T_B$. We can just assume there are trades between all hospitals, and that it’s possible for a student to give probability $x^-_i$ and receive $x^+_i$ for the same hospital. It’s a bit silly, but it makes sense that if an optimal trade exists, it will not be have “redundant” individual trades. And even if it does, they will cancel out.

So the changes in happiness simplify to:

\[\begin{eqnarray} \Delta_A = \sum_{i \in \mathcal{H}} w_{\pi_A(i)} \left(x^+_i - x^-_i\right) \\ \Delta_B = \sum_{i \in \mathcal{H}} w_{\pi_B(i)} \left(x^-_i - x^+_i\right) \\ \Delta = \Delta_A + \Delta_B = \sum_{i \in \mathcal{H}} \left(w_{\pi_A(i)} - w_{\pi_B(i)}\right) \left(x^+_i - x^-_i\right) \end{eqnarray}\]

We require that each of the individual changes in happiness are positive:

\[\begin{eqnarray} \Delta_A \gt 0 \\ \Delta_B \gt 0 \end{eqnarray}\]

In addition we require that changes in probability make sense:

\[\begin{eqnarray} x^-_i \leq p_{A,i} \\ p_{A,i} + x^+_i \leq 1 \\ x^+_i \leq p_{B,i} \\ p_{B,i} + x^-_i \leq 1 \end{eqnarray}\]

Which can be rewritten as:

\[\begin{eqnarray} x^-_i \leq \min\left\lbrace p_{A,i}, 1 - p_{B,i} \right\rbrace \\ x^+_i \leq \min\left\lbrace p_{B,i}, 1 - p_{A,i} \right\rbrace \end{eqnarray}\]

In addition, we have the following “preservation” requirement:

\[\sum_{i \in \mathcal{H}} x^-_i = \sum_{i \in \mathcal{H}} x^+_i\]

To summarize it all:

\[\begin{eqnarray} \mathrm{maximize} \ & \Delta = \sum_{i \in \mathcal{H}} \left(w_{\pi_A(i)} - w_{\pi_B(i)}\right) \left(x^+_i - x^-_i\right) \\ \mathrm{such\ that} & \sum_{i \in \mathcal{H}} w_{\pi_A(i)} \left(x^+_i - x^-_i\right) \gt 0 \\ & \sum_{i \in \mathcal{H}} w_{\pi_B(i)} \left(x^-_i - x^+_i\right) \gt 0 \\ & \sum_{i \in \mathcal{H}} \left( x^+_i - x^-_i \right) = 0 \\ & 0 \le x^-_i \leq \min\left\lbrace p_{A,i}, 1 - p_{B,i} \right\rbrace \\ & 0 \le x^+_i \leq \min\left\lbrace p_{B,i}, 1 - p_{A,i} \right\rbrace \end{eqnarray}\]

This is a Linear Programming problem, and solving them is quite tricky, which is why we’re going to take a small detour and survey linear programming in the next few posts. Stay tuned.