Monday, October 31, 2011

Problem 19 Answer: "Rating Trading, Part 2"

We prove the problem NP-complete by reduction to the subset-sum problem.

Given an instance of the subset-sum problem, where we have a set A1 ... An, and we want to determine if there is a subset that sums to exactly B.

Let X be the sum A1 + ... + An. (Assume that B <= X; otherwise the problem is trivial.) Consider an instance of the Generalized Math Trade Problem with two participants P and Q. Person P has objects which he values at (2*A1) ... (2*An) and person Q values at (A1 ... An) respectively. Person Q has one object which he values at B and P values at 3X. Choose a target value of 5X-B. The only way to achieve the target value under the constraints is for Q to give P his object, and P to give Q objects from his set that Q values at a total of exactly B. Thus if we had a way to determine whether this was possible, we would have solved the subset-sum problem.

Problem 18 Answer: "Rating Trading"

Problem 18A. Create a graph in which each node represents a player, and there is an edge from A to B if and only if B wants the game A has. Each edge has capacity 1 and cost -1, and each node has node capacity 1. If the edge from A to B has flow, then that means A gives his game to B. The conservation-of-flow constraints ensure that each person ends up with the same number of games he started out with (exactly one), the node capacity constraints ensure that a player isn't trying to give his game to two or more people at the same time, and "minimizing the cost" means to maximize the number of trades.

Problem 18B. Split each node into two nodes: an "in" node and an "out" node. All the incoming edges go into the "in" node, and all the outgoing edges come out of the "out" node. Add a "synthetic edge" going from the "in" node to the "out" node with cost equal to 0 and capacity equal to the node capacity. This guarantees that the amount flowing through that node is not greater than the node capacity.

Problem 18C. Suppose that I am person A and the people B1 ... Bn all have the same game that I want. What I do is create a "fake" person B that wants each of B1 ... Bn's games, and I say that I want B's game. Then the "person B" will win at most one of the games (and then pass it on to me, since I am the only one who wants B's game).

---

The current "gold standard" for software to do the Math Trades is TradeMaximizer, developed in 2008. The algorithm used is described at the bottom of that page, and as far as I can tell, the graph TradeMaximizer generates is exactly the same as the graph described above - the only difference is that TradeMaximizer formulates it as a bipartite graph matching problem while I formulated it as a network flow problem; the main difference between the formulations is that whether or not there is flow on each "synthetic edge" is inverted. I just hadn't noticed that the graph was bipartite so you can use the special "bipartite graph matching" algorithms, but even normal network flow algorithms are polynomial time (in the case where all the costs are 1), which is much better than the exponential time algorithms they were using for years before TradeMaximizer. Oh, well, if only I had found out about the math trades a few years earlier I could have been famous on BoardGameGeek.

Friday, March 18, 2011

Problem 17 Solution: "Seeds of Victory"

No. The problem is that there are 12 ways of choosing a Final Four with seeds 1,1,2,3, and only one way of choosing a Final Four with seeds 1,1,1,1. If I choose the Final Four with seeds 1,1,1,1, I will have a 1 in 38.02 chance of success. However. if I choose a Final Four with seeds 1,1,2,3, then I will only have the correct Final Four if both (a) the final four has seeds 1,1,2,3 (this has probability 1 in 14.05), and (b) I correctly predicted which of the twelve (1,1,2,3) combinations would happen (this has probability 1 in 12), for a total probability of 1 in 168.6.

To give a simpler analogy, let's say I flip two biased coins that each have probability 0.6 of coming up heads. It's easy to see that the probability of two heads is 0.36 while the probability of one tail and one head is 0.48. But that's because the "one tail and one head" is really two outcomes, (head, tail) and (tail, head). If I was forced to predict the results of each of the two coin tosses separately, a prediction of (heads, heads) would be right with probability 0.36 while a prediction of (heads, tails) would be right with probability 0.24. And in March Madness you predict each conference separately, not the Final Four seed distribution as a whole. The situation above is the same, where "1,1,2,3" is really 12 different outcomes. ((1,1,2,3),(1,1,3,2),(1,2,1,3)
, etc.)

(Yes, I did send this analysis to Jacobson. I just did it a few minutes ago, so I haven't heard back from him yet, though.)

Thursday, February 24, 2011

Problem 16 Solution: "Fall Out Pick Up"

(Note: We will identify an object of weight a and value b by the notation [a,b].)

(a) and (b).

No. Suppose there was an algorithm with competitive ratio K > 0. Then the adversary operates as follows:

- Choose an integer T that is high enough so that 1/T < K.
- Give the algorithm a capacity (T^2).
- At first, provide the item [T^2, T]. If the algorithm doesn't take the item, then stop there; the algorithm got 0 while it could have gotten T, contradicting the fact that it has competitive ratio K. If it does, then keep providing [1,1] items. If the algorithm ever drops the [T^2, T] item to pick up a [1,1], then stop there; the algorithm got 1 when it could have gotten T. Otherwise, stop after you have provided T^2 [1,1] items; by this point the algorithm got T when it could have gotten T^2. So the algorithm has only achieved 1/T of the optimal, contradicting the competitive ratio K.

So by contradiction, there is no algorithm with competitive ratio K > 0.

(c) Each time you see a new item, pick up the item, then drop the items with the lowest value-to-weight ratio until you get within capacity (this may include dropping the new item; in this case you essentially don't pick it up). If this never involves dropping any items, then you will have picked up every item you got so of course it is optimal. If this does involve dropping items, then at the end you will have at least C-M out of your capacity C filled (because each time you drop an item it reduces the total by at most M, and you don't drop until you're over C). Also, since you're storing the items with the highest cost-to-weight ratio, the space that you did fill you are getting as much value out of it as you can. So the competitive ratio is at least (C-M)/C = 1-(M/C).

Saturday, February 19, 2011

Problem 15 Solution: "For Science"

Specify a bijective mapping that associates each possible ordering with an integer in the range [1,24]. It does not matter how you do this; one way is in lexicographic order (say the wormholes are marked A, B, C, D, then you would have 1 = ABCD, 2 = ABDC, 3 = ACBD, 4 = ACDB, etc). Then you can use binary search to find the correct number by asking questions of the form "Is the number corresponding to the correct ordering less than or equal to X?"

Since there are 24 possibilities, this takes ceil(log_2(24)) = 5 questions.

The generalization is obvious - if there are N wormholes then you map each possible ordering to an integer int he range [1,N!], and so the algorithm takes ceil(log_2(N!)) = O(n log n) questions.

The proof of optimality is easy: If there are k questions, then the number of possible ways of answering those questions is at most 2^k, so if 2^k < N! (which happens if k < log_2(N!)) there is no possible way of distinguishing between the N! possibilities using just the k questions.

[The trick to this question is that you get to ask *any* yes-or-no question; you are not limited to questions of the form "Does X come before Y" that most sorting algorithms use. As an aside, the proof of optimality above is also the standard argument use to show that no comparison-based sorting algorithm can run in better than O(n log n) time.]

Wednesday, February 24, 2010

Problem 14 Solution - "Community Events"

Suppose there are N attributes - number the attributes from 1 to N. Then define an N-by-N matrix T as follows: The value in the ith row and jth column of T is equal to the intensity of the link going from attribute j to attribute i, or zero if there is no such link.
Now consider the N-dimensional column vector of attribute values. Suppose the initial event adds the vector J to the vector of attribute values (e.g. if only the i'th attribute was affected, J would have zeros in all positions but the i'th position, and the i'th position would ahve how much was added). Then the amount that is added in addition to that due to links is TJ. But these changes also affect other attributes through links, for an additional effect of (T^2)J. And then (T^3)J, and so on.

So the total change is (I + T + (T^2) + (T^3)...)J. We want to see if (I + T + T^2 + T^3 ...) converges. Since this is a power series of matrices, we can use the eigenvalue decomposition of T - see the section titled "functional calculus" in the linked article for information on how to use the eigenvalue decomposition to find power series of matrices. Since we have all the a_i = 1 (using the notation in the article), it just becomes the regular geometric series, so it converges provided all the eigenvalues of T have absolute value less than 1.

Monday, February 22, 2010

Problem 13 Solution: "Characteristic Characters"

Consider the system with figured characteristics. Suppose there are N total characteristics. We will make the following definitions:
Let T be an N-dimensional row vector representing the cost per point for each characteristic.
Let P be an N-dimensional column vector representing the number of points of each characteristic that is purchased.
Let R be an N-dimensional column vector representing the number of points of each characteristic that the player actually ends up with.
Let X be the total number of points the player spends on characteristics.
We will now find relationships between these variables.
First, observe that X is just the dot product of T and P - or in other words TP.
Now consider the relationship between P and R. R is equal to P plus the vector V representing the number of "free points" of figured characteristics obtained. But since each element of V is a linear combination of elements of P (of course zero in the case of base characteristics), we can represent this as AP where A is a matrix representing the set of linear equations. So we have R = P+V = P+AP = (I+A)P where I is the N-by-N identity matrix.

In other words, the player gets a final characteristic vector of (I+A)P for a cost of TP points.
Now observe that I+A has ones all along its main diagonal, and is upper triangular (if you list the base characteristics before the figured ones), which is sufficient to show that it is nonsingular, and thus has an inverse, which we will designate as Z.

Now consider a system without figured characteristics, and with a cost vector of TZ. Then in order to get the same characteristic vector as before - (I+A)P - the player he simply buys those characteristics directly, for a total cost of TZ(I+A)P. But since Z and I+A are inverses, they cancel each other out, so the total cost is TP, just as before. This shows that the two systems have the same cost, regardless of what characteristics the player wants to buy.