Sunday, April 17, 2016

A greedy week

This week started with TopCoder Open 2016 Round 1B on Tuesday (problems, results, top 5 on the left). Just like in Round 1A, the top of the scoreboard featured many contestants who don't compete much anymore — but the first place went to Kankuro who's actually quite active in other competitions, just doesn't like doing TopCoder SRMs for some reason. Congratulations, Vlad!

TopCoder SRM 688 followed on Friday (problems, results, top 5 on the left, my screencast). The main story of this round was that of the 1000-pointer, which went like this: you are given a sequence of opening and closing parentheses. You want to color each parentheses either red or blue, in such a way that all red parentheses form a correct parentheses sequence, and all blue parentheses form a correct parentheses sequence. Out of all ways to do that, you want to pick the one that minimizes the total cost of the two resulting parentheses sequences. The cost of a parentheses sequence, in turn, is defined as the sum of costs of all its matching parentheses pairs. And finally, the cost of a matching parentheses pair in a parentheses sequence is defined as the square of the maximum level of nesting of parentheses present inside that pair (i.e. the cost of "()" is 1, the cost of outer parentheses pair in "(()())" is 4, and the total cost of "(()())" is 6).

Almost all accepted solutions for this problem, to the best of my knowledge, implement the following greedy algorithm: go from left to right, and color each parentheses (either red or blue) in such a way that the balance of the red sequence is always greater than or equal to the balance of the blue sequence, and they differ by at most 1. It's not hard to see that this rule determines the color of each parenthesis uniquely.

For example, consider input string "((())()(()))". Here's how the greedy solution works on this input:
Char  Color  Red String  Red Balance  Blue String  Blue Balance
---------------------------------------------------------------
   (    red           (            1                          0
   (   blue           (            1            (             1
   (    red          ((            2            (             1
   )    red         (()            1            (             1
   )   blue         (()            1           ()             0
   (   blue         (()            1          ()(             1
   )   blue         (()            1         ()()             0
   (   blue         (()            1        ()()(             1
   (    red        (()(            2        ()()(             1
   )    red       (()()            1        ()()(             1
   )   blue       (()()            1       ()()()             0
   )    red      (()())            0       ()()()             0

Intuitively it seems like this algorithm splits the parentheses most evenly into two halves, and thus the cost of each parenthesis pair will be as small as possible, and thus the total cost will also be as small as possible. A formal proof, however, evades me. The problemsetters also didn't expect this greedy solution, and expected something much more complex. Do you see how to prove (or find a counterexample) to this solution?

Some further points:
  • This solution passes stress test using random sequences of length up to 16.
  • It seems that this solution also works with any other non-decreasing cost function (from nesting level to cost).
  • Minor details are important: for example, if you don't always maintain that it's the red sequence that has higher balance when the total balance is odd, and allow the blue sequence to have higher balance sometimes, then the solution isn't optimal anymore. For example, a somewhat logical implementation is to always put the next parenthesis to the red sequence if the current balances are equal, regardless of it being closing or opening. It fails, for example, the following testcase: "((())(()))". It splits it as "(())(())" + "()" for the total cost of 4+4+1+1+1=11, while "(()())" + "()()" is better at 4+1+1+1+1=8.
At the same time with the SRM, CROC Championship 2016 Final Round took place in Moscow (problems, results, top 5 on the left, online round results with 4/5 problems in common, partial analysis in Russian). The dynamic scoring system on an onsite competition with just 50 contestants meant that just the two most difficult problems really mattered. Gennady was able to claim the victory by solving one of those problems well before anybody else could — congratulations!

Problem C was the most interesting for me in this round, in particular because it had a very simple and beautiful statement: you are given a 20x100000 table of zeroes and ones. You are allowed to invert (replace 0s with 1s and vice versa) rows and columns of this table. What is the smallest number of ones you can make the table contain?

Google Code Jam 2016 continued with the Round 1A on Saturday (problems, results, top 5 on the left, analysis). nika has managed to solve all three problems in just 21 minutes — amazing! Congratulations to everybody who advanced to Round 2, and those who don't have two more chances in the coming rounds 1B and 1C.

Thanks for reading, and check back next week!

No comments:

Post a Comment