After blogs had stopped being cool in 2004, I decided that it was now safe to start one. This is a blog mainly about algorithms, machine learning, complexity, and often closely related topics like cycling. There will be technical posts, but I will try to make an effort to keep each technical post accessible to more than a general theory audience. It's not going to be a math blog. I generally prefer short blog posts over long ones. Maybe I should decide on a word limit.
One particular theme that I will likely cover are social concerns and how they affect the way we should design algorithms.
Privacy is a concern that has gained solid traction in the academic community. The theory is relatively far developed. As some folks know, I'm an enthusiastic supporter of Differential Privacy and excited by the progress that's been made in that community. I will probably blog about Differential Privacy to some extent. In particular, I'd like to cover some recent and ongoing attempts at putting Differential Privacy to use in non-academic settings. Applying differential privacy is challenging for a number of interesting reasons that I hope to discuss.
But there are other social questions apart from privacy that I've been thinking about which are much less developed. Fairness is one of them. The problem arises in nearly all (algorithmic) classification tasks that deal with human beings. It's started to appear on the academic radar even within Computer Science and related areas. I hope to cover relevant developments.
Classifying humans is fundamentally different from classifying, say, rocks, for at least one other reason that I can think of. Human beings are strategic agents unlike most rocks. This leads to problems such as gaming and manipulation. A classifier that distinguishes between poor students and good students by looking at the number of books they own might have very high classification accuracy on historic data. Nevertheless it is quite unlikely to perform well as a high-school entrance test, for example. It would only incentivizes students to purchase a lot of books. The signal found in historic data would disappear quickly as a result. This phenomenon is also a major hurdle in policy making. It's known as Goodhart's law in financial policy. Despite its importance computer scientists have spent little time thinking about how it affects the design of machine learning algorithms. My next post will return to this problem.
In the near term I also hope to cover some of the exciting talks and workshops at the new Simons Institute at Berkeley. I will be a participant in the Theory of Big Data program and I plan on selectively attending some of the events in the Real Analysis program, as well. So, stay tuned for some coverage.
Oh, and why did I choose that name? Moody Rd is one of my favorite roads in the bay area. It used to be my escape route out of Mountain View into the Los Altos Hills. Moody Rd ends in a vicious climb that's followed by an even longer climb to the top of Skyline. If you continue to descend through the redwoods, you'll eventually reach the Pacific ocean. Make sure to eat a goat cheese pizza in Pescadero while you're out there. Moody Rd symbolizes a transition from the busy life in the bay area to the quiet Santa Cruz mountains. It's a transition from intellectual exercise to physical exercise. It's a change in mood from busy to free. Googling "Moody Rd", I found an impressive video of a guy barrelling up the 10% average grade on Moody Rd on his bike. So, if you're curious, I invite you to take a shaky ride on Moody Rd.