9.12.10

Yacc is of the Living Dead

The scientific process seems to evolve towards something like pop culture and music industry. For starters, we have publishing houses and record companies.

And then, there is a scientific article with the title "Yacc is dead" [1] which, quite possibly, has reached more people than all the accepted papers of the conference to which it was submitted.

Russ Cox criticized this paper quite harshly in an insightful blog post [2].

So, in short, the paper proposes a method that yields all parse trees and claims that the approach is "efficiently on average". The authors run a benchmark, but of course a benchmark is not exactly a proof. Russ's argument (by regexp analogy) goes something like we can make up grammars that simulate NFAs. Since there are regexps whose NFA will have O(exp(n)) possible successful runs on input of length n, (maybe something like a*a*), getting all parse trees means asking for trouble.

Where I am not so sure is, whether the authors are really unaware of this. It is possible to understand "parses efficiently on average" as "works efficiently for the average grammar programmers want to parse". Writing an ambiguous grammar by accident is one thing, writing something like a*a* is something else. Agreed, a statement like "works well for the average grammar" does not really have much scientific worth -- and Russ's criticism still stands -- but indeed it may work out in practice.

The math that derives working parsers from their specification is beautiful... the issue for practical purposes is that there are are other approaches to combinator parsing -- these are the ones that should be compared with derivative-based method, not yacc.

There is at least one paper (journal publication) out there that shows how memoization can make top-down functional parsers linear [3?], unfortunately I cannot access it so cannot comment on how applicable it is here. What's next, DMCA for papers? <b>J'accuse</b>.

Regardless what is up with yacc or memoization, among functional programming literature, there is a plethora of papers that use parser combinators to achieve polynomial parsing (the first reference that came to my mind is Swierstra and Duponcheel's paper [4]), but the wikipedia page on combinator parsing [5] has a link to more recent -- 2007 -- work from Frost and Hafiz that looks interesting.

There are no conclusions here. Combinator parsing is not new, but with the rising popularity of Scala, it is pretty much alive and kicking. After all, it has a packrat combinator parsing library included. For my next parser, I will certainly use that one and not Might and Darais approach. Still, I think the approach described in Might and Darais paper has educational value and I sincerely hope the authors will come up with a longer, more polished version.

[1] Matthew Might, David Darais. Yacc is dead.

[2] Russ Cox. Yacc is not dead.

[3?] Frost, Szydlowski. Memoizing purely functional top-down backtracking language processors. Science of Computer Programming
Volume 27, Issue 3, November 1996, Pages 263-288

[4] Swierstra, Duponcheel. Deterministic, error correcting parser combinators.  

[5] http://en.wikipedia.org/wiki/Parser_combinator

19.8.10

Romantic enthusiasm for verification

Recently, a workshop got me interested in contract programming again. In essence, programming by contract means we annotate our source code with pre- and postconditions.

@Requires(numChars % 2 == 0)
@Ensures(result.length() == numChars)
String foo(int numChars) {
...
}


This is like saying "the only proper way to call this method with a numChars parameter that is even. You will get a nice non-empty string in return." Sounds like a deal.

There is probably a lot to say about contracts and how they can help developers understand their code. There are frameworks that will take these annotations and turn these into run-time checks. My interest is more in verification though. Automated program verification is really hard to do, because several undecidability results apply -- but in the words of Prof Jürgen Giesl this also means that one can always improve the state of the art.

The previous time I got excited by contracts was when I had the pleasure to host Prof Peter Müller at Google where he give a talk and demo of SpecSharp. The article Specification and Verification: The Spec# Experience gives a glimpse of the programming system the authors had built. An amusing quote from there: "When programmers see our demos, they often develop a romantic enthusiasm that does not correspond to verification reality."

Let me state that dream here: a program is made of method calls. If the compiler, or IDE, "understands" the pre-conditions and post-conditions of method calls, it should be able to chain these "facts" together and achieve some form of "understanding" of the program. This can help developers add or change lines and get immediate feedback on whether the source code they write is correct.

Going back to the example above: imagine writing foo(3) in your IDE and having it underlined in squiggly red, with a mouse-over message stating "precondition numChars % 2 == 0 not satisfied".

Spec# translates the pre- and postconditions into Satisfiability Modulo Theories (SMT) problems which are then solved by the underlying engine called Boogie. Another SMT solver is Z3 which has a readable tutorial. Reading up on these, it is quite striking that Spec# is working on a real programming language, and that neat interfaces do exist between the layers of byte code, verification source and the nitty gritty predicate logic formulae.

And then, there still is the following flaw, quoted here:

At the moment you find an error, your brain may disappear because
of the Heisenberg uncertainty principle, and be replaced by a new
brain that thinks the proof is correct.
L.A. Levin

Just kidding. Robert Pollack's article is concerned with the problem that we can come up with a system that will verify our program, but we won't necessarily be able to trust it, because it is a program itself. Here is the bit that is relevant though: "My guess is that tech- 
nologies of automated proof search will be highly developed because of economic 
pressure for reliable hardware and software, and these, applied as tactics to proof 
checking, will make formal mathematics a practical reality in the foreseeable fu- 
ture."

It seems we still have a long way to go before developers see the light in IDEs having some form of automated program verification. I have a hope that if one goes back to natural deduction style proofs, logical frameworks or simple logic programming with SLD resolution, it may be possible to write reasonable (logical) annotations and have proper support. Let (expert) programmers write more. Only experimentation can show whether this is a good idea.

14.8.10

Good news and minarets

Good news first:
  • I got married to Şerife exactly one week ago : )
  • It is the blessed month of Ramadan, and my best wishes go to all who observe it.
  • the world has not ended yet.
Apart from that, today's news have been full of bad news from all over the world. Additionally, I had been reading a lot of stuff all over the web and naturally some of it had potential to pull me down (think Palestine, think victims of human trafficking). But today the Swiss news also featured Obama supporting religious freedom in the US. 

I got curious about the http://www.thecordobafoundation.com/ and there found their journal called Arches. This proved to be quite fascinating, as the points of view expressed in those articles are so different from what one usually sees in the media. The last edition is Vol 3 Edition 5.

Not all of the Arches articles are my taste. For instance, there is this Kurdish guy who represent the home-coming of PKK guerilla in Turkey (which derailed the peace process very efficiently) as the "home-coming of members of a peace group" and describes how brainwashed the Turkish people are to misunderstand this event. Well, it seems in 2010 "peace groups" sport fashionable war uniforms, even when they lay down their arms.

The one article I like most is " Understanding 'Land of War' and 'Land of Islam' " by Dr Jasser Auda, pp.41.
'Land of War' is a concept that, unfortunately, has currency among some Muslims. For them, 'Land of War' is the essential clash-of-civilizations hypothesis.  This glossary gives the definition: A land ruled by infidels that might, through war, become the "Abode of Islam."


In essence, there really are people who believe that in the "Land of Peace" they are free to cheat or lie. Maybe they think, Muslims will "conquer" this land one day, and then, only then, will it become necessary to act. It seems pretty clear that terrorists would subscribe to this idea.

It sounds silly to think of a country like Switzerland as a "Land of War". You would think that Turkey, which is mostly Muslim, would not qualify. Yet, I once overheard a discussion where somebody qualified the whole country of Turkey as un-Islamic, since "the women would not wear the veil there". Uh-huh. What can one say to that? Something is pretty wrong with people who think in terms of "Land of War".

This has been refuted many times, and practicing Muslims who take their religion seriously enough to do some research should know about the falsity. Dr Auda's article is special in that it is at the same time approachable by Muslims as well as non-Muslims, and reads like a sound scholarly discourse on the topic. 

First, the article points out the difference between Shari'a and Fiqh: whereas the former is an ideal (like "Justice"), the latter is a necessarily incomplete, changing attempt to achieve justice by norms and rules (like "Law"). Then he places the "Land of War" concept in the realm of Fiqh. In other words, yes, there was a time and a place where learned scholars argued that outside the secure realm of the Islamic empire, things gets rough. Fiqh is and must always be adapted to social circumstances of a particular time and place. (My remark: A great book that explains why is Tariq Ramadan "Radical Reform")

Back to the article. Auda then thinks on how the "Land of Peace" should look like. Should it be predominantly Muslim? Should it have a Muslim ruler? Should its legal system be Shari'a compliant? Is it enough for Muslims to practice their religion.

He arrives at the conclusion that the proper way of viewing things is to see "Land of Islam" as synonymous with "Land of Justice". This has its basis in the realization that the purpose of "Islamic Law" is to ensure justice and well being. This includes Muslims rights to perform "public acts of worship". Among other things, this means construction of mosques, including minarets.

There, finally, is a great explanation of why many "Western" countries are better developed and why Muslims find it easier to practice their religion there. The legal systems of many so-called "Islamic countries" are not up to the task of actually achieving justice.

I am willing to draw the conclusion that it absolutely does not matter whether the laws were derived from the Noble Qur'an or not: what matters is if they are achieving justice (or Shari'a). Muslims can well approximate the "divine way of living" in a secular democracy. Also in democracies, humans make errors when deriving and judging law. They introduce bugs. It seems the legal systems of "Islamic countries" are still buggy.

Democracy is not a guarantee for justice. In Switzerland, a not-so-nice bug got introduced. At the time of this writing, building minarets is still forbidden in Switzerland. People voted on this on 2009-11-29, after a big smear campaign by the SVP. On the other hand, the constitution still claims that everybody is free to practice their religion, and that everybody is equal in front of the law. Am I the only one who sees a contradiction here?

Barack Obama supported building the community center (including their mosque) near Ground Zero, on the basis that Muslims have the right to practice their religion.