When I received my December issue of Genii (very good issue, Richard) I eagerly read Mr. Racherbaumer's column, On the Slant, as I always do. What a teaser! He mentions a fascinating approach to figuring out if Shakespeare's works were all written by Shakespeare or not:
For years, scholar have argued about the provenance of Shakespeare's works, and it was recently reported that "a team of researchers at Beth Israel Deaconess Medical Center believe they have settled the debate, using a new computer program..."
But you don't tell us how they settled the debate. What a sneaky way to encourage us to actually go out and look for knowledge on our own instead of having it spoon-fed to us.
After a few minutes of interaction with our favorite oracle, Google, I found an article titled Information categorization approach to literary authorship disputes by Yang, Peng, Yien, and Goldberger. The official PDF file of the article may not be available to everyone, I'm not sure. It looked like a subscription was needed, but my university IP address may have authorized me--I really don't know.
Two of the interesting findings in the paper: Edward III seems to be have been more likely written by Christoper Marlowe, and The Noble Kinsmen seem like it may have been co-authored by John Fletcher and Shakespeare.
The authors of this paper also analyze a classic work of Chinese literature, The Dream of the Red Chamber and classic American works, The Federalist Papers.
As interesting as this work is, I don't think is clearly settles the debate. The authors have an indication of statistical similarity based on words used by authors. It is fascinating, but can we found counterexamples? Places where it doesn't work? Do authors take on different styles consciously that look different with this approach, or are they unable to do so? I think I would enjoy doing some research in this.