How to Digitize a Million Books

How to Digitize a Million Books:

“Ultimately, Clancy says, Google would like Book Search to give the same result as someone going to a library, looking in its stacks, and serendipitously finding a book that’s interesting or useful. One way to do this would be to link books to each other by categories and themes, he suggests. The task becomes more complicated, though, when linking works by Virginia Woolf, for instance, to criticisms of her work, works that inspired her, or authors who wrote during the same era. Designing algorithms that can effectively organize all of this new information, Clancy says, is ‘one of the grand challenges and will take many years.’”
“Reddy says CMU researchers are trying to tackle this challenge by using a ‘statistical approach’ to organizing the information. In this approach, Virginia Woolf’s stream-of-consciousness sentences, for example, would be analyzed by an algorithm that would find patterns based on sentence length, structure, and punctuation. This technique might find a work by James Joyce, one of Woolf’s influences — or that of an obscure author whose writings might otherwise never have been found.”

---

There are 2 other entries posted on this day.