I am one of the thousands of volunteers at Distributed Proofreaders. We’re Distributed because we’re located in different places all over the globe and we’re Proofreaders because we read text looking for errors. We turn out-of-copyright printed books into electronic eBooks, which have selectable/searchable text and which are also suitable for text-to-speech software, and then make those eBooks available to all, for free, via Project Gutenberg.
Once we have a scanned image of a page from a printed book, we run Optical Character Recognition (OCR) software on it to turn the image of text into actual editable text. The OCR accuracy is good, but tends to still leave many mistakes (what we call “scannos”) in the created text. We then, in multiple passes, verify the OCR’s results.
In striving towards a high quality for the finished eBooks we aim for a consistent result from all the many different volunteers. This is achieved by following a set of Proofreading Guidelines which explain what to change and how to do it.
And to help people familiarize themselves with the Guidelines, we have a set of Proofreading Quizzes and Tutorials. These act as an instructional aid for people to learn what to do and also as an ongoing refresher course, as it is strongly recommended that all volunteers redo the Proofreading Quizzes every six months or so.
The Proofreading Quizzes start with the basics and gradually introduce more and more elements, covering what to do with things found in easier books through to quite hard and challenging books. Each quiz is accompanied by a brief tutorial which explains everything one needs to know to complete the quiz.
Part of such a scanned image of a page from a printed book might look like this:
and the OCR software may have generated for it the text:
We then compare that OCR generated text with the scanned image of the printed page and correct any mistakes which the OCR made to have the text be the same as in the image:
It’s very much like those spot-the-differences types of games/puzzles. Whilst Proofreading we ignore things like italics and just verify the text has the correct characters. Layout and style issues, such as italics, are dealt with in later Formatting rounds of our process.
The quiz process lets volunteers actually try their hands at proofreading as they work through the quizzes and tutorials. And it provides the answers online in an automated way — you don’t have to wait for feedback.
Here’s a little quiz to start you off:
- Do you have an attention to detail?
- Do you like those spot-the-differences games?
- Do you like learning new things and facing new challenges?
If you have answered yes to these questions, you may enjoy being a Proofreader at Distributed Proofreaders. Try the Proofreading Quizzes and find out!
This post was contributed by FallenArchangel, a DP volunteer.