The Proofreading Quizzes

February 1, 2019

I am one of the thousands of volunteers at Distributed Proofreaders. We’re Distributed because we’re located in different places all over the globe and we’re Proofreaders because we read text looking for errors. We turn out-of-copyright printed books into electronic eBooks, which have selectable/searchable text and which are also suitable for text-to-speech software, and then make those eBooks available to all, for free, via Project Gutenberg.

Once we have a scanned image of a page from a printed book, we run Optical Character Recognition (OCR) software on it to turn the image of text into actual editable text. The OCR accuracy is good, but tends to still leave many mistakes (what we call “scannos”) in the created text. We then, in multiple passes, verify the OCR’s results.

In striving towards a high quality for the finished eBooks we aim for a consistent result from all the many different volunteers. This is achieved by following a set of Proofreading Guidelines which explain what to change and how to do it.

And to help people familiarize themselves with the Guidelines, we have a set of Proofreading Quizzes and Tutorials. These act as an instructional aid for people to learn what to do and also as an ongoing refresher course, as it is strongly recommended that all volunteers redo the Proofreading Quizzes every six months or so.

The Proofreading Quizzes start with the basics and gradually introduce more and more elements, covering what to do with things found in easier books through to quite hard and challenging books. Each quiz is accompanied by a brief tutorial which explains everything one needs to know to complete the quiz.

Part of such a scanned image of a page from a printed book might look like this:

quizimage

and the OCR software may have generated for it the text:

quiz_rawscan

We then compare that OCR generated text with the scanned image of the printed page and correct any mistakes which the OCR made to have the text be the same as in the image:

quiz_corrected

It’s very much like those spot-the-differences types of games/puzzles. Whilst Proofreading we ignore things like italics and just verify the text has the correct characters. Layout and style issues, such as italics, are dealt with in later Formatting rounds of our process.

The quiz process lets volunteers actually try their hands at proofreading as they work through the quizzes and tutorials. And it provides the answers online in an automated way — you don’t have to wait for feedback.

Here’s a little quiz to start you off:

  • Do you have an attention to detail?
  • Do you like those spot-the-differences games?
  • Do you like learning new things and facing new challenges?

If you have answered yes to these questions, you may enjoy being a Proofreader at Distributed Proofreaders. Try the Proofreading Quizzes and find out!

This post was contributed by FallenArchangel, a DP volunteer.


Preserving the Past … For the Future

January 1, 2019

Preserving the Past … For the Future … One Dig at a Time

archaeologyLooking forward to another day at the archaeology dig. Putting on the coffee and getting breakfast. Water containers to be filled with fresh water — it’s going to be HOT today, so need to take extra. Grabbing some food to throw into my pack along with the water. A trip to the barn to check on my animals — fresh water, everyone looks good. Throwing my pack into my vehicle and away I go!

Need to dig carefully — looks like someone broke a clay pot — all in pieces — and each piece needs to be carefully extracted from the soil. The pot will be reconstructed in the lab at a future time. Notes, notes, notes, never ending — this is the important stuff — keeping track of soil changes, artifacts found, any “stains” in the soil that may be the remains of poles holding up ancient structures. Here’s some rock debris — someone chipping away on a precious piece of rock to make a projectile point, scrapper, or other implement. Each piece of rock must be collected and labeled carefully. Some charcoal here — an ancient fire pit, rock-lined — need to photograph and draw a rough sketch. Wonder what they were cooking: deer? rabbit? fish? Maybe some of the potsherds from the broken clay pot can be sent out for protein analysis.

One never knows what is going to be found at a dig — but each little bit tells the story of the past and must be carefully preserved for future generations.

I’m very dirty and very tired and mosquito-eaten — but it’s been a good day and I feel great!

Preserving the Past … For the Future … One Page at a Time

That’s what I did as an archaeologist volunteer — but it’s not so very different from what I do as a Distributed Proofreaders volunteer.

Getting up in the morning and turning on the computer before doing anything else. Putting on the coffee and grabbing some breakfast. Logging into Distributed Proofreaders.

What shall be read today? Sometimes science, sometimes travel, sometimes anthropology, sometimes just choosing something different that I never even considered reading. Every book is important — the 5-page books to the 1,000-page books. The religious books — books of poems — science books — fictional books — travel books — music books — medical books — all interesting and need to be carefully proofed.

Here’s a book on engineering — wonder what sorts of things engineers were working on way back then? Another on an African tribe — a culture different from mine — thinking and doing things according to their needs and wants — wonder what they would think of Western culture? And another book on ocean biology — maybe will read this one for a while. All those Latin names of shells and sea creatures — they require a reader’s full attention. Here’s another book on submarines — somewhat technical — think I’ll read this next. Some math formulae and engineering terms — wonder how submarines have changed from past times to today?

Never know what books will be in the queue to be proofed but every one is important, each book tells a story of the past and must be meticulously proofed, formatted and preserved for future generations.

My back hurts, I need more coffee, my eyes are glazing over — but it’s been a good day and I feel great!

This post was contributed by eyecrochet, a DP volunteer.

The DP Blog wishes all its readers a very happy and healthy New Year!


A Spell of Proofing

December 1, 2017

proofreader_cropped“I have some free time. I get to proof!” Proofing (as we call proofreading at Distributed Proofreaders) is relaxing. I get into a flow where time and place disappear and I am just in the page — in the zone.

“What shall I proof today? The project I have been chipping away at, a page at a time, has moved on. Oh, this project that I’ve been dipping into appears to be stuck in the round. What’s stopping it? Ah, it’s a page with a lot of Greek on it. I don’t think I can leave that page better than I found it. I’ll leave it for someone else.” Perhaps I’ll post about it in the Greek Team forum.

“Look, here’s a book someone proofed up to the Table of Contents (ToC).” I enjoy proofing ToCs because they often hold a few missed errors. “See — that page number is 33, not 38. It’s a bit obscure, but since the next entry is for page 35, it’s likely 33.” I’ll leave a note.

33[**38]

“Ooh look, it’s one of those old-fashioned detailed ToC entries that lists out subjects covered in the chapter separated by dashes. This line starts with a dash so the dash and the word following it need to move up to the prior line. The word is followed by a dash so that needs to move up too.” I change:

porches–rocking chairs–stoops
–steps–lazy conversation–sunset

to

porches–rocking chairs–stoops–steps–lazy
conversation–sunset

“The post-processor is going to have fun with that!”

I’m at the bottom of the page. Let me hit WordCheck (DP’s version of spellcheck). “Hunh. I didn’t notice ‘explain’ was mis-typeset ‘explarn’. I’d better exit and add a note.”

explarn[**explain]

I return to WordCheck. “Looks good.” Save and close.

“I’ve wrapped up the ToC and Illustrations pages. I’m not really interested in the content of this project. What else is available?”

“Oh, I see a novel, a Western. That should have different types of errors to seek out and find.”

I open a page. “Ugh — dialect. I’ll do just this page then find something else.” But dialect means dialogue. Dialogue often means quotation marks misplaced in the text — often mis-spaced ones or ones attached to the speaker instead of the conversation. “Yep, there’s one.”

he said,” Bring that thar hoss over hyar.”

I change that to:

he said, “Bring that thar hoss over hyar.”

Novels, juveniles, and Westerns often seem to have the worst typesetting: missing or misplaced quotation marks, missing periods at the ends of sentences, misspellings. They’re laced with dialect that at times makes reading and understanding the intended word difficult at best.

Speaking of reading: There’s proofing and there’s reading. It really helps to do both to find errors — but not at the same time. “Oh, this is really interesting.” “I didn’t know that.” “What happens next?” Sliding from proofing to reading can mean my eyes gloss over errors, unconsciously mentally fixing instances where a word is repeated, not noticing misplaced quotation marks, but still laser-focusing on typos, incorrect word usage and lack of continuity. Proofing to match letter and punctuation marks can mean I miss the typo because the letters match. These are all important errors to catch. Making separate reading passes and proofing passes as the page is open can help me find different kinds of errors. Muddling both into a single pass risks missing things.

“What? My free hour is up? How can that be? I just got started!”

This post was contributed by WebRover, a Distributed Proofreaders volunteer.


The Typesetters, the Proofreaders, and the Scribes

February 1, 2017

scribeAt Distributed Proofreaders, we are all volunteers. We are under no time pressure to proof a certain number of pages, lines or characters. When we check out a page, we can take our careful time to complete it.

We can choose a character-dense page of mind-numbing lists of soldier’s names, ship’s crews, or index pages. We are free to select character-light pages of poetry, children’s tales or plays. Of course these come with their own challenges such as punctuation, dialogue with matching quotes or stage directions. We can pick technical manuals with footnotes, history with side notes, or  science with Latin biology names. We can switch back and forth to chip away at a tedious book interspersed with pages from a comedy or travelogue.

Every so often though, I stop and think about the original typesetters.

They didn’t get to pick their subject material, their deadline or their quota. They worked upside-down and backwards. They didn’t get to sit in their own home in their chosen desk set-up, with armchair, large screen, laptop or other comforts. Though we find errors in the texts that they set, many books contain very few of these errors. When I pause between tedious pages, I wonder how they did it.

Beyond the paycheck, what motivated them to set type on the nth day of the nnth page of a book that consisted mostly of lists, or indices? Even for text that would be more interesting to the typesetter, the thought of them having to complete a certain number of pages in a given day to meet a printing deadline is just impressive.

printing pressI know many have jobs today that require repetitive activities. But how many are so detail-oriented, with no automation, that leave a permanent record of how attentive you were vs. how much you were thinking about lunch? Maybe it was easier to review and go back and fix errors than I picture it to be. Maybe they got so they could set type automatically and be able to think of other things or converse.

When I’m proofing a challenging page, I sometimes think of that person who put those letters together for that page. I realize my task is so much easier. If I want I can stop after that page and hope some other proofer will do a page or two before I pick up that project again.  I can stop, eat dinner, and come back tomorrow to finish the page when I’m fresh.

I imagine a man standing at a workbench with his frames of letters and numbers and punctuation at one side, picking out the type one by one, hoping that the “I” box doesn’t contain a misplaced “l” or “1.” I see him possibly thinking about how much easier life is for him than it was for the medieval scribe. The scribe was working on a page for days, weeks, even months, one hand-drawn character at a time. I see the typesetter appreciating how much improved his own life is and how much more available his work makes books to his current readers. And I smile as I see him smile.

This post was contributed by WebRover, a DP volunteer.


Proofing with Maps

August 8, 2015

While proofing for Distributed Proofreaders, I often find myself opening up a mapping application to locate rivers, towns, buildings, forts, streets, etc. that are mentioned, described, or central to a project.  Sometimes it’s to figure out where they are. Sometimes it’s to try and see what’s being described.

map

For example, Early Western Travels, 1748-1846, Volume XXIII, describes some rock formations that the footnote identified as being in Dawson and Valley Counties, Montana. Using that information, I was able to view a photo of the rock formations. I’ve also found remote tiny towns that still exist in the American West — one even had a preserved historical district.

Florizel’s Folly (in progress at DP) led me to Brighton, EnglandYellowstone’s Living Geology: Earthquakes and Mountains (also in progress) to Old Faithful.

I posted in the DP forums about this and found another proofreader who was using mapping software to locate parks that were mentioned in old bird books as locations of certain birds. This person was interested in whether the parks have the same birds.

Of course, I look at maps because I love maps. So starting with a specific reference point from a book, I can get lost for half an hour or more exploring, envisioning, and virtually visiting. Anywhere. And how exciting when I get a chance to visit in person a site I’ve visited before via mapping software; for example, the Pony Express Statue in Sacramento Old Town.

If you haven’t tried this before, do! You may find yourself addicted.

This post was contributed by WebRover, a DP volunteer.


%d bloggers like this: