My Distributed Proofreaders Journey

April 1, 2024

When I first discovered Distributed Proofreaders via a Google search, my quest was like that of many who were looking for an interesting and challenging activity during the pandemic. Turning public domain books into free e-books for Project Gutenberg – preserving history by proofreading one page at a time – seemed ideal. And one thing that particularly drew me to Distributed Proofreaders was that there was an active forum where like-minded folks could share knowledge and chat.

The Bookworm, Carl Spitzweg, c. 1850

I stalked that forum for a few days and noticed one particular forum thread was quite active – P3 Diehards. I said to myself, “I want to join that fun team.” Problem was, I was not a P3! P3 is the top of three levels of proofreaders, and the privilege must be earned.

And so my Distributed Proofreaders journey began, going through the registration process, taking the training quizzes, and starting at the lowest proofreading level, P1. I had my blinders on and was only interested in proofreading so I could qualify for level P3. Occasionally, on my breaks, I would browse the forum to see what everyone was talking about. There was always something complex that people were working through with help from others. It all sounded too bothersome and beyond my abilities at the time. So, back to proofing I would go.

And then one day it happened. A private message arrived in my inbox announcing my access to P3! I think it was maybe 30 seconds later I became a member of the P3 Diehards team. What a great group and a great goal – to save the old books from languishing in the P3 round for more than 100 days. Of course, there are other great teams at Distributed Proofreaders; this one just spoke to me. (I even wrote a poem about it!)

Fast forward two years, and a new interest was sparked to investigate some other parts of Distributed Proofreaders. There’s Formatting, Content Providing, Project Management, and Post-Processing. I decided I wanted to try them all!

Starting with formatting seemed like the logical first step. At first it’s difficult to understand which tags (such as for italics and poetry and so forth) go where, and more importantly why, but it is rewarding when the second level formatter, F2, confirms that my page was done correctly.

Next step: Project Manager (PM). A PM guides a book project through the rounds of proofreading and formatting and answers questions in the project discussion. I was prompted to this endeavor by a plea for more content to be provided for new members. However, I couldn’t yet create my own projects to manage, so I adopted a few from others for awhile. With kindness and expert knowledge, my mentor, Fay Dunn, guided me through setting up my first project and several more after that. It is fun to watch my projects going through the rounds, some more popular than others, but they are all like “my kids.”

So that meant the next step needed to be learning Content Providing (CP). CPs select books for processing through Distributed Proofreaders, harvesting page images from online sources and converting them into editable text using optical character recognition. For this step, there were nearly a dozen software applications that first needed to be downloaded. And, of course, about half of them were problematic with my device for one reason or another. BUT, there’s a forum with lots of knowledgeable people from all over the world. They came to my rescue, and finally everything was working.

It was slow going at first, what with learning all this new software AND the CP steps. One of those steps is obtaining copyright clearance from Project Gutenberg, because everything we work on has to be in the public domain. I can’t describe the thrill when I see the clearance email from the copyright team approving a project. I’ve now CP’d 10 projects and was recently handed my graduation diploma by my wonderful mentor!

Post Processing (PP) was the last on the list of areas to tackle. PPers convert the proofread and formatted text into its final e-book format for Project Gutenberg. This turned out to be the most arduous process. There was more software to download and learn, as well as learning new concepts and new lingo used in the Guidelines. Luckily my PM/CP mentor felt brave enough to help me through it all. After a crash course in HTML and CSS and some peeking at finished projects, I somehow managed to crank out my first PP project recently. A good amount of time will need to lapse before I tackle another PP project, but I have so much respect for those who get these books onto the Project Gutenberg site.

It’s mind-boggling the amount of intelligence in this Distributed Proofreaders pool of volunteers, from solving difficult software issues, to proofing questions about any known language to man, to devising detailed search functions; there’s always someone to provide the answer.

This journey has been one of the best experiences of my life. I conquered some complicated (for me) challenges. I almost gave up a couple of times, but my mentor kept encouraging and reassuring me. I’m so glad I persisted, as it gave me the opportunity to rub elbows with some brilliant, clever, and creative minds from around the world. It is a privilege to be among them.

This post was contributed by Susan E., a Distributed Proofreaders volunteer.


Post-Processing Fornander’s Hawaiian Antiquities

March 1, 2024

When searching for works to prepare as e-books at Distributed Proofreaders, I always try to find works that are still interesting today, add some diversity to Project Gutenberg’s collection, or are of significant cultural or historical importance.

Another criterion is that the works should be manageable by the volunteers here at Distributed Proofreaders, and in this, I like to explore the edges of what is possible. Each e-book on the site goes through multiple proofreading and formatting rounds, with volunteers carefully reviewing the images of each page with the computer-generated text generated from the images. Once all the pages have completed these steps, a post-processor carefully assembles them into an e-book.

Collections of folklore are always popular and interesting. They are timeless and offer an insight into the culture of a people. Over the years, I’ve added a couple of books with Hawaiian folklore from various authors, and, while digging deeper for more, I hit upon the mother-lode of many of these works: the Fornander Collection of Hawaiian Antiquities and Folk-Lore, a huge collection of material collected in the late 19th Century by Abraham Fornander, published between 1916 and 1920, in three large volumes, by the Bishop Museum Press in Honolulu.

Abraham Fornander was born in Sweden, on the island of Öland, on 4 November 1812, the son of a clergyman. He studied theology at the University of Uppsala, but dropped out and left Sweden to became a whaler. In 1838, he arrived on Hawaii. Here, he became a coffee planter, land-surveyor, and journalist. He also officially became a citizen of the (then still independent) Kingdom of Hawaii, and married Pinao Alanakapu, a Hawaiian chiefess. He started to promote public education and took up various official roles as inspector, governor, and judge. This allowed him to travel on the Hawaiian islands and collect a lot of information about Hawaiian mythology and the Hawaiian language. He used much of his collected materials to publish his Account of the Polynesian Race (a work I hope to tackle at some later date). After his death, he left a massive collection of notes and papers. These ended up in the Bernice P. Bishop Museum and ultimately were published, together with English translations, from 1916 to 1920. The first volume of Fornander’s collection is now available on Project Gutenberg (the following two volumes are still in progress at Distributed Proofreaders at the time of writing).

The volumes are bilingual, with the English translation on the left and Hawaiian original on the right. Since the Hawaiian language, as written at that time, used only standard letters and no diacritics, it is not that difficult for non-speakers to deal with. In fact, the Hawaiian alphabet is surprisingly short, with just 13 letters: five vowels: a e i o u (each with a long pronunciation and a short one, but here not distinguished); eight consonants: h k l m n p w; and the glottal stop (not shown in this text). Since all syllables in Hawaiian are a single consonant followed by a vowel or diphthong, to non-natives some words may appear long and repetitious, and in particular names can become pretty long — although there are also plenty of very short words to compensate.

Like many indigenous languages, Hawaiian is an endangered language. It was still widely spoken in the 19th Century, when the Hawaiian islands were an independent kingdom that maintained diplomatic relations with many countries. The Hawaiian Kingdom’s constitution was written in Hawaiian. Literacy was promoted and newspapers were regularly printed. However, through the machinations of American businessmen, the government of Queen Liliʻuokalani was overthrown in 1893, and after being run as a “Republic” for a short while, the territory was annexed by the United States in 1898. This led to the demise of the Hawaiian language. In 1896, English was made the sole official language, and the use of Hawaiian in schools was systematically suppressed. Only in the 1950’s did this trend slowly begin to reverse, with renewed interest in the language and indigenous culture, though Hawaii became a U.S. state in 1959. Hawaiian dictionaries were published, and a revival movement gained traction in the 1970’s, with schools once again teaching children the language. However, it is still spoken by only a small fraction of the current population of Hawaii.

Having Fornander’s collection easily accessible will be very valuable to learners of the language (even though the language used is probably archaic and the spelling differs a bit from modern Hawaiian) and to students of its folklore and history. The collection starts off, appropriately, with a mythological description of the discovery of the islands and the origins of the Hawaiian people. The first volume further includes, among many others, the popular story of Umi, a fifteenth-century chief or king, who usurped the throne from his older half-brother, then ruled for about 35 years and united the Hawaiian islands into a single kingdom.

Since today only about 24,000 speakers of Hawaiian remain, the hope of finding enough native speakers to help us out with this project was limited. We needed to ask non-Hawaiian-speaking volunteers to work on Hawaiian pages, even if they didn’t know a single word. Hawaiian is an Austronesian language, remotely related to languages such as Malay or Tagalog, so speakers of those might occasionally recognize a word, although it will often require some linguistic training to see the relationship (and that really is no help in proofreading those pages). Hawaiian is more closely related to Polynesian languages such as Tongan, Samoan, or Tahitian, and speakers of those languages can probably get some of the gist of the stories (but speakers of those languages are also not easily found).

So how to deal with such a massive and complex work?

Well, first, praise where praise is due: The many volunteers at Distributed Proofreaders dutifully ploughed through the Hawaiian pages and fixed a lot of errors left behind by the optical character recognition process (which turns scanned images into editable text). When I received the work to post-process, most of the hard work had already been done.

Still, post-processing a work like this is a considerable challenge. Post-processors have to create both text and HTML files for Project Gutenberg and make them both easily readable. First, I needed to untwine the English and Hawaiian text (which in the original book were on alternating pages), such that both the English and Hawaiian text became continuous texts, at least at the chapter level. To do this I simply made two copies of the text file, and then removed the English part of the text. Then I recombined them, so that the Hawaiian follows the corresponding English chapter.

Once the untwining was done, I started to add tags to demarcate chapter headings, poetry, tables, and footnotes, convert quotation marks to their proper curly shapes, etc., and deal with the issues the proofreaders noted. Then I came to the task of checking the entire text for remaining spelling issues, and that in a language I do not speak, without the help of a spelling-checker, and in an obsolete spelling.

Luckily, I’ve done this a few times before, and developed a few tools to help me make this easier. During my preparation, I tagged each fragment of text in my file with the language it is written in. This enables me to create word-lists, which I can inspect. Words that occur many times can be safely ignored, but those that are rare or unique may need some further inspection. Since I color-code by frequency, rare words jump out.

Using the word-list, I can identify suspect words, but that doesn’t always help. Then I can turn to a another tool and generate a KWIC (Keyword in Context) index. This allows me to see how each word is used, and, based on that, I can often decide how to deal with it.

The illustration below show how this works for the name Kekakapuomaluihi. At a glance, I can see this is used in Hawaiian and English. It is mentioned in the index (yellow background), pointing to the page it can actually be found, and its meaning is explained in a footnote (pink background).

Finally, I wanted to align the text in parallel columns, such that the English and Hawaiian could be read side-by-side, as in the original. This is less straightforward than it sounds, because sometimes a paragraph on the left is the equivalent of two on the right, and sometimes paragraph boundaries do not match. To make this work, I give all paragraphs in one language a label, and give the matching paragraphs in the other language the same label. This way, my software knows which paragraphs to place next to each other.

Having gone through all those steps, I was at last able to submit the work to Project Gutenberg. Now the first volume of Fornander’s monumental collection is freely available to all those interested in Hawaiian culture. At the time of writing, volume two is almost ready as well, and volume three is in the final formatting round at Distributed Proofreaders.

This post was contributed by Jeroen Hellingman, a Distributed Proofreaders volunteer who was the Project Manager and Post-Processor for the Fornander Collection.


Celebrating 47,000 Titles

December 20, 2023

This post celebrates the 47,000th unique title Distributed Proofreaders has posted to Project Gutenberg: the Betty Crocker Picture Cooky Book. Congratulations and thanks to all the Distributed Proofreaders and Project Gutenberg volunteers who worked on it!

Everybody loves cookies! So proclaims the fictional Betty Crocker in the introduction to the seminal 1948 booklet on baking them, the Betty Crocker Picture Cooky Book. Thanks to the volunteers at Distributed Proofreaders – especially the Cookbook Lovers team – you can re-create the delicious goodies that post-World War II American housewives made for the growing Baby Boom generation.

This early version of the Betty Crocker Picture Cooky Book is only a 46-page booklet, but it crams in “128 of the most popular tested recipes from her collection … with 70 ‘how-to-do’ tips, 50 success pointers and 175 illustrations.” All that for just 25 cents, if you sent it to Betty at General Mills, the Minnesota food conglomerate behind the icon. Later editions in full-size book form – particularly the classic 1963 edition – were greatly expanded to include hundreds of recipes. And, of course, some recipes were modernized based on changing tastes and eating habits. But the goal remained the same: to make it easy for busy homemakers to bring the comforting, nostalgic aromas and flavors of cookies to their families.

Snickerdoodles baked by Lisa Corcoran

This milestone is particularly meaningful because the booklet was contributed by a Distributed Proofreaders volunteer, Lisa Corcoran (Leebot). It belonged to her mother, and Lisa still bakes from it. She recalls:

“My mom collected lots of these promotional cookbooklets through the years, as well as recipes from various TV and cooking shows. She cooked and baked from scratch. My sister and I loved coming home from school to a batch of chocolate chip or peanut butter cookies, or Snickerdoodles (the recipe in the photo). At Christmas she made many types of cookies, many of which she’d assemble into gift boxes for neighbors and friends. The Berliner Kranser (little wreaths) recipe remains a favorite.”

Lisa prefers to substitute real butter for shortening in these recipes. She advises, “If you do substitute butter, make a couple of test cookies first as you may need to adjust the ratio of flour. If they spread out and are too buttery, work more flour into the dough until you get the right consistency.”

Distributed Proofreaders is proud to celebrate its 47,000th title with this very special cookbook. Many thanks to everyone who made it possible. And Happy Holidays to all!

This post was contributed by Linda Cantoni, a Distributed Proofreaders volunteer, with contributions from Lisa Corcoran (Leebot), leader of the Distributed Proofreaders Cookbook Lovers team.


The Life of a P3 Diehard

January 1, 2023

Note for those who don’t yet know the e-book creation workflow at Distributed Proofreaders: After a scanned book is turned into editable text, it goes through three rounds of proofreading. The third, P3, is the most challenging, as it requires the most expertise and the closest attention, so sometimes a project has to wait awhile until P3-qualified proofreaders can get to it. The P3 Diehards team has dedicated itself to rescuing projects that are languishing in that round.

I woke up this morning with a minor headache,
So the first line of business: There’s coffee to make!

The headache was due to some major proofreading,
But I wanted to help move some books to smooth reading.

I found my new passion at DP last year
‘Cause there’s so much to do plus there’s fellowship here.

The challenge was huge, but my spirit was keen,
And each day I leaned on my good friend: Caffeine.

Each day I made progress, but slow in my mind.
The goal seemed beyond reach; I felt so behind.

This goal was to help in the most needed place
Where projects sat languished, forgotten, misplaced.

But one thing that kept me on path to my goal
Was seeking that something that feeds mind and soul.

So, first was the hurdle of gaining the level
Where trust must be earned and to demonstrate mettle.

I thought I was hopeless to learn any more,
But my mentors worked wonders with guidance galore!

Then one day my inbox had news I had hoped for:
Clear access to P3; it made my heart soar!

Team Diehards is where I went skipping so quickly
To help with those projects abandoned and prickly.

Some projects are tricky or boring or fun,
And sharing with teammates is second to none.

The visions from Surgery of so many leeches
Are far from the thought of a bowl full of peaches!

The sad Roll of Honour brought tears to my eyes,
But the story of bravery and valor survives.

My headache is gone; I give thanks with “Amen.”
And tomorrow I can’t wait to do it again!

This poem was contributed by Susan E., a Distributed Proofreaders volunteer. Hot off the Press wishes all its readers a happy and book-filled New Year!


Creating E-Book Covers

April 1, 2022

Distributed Proofreaders volunteers work hard to make the e-books they contribute to Project Gutenberg as user-friendly as possible. Among the things we do to that end is creating e-book cover images to make it easy for readers to find e-books of interest to them.

The role and requirements

Book covers in the digital age have taken a different role. Where in the past covers and dust jackets served to protect and later also advertise the book, they now mainly serve to advertise the book and make it easy to quickly locate it on a computer or e-reader screen. With that changed role, the requirements for book covers have also changed.

In short, the role of a book cover in the digital age is to

  • Invite a potential reader to give it some attention.
  • Provide an easy-to-locate icon in e-book readers or computer screens, so it can be found quickly.
  • Provide a reasonably sized, readable short title and the author’s name, so people can ascertain they have selected the book they want.
  • Give some impression of the type of content to expect.

All the while considering that a digital cover is now often just the size of a postage stamp.

A short history

Historically, decorated book covers are a relatively new invention. Books started to be sold in neatly designed covers only by the end of the 19th Century, and in some countries even later. Book buyers were expected to provide their own cover and binding, as desired and fitting for their personal library. So the publisher just sold the book as a bound stack of pages with a nondescript paper cover. That is why old libraries often look very uniform, with all those similarly and often richly bound and decorated volumes. (Our 34,000th title contributed to Project Gutenberg was a manual of artistic bookbinding published in 1878.)

Since books are stored in bookcases or cabinets with only their spine visible, the publisher needed only to put identifying information, such as the title, on the spine. The cover could remain boringly neutral, or, as with some ancient bibles, heavily decorated, but there was no need to put a title on them.

Fortunately, many of the originals we work from at Project Gutenberg are late 19th- and early 20th-Century titles, which often do have nice book covers. However, even when the book we are digitizing does have a cover, it is the part of a book (after the spine) that is most likely to suffer from wear and tear, stained with ink and coffee, mutilated by repeated unprofessional repairs, and defaced by libraries who like to put stickers with shelf locations and bar-codes on them. They are also most exposed to sunlight and so end up discolored.

Even then, such covers were designed to be attractive when placed in the book shop’s window, on a table, or when pulled from a shelf by a prospective buyer, so the requirements for large-size titles and author names are quite different from those you’ll need on a postage-stamp-sized digital image.

Challenges

When dealing with book covers, we at Distributed Proofreaders face a number of challenges. It is our intention to reproduce the original book in its full glory, “the book, the whole book, and nothing but the book.” Of course, with the transition to a digital format, we will lose some of the artifacts of the paper medium, such as page headers and page numbers, although we often retain the latter as small notes in the margin. Similarly, book covers will have to be reinvented for our books reincarnated in their digital form.

When preparing a book for Project Gutenberg, we will address these challenges in different ways.

Locating a good quality cover

First of all, we prefer to use an image of the original cover, so if we have one, we can use that as a starting point. In that case, it often requires some digital restoration. But before we invest in the labor-intensive process of restoration, we’ll seek out alternatives. If we don’t have a good quality cover, but have some idea of what it looks like, our first step is an internet search. Surprisingly often, better-quality scans of the same cover can be found, and sometimes those can be used. We need to be sure to pick only scans of a truly matching cover (i.e., same edition and printing), both to avoid a copyright violation, and to maintain the integrity of the e-book edition we’re making. Covers tend to appear in far more variations than the book itself, even within a single print-run.

Digitally restoring a damaged cover

If our search fails to unearth a good-enough cover, we will fire up our photo-editing software to restore what we do have. My personal guidelines in digital restoration is not to try to reconstruct an as-new cover (it would be nice if such a cover is still available), but only to remove mutilations like bar-codes and disfiguring damage, such as scratches and stains. Smaller aspects of wear and tear I will leave as is: it is not a shame to be old and look it. What I will also try to do is brighten up the colors, and restore color balance. Of course this involves a lot of guesswork, but again, if we can find alternative images on-line, even if tiny photographs, they can give us an indication of the original colors if our copy is particularly discolored.

Removing disfiguring stains from the cover of Van de Noordpool naar den Aequator
Improving the cover colors for Belgian Fairy Tales
Removing the bar code from the cover of The Mason Wasps

Adding titles and authors to original covers

As explained above, the original cover will often not mention the title or author at all. In such cases, to make it easier to recognize a book, we can decide to digitally add the title to the front, — that is, if the original design leaves space for it, which it often does. When adding the title and author, it makes sense to use a typeface matching that of the spine (if known), or the title page. Sometimes we can also use the title from the title page directly, manipulating the color and appearance to blend in with the original cover design.

Adding the title and author to the cover of Myths of the Cherokee

Designing our own cover

Then we come to the point where we have no cover to start with at all. The book at hand is in a generic, unmarked cover, or we have none at all, for example when we work from a set of scans produced elsewhere. In that case, we will design a new cover. From here onward I will concentrate mostly on the way I do this, as other volunteers may have different procedures. It may be tempting to go all overboard and design something really fancy, but here I normally try to restrict myself and keep it functional.

One starting point I often use is the scanned cloth pattern of a book’s back to serve as a generic background. I derived a range of color variants from it. I will pick one color, depending on my mood and gut feeling of what would be appropriate for the book, and will add the original title, author, and year of original publication in a centered design. If the book itself includes a suitable illustration, often the frontispiece, I will use that. If not, I will slightly emboss a generic “PGDP” design on it, but won’t use artwork not present in the source, because of the copyright implications that might have. Balancing out the letters takes some puzzling with font-sizes, splitting lines, and letter-spacing, but normally, I am able to produce a reasonable new cover in some 15 to 30 minutes. Not perfect, probably not to everybody’s taste, but better than auto-generated.

DP-created covers for Narrative of a Five Years’ Expedition Against the Revolted Negroes of Surinam, Serbian Fairy Tales, and De Hogerveldt’s Oorspronkelijk Tooneelspel in 3 Bedrijven

I normally use serif typefaces, capital letters, and symmetrical design, because that was the standard in the era most of our books where produced. Asymmetric designs only started to come into vogue after the 1920’s, and thus are inconsistent with most books’ age. I still don’t feel the need to fully emulate an old style cover: I typically use somewhat brighter and larger letters, and prominently place the year of the original copy at the bottom of my design. This should immediately signal to the reader they are dealing with an old book in a new digital cover.

Some things that work less well

An alternative I regularly see is to use the title page as a replacement for a cover. I am not a big fan of that, because title pages are far more similar to each other and often black-and-white, so they lack distinctiveness. Besides that, they often include more detailed information, like the publisher’s name, author credentials, and such, given in a much smaller type. Imagine what it does with your ability to spot the book you’re looking for on a screen filled with postage stamp sized title pages in an e-reader.

Not all books in Project Gutenberg have book covers, so as a gap-stop measure, PG has a system to generate generic covers automatically. The results are not always satisfactory, because the software we use isn’t smart enough to understand what part of the title is most significant and to tweak letter sizes and spacing accordingly to obtain a pleasing result.

Finally, a little searching on some large commercial e-book platforms will reveal a range of newly designed covers for public domain books (the texts for which are often harvested from Project Gutenberg’s offerings in bulk), which range from boring to utterly hilarious: using inappropriate photographs on designs that make serious literary classics, even non-fiction, look worse than cliché Harlequin romances. Such things should not happen at Project Gutenberg, except when we keep the original pulp magazine cover that happens to be equally cringe-worthy, such as this:

Cover of the Dutch pulp magazine Lord Lister No. 8

This post was contributed by Jeroen Hellingman, a Distributed Proofreaders volunteer.


In Memoriam Stephen Hutcheson

March 1, 2022

With heavy but grateful hearts, the volunteers at Distributed Proofreaders bid farewell to our Beloved Emeritus Stephen Hutcheson (1956-2021), who uploaded his final book to Project Gutenberg on September 27, 2021, one day before he passed away.

Stephen joined DP in July 2004 under the user name “hutcheson” and ultimately became one of our most prolific contributors. Although he proofread and formatted over 75,000 pages, his primary roles were as a Content Provider, Project Manager, and Post-Processor for numerous projects that he shepherded from the beginning steps (copyright clearances, image scanning) to final upload to PG. He also graciously processed items from the collections of other volunteers, with a “kid in the candy store” glee over the latest find. (Anything pertaining to his beloved home state of Tennessee would get top priority!) He completed over 1,000 projects and was also active with Distributed Proofreaders Canada, completing around 200 titles in the Canadian public domain. One of his projects, French Painting of the 19th Century in the National Gallery of Art, was selected as Distributed Proofreaders’ 37,000th title posted to PG and was celebrated in this Hot off the Press blog post.

Stephen was the oldest and only boy of six children reared on a farm in Murfreesboro. His sister, Libby Smelser, recalls that his hay fever kept him indoors, reading voraciously, listening to classical music, playing solo chess. “He was very cerebral, very focused, with wide-ranging interests … his mind had so many tendrils. We sisters thought he was just terribly smart!” Stephen followed in his father’s footsteps, graduating from Middle Tennessee State University and becoming a computer programmer. He and wife Ruth were married for over twenty years, and had two children, Laura and David. Although separated, he and Ruth remained dear friends; she enjoyed accompanying him on his book scouting forays to secondhand shops.

Stephen spent years developing his own tools for post-processing DP projects, requiring a special set of proofreading and formatting methods. Volunteers who braved the learning curve of his “Hutcheson Wiki” guidelines were rewarded with a rich variety of topics that reflected his own eclectic interests: old buildings, “interesting places” (as he phrased it), anthropology, archaeology, linguistics, history of inventions and technology, arts and crafts, cookbooks, botany, U.S. history and geography/geology, mining/minerals, religious history and hymnology, classical music, ornithology and zoology, juvenile mystery/adventure series, and science fiction. He loved coming upon cross-references between books at Project Gutenberg, saying, “That’s the thing about a library: the bigger it gets, the more the books start talking to each other.”

Stephen processed many field guides for U.S. National and State Parks, monuments, nature parks, museums, and locations with historical importance, with a view to having an eBook guide available to any traveler with a smartphone. These were his favorite projects to work on, and he had a penchant for maps and atlases. He inherited a love of birding from his family, and contributed many books about flora and fauna.

Stephen participated in DP’s “Project Not Quite Nancy Drew,” featuring various juvenile series in which young people ran around “unsupervised and unchecked,” solving mysteries and having adventures. His contributions helped expand and even complete PG’s collection of series such as the Camp Fire Girls, Jean Craig, Judy Bolton, Motor Girls, Dorothy Dale, Go Ahead Boys, The Airship Boys, and many more. His sardonic sense of humor was evident in his project comments: “What to expect: Ghosts. Cemeteries. Midnight vigils. Ominous telegrams. Disguises. Trafficking in illegitimate rubber products. City kids lost in the woods on a snowy night. Most frightening of all, efficiency experts in the newsroom. Amnesiacs. And … I forget. But I’m sure our blundering but persistent detective figured it all out, and her father published everything in a special edition of the Star.”

Soon after being diagnosed with leukemia, he was hospitalized in December 2020 until his passing in September 2021, but he continued diligently working on DP projects from his hospital room. He often remarked that DP was what kept him sane, and he worked every day except when the chemotherapy affected his vision. Even when he was in the ICU on a breathing machine, he made the effort to connect to DP. He was determined to reach a personal milestone of 1,000 projects uploaded to PG, which he achieved with about 80 to spare in his final weeks. Ruth recalled,

“Stephen loved his work and his friendships at Distributed Proofreaders. This spilled over into his contacts with the hospital staff as they learned about DP. His ability to continue with DP kept him going throughout his long hospitalization…. I was so thankful he could continue his passion project until almost the last day. It brought him joy, fed his thirst for knowledge, and gave him goals to work toward even on the most difficult days. The hospital staff encouraged him, inquired daily about his projects, kept track of his book count on his patient whiteboard, and celebrated each book completed. After he reached 200 books [posted to PG] while in hospital, staff gave him a celebration party.”

The DP community can certainly relate to Ruth’s phrase “passion project.” Stephen’s passion and dedication is an inspiration. DP offered Stephen the perfect venue for his love of books, his insatiable curiosity, and desire to stay productive until his very last day of life. In turn, he has left an enduring legacy of preserving lovely books across many genres for all the world to have at their fingertips – a treasure indeed. Rest in peace, Stephen. You were one of a kind.

This post was contributed by Lisa Corcoran (Leebot), a Distributed Proofreaders volunteer. Many thanks to Ruth Hutcheson and Libby Hutcheson Smelser for their valuable insight. Photos of Stephen courtesy of Ruth Hutcheson.


Buffalo Bill

May 1, 2021

Thanks to movie director Quentin Tarantino, most folks are familiar with the term “pulp fiction,” but the more common “dime novel” was used to describe everything from the pulp magazines starting around 1860 to the “penny dreadfuls” popular in the United Kingdom, featuring such characters as Sweeney Todd and Varney the Vampire. In the United States, fictional characters like Nick Carter were popular, but the stories about real-life American Wild West heroes like Buffalo Bill Cody really drove the genre.

Buffalo Bill was born in Iowa Territory, fought in the Civil War for the Union, was a U.S. Army scout during the Indian Wars, received the Congressional Medal of Honor and later became an entertainer, featured in his own Buffalo Bill Wild West show that toured the U.S. and Europe, giving command performances for Queen Victoria and the Pope. Even today, as a lasting tribute to his legacy, there’s an American football team named after him.

Mark Twain wrote a novel, A Horse’s Tale, about Buffalo Bill’s horse Soldier Boy, and E.E. Cummings penned a poem called “Buffalo Bill ‘s” (yes, there’s a space before the ‘s), but it was the countless dime novels written about Buffalo Bill that made a lasting impression, especially those by Colonel Prentiss Ingraham, almost as colorful a character as Buffalo Bill himself. Published in the early 20th Century by Street and Smith, renowned for their strategic re-use of material, the Buffalo Bill Border Stories have been making their appearance on Project Gutenberg thanks to the volunteers at Distributed Proofreaders. (See the list below for links to the e-books.)

Each novel features an adventure, or a series of related adventures, where our intrepid hero Bill manages to outwit the bad guys and save the helpless. In the books, Bill lives by a strict code, a set of rules of what is good and what is bad, of what must be done and what must never be done. As was common in those times, certain groups of people are portrayed as stereotypes, especially people of color and Native Americans, as well as people from other countries. For us today, such stereotypes are offensive, but they do serve to show how far we have come in terms of accepting the rich diversity of America.

In the novels, Bill is the ultimate hero, and he is usually accompanied by one of more “pards,” or sidekicks, who help him on his adventures. There is the Baron, a Prussian whose speech is difficult to decipher because it tries to mimic an exaggerated German accent. There’s also Nomad, an older Scout who nevertheless defers to the younger Bill for direction. And there’s Little Cayuse, the Paiute youngster who has yet to learn proper English. Bill, as the hero of course, always speaks in perfect English, never in any vernacular. Whether some, all or none of Bill’s adventures really happened will only be known to the Colonel, but the stories are “durned” fine to read.

In addition to the Buffalo Bill series, Project Manager David Edwards (De2164), also has other dime novel series and pulp magazines on their way to Project Gutenberg, including the many Nick Carter adventures as well as Frank Merriwell, the Frank Reade Library and the American Indian Weekly magazine. David has been sharing his own collection in order to preserve them before time takes its toll. Each project has to be scanned by hand and run through OCR software before it even makes it to Distributed Proofreaders, and recently David acquired thirty more titles that we can look forward to.

As the post-processor – a Distributed Proofreaders volunteer who stitches a final e-book together after other volunteers have proofread and formatted it – I love working on these projects. I did my first Buffalo Bill last year while still an apprentice post-processor, and to date, I’ve sent ten so far to Project Gutenberg. Each project presents its own challenges, especially when there are advertising pages, which the publishers made frequent use of, since their business depended on quantity rather than quality. I hope that, along with David and the countless unsung heroes who are our volunteers at DP, including the wonderful Smooth Readers who faithfully read each project to catch stray errors, we will continue to provide the dime novels, a unique slice of literature, for many years to come.

This post was contributed by Susan L. Carr (Skeeter451), a Distributed Proofreaders volunteer.


Buffalo Bill Border Stories

Buffalo Bill, the Border King (No. 1)

Buffalo Bill’s Spy Trailer (No. 41)

Buffalo Bill’s Still Hunt (No. 44)

Buffalo Bill’s Weird Warning (No. 66)

Buffalo Bill’s Girl Pard (No. 77)

Buffalo Bill’s Ruse (No. 82)

Buffalo Bill’s Pursuit (No. 83)

Buffalo Bill’s Bold Play (No. 101)

Buffalo Bill, Peacemaker (No. 102)

Buffalo Bill’s Big Surprise (No. 103)

Buffalo Bill’s Boy Bugler (No. 128)

Buffalo Bill Entrapped (No. 137)

Buffalo Bill’s Best Bet (No. 171)

Buffalo Bill among the Sioux (No. 176)


A Walter Crane Bouquet

January 1, 2021

beautybeastcraneresizedWhen I was a child, I had one of those big treasury-type books of nursery rhymes and fairy tales called Young Years, published in 1960. I say “I” had it, but really I had to share it with my younger brother, whose main interest in it was embellishing the text with abstract crayon art. It didn’t need his help, because it was already lavishly illustrated in a variety of styles. I loved the pictures as much as I loved the stories.

There was one particular story, “Beauty and the Beast,” whose illustrations were hauntingly gorgeous. The florid colors of Beauty’s rich gowns, of the Beast’s splendid 17th-Century coat and breeches, of his elegant chateau and his rose-filled garden, never failed to send me into a state of wonder. I kept the book into adulthood – I have it still, though it’s falling apart – just for those illustrations.

It wasn’t until I joined Distributed Proofreaders that I understood what a treasure that book really was. A Project Manager had come across three very pretty children’s books – The Baby’s Opera, The Baby’s Bouquêt, and The Baby’s Own Aesop – containing music notation. (You can read the lovely story of how he found them at an elderly friend’s home in this post.) Knowing that I was a music transcriber who could create audio files from the notation, he asked me if I’d like to work on them.

As soon as I saw the first one, I was immediately struck by the style of the illustrations – could it be the same artist who had made those marvelous “Beauty and the Beast” illustrations in my fairy-tale book? I pulled out my book – yes, it was none other than Walter Crane. In fact, that old children’s treasury of mine had pictures by pretty much every major children’s illustrator of the 19th and early 20th Centuries, including Kate Greenaway and Arthur Rackham. No wonder I loved it.

But Walter Crane’s illustrations were, and are, special to me. And I learned that his art is special to many DP volunteers who love working on the books he wrote and/or illustrated. In fact, one of his beautiful volumes, A Flower Wedding, was DP’s 33,000th title a few years ago. DP volunteers have contributed over 40 Walter Crane books to Project Gutenberg. Most are children’s books, but there are also works designed for grownups with vividly colored illustrations, like Flowers from Shakespeare’s Garden, posted to Project Gutenberg just last week. Crane also wrote and illustrated his own poetry as well as nonfiction works on art and design, and he decorated the work of other authors. You can even color your own Walter Crane creation with Walter Crane’s Painting Book.

See below for links to more of the wonderful world of Walter Crane, thanks to the volunteers at Distributed Proofreaders and Project Gutenberg.

This post was contributed by Linda Cantoni, a Distributed Proofreaders volunteer. Hot off the Press wishes all its readers a very Happy New Year!

Walter Crane Books at Project Gutenberg

For Children

The Absurd ABC
An Alphabet of Old Friends
The Baby’s Bouquêt
The Baby’s Opera
The Baby’s Own Aesop
The Buckle My Shoe Picture Book
Carrots (by Mrs. Molesworth)
A Christmas Posy (by Mrs. Molesworth)
The Cuckoo Clock (by Mrs. Molesworth)
Don Quixote of the Mancha (by Judge Parry)
A Flower Wedding
The Frog Prince and Other Stories
Goody Two Shoes
Grandmother Dear (by Mrs. Molesworth)
King Arthur’s Knights (by Henry Gilbert)
Little Miss Peggy (by Mrs. Molesworth)
Mother Goose’s Nursery Rhymes (with other illustrators)
Mother Hubbard, Her Picture Book
The Necklace of Princess Fiorimonde (by Mary de Morgan)
Princess Belle-Etoile
The Rectory Children (by Mrs. Molesworth)
The Sleeping Beauty Picture Book
The Song of Sixpence Picture Book
The Tapestry Room (by Mrs. Molesworth)
“Us,” an Old-Fashioned Story (by Mrs. Molesworth)
The Vision of Dante (by Elizabeth Harrison)
Walter Crane’s Painting Book
A Winter Nosegay
A Wonder Book for Girls & Boys (by Nathaniel Hawthorne)

Poetry

A Floral Fantasy in an Old English Garden
Renascence: A Book of Verse
Queen Summer

Nonfiction

The Bases of Design
Ideals in Art
India Impressions
Line and Form
Of the Decorative Illustration of Books Old and New
William Morris to Whistler

Other

Eight Illustrations to Shakespeare’s Tempest
Flowers from Shakespeare’s Garden
A Masque of Days (from essays by Charles Lamb)
The New Forest, Its History and Its Scenery (by John R. Wise)
The Shepheard’s Calender (by Edmund Spenser)


Harrods for Everything

August 1, 2020

Harrods for Everything is the apt title of a huge 1,525-page catalogue from about 1912, produced by the famous London department store which is still in existence today.

harrods_outsidecover_cropped

Some idea of the vast quantity of items that Harrods stocked or had available to order can be taken from the general index, which runs for 68 pages, five columns to a page. The catalogue illustrates over 15,000 products. While many customers would visit the store, or have their provisions delivered by Harrods’s own fleet of vans that covered London and the outlying suburbs, they also had perfected shopping by telephone – something we think of now as a modern invention. As they put it in their advert (on p. 1524 of the catalogue):

– RING UP –
“WESTERN ONE”
FOR ANYTHING
AT ANY TIME
DAY OR NIGHT

80 MAIN LINES

While you could buy all the sorts of things you would expect in a modern large department store, because of the time period you will find some things that are uncommon in most households now, like churns for making butter.

The Victorians and Edwardians had a penchant for cures for this, that, and the other, with various powders or preparations sold for all manner of ailments, and you could buy things like chloroform or throat pastilles in dozens of varieties, even those containing cocaine!

While most domestic appliances still required manual application, the use of electricity was starting to become more widespread. From a company called Ferranti you could buy electric stoves, fires, and irons. But the high cost of electricity (4d a unit, equivalent to £2/US$2.50 in today’s money), meant the larger houses, hotels, offices, and public buildings often had their own generators. Harrods could supply and fit these, of course.

Looking through the provisions department lists, you will notice examples from brands such as Cadbury’s and Crosse & Blackwell’s (British) or Lindt (Swiss), and you could buy American chewing gum or Californian peaches. But some items stand out from a bygone age like turtle soup, and I was surprised to find okra and pineapples on offer.

To keep to their motto “Harrods for Everything,” you could also hire bands or musicians, plus tents or marquees for outdoor gatherings. You could rent steam, electric, or petrol launches to go down a river, or, if you set your sights further afield, there were “exploring, scientific and shooting expeditions … completely equipped and provisioned for any part of the world.”

Putting Harrods for Everything through Distributed Proofreaders was a mammoth and long-running task, which started sometime in early 2007 with me scanning the original to produce a text that other DP volunteers could work on. While the books we work on sometimes have a few pages of advertisements, this project was ALL advertisements. Pages were split into three to five parts to make proofreading and checking easier. Three rounds of proofreading started in September 2007, and the project did not finish the first formatting round (F1) until March 2010. Fortunately, those volunteers who normally do second-round formatting (F2) were spared Harrods for Everything, as it really needed one person working on it (myself) to achieve a consistent format.

As the assigned post-processor, I worked behind the scenes from 2010 to 2014 preparing the 15,000+ illustrations, but there were long gaps when other commitments prevented me from working on it. I began officially post-processing the text in 2014, but again with many gaps in working on it. It went out for smooth reading (SR) in October 2019 (a round in which DP volunteers read through the project as for pleasure in order to spot remaining errors). It was finally released to Project Gutenberg on the 1st May 2020. Sincere thanks to all who worked on it!

This post was contributed by Eric Hutton, a Distributed Proofreaders volunteer.


Working on Grote’s History of Greece

July 1, 2020

frontisGeorge Grote: You are not buried at Westminster Abbey for nothing! This thought summarizes my admiration for George Grote and his lifelong achievement, History of Greece, in twelve volumes, now complete at last at Project Gutenberg.

This History is a perfect example of the sound scholarship coming from 19th-Century English universities. But the author was not a scholar. He was not even a university graduate. He was a banker. His parents were rich enough to have him schooled in an upper-class secondary school where he became enamored with ancient Greek and ancient Greece. But his father did not allow him to enter a university to complete his education. George was needed at the family banking business in the City of London, and a good banker he became. His love for Greece was developed as a hobby, along with his taste for languages, philosophy and politics (radical politics, not usual in a banker).

Dissatisfied with the available accounts of Greek history in English, he began in 1822 to write his own in his spare time. Twenty-four years later, he decided at last to abandon his banking activities to focus on finishing this History which had developed into twelve volumes. It was published over a ten-year period, from 1846 to 1856.

Grote’s History at Distributed Proofreaders

Five years ago, I stumbled on this magnificent work at Distributed Proofreaders (DP). It was half-abandoned. A prolific Project Manager (PM) had prepared most of the volumes of this work starting from page scans of a somewhat simplified American edition without maps and side-notes. Some volumes were already proofread, others were in progress, others were not even begun, and one volume was missing. The PM had apparently left DP, so there was no one to keep an eye on the project’s progress.

I was not then a PM myself, only a post-processor (PPer, the person who assembles and finalizes a book after it has been proofread and formatted) who was looking for something exciting to post-process. Volume 9 of Grote’s History had just been given up by another PPer because of the huge number of quotations in ancient Greek. I had studied (and forgotten) some ancient Greek at Madrid University in a prior reincarnation, and I foolishly decided to have a try at this rejected volume.

Oh, my! The English text was interesting, but the amount of Greek was indeed daunting. Not willing to give up on this task, I began to invade an alien territory full of traps. There were 11,503 footnotes in the 12 volumes, an average of 875 footnotes per volume. About 40% of these footnotes included ancient Greek – not a word or two, but full paragraphs. And some 10% of the footnotes consisted solely of ancient Greek text. How was I going to handle all this?

The Greek challenge

But fortunately at DP you are never alone: unexpected resources appear when needed. DP resident gurus in Greek philology made me aware of the Perseus Digital Library, a website where most of the ancient classical texts in Latin and Greek are found in native and translated versions. For the most part it was a matter of finding the quoted Greek text and copy-pasting it into the project. But finding the quotation was not easy: a fair number of the references were not accurate or were simply missing. It was a matter of reading lots of Greek texts to locate the quotation or, if not found, to type in the Greek quotation myself. Later, I learned to perform searches in Greek, something rather difficult to do at Perseus.

When the reference was found at Perseus, it appeared as modern scholarly conventions require for ancient Greek. But in Grote’s volumes, quoted text was rendered according to the 19th-Century orthographical style, which had to be preserved, so some retyping was always needed. For instance, to incorporate middle dots instead of semicolons (ἀνθρώπων· versus ἀνθρώπων;), breathing marks over the rhos (as in παῤῥησία), or at least to change vertical modern Greek acute accents to slanted ancient Greek ones, as DP experts recommended. For example:

Socrates

Moreover, Grote had the habit of retouching the original quoted text without warning, and this retouching also had to be preserved. But typing or retyping Greek is hard: my trials with the Greek keyboard in Windows were disappointing. Fortunately, one of our experts directed me to a simple HTML page (with lots of JavaScript underneath) where it was easy to type Latin characters in order to get Greek output, and then cut-and-paste this Greek into your file.

One of my tricks when I feel insecure during post-processing is to have at hand a paper copy of the book I am working on. This is invaluable to check errors and typos or to re-scan illustrations. Through eBay, I was fortunate enough to find, at an affordable price, a complete set of the twelve volumes of Grote’s History, published in London in 1883 but printed in Leipzig, where printing houses were famous for producing classical texts devoid of typos (not so in this case, as it turned out, but still better printed than the American edition I was working on).

Every bit of Greek text was checked with this later edition, which brought a second opinion into the checking process. It was also invaluable to check some other modern language misprints. Grote was very fond of quoting in original languages, and he included Latin, French, German, Italian, and Spanish excerpts, sometimes lengthy and always in footnotes.

Finishing the full set

Well, I discovered I was able to accomplish the PP task of this first volume. After finding a kind PM for the half-baked remaining volumes, and scans for the missing one, I committed myself to finishing the other eleven volumes, which took five years.

Finding, checking, and typing all the Greek in a volume needed more than two months: it was a tiring task and had to be alternated with working on other things to be bearable. At least another month was needed to perform the rest of the PP work and then another month for smooth-reading the outcome. Smooth-reading proved to be essential (it always is!): a fair lot of mistakes which had not been detected in the DP proofreading rounds showed up now, on top of my own mistakes in handling the Greek and other languages in the text.

I was fortunate enough to have had very competent smooth-readers, not only for the Greek text (I believe that checking accurately lots of Greek text worked on by another person ought to bring you directly to heaven) but also for the English main part, finding out, for instance, that Acharnians and Phokæns are suspect words (the correct are Acharnanians and Phokæans) and other similar things of which I was not aware.

What Grote, or perhaps his publishers and printers, was somewhat lacking in was accuracy in citing authors, titles, and editions. Fortunately, in Internet times, it is possible, with some patience, to find a digital copy of almost every book cited, view its title page, and correct the names and references as originally printed.

DP’s added value

Now that all 12 volumes of Grote’s masterwork are available at Project Gutenberg, it is time to remember that a transcription like this would have been almost impossible to achieve outside of Distributed Proofreaders. A vast array of DP volunteers contributed their talents and efforts to this project. Lots of people have painstakingly checked, proofed, transliterated, formatted, and distilled their wisdom in the associated forum for each volume, with the constant help of DP administrators, project facilitators, and other DP roles.

It is wholly unfair that those tasks like post-processing that are not distributed absorb so great a part of the final credit for a DP project. The undistributed tasks are pointless without the distributed ones, which are the bulk and the force of DP contributing model. The truth is that these 12 volumes are an achievement of DP as a whole, of the DP model of distributed work, of the DP way of building and maintaining consensus among its members. I bow and take my hat off to all of them.

This post was contributed by rpajares, a Distributed Proofreaders volunteer.