As a side project in late 2018 I started to produce ebooks for Standard Ebooks. I had been wanting to broaden my knowledge of epubs so I went casting about the internet for some good starting places. And I stumbled across this project:

Standard Ebooks takes ebooks from sources like Project Gutenberg, formats and typesets them using a carefully designed and professional-grade style manual, fully proofreads and corrects them, and then builds them to create a new edition that takes advantage of state-of-the-art ereader and browser technology.

It sounded perfect. And the addition of current semantics and web standards in construction also allow them to be more accessible which was something I had also been looking into. Volunteers pick a book project and after the redesign of the base code, they are modernized, proofread again and issued on the Standard Ebook website in multiple formats. And the whole system is also setup to allow maintenance after publication with fixes and updates by both the original producer and readers at large using GitHub. I would absolutely recommend that if you are thinking of downloading an ebook from Gutenberg that you check with Standard ebooks first to see if it has been worked on. It’s a much better choice.

So what do you do?

Well first off you need to subscribe to their google groups mailing list. Then you pick a copyright free (U.S. copyright free) book and propose it to the group. They prefer that the first project be short (~40,000 words) to encourage you to finish it rather than getting bogged down and abandoning the book. I get the sense that this happens a lot. Once the proposed title is approved  you head over to the their website and follow their handy step-by-step guide. Step one is downloading their tools. These are a set of Python-based command line tools that take care of a lot of the technical bits. If you’ve never used command line (Terminal on a Mac) it can be a bit intimidating but if you are interested you really shouldn’t let that stop you. Downloading the tools can take a lot of time so be patient.

The process

Essentially you follow these basic steps:

  1. Find the book on Gutenberg (or some other archive). Also locate online scans of the original text.
  2. Create a basic ebook template using the downloaded files (via the toolset).
  3. Clean up the files and make them conform to Standard Ebook standards.
  4. Fix the typography (via the toolset).
  5. Check the typography against their thorough typography manual.
  6. Add Semantics (again first via the toolset, then by using their semantic manual).
  7. Modernize spelling and punctuation.
  8. Find a cover (this is really rather a difficult and time-consuming step because they insist you find a CCO public domain image or something that was previously printed prior to 1922.)
  9. Complete the ToC and add links to various pages.
  10. Finish off the metadata (usually just a matter of writing a synopsis and filling in some blanks).
  11. Proofread, proofread, proofread.
  12. Submit for approval (and inevitably revise based on things you’ve missed).

Interestingly enough

This is a project started by and mostly inhabited  by bibliophilic computer geeks. They use a programmer’s approach to both structure, methodology and problem solving and rely on all sots of computer tools like GitHub—and the things I have learned about regex’s (high-powered search and replace paradigms) makes me giggle in glee. I can’t say, as a book designer I always agree with them and some of their stricter choices but the results speak for themselves. The main the thing their approach brings is an easily updated and maintained ebook that suffers very little from the idiosyncratic problems I find in “professionally” designed ebooks. And their collaborative approach ensures that multiple contributions by multiple contributors can be managed swiftly and easily, something that almost never happens in the real publishing world.

My ebooks so far…

After the first couple of books I settled into doing mostly plays. It’s a form I have always enjoyed, a genre that I am really familiar with and the technical challenges make them much more interesting to work on. And they’re fairly short which works well with my short attention span.



If you’d like to see a current list of books I will try to keep the page over at current.

In conclusion

I plan to continue doing this as long as I have time. I am learning an incredible amount about ebooks, ebook structure, programming tools, css and html, art, literature, and even a bit about copyright and the open source community. I am trying to talk L into collaborating on a project with me (we are thinking William Carlos Williams’ pre-1923 poetry) where I will focus on the tech end and she can do the “boring” proofing and editorial. There might be an opportunity to work  straight from a scanned original—bypassing the Gutenberg process altogether. That will make it much more challenging. And I will probably start adding some notes about my various process and fixes to the site. After all I did originally start it as a way to save my bits and bobs of computer experimentation for posterity. So if you start seeing things like:

Find stage direction in brackets: [maid dusts the mantlepiece] \[(.*?)(.*)\] Replace with: <i epub:type="z3998:stage-direction">\u\1\2\.</i>

…you will know what it’s all about.