I used my first API!!!

Booksonix, the book management software Orca uses to manage their publications and workflow released an API (Application Programming Interface) a few weeks ago — so I started playing with it to see if I could automate some of my work.

After much playing with flask, jinja and json files I made a thing of beauty. Just plug in the print ISBN and it delivers me the cover for download and all the metadata, formatted and everything so I can just copy and paste it into the epub’s .opf file. That’s what an API does, let you get info out of a system without having to go through the regular interface so you can manipulate it the way you want. All the best websites have one: Facebook, YouTube, Twitter etc.

Standard ebooks update

All good things…

A (final?) addition to the ongoing list of projects I have worked on for Standard ebooks and work has ground to a halt. I did exactly one book last year and that was Orlando, another collab between L and I—I did the  mechanical stuff and she did the editorial.

Its not quite the end but I’ve been so busy doing books at Orca that I don’t have much free time (or energy) left to do volunteer work at Standard. My output had trickled to a stop, The Manual of Style had gone through multiple iterations without me keeping up on changes and I was becoming less and less confident in my ability to make good decisions. As a result I decided to step down from the editorial position at the beginning of the year. A screenshot as a final memento…

Hopefully I will eventually free up some time and mental space and add some more books—I have a few plays I really want to add.

As usual the full (and up-to-date) list can be seen over at astart.ca/publishing/ebooks/.

Here’s what I’ve added since the last update:

 

As always I really encourage you to go take a look and enjoy some of the books. And consider contributing if that tickles your fancy. It’s not really that hard once you learn the basics. I even wrote a beginners’ guide to help out: Standard Ebooks Hints and Tricks.

Book(s) of the year 2023

Welcome to the end of the year. Or the beginning of the year. It’s been a lot of fun. Due to work commitments I have done exactly zero Standard ebooks this past year and thus all my reading has been in the modern era. I did add a new non-sci-fi series so there was a bit of branching out.

But without any further ado…
Here is  2023-doo-dee-doo:

January (10)

  • The Way to Glory David Drake (2005)
    Book 4 of Lt. Leary – ebook; reread
  • Whispers Under Ground Ben Aaronovitch (2012)
    Book 3 of Rivers of London – ebook;
  • Some Golden Harbour David Drake (2006)
    Book 5 of Lt. Leary – ebook; reread
  • The Wrong Stars Tim Pratt (2017)
    Book 1 of The Axiom – ebook;
  • Broken Homes Ben Aaronovitch (2014)
    Book 4 of Rivers of London – ebook;
  • Foxglove Summer Ben Aaronovitch (2015)
    Book 5 of Rivers of London – ebook;
  • When the Tide Rises David Drake (2008)
    Book 6 of Lt. Leary – ebook; reread
  • Risen Empire Scott Westerfeld (2003)
    Book 1 of The Succession Duology – ebook;
  • The Killing of Worlds Scott Westerfeld (2003)
    Book 2 of The Succession Duology – ebook;
  • In the Stormy Red Sky David Drake (2009)
    Book 7 of Lt. Leary – ebook; reread

February (8)

  • Babel R. F. Kuang (2022)
    – ebook;
  • Kushiel’s Dart Jacqueline Carey (2001)
    Book 1 of Kushiel’s Legacy – ebook; reread
  • Rubicon J.S. Dewes (2023)
    – ebook;
  • What Distant Deeps David Drake (2010)
    Book 8 of Lt. Leary – ebook;
  • The Hanging Tree Ben Aaronovitch (2017)
    Book 6 of Rivers of London – ebook;
  • The Road of Danger David Drake (2012)
    Book 9 of Lt. Leary – ebook; reread
  • The Sea Without a Shore David Drake (2014)
    Book 10 of Lt. Leary – ebook;
  • Lies Sleeping Ben Aaronovitch (2017)
    Book 7 of Rivers of London – ebook;

March (13)

  • Death’s Bright Day David Drake (2016)
    Book 11 of Lt. Leary – ebook;
  • Tales from the Folly Ben Aaronovitch (2020)
    Book 0 of Rivers of London – ebook;
  • Though Hell Should Bar the Way David Drake (2018)
    Book 12 of Lt. Leary – ebook;
  • When the Tiger Came Down the Mountain Nghi Vo (2020)
    Book 2 of The Singing Hills Cycle – ebook;
  • Rivers of London: Body Work Ben Aaronovitch, Andrew Cartmel (2016)
    Vol. 1, Issues 1–5 – graphic novel
  • The Dreaming Stars Tim Pratt (2018)
    Book 2 of The Axiom – ebook;
  • The Castlemaine Murders Kerry Greenwood (2003)
    Book 13 of Phryne Fisher Mysteries – ebook;
  • The Forbidden Stars Tim Pratt (2019)
    Book 8 of The Axiom – ebook;
  • False Value Ben Aaronovitch (2020)
    Book 8 of Rivers of London – ebook;
  • To Clear Away the Shadows David Drake (2019)
    Book 13 of Lt. Leary – ebook;
  • Richard Bolitho: Midshipman Alexander Kent (1975)
    Book 1 of Bolitho – ebook;
  • Midshipman Bolitho and the Avenger Alexander Kent (1978)
    Book 2 of Bolitho – ebook;
  • Band of Brothers Alexander Kent (2006)
    Book 3 of Bolitho – ebook;

April (10)

  • Stand into Danger Alexander Kent (1980)
    Book 4 of Bolitho – ebook;
  • City of Last Chances Adrian Tchaikovsky (2022)
    – ebook;
  • In Gallant Company Alexander Kent (1977)
    Book 5 of Bolitho – ebook;
  • Amongst Our Weapons Ben Aaronovitch (2022)
    Book 8 of Rivers of London – ebook;
  • Sloop of War Alexander Kent (1977)
    Book 6 of Bolitho – ebook;
  • We Are Legion (We Are Bob) Dennis Taylor (2016)
    Book 1 of * Bobiverse* – ebook;
  • In Fury Born David Weber (2006)
    – ebook; reread
  • Sloop of War Alexander Kent (1967)
    Book 7 of Bolitho – ebook;
  • Command a King’s Ship Alexander Kent (1973)
    Book 8 of Bolitho – ebook;
  • Tsalmoth Steven Brust (2023)
    Book 16 of Vlad Taltos – ebook;

May (15)

  • Fair Trade Sharon Lee & Steve Miller (2022)
    Book 3 of Jethri Goblyn – ebook;
  • Into the Riverlands Nghi Vo (2022)
    Book 3 of * The Singing Hills Cycle* – ebook;
  • Drowned Country Emily Tesh (2022)
    Book 2 of Greenhollow Duology – ebook;
  • Some Desperate Glory Emily Tesh (2023)
    – ebook;
  • Passage to Mutiny Alexander Kent (1976)
    Book 9 of Bolitho – ebook;
  • The Lost War Justin Lee Anderson (2022)
    Book 1 of Eidyn – ebook;
  • The Atlas Paradox Olivia Blake (2022)
    Book 2 of Atlas Series – ebook;
  • Mutineer’s Moon David Weber (1991)
    Book 1 of Dahak – ebook; reread
  • The Armageddon Inheritance David Weber (1993)
    Book 2 of Dahak – ebook; reread
  • Kushiel’s Chosen Jacqueline Carey (2011)
    Book 21 of Kushiel’s Legacy – ebook; reread
  • Kushiel’s Avatar Jacqueline Carey (2011)
    Book 3 of Kushiel’s Legacy – ebook; reread
  • Cyteen C. J. Cherryh (1988)
    Book 1 of Union – ebook; reread
  • With All Despatch Alexander Kent (1988)
    Book 10 of Bolitho – ebook;
  • Form Line Of Battle! Alexander Kent (1969)
    Book 11 of Bolitho – ebook;
  • Enemy In Sight! Alexander Kent (1970)
    Book 12 of Bolitho – ebook;

June (11)

  • The Flag Captain Alexander Kent (1971)
    Book 13 of Bolitho – ebook;
  • Untethered Sky Fonda Lee (2023)
    – ebook;
  • Signal—Close Action Alexander Kent (1974)
    Book 14 of Bolitho – ebook;
  • Witch King Martha Wells (2023)
    – ebook;
  • The The Inshore Squadron Alexander Kent (1978)
    Book 15 of Bolitho – ebook;
  • A Big Ship at the Edge of the Universe Alex White (2018)
    Book 1 of The Salvagers – ebook;
  • The Atrocity Archives Charles Stross (2004)
    Book 1 of The Laundry – ebook; reread
  • West of Honor Jerry Pournelle (1976)
    Book 1 of Falkenberg – ebook; reread
  • Mercenary Jerry Pournelle (1977)
    Book 2 of Falkenberg – ebook; reread
  • Prince of Mercenaries Jerry Pournelle (1989)
    Book 3 of Falkenberg – ebook; reread
  • A Bad Deal for the Whole Galaxy Alex White (2018)
    Book 2 of The Salvagers – ebook;

July (10)

  • Dune Frank Herbert (1965)
    – ebook; reread
  • Rosemary and Rue Seanan McGuire (2009)
    Book 1 of October Daye – ebook;
  • A Tradition of Victory Alexander Kent (1981)
    Book 16 of Bolitho – ebook;
  • Translation State Ann Leckie (2023)
    Book 4 of Imperial Radch – ebook;
  • A Local Habitation Seanan McGuire (2010)
    Book 2 of October Daye – ebook;
  • Time for the Stars Robert A. Heinlein (1956)
    – ebook; reread
  • The Better Part of Valor Tanya Huff (2002)
    Book 2 of Confederation – ebook; reread
  • The Heart of Valor Tanya Huff (2007)
    Book 3 of Confederation – ebook; reread
  • Valor’s Trial Tanya Huff (2008)
    Book 4 of Confederation – ebook; reread
  • The Truth of Valor Tanya Huff (2010)
    Book 5 of Confederation – ebook; reread

August (11)

  • An Artificial Night Seanan McGuire (2010)
    Book 3 of October Daye – ebook;
  • Queen of the Flowers Kerry Greenwood (2004)
    Book 14 of Phryne Fisher Mysteries – ebook;
  • Success to the Brave Alexander Kent (1983)
    Book 17 of Bolitho – ebook;
  • The Frugal Wizard’s Handbook for Surviving Medieval England Brandon Sanderson (2023)
    – ebook;
  • The Blighted Stars Megan E. O’Keefe (2023)
    Book 1 of The Devoured Worlds – ebook;
  • Phule’s Paradise Robert Asprin (1990)
    Book 1 of Phule’s Company – ebook; reread
  • Fletcher’s Fortune John Drake (1992)
    Book 1 of Fletcher – ebook;
  • Legends & Lattes Travis Baldree (2022)
    – ebook;
  • Imperium Restored Walter Jon Williams (2022)
    Book 6 of Dread Empire’s Fall – ebook;
  • Phule’s Company Robert Asprin (1990)
    Book 2 of Phule’s Company – ebook; reread
  • Colours Aloft! Alexander Kent (1986)
    Book 18 of Bolitho – ebook;

September (9)

  • The Mystery at Dunvegan Castle T. L. Huchu (2023)
    Book 3 of Edinburgh Nights – ebook;
  • Lords of Uncreation Adrian Tchaikovsky (2023)
    Book 3 of The Final Architecture – ebook;
  • Nevernight Jay Kristoff (2017)
    Book 1 of The Nevernight Chronicles – ebook; reread
  • Godsgrave Jay Kristoff (2017)
    Book 2 of The Nevernight Chronicles – ebook; reread
  • Darkdawn Jay Kristoff (2019)
    Book 3 of The Nevernight Chronicles – ebook; reread
  • Late Eclipses Seanan McGuire (2011)
    Book 4 of October Daye – ebook;
  • Starter Villain John Scalzi (2023)
    – ebook;
  • Honour This Day Alexander Kent (1997)
    Book 19 of Bolitho – ebook;
  • The Book That Wouldn’t Burn Mark Lawrence (2023)
    Book 1 of The Library Trilogy – ebook;

October (13)

  • Killing Gravity Corey J. White (2017)
    Book 1 of Voidwitch Saga – ebook; reread
  • Void Black Shadow Corey J. White (2018)
    Book 2 of Voidwitch Saga – ebook; reread
  • Static Ruin Corey J. White (2018)
    Book 3 of Voidwitch Saga – ebook; reread
  • One Salt Sea Seanan McGuire (2011)
    Book 5 of October Daye – ebook;
  • The Goblin Emperor Katherine Addison (2014)
    Book 1 of The Goblin Emperor – ebook; reread
  • The Witness for the Dead Katherine Addison (2021)
    Book 1 of The Cemeteries of Amalo – ebook; reread
  • Go Tell the Spartans Jerry Pournelle & S.M. Stirling (1991)
    Book 4 of Falkenberg – ebook; reread
  • The Grief of Stones Katherine Addison (2021)
    Book 2 of The Cemeteries of Amalo – ebook; reread
  • Prince of Sparta Jerry Pournelle & S.M. Stirling (1993)
    Book 5 of Falkenberg – ebook; reread
  • The Fifth Ward: First Watch Dale Lucas (2017)
    Book 1 of The Fifth Ward – ebook; reread
  • The Fifth Ward: Friendly Fire Dale Lucas (2018)
    Book 2 of The Fifth Ward – ebook; reread
  • Live Free or Die John Ringo (2010)
    Book 1 of Troy Rising – ebook; reread
  • Citadel John Ringo (2010)
    Book 2 of Troy Rising – ebook; reread

November (8)

  • The Hot Gate John Ringo (2011)
    Book 3 of Troy Rising – ebook; reread
  • The Last Devil to Die Richard Osman (2023)
    Book 4 of A Thursday Murder Club Mystery – ebook;
  • Mélusine Katherine Addison (2005)
    Book 1 of The Doctrine of Labyrinths – ebook;
  • The Fractured Dark Megan E. O’Keefe (2023)
    Book 2 of The Devoured Worlds – ebook;
  • The Virtu Katherine Addison (2006)
    Book 2 of The Doctrine of Labyrinths – ebook;
  • The Only Victor Alexander Kent (1990)
    Book 20 of Bolitho – ebook;
  • Ashes of Honor Seanan McGuire (2012)
    Book 6 of October Daye – ebook;
  • The Mirador Katherine Addison (2007)
    Book 3 of The Doctrine of Labyrinths – ebook;

December (9)

  • Chimes at Midnight Seanan McGuire (2013)
    Book 7 of October Daye – ebook;
  • System Collapse Martha Wells (2023)
    Book 7 of The Murderbot Diaries – ebook;
  • Corambis Katherine Addison (2009)
    Book 4 of The Doctrine of Labyrinths – ebook;
  • Scholar L.E.Modesitt Jr. (2011)
    Book 4 of The Imager Portfolio – ebook; reread
  • Princeps L.E.Modesitt Jr. (2012)
    Book 5 of The Imager Portfolio – ebook; reread
  • Imager’s Battalion L.E.Modesitt Jr. (2013)
    Book 6 of The Imager Portfolio – ebook; reread
  • Antiagon Fire L.E.Modesitt Jr. (2013)
    Book 7 of The Imager Portfolio – ebook; reread
  • Rex Regis L.E.Modesitt Jr. (2014)
    Book 8 of The Imager Portfolio – ebook; reread
  • Madness in Solidar L.E.Modesitt Jr. (2015)
    Book 9 of The Imager Portfolio – ebook; reread

Show me the numbers!

Total read: 127

46 rereads
10.6 books/month
2.5 books/week
.35 books/day

Books by women: (15/45 authors)
Non SF/Fantasy: 23
Oldest 1956 (Time for the Stars Robert A. Heinlein)
1 print (a graphic novel)

40 diferent series
11 non-series books

Of Note

Some books that made an impression:

  • Ben Aaronovitch’s Rivers of London series— I mentioned the first one last year and continued to consumed this series in huge gulps, always wanting more. It helps we’ve been watching a lot of British TV lately but I love Aaronovitch’s sense of wry humour and the premise of a magical cop is handled in a way that defies the trite “The perfect blend of CSI and Harry Potter”  promo line they keep using.
  • I continued to read some of Adrian Tchaikovsky’s stand-alone fiction and really enjoyed City of Last Chances.
  • The Lost War by Justin Lee Anderson wrote a great story then upended it completely in the last few pages. The sequel is just out and I am actually trepidatious about reading it because it has the potential to go so wrong… or be absolutely fantastic if he gets it right.
  • Babel by R. F. Kuang— I made Leslie read this one. Enough said.

Some other books that made an impression (not that I like the so much as they were noteworthy):

  • Rivers of London: Body Work Ben Aaronovitch, Andrew Cartmel — a graphic novel. Probably the first graphic novel I have ever read completely and the only hard copy book of the year. Still not my jam.
  • The Frugal Wizard’s Handbook for Surviving Medieval England Brandon Sanderson—One of Sanderson’s big releases from his whole kickstarter thing. Meh. Not bad but I really didn’t see they anything other than a quick little project that was acceptable. It didn’t help that I read the ebook and it was chock-a-bloc full of graphics making it tedious to read on my poor little Kobo. That’s another trend that I don’t particularly admire.
  • The much hyped and marketed Starter Villain by John Scalzi—It was a fun premise, well written and in no way a chore to read. But of any substance whatsoever? No. Scalzi himself alluded to it as sort of a post-Covid relief valve. I enjoyed it; but I was kind of  amused by the hype surrounding a throw-away book of the type that used to have me referring to my reading habits as “trashy sci-fi.” I do hope he writes more, but I think they should save the over-the-top accolades and marketing dollars for others.
  • Rosemary and Rue by Seanan McGuire and the rest of the series (I am up to book 8)—I started this based on a conversation with someone and it was well-written enough to become a habit. But is a habit a good enough excuse to continue on? So far McGuire is winning by continuing to suck me in… but I imagine the $22.99 price point for the latest in the series might be a stopper (more on this later).

Some non-SF and my continued whinging

This year I started in on the 30+ book series of Richard Bolitho. It’s another Napoleonic era, Age of Sail series that starts with a young Richard Bolitho as midshipman and follows his career. I am only up to Book 20 so I have a few more to go and I believe he moves on to the title characters nephew as the protagonist in the last few books…doesn’t look good for Richard’s health long term 🙂

The problem is that many of the books are geo-blocked in Canada. It really really grinds my gears that any book is geo-blocked, but when only some of the books in a series are available…well…that’s just bloody unacceptable. Needless to say I tried hard to get around it legitimately (i.e. actually pay for them ) but it proved impossible and I ended up downloading a bunch of crappy OCR’d versions and doing my best to clean them up. Ironically and irritatingly, a bunch of the later books finally showed up as available here in Canada so I bought them and chucked out all my work on the half-assed versions. Now if they will just release the rest…

I also added two more Phryne Fisher mysteries to the total to bring me up to book 15 and completely my list on non-SF/Fantasy reading.

Two more bitches

Prices: $23 dollars for an ebook? What the serious fuck?. Pricing a book used to be 5 x pp&b ( five times the print, paper and binding costs). If manufacturing a book cost $2/unit then the retail price would be around $10. This covered all the other  creation costs and paid the author a “fair” royalty. I know things have changed but creating an ebook is under a hundred dollars for unlimited copies…and probably much cheaper than that for the big guys. This is an outright money grab, trying to create “hardback”/ new release pricing around a purely digital product that in most cases YOU DON”T EVEN OWN!  The hardcover is only $37 for goodness sake… the pricing makes no sense! Sigh.

Romantasy: Seriously? Goodreads had it as an actual category in their Choice Awards this year… I can’t even.

Conclusions

Not much to say about books this year. I am trying to read more new books and more new authors. I think the tendency towards writing dense sci-fi and fantasy continues and when I am in the mood for it, it’s some pretty great stuff. But as much as I twit about Scalzi, it’s nice to read some good old fashioned, well-written schlock. Especially as we all recover from the mass trauma that was “Humanity during Covid.” So let’s hope there is plenty of that to read as well (although I realize I have decades of old stuff to catch up on so there is always something “schlocky to read 😉 )

Here’s to a lot of great books in 2024!


My fellow book counters: Dr. la ass dean 2023 book and music list and the one, the only, the original, Earl J. Woods’ Last But Not Least: Books I Read in 2023 .

Links to previous years’ book count posts:

15 years of tweets

Read all most of the nonsense here: https://macblaze.ca/?cat=9

Note: Since Elon’s take over, the removal of api’s and the subsequent X thing, my digest stopped working on May 11, 2023. So while I actually managed to stay on Twitter (X) for 15 years the tweets for the last 7 months have gone unrecorded. It might also be time to blow this joint. Although I will say that since all the hardcore Twitter users abandoned it it has been a much happier place (if you avoid the idjuts…).


BruceKeith is alive and well and living on Twitter. We’ll see how this goes…


Now Tweeting.

Python in Ebook Production

The following is a blog post I originally wrote on behalf of Orca Book Publishers for the APLN (Accessible Publishing Learning Network) website. I had done a brief online Q&A on behalf of eBound talking about our Benetech certification and there were questions about my python workflow. So I tried to write it out, which was a good exercise in and of itself.


In 2022 Orca Book Publishers had a dozen accessible titles that had been remediated via programs with BooksBC and eBound as well as another group that had been created as mostly accessible epubs by outsourced ebook developers. When Orca made a commitment to creating accessibility ebooks the immediate goal was to pursue Benetech Certification with an eye to adopting a born accessible workflow and to start remediating backlist titles.

Orca has three main streams of books from an epub point of view: highly illustrated non-fiction, fiction with few or no images, and picture books. We started by remediating the fiction titles that were already mostly accessible and bringing them up to Benetech standards.

Concurrently we brought the non-fiction production in-house to begin to develop a functional accessible workflow. Non-fiction titles usually feature 80 plus illustrations and photographs, multiple sidebars, a glossary, index, and a bibliography.

In publishing circles a fair amount of time is spent bemoaning the shortcomings of InDesign as a platform for creating good epubs, let alone making accessible ones. With a complex design, you can spend a lot of time and effort prepping an InDesign file to export a “well-formed” file and still end up with a “messy” end result. Instead, Orca’s approach was to ignore InDesign as much as possible, export the bare necessities (styles, ToC, page markers etc.), clean out the junk in the epub it produces using a series of scripted search and replaces, and then rely on post-processing to produce well-formed, accessible epub in a more efficient manner.

To that end we started building two things: a comprehensive standard structure and its accompanying documentation for an Orca ebook, and a series of python scripts to apply that structure to epubs. These scripts needed to be robust enough to work with both new books and to remediate older titles that spanned everything from old epub2’s to mostly-accessible titles that didn’t quite meet Benetech standards.

Python in Epub production

Python was the obvious choice for these tasks. Python is a programming language suited for text and data manipulation that is highly extensible, with thousands of external libraries available, and has a focus on readability. It comes already installed with Mac OSX and is easily added to both Windows and Linux.

Python is easy to learn and fairly easy to use.  You can simply write a python script in a text file e.g.:

print('Enter your name:')
name = input()
print('Hello, ' + name)

Then save it as script.py and run it using a python interpreter. As a general rule writing and running python scripts from within an IDE (integrated development environment) like Visual Studio Code, a free IDE  created and maintained by Microsoft, makes this pretty simple. Using VS Code allows a developer to easily modify scripts and then run them from within the same application.

Regular Expressions

The other important part of the process and well worth learning as much as they can about — even if they don’t dive into python — is regular expressions (regex). This a system of patterns of that allow you to search and replace highly complex strings.

For instance if you wanted to replace all the <p>’s in a glossary with <li>’s:

<p class="glossary"><b>regular Expression</b>: is a sequence of characters that specifies a match pattern in text.</p>

You could search for:

<p class="glossary">(.*?)</p>

where the bits in parentheses are wildcards…and replace it with:

<li class="glossary">\1</li>.

For each occurrence found, the bit in the parentheses would be stored and then reinserted correctly in the new string.

Once you start to use regexes you’ll quickly get addicted to the power and flexibility and quite a few text editors (even InDesign via grep) support regular expressions.

Scripting Python

With these two tools you can write a fairly basic script that opens a folder (an uncompressed epub) and loops through to find a file named glossary.xhtml and replace the <p class="glossary"> tag and replace it with a <li> — or whatever else you might need. You can add more regexes to change the <title> to <title>Glossary</title>, add in the proper section epub:type’s and roles and more. Since InDesign tends to export fairly regular epub code once you clean out the junk, if you create a standard set of styles, it means you can easily clean and revise the whole file in a few key strokes.

Taking that one step further, if you ensure that the individual files in an epub are named according to their function e.g about-the-author.xhtml, copyright.xhtml, dedication.xhtml etc. you can easily have custom lists of search/replaces that are specific to each file, ensuring things like applying epub:types and aria-roles is done automatically or you could edit or change existing text with new standardized text in things like the .opf file.

If you build basic functions to perform search and replaces, then you can continually update and revise the list of things you want it to fix as you discover both InDesign and your designer’s quirks, things like moving spaces outside of spans or restructuring the headers. If you can conceptualize what you want to do, you can build a regex to do it and just add it to the list.

You can also build multiple scripts for different stages of the process or expand into automating other common tasks. For instance the Orca toolset currently has the following scripts:

  • clean_indesign (cleans all the crud out and tidies up some basic structures),
  • clean_epub (which replaces all the headers, adds a digital rights file, rewrites the opf file to our standard, standardizes the ToC and landmarks and more…),
  • alt-text-extract (extracts image names, alt text and figcaptions to an excel spreadsheet),
  • update_alt text (loads an excel spreadsheet that has alt text and, based on image file names, inserts it into the proper <img alt=""),
  • run_glossary (which searches the glossary.xhtml and creates links to the glossed words in the text),
  • extract_metadata (which loops through all the epubs in a folder and pulls the specified metadata e.g. pub date, modified date, a11y metadata, rights etc.),
  • extract_cover _alt (loops through a folder of epubs and extracts the cover alt text into a excel spreadsheet),
  • increment_pagenumber (some of our older epubs were made from pdfs that had different page numbering from the printed book, so this script goes through and bumps them by a specified increment)

You can see the InDesign cleaning script here: github.com/b-t-k/epub-python-scripts as a basic example. As we continue to clean up and modify the rest they will slowly be added to the repository.

So you can see, learning and using python in your workflow can speed up a lot of repetitive and time consuming tasks and actually ensure a better quality and more standardized book— which incidentally means making future changes to epubs becomes much more efficient.

Documentation

Concurrently to all this Orca maintains and continually revises a set of documents that records all the code and standards we have decided on. It is kept in a series of text files that automatically update a local website and it contains everything from the css solutions we use to specific lists of how ToC’s are presented, our standard schema, how we deal with long descriptions, lists of epub-types and aria roles and a record of pretty much any decision that is made regarding how Orca builds epubs. Because website is searchable, a quick search easily finds the answer to most questions.

Our Books

This type of automation has allowed us to produce accessible non-fiction titles in-house and in a reasonable time framework. Books like Open Science or Get Out and Vote! can be produced in a Benetech certifiable epub in just a few days even though they feature things like indexes, linked glossaries, long descriptions for charts and a lot of alt text that was written after the fact.

And if we have the alt text ready (which is now starting to happen as a part of the workflow), producing non-fiction titles will usually take less than two days. This time frame does grow if we are remediating old epubs that were produced out-of-house and we are giving serious thought to going back and redoing them as it might be quicker. Also with the establishment of the new workflow, remediating fiction titles (or producing them from scratch) now just takes a couple of hours! (excluding QA).

Producing an Accessible epub

Orca’s production process has been continually evolving. We started by focussing on making accessible non-fiction epubs without alt text, and then brought alt text into the mix after about 9 months (two seasons)—the scripts meant it was easy to go back and update those titles after alt text was created. Meanwhile we pursued Benetech certification for our fiction titles that were produced out-of-house and developed a QA process to ensure compliance. And just recently we have brought fiction production in-house as well.

At this point, as soon as the files have been sent to the printer, the InDesign files are handed over to produce the epub. Increasingly before this stage, the alt text is produced and entered in a spreadsheet. Then this is merged into the completed epub. A “first draft” is produced and run through Pagina’s EPUBCheck and Ace by DAISY to ensure compliance. Then, along with a fresh export of the alt text in a separate excel file, it is sent over to our production editor who has a checklist of code elements to work through using BBEdit, and then he views the files in Thorium and Apple Books, and occasionally Colibrio’s excellent online Vanilla Reader, checking styles, hierarchy, visual presentation and listening to the alt text.

Changes come back and usually within one or two rounds it is declared finished and passed on to the distribution pipeline. There our Data Specialist does one last check of the metadata ensuring it matches the onix files and reruns EPUBCheck and ACE before sending it out.

Spreading the load

In the background we have marketing and sales staff working on spreadsheets of all our backlist, writing and proofing alt text for the covers and interior illustration of the fiction books so it is ready to go as titles are remediated. The hope is to incorporate this cover alt text into all of our marketing materials and websites as the work is completed.

The editors meanwhile are just starting to incorporate character styles in Word (especially in specifying things like languages and italics vs. emphasis) and working with authors to build in alt text creation alongside the existing caption-writing process.

The designers are slowly incorporating standardized character and paragraph styles into their design files and changing how they structure their documents to facilitate epub exports. They are also working with the illustrators to collect and preserve their illustration notes in order to help capture the intent of illustrations so those notes can be used as a basis for alt text. They are also working to document cover discussion as a way to help facilitate more interesting and accurate cover alt text.

It will take a few more years but eventually the whole process for producing born accessible, reflowable epubs should be fully in place.

The Future

Orca is currently working towards a goal of 300 Benetech accessible epub titles in our catalog for February 2024, including everything back to 2020. And then we will continue to remediate all our backlist of over 1200 titles over the next few years.
As soon as the process for fiction epubs has solidified, we’d also like to start in on our pictures books and ensure that these fixed epubs are as accessible as possible. It is currently an extremely time-consuming task, but we have hope that we can eventually work out a way to automate a lot of the repetitive work.
This means we need to continue to educate ourselves and our suppliers and work towards a way to standardize as many aspects of the workflow as possible. The more standards we create and maintain the more automation we can employ.
And of course, this means learning even more python…

Instagram Since Last Time

Instagram Since Last Time
Checking out the Valley Line for something to do. Mill Woods here we come!
Instagram Since Last Time
My Uncle Phil and my father beside his T33 in Calgary after he flew in to date my mom. #lestweforget
Instagram Since Last Time
Scary!
Instagram Since Last Time
Me and my mom.
Instagram Since Last Time
Golden gate baby!

Instagram Since Last Time

Instagram Since Last Time
Me and mom gonna give a different kind of cruise a try. Off to SF @strathmore_cruise_expert
Instagram Since Last Time
Sunrise at sea, 50 nm off Cape Mendocino #princesscruises @strathmore_travel_agent
Instagram Since Last Time
I really don’t know where this little guy came from, but there he is, in the middle of my lawn, all happy and yellow.
Instagram Since Last Time
It’s a good ‘un this morning #nofilter
Instagram Since Last Time
Sourdough day
Instagram Since Last Time
Now I remember why I hate Ferris wheels. #unstable @fortedmontonpark
Instagram Since Last Time
It’s a crap picture that really doesn’t do it justice, but… freaky sunset tonight
Instagram Since Last Time
I “accidentally” made cinnamon buns. Happy Sunday!