Add a python app

Just computer note…

I wanted to run a Flask web app for L’s Moodle conversions and doing it on the Mac server seemed the best idea. I put the web app into the www folder. Here were the steps…

cd /to_folder

python3 -m venv venv

source venv/bin/activate

pip install Flask

pip install pypandoc

python3 main.py

 

To come…

Add Gunicorn

New(ish) Kobo

An image of a Kobo Libra ereader

Back in July I thought I would spend a little money on a treat for myself. I didn’t need an new ereader (my old Aura was still working ok, albeit was a bit battered) but I wanted to check out the side handle on the Libra to see if it was more comfortable when reading at night.

A recap

I started back in 2009 with a Sony PRS-650. I used that until its battery died. Then I switched to a  Sony PRS T1  in 2013.

In 2014 I got a new Aura HD because I wanted a backlight (and was immediately a bit annoyed when I realized you needed to have a Kobo account just to use it to read books). I worked around that by having an account for the ereader that I never used and a separate account for any purchases I made and side loaded all my books through Calibre. This got replaced in 2017 with a new generation Aura after an unfortunate accident involving a suitcase and hard airport floor.

A kobo ereader with a screen that is full of weird shapes and broken images.

New reader

Which brings me to this year and my new Kobo Libra 2 with a colour screen. It’s not bad. I don’t much have use for the colour screen and the variable colour backlight is a bit annoying (more on that later). But over all I like the way it feels in the hand, like the fact the page turn buttons are back (haven’t seen them since the Sony) and am appreciative of the move to usb C.

Account registration

One of the bonuses was I looked into the whole “must have an account to use” idiocy again and found a hack:

  1. Connect Libra Colour to PC.
  2. Go to \.kobo\Kobo and find the Kobo eReader.conf file.
  3. Open the file and add the line SideloadedMode=true under the [ApplicationPreferences] section.
  4. Done! The device will now display the “My Books” tab straight away, and the online store will be disabled. You can re-enable everything by doing the same thing and adding SideloadedMode=false instead of true.

You lose the  main screen that shows the Books Reading etc but since I rarely used that its no great loss.

The downside is there are a number of “ghost” titles in the lists showing up under Author view that must have been something to do with promotions and I assume they would have been deleted int eh intial setup but annoyingly show up when I am browsing books on the ereader itself.

I have screwed around with editing the hidden KoboReader.sqlite file on the eredaer and while I can find the books in some tables, deleting them doesn’t seem to get rid of them. I imagine there are associated tables I need to clean as well but so far no joy unless I want to delete everything and start over. Still a possibility…

Battery life

I had/have a beef with the battery though. The Libra has a fancy dimming functionality. If you set bed time for, let’s say 11:00 pm, it will start slowly changing the colour of the backlight from cool (blue) to warm (orange) and dimming the screen for reading in the dark. I thought this was the cat’s meow.

But.

The battery life sucked. I mean I was recharging every 4 days! One night I hit the 10% battery life warning which used to mean I had several hours of charge left and it went out on me in less than 15 minutes. This was beyond unacceptable. I didn’t want colour at all, let alone if it was going to be such a huge battery suck. But googling yielded no solutions or even anyone else’s notice of the problem — I was starting to think maybe I got a dud ereader.

Then I started fiddling with settings and one night turned off the automatic dimmer and dialed the backlight down to 30% — which is a bit below my preferred level but workable. The next morning the battery indicator showed it was still full. Aaargh. I have left it that way and played with the colour and level and so far so good. It has only been a couple of days but the improvement is significant. I will keep on playing and see if I can nail down the exact battery-draining culprit, but so far it seems its just the timing function. Why Kobo would have such a ridiculous battery suck as a bart of the software is beyond me.

Happy?

I think so. I like the handle and page turn buttons. Now that the battery is better I have more faith in it (although I now keep my old Aura up to date and charged in case). It is a bit more bulk and doesn’t fit in a pocket anymore but that’s not an issue I care too much about except when my hands are full. And it is zippy! Interface things the Aura used to do badly like scrolling actually make sense now that the response is immediate instead of having to pause between touches.

But if I am feeling rich again I might finally break down and try a Boox or a Pocketbook. Still annoyed at Kobo? Why yes, how could you tell…? 🙂

A Kobo libra showing a list of books

Immich Implemented

Well I think I an 90% sure this is the way I will go. Back in New Image Management? I mentioned I was testing Immich as possible replacement for Apple Photos and the iphone app. I tested it out and even used it at a family visit to Brooks and so far it has performed flawlessly. I covered some of this in the original post but I thought I would  add in a bit more detail.

Server

I have this installed on my Mac Mini (2019) which has basically become my server these days. Maybe a post later about how that is set up now. Much of this is based on https://rhettbull.github.io/osxphotos/tutorial.html‘s tutorial.

Step 1 Docker

Everything is Docker (containerized) so, other than the Docker overhead, it is pretty clean install-wise. I decided to go with Docker’s Docker Desktop for the Mac rather than jump through the hoops to install it on the system itself… lower overhead vs easier management… and I was lazy.

Then I followed Immich’s install guide. The only real tricky part was that on a Mac if you try to ‘get’ the .env file it immediately is invisible (wget -O .env https://github.com/immich-app/immich/releases/latest/download/example.env). So I modified the command to read wget -O example.env https://github.com/immich-app/immich/releases/latest/download/example.env and then after editing it renamed it (in terminal mv example.env .env) to .env

The .env (Environment)

I changed the location of my library to a folder on my Mini and decided to put the postgres database there as well. Not sure if this is best practice but I am tired for hunting for elements years later when I need to change or clean up something. And then I set the time zone to America/Edmonton.

Just a note to remind you if you want to edit the .env in a text editor then hit cmd-shift-. (period), this will show all the invisible files so you can just drag the .env file to BBEdit etc. Just remember to hit cmd-shift-. again to hide them all or all the hidden files will clutter up everything.

Save the file.

Docker Compose

Make sure you are in the right directory (the one with all these files) and run the downloaded docker compose command docker compose up -d. This will invoke Docker, run though the compose file to download and set up everything and the -d ensure it is running in the background.

That’s it. You should be up and running at http://<machine-ip-address>:2283

Step 2 Apple Photos

The next big challenge is to get the photos from Photos to Immich.

If, like L, you have a bunch of images on your phone or iPad that aren’t on your desktop the easiest was is to download the Immich App from the store and set it up to back-up all the images. Once that’s done you can set it to only back up Recent and every time you open Immich on your phone it will upload the images to the server.

 

As for the main Photos Library I covered that back in the original post (New Image Management?).

Settings

A few settings and tweaks.

  • I enabled tags Account Settings > Features > Tags. This shows all the tags that I had set in Photos and allows me to set more.
  • I changed the library so it organized itself in a more understandable way. Originally all the images were stored in folders like /Immich/upload/8e403d85-6740-4ba7-8549-0feb702f0cb3/6a/04. By going to Administration > Settings > Storage Template and enabling it you can set the folders to year/month/ date etc. and it will migrate all the images to the new structure e.g. /Immich/library/User-1/1956/1956-01-01/001 – 1956.jpg.
    • Then you have to go to Administration > Jobs > Storage template migration and click Start

A Second Library

One of the neater things is that you can set up a second user and use the server for them as well. I did this and set up all L’s images to be stored in a different folder. Note that when you set up the user you then should go to Administration > User and click the three dots to edit the account. Change the Storage label to whatever you wan the folder to be called.

Then either before or after you migrate their images. Make sure you repeat the Storage template steps above.

Sharing

Now, if you want, you can set it up so the other user can see all of you images in their own Immich instance. Account Settings > Settings > Partner Sharing. Add the other user and they can  see pretty much everything by clicking on Sharing in the main sidebar.

External Access

I am still not sure if I will use this as something that is accessible external to my firewall. I did briefly set it up to one of my test domains when I was away and it worked just as advertised. i took a picture, opened the Immich app and voila it was  available pretty much instantly on the web interface.

But if I leave it inaccessible, the app stores all the thumbnails so I can”see the images wherever I am and when I am back in my own network I can upload the images then — which is pretty much how I did it with the old system. the only difference being if I want a high-res version of a photo when I am away from home I am screwed unless I VPN back in… and I rarely leave the VPN running unless I am away for extended periods.

Hmmm….

New Image Management?

I’ve decided to give an alternative to Apple Photos a try. Photos has been increasingly  frustrating in its organization on the iphone and its pretty darn slow on my desktop if I try and offload the data base to an external drive.

Immich

An open source, self hosted photo solution, Immich is also free. And it has an iPhone app that will sync with you camera roll to automatically backup photos from the camera.

 

A screen shot of the Immich app showing thumbnails of photos.

I will flush out the documentation but basically I followed baty.net/posts/2025/12/from-apple-photos-to-immich/ 

Immich is a Docker container but My Pi’s aren’t robust enough (I think) so I installed it on my Intel Mac Mini (2019) which is doing duty as my Calibre server and Jellyfin server. This means I have to use the clucnky Docker Desktop for Mac but c’est la vie. It install via docker -compose file docs.immich.app/install/docker-compose. The wget didn’t work so I just downloaded the file from Github. I had to use terminal to rename the .env file. Then I fired up the container by running the compose file and that was pretty much it.

Visit the app by going to 192.168.1.x:2283, create an account and add a password.

Preliminary testing

cd into your directory and docker compose up -d

I played around with and the iPhone app and decided it was going to work and proceeded to move my main library of 30,000+ images. What I did was copy the photos library (Photos Library.photoslibrary) over to the mac mini (150gig, 1+ hours) then use an utility to export it. In retrospect that was silly since I exported it to an external SSD for temporary storage anyway. What I should have done was  run the utility direct from the original library to the ssd and save me a lot of time.

osxphotos

To install:

brew tap Rhetbull/osxphotos

brew install osxphotos

It’s a pretty fancy little utility but I went with the basics:

osxphotos export /Volumes/External_SSD/DestinationFolder \
--skip-original-if-edited \
--sidecar XMP \
--touch-file \
--directory "{folder_album}" \
--download-missing \
--library '/Users/admin/Desktop/photo library/Photos Library.photoslibrary'

Note this if you want to import a library other than the defualt Photos library. Otherwise you can just eliminate the last line.

It didn’t take too long and it had exported all my images in a fancy folder structure on the ssd.

immich cli

Then I had to install the command line interface for immich. I discovered it was on brew which was easier than the suggested npm.

brew install immich-cli

Then you need to go to the immich web interface (192.168.1.x:2283) and add an api key. Account Settings > Api Key.  Then add the key tot eh following and log in.

immich login http://192.168.1.x:2283/api APIKEYxxxXXXxxxXXXXxxXXXXXx

Once you are logged in, let ’er rip.

immich upload --recursive /Volumes/External_SSD/DestinationFolder --album

Less than an hour later the images were imported. It took a couple of hours for the thumbnails to appear and overnight for things like geo-location and face recognition to  finishing running.

Conclusion

So far it is pretty slick. I can  take a picture on my phone and  it is set to upload the image as soon as I open the app (if I am at home). The response is snappy, the AI assisted search is wonderful (for finding images with two cats, or bread, or pizza etc.)

A grid of many different home-baked pizzas

If I keep it I will likely set up a domain and a ssl cert so I can access the images from outside my firewall. But for now the big downside is that if I want to do anything but look at an image on my iPhone from outside my personal network I am out of luck—the app only stores thumbnails. And maybe a followup review where I can add a bit more detail…

 

Electronic waste

I dug up this picture of my desk from 2010… 15 years ago. Of all the electronics visible (and there are a lot if you look closely…including the calculator and the camera) they have all been relegated to the recycling bin.  (Actually so have the desk, shelves and chair  lol…) The very last thing I was using was the Apple keyboard which got replaced last week… so all in all I think the sometimes outrageous price of Apple products is worth the while.

A ddesk covered with electronics

I still have the ipad (1st gen) and one of the apple mices but both are in a box and only used in an emergency. Actually looking closer I still use those speakers so  I guess they are the winners 🙂

I used my first API!!!

Booksonix, the book management software Orca uses to manage their publications and workflow released an API (Application Programming Interface) a few weeks ago — so I started playing with it to see if I could automate some of my work.

After much playing with flask, jinja and json files I made a thing of beauty. Just plug in the print ISBN and it delivers me the cover for download and all the metadata, formatted and everything so I can just copy and paste it into the epub’s .opf file. That’s what an API does, let you get info out of a system without having to go through the regular interface so you can manipulate it the way you want. All the best websites have one: Facebook, YouTube, Twitter etc.

15 years of tweets

Read all most of the nonsense here: https://macblaze.ca/?cat=9

Note: Since Elon’s take over, the removal of api’s and the subsequent X thing, my digest stopped working on May 11, 2023. So while I actually managed to stay on Twitter (X) for 15 years the tweets for the last 7 months have gone unrecorded. It might also be time to blow this joint. Although I will say that since all the hardcore Twitter users abandoned it it has been a much happier place (if you avoid the idjuts…).


BruceKeith is alive and well and living on Twitter. We’ll see how this goes…


Now Tweeting.

Python in Ebook Production

The following is a blog post I originally wrote on behalf of Orca Book Publishers for the APLN (Accessible Publishing Learning Network) website. I had done a brief online Q&A on behalf of eBound talking about our Benetech certification and there were questions about my python workflow. So I tried to write it out, which was a good exercise in and of itself.


In 2022 Orca Book Publishers had a dozen accessible titles that had been remediated via programs with BooksBC and eBound as well as another group that had been created as mostly accessible epubs by outsourced ebook developers. When Orca made a commitment to creating accessibility ebooks the immediate goal was to pursue Benetech Certification with an eye to adopting a born accessible workflow and to start remediating backlist titles.

Orca has three main streams of books from an epub point of view: highly illustrated non-fiction, fiction with few or no images, and picture books. We started by remediating the fiction titles that were already mostly accessible and bringing them up to Benetech standards.

Concurrently we brought the non-fiction production in-house to begin to develop a functional accessible workflow. Non-fiction titles usually feature 80 plus illustrations and photographs, multiple sidebars, a glossary, index, and a bibliography.

In publishing circles a fair amount of time is spent bemoaning the shortcomings of InDesign as a platform for creating good epubs, let alone making accessible ones. With a complex design, you can spend a lot of time and effort prepping an InDesign file to export a “well-formed” file and still end up with a “messy” end result. Instead, Orca’s approach was to ignore InDesign as much as possible, export the bare necessities (styles, ToC, page markers etc.), clean out the junk in the epub it produces using a series of scripted search and replaces, and then rely on post-processing to produce well-formed, accessible epub in a more efficient manner.

To that end we started building two things: a comprehensive standard structure and its accompanying documentation for an Orca ebook, and a series of python scripts to apply that structure to epubs. These scripts needed to be robust enough to work with both new books and to remediate older titles that spanned everything from old epub2’s to mostly-accessible titles that didn’t quite meet Benetech standards.

Python in Epub production

Python was the obvious choice for these tasks. Python is a programming language suited for text and data manipulation that is highly extensible, with thousands of external libraries available, and has a focus on readability. It comes already installed with Mac OSX and is easily added to both Windows and Linux.

Python is easy to learn and fairly easy to use.  You can simply write a python script in a text file e.g.:

print('Enter your name:')
name = input()
print('Hello, ' + name)

Then save it as script.py and run it using a python interpreter. As a general rule writing and running python scripts from within an IDE (integrated development environment) like Visual Studio Code, a free IDE  created and maintained by Microsoft, makes this pretty simple. Using VS Code allows a developer to easily modify scripts and then run them from within the same application.

Regular Expressions

The other important part of the process and well worth learning as much as they can about — even if they don’t dive into python — is regular expressions (regex). This a system of patterns of that allow you to search and replace highly complex strings.

For instance if you wanted to replace all the <p>’s in a glossary with <li>’s:

<p class="glossary"><b>regular Expression</b>: is a sequence of characters that specifies a match pattern in text.</p>

You could search for:

<p class="glossary">(.*?)</p>

where the bits in parentheses are wildcards…and replace it with:

<li class="glossary">\1</li>.

For each occurrence found, the bit in the parentheses would be stored and then reinserted correctly in the new string.

Once you start to use regexes you’ll quickly get addicted to the power and flexibility and quite a few text editors (even InDesign via grep) support regular expressions.

Scripting Python

With these two tools you can write a fairly basic script that opens a folder (an uncompressed epub) and loops through to find a file named glossary.xhtml and replace the <p class="glossary"> tag and replace it with a <li> — or whatever else you might need. You can add more regexes to change the <title> to <title>Glossary</title>, add in the proper section epub:type’s and roles and more. Since InDesign tends to export fairly regular epub code once you clean out the junk, if you create a standard set of styles, it means you can easily clean and revise the whole file in a few key strokes.

Taking that one step further, if you ensure that the individual files in an epub are named according to their function e.g about-the-author.xhtml, copyright.xhtml, dedication.xhtml etc. you can easily have custom lists of search/replaces that are specific to each file, ensuring things like applying epub:types and aria-roles is done automatically or you could edit or change existing text with new standardized text in things like the .opf file.

If you build basic functions to perform search and replaces, then you can continually update and revise the list of things you want it to fix as you discover both InDesign and your designer’s quirks, things like moving spaces outside of spans or restructuring the headers. If you can conceptualize what you want to do, you can build a regex to do it and just add it to the list.

You can also build multiple scripts for different stages of the process or expand into automating other common tasks. For instance the Orca toolset currently has the following scripts:

  • clean_indesign (cleans all the crud out and tidies up some basic structures),
  • clean_epub (which replaces all the headers, adds a digital rights file, rewrites the opf file to our standard, standardizes the ToC and landmarks and more…),
  • alt-text-extract (extracts image names, alt text and figcaptions to an excel spreadsheet),
  • update_alt text (loads an excel spreadsheet that has alt text and, based on image file names, inserts it into the proper <img alt=""),
  • run_glossary (which searches the glossary.xhtml and creates links to the glossed words in the text),
  • extract_metadata (which loops through all the epubs in a folder and pulls the specified metadata e.g. pub date, modified date, a11y metadata, rights etc.),
  • extract_cover _alt (loops through a folder of epubs and extracts the cover alt text into a excel spreadsheet),
  • increment_pagenumber (some of our older epubs were made from pdfs that had different page numbering from the printed book, so this script goes through and bumps them by a specified increment)

You can see the InDesign cleaning script here: github.com/b-t-k/epub-python-scripts as a basic example. As we continue to clean up and modify the rest they will slowly be added to the repository.

So you can see, learning and using python in your workflow can speed up a lot of repetitive and time consuming tasks and actually ensure a better quality and more standardized book— which incidentally means making future changes to epubs becomes much more efficient.

Documentation

Concurrently to all this Orca maintains and continually revises a set of documents that records all the code and standards we have decided on. It is kept in a series of text files that automatically update a local website and it contains everything from the css solutions we use to specific lists of how ToC’s are presented, our standard schema, how we deal with long descriptions, lists of epub-types and aria roles and a record of pretty much any decision that is made regarding how Orca builds epubs. Because website is searchable, a quick search easily finds the answer to most questions.

Our Books

This type of automation has allowed us to produce accessible non-fiction titles in-house and in a reasonable time framework. Books like Open Science or Get Out and Vote! can be produced in a Benetech certifiable epub in just a few days even though they feature things like indexes, linked glossaries, long descriptions for charts and a lot of alt text that was written after the fact.

And if we have the alt text ready (which is now starting to happen as a part of the workflow), producing non-fiction titles will usually take less than two days. This time frame does grow if we are remediating old epubs that were produced out-of-house and we are giving serious thought to going back and redoing them as it might be quicker. Also with the establishment of the new workflow, remediating fiction titles (or producing them from scratch) now just takes a couple of hours! (excluding QA).

Producing an Accessible epub

Orca’s production process has been continually evolving. We started by focussing on making accessible non-fiction epubs without alt text, and then brought alt text into the mix after about 9 months (two seasons)—the scripts meant it was easy to go back and update those titles after alt text was created. Meanwhile we pursued Benetech certification for our fiction titles that were produced out-of-house and developed a QA process to ensure compliance. And just recently we have brought fiction production in-house as well.

At this point, as soon as the files have been sent to the printer, the InDesign files are handed over to produce the epub. Increasingly before this stage, the alt text is produced and entered in a spreadsheet. Then this is merged into the completed epub. A “first draft” is produced and run through Pagina’s EPUBCheck and Ace by DAISY to ensure compliance. Then, along with a fresh export of the alt text in a separate excel file, it is sent over to our production editor who has a checklist of code elements to work through using BBEdit, and then he views the files in Thorium and Apple Books, and occasionally Colibrio’s excellent online Vanilla Reader, checking styles, hierarchy, visual presentation and listening to the alt text.

Changes come back and usually within one or two rounds it is declared finished and passed on to the distribution pipeline. There our Data Specialist does one last check of the metadata ensuring it matches the onix files and reruns EPUBCheck and ACE before sending it out.

Spreading the load

In the background we have marketing and sales staff working on spreadsheets of all our backlist, writing and proofing alt text for the covers and interior illustration of the fiction books so it is ready to go as titles are remediated. The hope is to incorporate this cover alt text into all of our marketing materials and websites as the work is completed.

The editors meanwhile are just starting to incorporate character styles in Word (especially in specifying things like languages and italics vs. emphasis) and working with authors to build in alt text creation alongside the existing caption-writing process.

The designers are slowly incorporating standardized character and paragraph styles into their design files and changing how they structure their documents to facilitate epub exports. They are also working with the illustrators to collect and preserve their illustration notes in order to help capture the intent of illustrations so those notes can be used as a basis for alt text. They are also working to document cover discussion as a way to help facilitate more interesting and accurate cover alt text.

It will take a few more years but eventually the whole process for producing born accessible, reflowable epubs should be fully in place.

The Future

Orca is currently working towards a goal of 300 Benetech accessible epub titles in our catalog for February 2024, including everything back to 2020. And then we will continue to remediate all our backlist of over 1200 titles over the next few years.
As soon as the process for fiction epubs has solidified, we’d also like to start in on our pictures books and ensure that these fixed epubs are as accessible as possible. It is currently an extremely time-consuming task, but we have hope that we can eventually work out a way to automate a lot of the repetitive work.
This means we need to continue to educate ourselves and our suppliers and work towards a way to standardize as many aspects of the workflow as possible. The more standards we create and maintain the more automation we can employ.
And of course, this means learning even more python…

Smart lights

…And Home Assistant

So last year I bought a couple of smart lights to play around with. I ended up installing Home Assistant in a docker on my pi (the pi400) to control them.

I have an Office light and TV room light.

  • The TV light turns on 1/2 hour before sunset and off at 10:10 pm everyday.
  • The Office turns on 1/2 hour before sunset then fades to a calm colour at 7 pm, turning off completely at 10:10pm.
  • In the morning the Office light turns on at 7 am and the off 1/2 hour after sunset.

But I want the office light to not turn on/off in the morning when the days are long so I had to write some YAML code which I’ve had to revise several times.

This is a trigger to fire my office light (in the morning) between September 25 and May 15. It still may not be working right, but so far so good.

Edit YAML mode:


condition: template
value_template: >
  {% set n = now() %} {{ n.month == 9 and n.day >= 25 or n.month > 9
         or n.month == 5 and n.day <= 15 or n.month < 5 }}

https://community.home-assistant.io/t/automation-during-date-range/133814/50