Web scraping Python code

In my previous post I explained that I was looking for a way to use web scraping to extract data from my Calibre-Web shelves and automatically post them to my Books Read page here on my site. In this post I will step through my final Python script to explain to my future self what I did and why.

Warning: Security

A heads up. I have no guarantees that this code is secure enough to use in a production environment. In fact I would guess it isn’t. But my Web-Calibre webserver is local to my home network and I trust that my hosted server (macblaze.ca) is secure enough. But since you are passing passwords etc. back and forth I wouldn’t count on any of this to be secure without a lot more effort than I am willing to put in. 

The code in bits

# import various libraries

import requests
from bs4 import BeautifulSoup
import re

This loads the various libraries the script uses. Requests is a http library that allows you to send requests to websites, BeautifulSoup is a library to pull data from html and re is a regex library to allow you to do custom searches.

# set variables

# set header to avoid being labeled a bot
headers = {
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}

# set base url
urlpath='http://urlpath'

# website login data
login_data = {
    'next': '/',
    'username': 'username',
    'password': 'password',
    'remember_me': 'on',
}

# set path to export as markdown file 
path_folder="/Volumes/www/books/"
file = open(path_folder+"filename.md","w")

This sets up various variables used for login including a header to try and avoid being labeled a bot, the base url of the Calibre-web installation, login data and specifies a location and name for the resulting markdown file. The open command is marked with a ‘w’ switch to indicate the script will write a new file every time it is executed, overwriting the old one.

# log in and open http session

with requests.Session() as sess:
    url = urlpath+'/login'
    res = sess.get(url, headers=headers)
    res = sess.post(url, data=login_data)

Then, using Requests, I opened a session on the webserver and log in using the variables.

Writing the File

Note: The code has matching file.write() and print() statements throughout. The print() statements just write to the terminal app and allow me to see what is being written to the actual file using file.write(). They are completely unnecessary.

# Set Title

file.write("# Books Read\n")
print("# Books Read\n")

Pretty basic: write the words Books Read followed by a carriage return, tagged with a # to indicate it is a h1 head. This will become the actual page name.

# find list of shelves

shelfhtml = sess.get(urlpath)
soup = BeautifulSoup(shelfhtml.text, "html.parser")
shelflist = soup.find_all('a', href=re.compile('/shelf/[1-9]'))
print (shelflist)

So now we set the variable shelfhtml to the session we opened earlier. Using BeautifulSoup we grab all the html code and search for all a links that have an href that contain the regex expression ‘/shelf/[1-9]’. (Hopefully I won’t have more than 9 shelves or I will have to redo this bit.) The variable now contains list of all the links that match that pattern and looks like this:

[<a href="/shelf/3"><span class="glyphicon glyphicon-list private_shelf"></span>2018</a>, <a href="/shelf/2"><span class="glyphicon glyphicon-list private_shelf"></span>2019</a>, <a href="/shelf/1"><span class="glyphicon glyphicon-list private_shelf"></span>2020</a>]

This as you can see, contains the links to all three of my current Year shelves, displayed in ascending numerical order.
 

#reverse order of urllist

dateshelflist=(get_newshelflist())
dateshelflist.reverse()
print (dateshelflist)

I wanted to display my book lists from newest to oldest so I used python to reverse the items in the list.

First loop: the shelves

The first loop loops through all the shelves (in this case 3 of them) and starts the process of building a book list for each.

# loop through sorted shelves

for shelf in dateshelflist:
    #set shelf page url
    res = sess.get(urlpath+shelf.get('href'))
    soup = BeautifulSoup(res.text, "html.parser")

    # find year from shelflist and format
    shelfyear = soup.find('h2')
    year = re.search("([0-9]{4})", shelfyear.text)
    year.group()
    file.write("### {}\n".format(year.group()))
    print("### {}\n".format(year.group()))

In the first iteration of the loop, the script goes to the actual shelf page using the base url and then adding an href extracted from the list by using a get command and then accesses the html from the resulting webpage. Then the script finds the year info, which is a H2, extracts the 4-digit year with the regex ([0-9]{4}) and writes it to the file, formatted as an H3 header and followed by a line break.

# find all books

books = soup.find_all('div', class_='col-sm-3 col-lg-2 col-xs-6 book')

Using BeautifulSoup we extract the list of books from the page knowing they are all marked with a div in the class col-sm-3 col-lg-2 col-xs-6 book.

Second loop: the books

#loop though books. Each book is a new BeautifulSoup object.

for book in books:
        title = book.find('p', class_='title')
        author = book.find('a', class_='author-name')
        seriesname = book.find('p', class_='series')
        pubdate = book.find('p', class_='publishing-date')
        coverlink = book.find('div', class_='cover')
        if None in (title, author, coverlink, seriesname, pubdate):
            continue
        # extract year from pubdate
        pubyear = re.search("([0-9]{4})", pubdate.text)
        pubyear.group()

This is the beginning of the second loop. For each book we use soup to extract the title, author, series, pubdate and cover (which I don’t end up using). Each search is based on the class assigned to it in the original html code. Because I only want the pub year and not pub date, I again use a regex to extract the 4-digit year. The if None… statement is there just in case one of the fields is empty and prevents the script from hanging.

# construct line using markdown

newstring = "* ***{}*** — {} ({})\{} – ebook\n".format(title.text, author.text, pubyear.group(), seriesname.text)
file.write(newstring)
print (newstring)

Next we construct the book entry based on how we want it to appear on the web page. In my case I want each entry to be an li and end up looking like this:

  • The Cloud Roads — Martha Wells (2011)
    Book 1.0 of Raksura – ebook

Python allows you to just list the variables at the end of the statement and fills in the {} automatically which makes for easier formatting. The script then writes the line to the open markdown file and heads up to the beginning of the loop to grab the next book.

More loops

That’s pretty much it. It loops through the books until it runs out and heads back to the first loop to see if there is another shelf to process. After it processes all the shelves it drops to the last line of the script:

file.close()

which closes the file and that is that—c’est tout. It will now be accessed the next time some visits the Books Read page on my site.

In Conclusion

Hopefully this is clear enough so that when I forget every scarp of python in the years to come I can still recreate this after the inevitable big crash. The script, called scrape.py in my case, is executed in terminal by going to the enclosing folder and typing python3 scrape.py then hitting enter. Automating that is something I will ponder if this book list thing becomes my ultimate methodology for recording books read. It’s big failing is that it only records ebooks in my Calibre library. I might have to redo the entire thing for something like LibraryThing where I can record all my books…lol. Hmmm… maybe…

The Final Code

Here is the final script in its entirety.

# import various libraries
import requests
from bs4 import BeautifulSoup
import re

# set header to avoid being labeled a bot
headers = {
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'
}

# set base url
urlpath='http://urlpath'

# website login data
login_data = {
    'next': '/',
    'username': 'username',
    'password': 'password',
    'remember_me': 'on',
}

# set path to export as markdown file
path_folder="/Volumes/www/home/books/"
file = open(path_folder+"filename.md","w")

with requests.Session() as sess:
    url = urlpath+'/login'
    res = sess.get(url, headers=headers)
    res = sess.post(url, data=login_data)

# Note: print() commands are purely for terminal output and unnecessary

# Set Title
file.write("# Books Read\n")
print("# Books Read\n")

# find list of shelves
shelfhtml = sess.get(urlpath)
soup = BeautifulSoup(shelfhtml.text, "html.parser")
shelflist = soup.find_all('a', href=re.compile('/shelf/[1-9]'))
# print (shelflist)

#reverse order of urllist
dateshelflist=(get_newshelflist())
dateshelflist.reverse()
# print (dateshelflist)

# loop through sorted shelves
for shelf in dateshelflist:

    #set shelf page url
    res = sess.get(urlpath+shelf.get('href'))
    soup = BeautifulSoup(res.text, "html.parser")

    # find year and format
    shelfyear = soup.find('h2')
    year = re.search("([0-9]{4})", shelfyear.text)
    year.group()
    file.write("### {}\n".format(year.group()))
    print("### {}\n".format(year.group()))

    # find all books
    books = soup.find_all('div', class_='col-sm-3 col-lg-2 col-xs-6 book')

    #loop though books. Each book is a new BeautifulSoup object.
    for book in books:
        title = book.find('p', class_='title')
        author = book.find('a', class_='author-name')
        seriesname = book.find('p', class_='series')
        pubdate = book.find('p', class_='publishing-date')
        coverlink = book.find('div', class_='cover')
        if None in (title, author, coverlink, seriesname, pubdate):
            continue
        # extract year from pubdate
        pubyear = re.search("([0-9]{4})", pubdate.text)
        pubyear.group()
        # construct line using markdown
        newstring = "* ***{}*** — {} ({})\{} – ebook\n".format(title.text, author.text, pubyear.group(), seriesname.text)
        file.write(newstring)
        print (newstring)

file.close()

Note 12/2021

There has been an update to the Calibre web code so I had to make some changes to the python script.

Making a “Books Read” page

So recently I came across a web page called How I manage my ebooks by a fellow named Aleksandar Todorovi. He is a developer who wanted to track his reading on his webpage. He introduced me to a Calibre project called Calibre-Web which is basically a web interface for Calibre with a few extra bells and whistles. Reading through his explanation it seemed pretty simple to implement except for this statement:

As a final step in the chain, I have created a script that allow me to publish the list of books I’ve read on my website. Since Calibre-Web doesn’t have an API, I ended up scraping my own server using Python Requests  and BeautifulSoup . After about one hundred lines of spaghetti code gets executed, I end up with two files:

  • books-read.md, which goes straight to my CMS, allowing me to publicly share the list of books I have read, sorted by the year in which I’ve finished reading them.

The Process

So I set about to try and implement my own version of Aleksandar’s project. In my typical trial and error fashion it took a couple of days of steady work and I learned a ton along the way.

Calibre-Web

I went ahead and downloaded Calibre-Web and wrestled getting it running on my test server (my old mac-mini). It is a python script, which I still a bit fuzzy about the proper way to actually implement. I ended up writing a shell script to run the  command "nohup python /Applications/calibre-web-master/cps.py"  and them made it executable from my desktop. I still have some work to do there to finalize that solution.

I have to say I really like the interface of Calibre-Web much more than the desktop Calibre and although there are a few quirks, I will likely be using the web version much more than the desktop from now on.

Then I made a few shelves with the books I had read in 2019 and 2020 and was good to go. Now I just needed to get those Shelves onto my website somehow.

Web Scraping

Now I’ve never heard of the term web scraping, but the concept was familiar and it turns out it is quite the thing.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites
Web scraping, Wikipedia

The theory being that since all the info is available accessible in the basic code of the Calibre-Web pages, all I needed to do was extract and format it, then repost it to this site. So I did. Voila: My Books Read page.

I guess I skipped the tough part…

Starting out I understood Python was a programming language, but had no idea what Python Requests or BeautifulSoup were. Turns out that Python Requests was essentially one of many “html to text” interpreters and BeautifulSoup was a program (library?…I am still a bit vague on the terminology) to extract and format long strings of code into useful data.

Start with Google

I started by a quick search and found a few likely examples to follow along with.

https://medium.com/the-andela-way/introduction-to-web-scraping-87edf94ac692
https://medium.com/the-andela-way/learn-how-to-scrape-the-web-2a7cc488e017
 https://www.dataquest.io/blog/web-scraping-beautifulsoup/

These were helpful in explaining the structure and giving me some basic coding ideas, but I mostly relied on https://realpython.com/beautiful-soup-web-scraper-python/ to base my own code on.

Step one

I got everything running (this included sorting out the mess that is python on my computer, but that is another story) and tried to get a basic python script to talk to my calibre installation. Turns out that even though my web browser was logged into Calibre-Web, my script wasn’t. Some some more googling found me this video (Website login using request library in Python) and it did the trick to write  the login portion of my script.

Step two

Then I wrote a basic script that extracted data (much more on this later) and saved it to a markdown file on the webserver. I figured markdown was easier to implement than html and knew WordPress could handle it.

Or could it? Turns out the Jetpack implementation was choking on my markdown file for some reason. I fought with it for a while and eventually decided to see if I could find a different WordPress plugin to do the job. Turned out I could kill two birds with one stone using Mytory Markdown which would actually  load (and reload) a correctly formatted remote .md file to a page every time someone visited.

Step three

After I got a sample page loaded on the website I realized that it was missing  pub date and series name which, if you have ever visited one of my annual books read posts (Last Books of the decade: 2019, Books 2018—Is this the last year? etc.) is essential information. So I had to go into the Calibre-Web code and add those particular pieces of info to the shelf page so I would be able to scrape it all at the same time. I ended up adding this:

{% if entry.series|length > 0 %}
    <p class="series">
        {{_('Book')}} {{entry.series_index}} {{_('of')}} <a href="{{url_for('web.books_list', data='series',sort='abc', book_id=entry.series[0].id)}}">{{entry.series[0].name}}</a>
    </p>
{% endif %}


{% if entry.pubdate[:10] != '0101-01-01' %}
    <p class="publishing-date">{{entry.pubdate|formatdate}} </p>
{% endif %}

…to to shelf.html in /templates folder of the Calibre-Web install. I added it around line 45 (just after the {% endif  %} for the author section). It took a bit of fussing to look good but it worked out great.

Step four

Now all I have to do is figure out how to run my scrape.py script. For now I will leave it a manual process and just run it after I update my Calibre-Web shelves, but making that automatic is on the list for “What’s Next…”

Ta-da

So between this post and Aleksandar’s I hope you have a basic idea of what you need to do in order to try and implement this solution. More importantly when future me comes back and tries to figure out what the hell all this gobbledey-gook mess is I can rebuild the system based on these sketchy notes. I will end this here and continue in a new post on the actual python/beautifulsoup code I came up with to get the web scraping done.

Instagram This Week

The calm before the storm. Last spring in Rebecca Spit and the Octopus Islands. Can’t wait for spring 2020. #pnw #desolationsound #sailing #beautifulbc
The calm before the storm. Last spring in Rebecca Spit and the Octopus Islands. Can’t wait for spring 2020.
#pnw #desolationsound #sailing #beautifulbc
I’d always wondered. #healthyingredients
I’d always wondered.
#healthyingredients
Things to do, when it’s minus 32. It’s an internal heat ?
Things to do, when it’s minus 32.It’s an internal heat ?

Instagram This Week

2019. The year I mastered this. And, if last night’s creation was any indication, the whole pizza making process. 
Happy new year everyone! Here’s to another year of learning and growing.
2019. The year I mastered this. And, if last night’s creation was any indication, the whole pizza making process.
Happy new year everyone! Here’s to another year of learning and growing.

Last Books of the decade: 2019

Well it’s that time again. I am a little late this year as I haven’t actually written anything before New Years day — as a result additional commentary might not be all that well thought out. But as The Raes said back in 1978, Que sera, sera. So without further adieu, here is what I read in 2019:

Books 2019

January (10)

The Gate Thief Orson Scott Card (2013)
Mithermage Book 2 – ebook;

Gatefather Orson Scott Card (2015)
Mithermage Book 3 – ebook;

The Consuming Fire John Scazi (2018)
The Interdependency Book 2 – ebook;

Friendly Fire Dale Lucas (2018)
Fifth Ward Book 2 – ebook;

The Gap into Conflict: The Real Story Stephen R Donaldson (1990)
The Gap Cycle Book 1 – ebook; reread

The Gap into Vision: Forbidden Knowledge Stephen R Donaldson (1991)
The Gap Cycle Book 2 – ebook; reread

Stand by for Mars Carey Rockwell (1952)
Tom Corbett: Space Cadet Book 1 – ebook; reread

The Gap into Power: A Dark and Hungry God Arises Stephen R Donaldson (1992)
The Gap Cycle Book 3 – ebook; reread

The Gap into Madness: Chaos and Order Stephen R Donaldson (1994)
The Gap Cycle Book 4 – ebook; reread

Major Barbara George Bernard Shaw (1905)
– ebook; reread

February (10)

The Gap into Ruin: This Day All Gods Die Stephen R Donaldson (1996)
The Gap Cycle Book 5 – ebook; reread

Six Characters in Search of an Author Luigi Pirandello (1921)
– ebook; reread

Star Hunter Andre Norton (1961)
– ebook;

The Armored Saint Myke Cole (2018)
The Sacred Throne Book 1 – ebook;

Our American Cousin Tom Taylor (1858)
– ebook;

Firebird Jack McDevitt (2011)
Alex Benedict Book 6 – ebook;

The Queen of Crows Myke Cole (2018)
The Sacred Throne Book 2 – ebook;

Pygmalion George Bernard Shaw (1913)
– ebook;

An Ember in the Ashes Sabaa Tahir (2015)
An Ember in the Ashes Book 1 – ebook;

A Torch Against the Night Sabaa Tahir (2016)
An Ember in the Ashes Book 2 – ebook;

March (13)

A Reaper at the Gates Sabaa Tahir (2018)
An Ember in the Ashes Book 3 – ebook;

Coming Home Jack McDevitt (2014)
Alex Benedict Book 7 – ebook;

Short Fiction Ivan Bunin (1907)
– ebook;

The Second Mrs. Tanqueray Arthur Pinero (1893)
– ebook; reread

Chanur’s Venture C.J. Cherryh (1984)
Chanur Book 2 – ebook; reread

Sing the Four Quarters Tanya Huff (1994)
Quarters Book 1 – ebook;

Dr Faustus Christopher Marlowe (1604)
– ebook; reread

Under a Graveyard Sky John Ringo (2013)
Black Tide Rising Book 1 – ebook; reread

To Sail a Darkling Sea John Ringo (2014)
Black Tide Rising Book 2 – ebook; reread

Islands of Rage and Hope John Ringo (2014)
Black Tide Rising Book 3 – ebook; reread

Strands of Sorrow John Ringo (2015)
Black Tide Rising Book 4 – ebook; reread

Fifth Quarter Tanya Huff (1995)
Quarters Book 2 – ebook;

No Quarter Tanya Huff (1996)
Quarters Book 3 – ebook;

April (18)

The Alchemist Ben Jonson (1610)
– ebook; reread

Alice Payne Rides Kate Heartfield (2019)
Alice Payne Book 2 – ebook;

A Memory Called Empire Arkady Martine (2019)
Teixcalaan Book 1 – ebook;

Shards of Honor Lois McMaster Bujold (1986)
Vorkosigan Book 1 – ebook; reread

Barrayar Lois McMaster Bujold (1991)
Vorkosigan Book 2 – ebook; reread

The Warriors Apprentice Lois McMaster Bujold (1986)
Vorkosigan Book 3 – ebook; reread

Mountains of Mourning Lois McMaster Bujold (1989)
Vorkosigan Book 4 – ebook; reread

The Vor Game Lois McMaster Bujold (1990)
Vorkosigan Book 5 – ebook; reread

Ceteganda Lois McMaster Bujold (1995)
Vorkosigan Book 6 – ebook; reread

Borders of Infinity Lois McMaster Bujold (1989)
Vorkosigan Book 7 – ebook; reread

Brothers in Arms Lois McMaster Bujold (1989)
Vorkosigan Book 8 – ebook; reread

Ethan of Athos Lois McMaster Bujold (1986)
Vorkosigan Book 6.5 – ebook; reread

Mirror Dance Lois McMaster Bujold (1994)
Vorkosigan Book 9 – ebook; reread

Memory Lois McMaster Bujold (1996)
Vorkosigan Book 10 – ebook; reread

Komarr Lois McMaster Bujold (1998)
Vorkosigan Book 11 – ebook; reread

A Civil Campaign Lois McMaster Bujold (2000)
Vorkosigan Book 12 – ebook; reread

Winterfair Gifts Lois McMaster Bujold (2004)
Vorkosigan Book 12.5 – ebook; reread

Diplomatic Immunity Lois McMaster Bujold (2002)
Vorkosigan Book 13 – ebook; reread

May (15)

Cryoburn Lois McMaster Bujold (2010)
Vorkosigan Book 14 – ebook; reread

Captain Vorpatril’s Alliance Lois McMaster Bujold (2012)
Vorkosigan Book 15 – ebook; reread

The Gentleman Jole and the Red Queen Lois McMaster Bujold (2016)
Vorkosigan Book 16 – ebook; reread

The Flowers of Vashnoi Lois McMaster Bujold (2018)
Vorkosigan Book 16.5 – ebook; reread

A Passage of Stars Kate Elliot (1990)
Highroads Book 1 – ebook; reread

Revolution’s Shore Kate Elliot (1990)
Highroads Book 2 – ebook;

The Price of Ransom Kate Elliot (1990)
Highroads Book 3 – ebook;

Falling Free Lois McMaster Bujold (1998)
– ebook; reread

Finders Melissa Scott (2018)
Firstborn, Lastborn Series Book 1 – ebook;

The Cloud Roads Martha Wells (2011)
Raksura Book 1 – ebook; reread

The Serpent Sea Martha Wells (2012)
Raksura Book 2 – ebook; reread

The Siren Depths Martha Wells (2012)
Raksura Book 3 – ebook; reread

The Edge of Worlds Martha Wells (2016)
Raksura Book 4 – ebook;

Blackcollar Timothy Zahn (1983)
Blackcollar Book 1 – ebook; reread

Backlash Mission Timothy Zahn (1986)
Blackcollar Book 2 – ebook;

June (6)

Tarnsman of Gor John Norman (1966)
Gor Book 1 – ebook; reread

Spinning Silver Naomi Novik (2018)
– ebook;

Cold Welcome Elizabeth Moon (2017)
Vatta’s Peace Book 1 – ebook;

Into the Fire Elizabeth Moon (2018)
Vatta’s Peace Book 2 – ebook;

Ancestral Nights Elizabeth Bear (2018)
White Space Book 1 – ebook;

Starless Jacqueline Carey (2018)
– ebook;

July (10)

Madness in Solidar L. E. Modesitt Jr. (2015)
The Imager Portfolio Book 9 – ebook; reread

Treachery’s Tools L. E. Modesitt Jr. (2016)
The Imager Portfolio Book 10 – ebook; reread

Assassin’s Price L. E. Modesitt Jr. (2017)
The Imager Portfolio Book 11 – ebook; reread

Endgames L. E. Modesitt Jr. (2019)
The Imager Portfolio Book 12 – ebook;

Short Fiction Mack Reynolds (2019)
– ebook;

Imager L. E. Modesitt Jr. (2009)
The Imager Portfolio Book 1 – ebook; reread

Imager’s Challenge L. E. Modesitt Jr. (2009)
The Imager Portfolio Book 2 – ebook; reread

Imager’s Intrigue L. E. Modesitt Jr. (2010)
The Imager Portfolio Book 3 – ebook; reread

The Merry Wives of Windsor William Shakespeare (1605)
– ebook;

Terminal Uprising Jim C. Hines (2019)
Janitors of the Post-Apocalypse Book 2 – ebook;

August (10)

Octavia Gone Jack McDevitt (2019)
Alex Benedict Book 8 – ebook; reread

Henry V William Shakespeare (1599)
– ebook; reread

Merchanter’s Luck C.J. Cherryh (1982)
Alliance-Union– ebook; reread

Finity’s End C.J. Cherryh (1997)
Alliance-Union – ebook; reread

Empress of Forever Max Gladstone (2019)
– ebook;

Warhorse Timothy Zahn (1990)
– ebook; reread

Pawn Timothy Zahn (2018)
Sibyl’s War Book 1 – ebook;

The Orphans of Raspay Lois McMaster Bujold (2019)
Penric and Desdemona Book 7 – ebook;

Red Sister Mark Lawrence (2017)
Book of the Ancestors Book 1 – ebook;

Grey Sister Mark Lawrence (2018)
Book of the Ancestors Book 2 – ebook;

September (7)

Rimrunners C.J. Cherryh (1989)
Alliance-Union – ebook;

Nevernight Jay Kristoff (2016)
The Nevernight Chronicle Book 1 – ebook; reread

Godsgrave Jay Kristoff (2017)
The Nevernight Chronicle Book 2 – ebook; reread

The Jeeves Stories P.G. Wodehouse (1920)
– ebook;

DarkDawn Jay Kristoff (2019)
The Nevernight Chronicle Book 3 – ebook;

Good Company Dale Lucas (2019)
Fifth Ward Book 3 – ebook;

Holy Sister Mark Lawrence (2019)
Book of the Ancestors Book 3 – ebook;

October (6)

Through Fiery Trials David Weber (2019)
Safehold Book 12 – ebook;

Hammered Elizabeth Bear (2005)
Jenny Casey Book 1 – ebook; reread

Scardown Elizabeth Bear (2005)
Jenny Casey Book 2 – ebook; reread

Worldwired Elizabeth Bear (2005)
Jenny Casey Book 3 – ebook; reread

Denver is Missing D. F. Jones (1971)
– ebook; reread

On the Beach Neville Shute (1957)
– ebook; reread

November (11)

The End of the Matter Alan Dean Foster (1977)
Pip and Flinx Book 4 – ebook; reread

Flinx in Flux Alan Dean Foster (1988)
Pip and Flinx Book 5 – ebook; reread

Mid-Flinx Alan Dean Foster (1995)
Pip and Flinx Book 6 – ebook; reread

Reunion Alan Dean Foster (2001)
Pip and Flinx Book 7 – ebook;

Flinx’s Folly Alan Dean Foster (2001)
Pip and Flinx Book 8 – ebook;

Sliding Scales Alan Dean Foster (2004)
Pip and Flinx Book 9 – ebook;

Crystal Singer Anne McCaffrey (1982)
Crystal Singer Book 1 – ebook;

Killashandra Anne McCaffrey (1985)
Crystal Singer Book 2 – ebook; reread

Crystal Line Anne McCaffrey (1992)
Crystal Singer Book 3 – ebook;

Running from the Diety Alan Dean Foster (2005)
Pip and Flinx Book 10 – ebook;

The Ruins of Gorlan John Flanagan (year)
Ranger’s Apprentice Book 1 – ebook;

December (7)

Bloodhype Alan Dean Foster (1973)
Pip and Flinx Book 11 – ebook;

Trouble Magnet Alan Dean Foster (2006)
Pip and Flinx Book 12 – ebook;

Knight Timothy Zahn (2019)
Sibyl’s War Book 2 – ebook;

Velocity Weapon Megan E. O’Keefe (2019)
The Protectorate Book 1 – ebook;

The Harbors of the Sun Martha Wells (2017)
Raksura Book 5 – ebook;

Patrimony Alan Dean Foster (2007)
Pip and Flinx Book 13 – ebook;

Flinx Transcendant Alan Dean Foster (2008)
Pip and Flinx Book 14 – ebook;

((\
(-.-)
o_(“)(“)

The Stats

123 books
64 rereads
0 audiobooks
10.25/month, .337/day

My ebook library now sits at 735 books.

This is significant because for the first time in my entire life I have a backlog of unread books to get through. Frankly I am a bit ashamed: 87 unread books! Now granted, around 31 are ebook copies of paper books I have previously read, and a further 16 are classic fiction, emergency books (like Around the World in 80 Days or Middlemarch) that I downloaded many years ago “just in case,” but that still leaves a whopping 40 titles I need to get through to establish some equilibrium. Time to cut down on the rereads I guess…

What I Read

As usual it was primarily SF and Fantasy.  Due to my work in the Standard ebook project I did add a bit of variety including 9 plays and 4 non-sf/f titles which included a massive collection of depressing, yet fascinating Russian short stories and  a bunch of the original Jeeves stories. I commend both to your attention.

Significant among the rereads were Stephen R Donaldson’s Gap Cycle which, while I was among the legion of Thomas Covenant fanboys back in the day,  seem to me to be a much better work and certainly more able to stand the test of time. I also revisited Elizabeth Bear’s first published books, the Jenny Casey series which were still great though they bore the rough edges of a new writer. I say this only because she has gone on to become probably my most revered author of the modern age—man that woman can spin a good story… again and again… and again. I also reread the entire Bujold Miles Vorkosigan tale, all 16 books with associated side stories and novels, and John Ringo’s Black Tide Rising zombie apocalypse trilogy — both of which were as enjoyable as ever.

I did reread and then finish off two series: the first was Martha Wells’ Rakusa where I reread the first three books and finished off the last two. I have to say it was ok, but paled in comparison to much more excellent Murderbot series of novellas. Part of that is that the Rakusa novels had a very clunky, episodic feel—I admit to being a bit nervous about the forthcoming Murderbot novel…maybe her forté is shorter fiction? Speaking of test of time, the second series I finished off definitely suffered, although I don’t know if that was me or the books. I first encountered Alan Dean Foster’s Flinx and his sidekick minidrag Pip back in the late 70s. It was certainly some of the earliest SF I ever read. I faithfully read along as he published new novels until about 1995 (Book 6) and then sort of dropped the ball for almost 25 years. As of today I have one more to read (Strange Music, Book 15, published in 2017) and then I assume he is done. What started as  a sort of advanced YA morphed into a more adult-oriented series but I  am not sure the style suited it. Suffice it to say I was not as enamoured of the later books and even the early ones reread a bit less than my expectations/memories.

Another eye opener was my decision to reread the Tarnsman of Gor which was the first in the Gor series written by John Norman. Written in the style of Burroughs’ Barsoom books, they are definitely not recommended for any reader that can’t situate themselves in a 50s or 60s mindset. Seriously. They would probably cause a brain aneurism for most younger, modern readers. And while the first one isn’t that bad, I seem to remember that by Book 8 or 9 he started to spend whole chapters talking about the natural servility of women and other pretty ridiculous theologies. It was good to remind myself of the past, but I find myself pretty settled in the future now, thank-you very much.

One last word on the past. I edited a collection of Mack Reynolds stories for Standard Ebooks. Written in the 50s mostly for SF rags, they are a pretty amazing look into the future of human political systems and technology. I was truly impressed about how much he got right. An underrated author if you ask me.

Modern SF

One last bit on the theme. It occurred to me this year that modern Science fiction and Fantasy these days (let’s say the last 15 or so years) is a lot tighter and better written than the older stuff. I am too lazy to seriously look at what that means or why it is (I left that all behind with my English degree) but overall the craftsmanship is way up. I am sure a lot of that is the people in the trade these days—both writers and editors— are standing on the shoulders of giants, and that the freeing of the publishing world from the oppressive yoke of traditional publishing has contributed to greater exposure for authors. (Note: I am being extremely sarcastic about the oppressive yoke bit, but not about the potential contribution. See Hugh Howey and Andy Weir.)

Whatever the reasons, I have found a new interest in fantasy, an interest which had almost died out with the never-ending, multi-book, soap opera-like series that had dominated the market that last bunch of years, and I was delighted several times this past year with authors like Sabaa Tahir, Mark Lawrence and Jay Kristoff. Even the now venerable Jacqueline Carey stretched her wings with a most excellent stand-alone novel: Starless

And the SF has  kicked it up a notch too; check out some truly “novel” and exciting stuff by people like Arkady Martine and Megan E. O’Keefe.

All this to say, I am enjoying the new crop of my chosen genre’s publishing efforts. Congratulations to each every one of you that has contributed to what I will deign to call a resurgence 😉


C’est tout. As I said I am behind times so hopefully you can already catch Leslie’s 2019 book & music list here and the one, the only, the original Earl’s list here.

Links to previous years book posts:

 

 

 

 

Vlogging 2019

I finally got around to editing all my video footage from April 2019. It really only needed 3 videos to cover the trip but I wanted to do the week-plus spent travelling with the Calgary Yacht Club as a separate video, so I ended up making 4 with the last one mostly just a round up of the last week, easing back into the real world and cleaning up.

I also thought I would publish a bit of background on the vids themselves. Some of the online forums I participate in are filled with curmudgeons that insist that YouTube is filled with freeloaders and people with no, or bad, work ethics. If the amount of effort I put into my — admittedly bad — videos  can be taken as a measure, then those who weekly produce high (or even medium) quality videos for their sailing channels are in no way suffering from work ethic issues.

My Channel

All 23 of my videos can be found here on the Never for Ever channel. It has a whopping 21 subscribers, half-a-dozen likes, and 6.8k views lifetime. The most views at 1.7k is a short, mostly unedited video of our second solo transit through Dodd Narrows on a Bayliner 38.  I guess people just want to see what all the hype is about? The least amount of views (not counting the new ones) is 62 which is Part 6 of our 2017 trip to Desolation—I guess people were getting bored by then as Part 1 has 222 views.

For those who have never been to the channel, here is a summary of what you will find:

  • 1 early and long video of a flotilla trip to Broughtons — 2014
  • 3 short Broughtons’ videos — 2015
  • 3 test promo videos for NYCSS
  • 7 medium-length videos of Desolation Sound — 2017
  • 3 long Broughtons’ videos — 2018
  • 4 medium Desolation videos —  2019

Videos

Why do I make these? Successful YouTube channels have a hook or theme—something to attract and retain viewers. Me? Not so much. I got to thinking about it during the last round of editing and realized my imagined audience (the enormity of sadly ignorant people who really should come to know and love the PNW) and my real audience (family and friends) were worlds apart. Given the videos I do create, I would guess my subconscious realizes this, as I would definitely characterize them as a “travel log” for people who know us.

To expand the channel by any degree I think I would have to make the leap to official travelogue. This format has a long and storied history in the world of television but it would take a bunch more research and filming to really show the essence of places we visit. And it’s not really feasible for me to move into the vlog world. At the very least, that would take a more cooperative (and less camera shy) partner and a lot more talking to camera…or even any talking to camera. Hmmmm…

And there is no way we could really be a true sailing channel — not with all the motoring we do 🙂 I can just imagine the scathing comments.

One of the main reasons I make the videos is to practice and sharpen my software skills. My “Map” is a perfect example of this. I needed a map to display routes and rather than steal (and it actually is stealing) one off the internet I decided to  make my own based on various sources. I traced a detailed outline into Adobe Illustrator and then modified and added layers to it. This was then imported into Adobe After Effects, routes added and animated and finally placed into the main Premiere video file for integration with the rest of the footage. It is labour intensive and complex and I end up learning something new every time I attempt it. Good practice, lousy efficiency.

The 3 videos I did to promote Nanaimo Yacht Charters were very much done for practice and proof of concept. It started when Water Dragon, a 2017 42 Lagoon, was going into charter and had done some videos to promote the boat — and I really hated his splash screens. So I volunteered to see if I could add something to it.

This:

 

Became this:

Then I redid an interview video he had done, tweaking the audio and lighting and  played with a sample cruise itinerary of the Gulf Islands, which incidentally is my second-most viewed video despite the fact that it is beyond horrible.

I have also discovered subtitles recently. When I took my one brother for a tour through the Broughtons I checked to see if YouTube had CC (closed caption) capability (he’s deaf). Turns out they have an automatic captioning tool that, like most “autocorrect” type features, produces some awesomely funny results. Fortunately there is an ability to edit and add your own captioning, so I have started to add that to all my videos.

Tools

Hardware-wise I mostly use my iPhone 7 (or iPhone 5 in past videos), a Nikon Coolpix L80 with 28x optical zoom for long shots, and an SJCam (which is a cheap GoPro knockoff) for wide angle, timelapse and underwater shots. I long for a drone but keep talking myself out of it. At home I use my 2015 Macbook Pro to edit and last year invested in a 32-inch Samsung monitor after my old 21″ burned out.

As you can see, one of the “hardest” things about cruising is giving up all that delicious screen real estate for the puny 13″ monitor on my laptop. 😉

I am lucky enough to have the full Adobe Creative Suite so I use After Effects  for animations and titles, Audition to prep and balance the audio and Premiere to put it all together. This year’s videos feature a lot of colour grading, which is a new skill for me, and a little experimentation with 2.33:1 anamorphic aspect ratios. I will usually open Photoshop and Illustrator at least once during a project to tweak an image or build some sort of graphic like the compass rose.

2019 Offerings

I had originally meant to say something about time invested and work ethic, but I think I have blithered on enough. Suffice it to say I shot these videos in April and it is now December. The rest I will leave for a future post. So, without further adieu, here are the four 2019 videos.

Part one is YEG to Lund.

Part two is Desolation Sound: Lund to Shark Spit.

Part three is a loop through the Discovery Islands with my brother and the Calgary Yacht Club‘s annual flotilla and our return to Smuggler Cove.

Part 4 is just our trip home from Smuggler, and encounter with the start of the VanIsle 360 and cleaning up.


—Bruce #Cruising, #Equipment

Instagram This Week

Cc-130 does a flyby at Gresibach while we spent a moment at the RCAF memorial there. #rcaf #remembranceday #lestweforget
Cc-130 does a flyby at Gresibach while we spent a moment at the RCAF memorial there.
#rcaf #remembranceday #lestweforget
The 439th at ready in Marville in 1963. Remember. #remembranceday #lestweforget #rcaf #1wing #CF-86
The 439th at ready in Marville in 1963. Remember.#remembranceday #lestweforget #rcaf #1wing #CF-86