Time to put out the results of one of my latest projects.
If you want to jump straight to the end result, just check it out: Top Read and Tweeted Kindle Books
It is currently set to update automatically every hour.
For more details of what this is and what I did, read on…
Over the last few months, 15-30 minutes at a time, as I had a few moments, I’ve been working on putting something together that I’d been curious about for a long time.  Namely, a while back a feature was added to Kindles to share that you had finished a book.  When you get to the last page of a book, it asks you if you want to put a note on Facebook or Twitter that you have finished the book.
This naturally leads one to wonder…  well, at least it leads me to wonder…  which books people are finishing and how that compares to standard lists of what books people are buying.  After all, probably most books that are bought do NOT actually get read, certainly not all the way through.  These social media posts might give at least some window into that.
Now, to be clear, in the end, looking at these can NOT tell you about what people are reading.  For one thing, it is just Kindle books.  For another thing, it is only people who bother to connect their social sites to their Kindles.  And then it is only the books that they choose to share publicly…  there is surely lots of reading people just don’t want to share.
But I thought it would be interesting anyway.  I concentrated on the Twitter side because I thought I had an idea how to do that.  When people finish their books they can choose to edit and customize what they Tweet, but if they don’t, then the tweets have a standard format, and I could grab and parse those tweets.  So I started collecting and grabbing that data.  Then I set up stuff to remove as much of the “extra” stuff in the tweets as I could (although when people add custom stuff, I can’t really catch that), and then do some sorting and counting and such to come up with a ranked list.  The parsing is by no means perfect, but it is good enough for now.
I tried looking at the last 10,000 tweets, but there were still way too many ties in the top 20.  So I looked at the last 20,000 tweets, but given the current rate of these tweets you would have to go back farther in time than I wanted, so it would be pretty slow to respond to changes.  For now I’ve settled at the last 16,384 tweets.  Why 16,384?  I am a geek, it is a power of two, it is between 10,000 with too many ties, and 20,000 with too much time, and at the current rate of tweeting it is pretty close to a month of tweets.
In any case, I put the last tweaks on this in the last 24 hours, and I figure now it is ready to go live.
To get the latest up to the hour counts, go to the page I’ve set up for this:  Top Read and Tweeted Kindle Books
As of the hour I am posting this though, here is what the list looks like:
Data as of 2013-03-10 20:00:16 UTC, covering 16384 tweets over 31.96 days.
Includes tweets from 2013-02-06 20:54:17 UTC to 2013-03-10 19:58:16 UTC.
And there it is.  Not quite the same as the bestseller lists, but fun to look at and see how it changes over time.
Oh, and yes, I know that it would be trivial to manipulate this list, since it just counts tweets in a specific format, and anybody could tweet as many tweets as they wanted in that format, no reading of a book required.  But hey, still fun.