Best Selling Book Covers

Be sure to subscribe for updates on this and all my other data analysis projects! 
After I finished my masters degree in San Diego, a good friend of mine gifted me a book he thought I'd enjoy. Probably unbeknownst to this friend, my parents and family had long since given up on trying to get me to read books for pleasure. While I'd pour through pages on the internet, and have always loved cinema, I stopped reading (outside of school) when I was about 16.

I was 25, the book was classic science fiction, and it literally changed my life. I read it every day while walking to and from my office on campus. Strolling slowly to school I would get about an hour of reading in per day, and it still took over a month to finish! Not reading fiction for a decade makes your mind out of shape. Now I love books, and have been trying to consume classics that I'd been recommended so many years ago. 

This metamorphosis has made me passionate about books again, concerned for libraries, and an active reader. One thing I noticed right away, especially when buying used paperback science fiction, is how bizarre book cover art can be. They range from basic solid hues, to gaudy airbrushed scenes of romance. This was a culture, an entire art scene, that I knew nothing about!

I do know a bit about movie posters, particularly from my youth working in a movie theater. Movie poster styles rely a lot on templates; basic layouts if you will. These are often very similar within the same genre (e.g the heroes in a V formation).

Color choice in movie posters is also fascinating (e.g. orange/blue contrast use in serious/action movies). There was an AWESOME blog post a few months ago by Vijay Pandurangan on the distribution of movie poster colors over time. Seriously, if you like my blog go read that post here! He found that blue has become much more prevalent in the last ~20 years. Neato!

I started to wonder: are there trends to be found among book covers?

Popular colors? Common layouts? Once again late night musings necessitate data!

Gathering the Data

So I gathered the book covers for Top 10 Best Selling books from USA Today. I wrote a script to grab the Top-10 covers every week (actually 4 weeks per month, 48 weeks a year) from 2000 to 2012. USA Today does a great job of aggregating book sales information from tons of sources, and their Top-10 list is easier to use than most other similar digests if you want a broad census for what people are reading.

How does one visualize ~6000 book covers (about 1300 individual books)? ALL AT ONCE

Here is what 12 years of Top-10 book covers looks like. It is organized in 1-year "bricks", with 2000-2011 top to bottom. In each brick are 48 columns (weeks), Jan 1 on the left, New Years on the right. Rank 1...10 are top to bottom within each brick.



a decade of best seller book covers
The Top-10 Best Seller book list from 2000-2011.
 Each brick contains 1 year, with weeks increasing left to right.
Ranks 1-10 go from top to bottom in each brick.
Click image for high resolution

Book Positions

There is a TON of fascinating structure within this visualization. Early on, in the wake of Y2K and the rise of the dotcom era, many book covers were white. While still popular, bright saturated hues have become a mainstay now a decade on. The "boy who lived" was the biggest chart topper, along with a couple self help books, and of course Dan Brown's tremendous showing starting in 2003. In 2008 a massive black swatch took hold of the top spots, and crushed records.

You can also see the "decay pattern", as books drop in popularity over a few weeks. As you might expect, books tend to jump up the chart very fast, and decay a bit slower.

A few interesting conclusions that came from analyzing this visualization:

1) Most books hit the Top-10 for a week or two.
Histogram of # total weeks on the Top-10 list for all books from 2000-2011.

2) Books that reach high on the chart, can stay on the chart longer (obvious)
The peak rank versus total number of weeks on the Top-10 list for
every book from 2000-2011. Note: points overlap

3) Persistence: Books that reach #1 tend to stay #1. In other words, first place has the least variance in books.
Variance (# unique books / # total weeks considered) versus Top-10
rank for all books from 2000-2011.

Big Winners


A few books stood out as big winners, spending inordinately long amounts of time on the bestseller list. For years 2000 through 2011, here is the Top 10 Top 10...

57 weeks: The Da Vinci Code
58 weeks: The Shack
65 weeks: HP The Prisoner of Azkaban
69 weeks: Breaking Dawn
76 weeks: HP Chamber of Secrets
79 weeks: Who Moved My Cheese?
81 weeks: Twilight
81 weeks: Twilight New Moon
82 weeks: HP Sorcerers Stone
93 weeks: Twilight Eclipse    


This cracks me up... The list is made up of teen lit, a largely word-of-mouth-publishing success story, and Who Moved My Cheese? 

Amazing.

Book Covers/Colors

I thought it would also be fun to see what the average bestselling book cover looked like since 2000:

the average best seller book cover

It's kind of creepy looking! Like an ancient piece of parchment or something. This was created by adding all the cover images together, and then readjusting the contrast. Note the gradient on the right side of the image, due to books not all having the same dimensions! You can definitely pick out common features. Small white letters on the top, lighter images (probably people) in the center. The most obvious next step in this would be to do PCA on the image stack. It should go without saying that this image should be the cover of my first book...


Perhaps my favorite result from this whole project was determining the average color of a best-selling book cover. This is based on the famous determination of the average color of the Universe, coined "Cosmic Latte". I present to the world, for the first time, Bestseller Brown! If you're curious, the RGB = [127, 112, 101], and Hex = 7F7065.



The Future


I have so far only presented the data from full years 2000-2011, but the Top-10 list for 2012 is so interesting that it's worth discussing/visualizing even before it's complete! Here is the current "brick" for 2012:

The Top-10 Best-Seller "brick" for most of 2012. Each column (left to right) is 1 week. 1st place is top, 10th on bottom.

Holy bimodality, Batman! I mentioned persistence in the top book ranks, but this is ridiculous. The very gradual decay of the Hunger Games books is fascinating, and is dramatically foreshadowing the 2013 release of the second movie. This summer the Fifty Shades series hit it big, and in a two week span crushed all other challengers and knocked the Hunger Games series from it's long-held top spot. How long will Fifty Shades of ___ stay on the list? It's anyones guess... though if I had sales information on every book in this database I could make a reasonable prediction. All the controversy (aka free press) surrounding it is certainly driving up sales. Hunger Games is poised to be a huge player in 2013, however, and I would expect we see these two franchises battling for middle spots on the Top-10 for at least another year, with Hunger Games having a bit more lasting power.


Update: You can now buy a poster version of the book cover visualization. Neat!


Best Selling Book Covers Poster
Best Selling Book Covers Poster by IfWeAssume
Browse anotherPoster template design online at Zazzle

5 comments:

  1. Another cool visualization would be to "sort by color" rather than by week. You could do something as simple as "take the average HSV" color, and then sort the books via a comparator that first compares V, then H, then S. That way it would be more "clear" what types of covers are the dominant ones.

    ReplyDelete
  2. The Hunger Games (particularly the second book) will probably experience a resurgence when the second movie comes out. The Hobbit might even jump back up once that comes out in theaters.

    ReplyDelete
  3. Thanks! My new book cover is going to be that cosmic color.

    ReplyDelete
  4. It also would be interesting to include the non-bestsellers for comparison.

    ReplyDelete

Inappropriate comments, advertisements, or spam will be removed.
Posts older than 2 weeks have moderated comments.
(Anonymous commenting disabled due to increasing spam)