The Dimensions of Art

7 comments:
Some good soul on reddit posted a link to a very neat dataset: Metadata from the Tate CollectionThe files contain lots of interesting bits of information, but one particularly stood out to me: the dimensions of every piece of art that the Tate owns.

A major caveat: a lot of the art is 3D and has a 3rd dimension I'm not considering (e.g. sculpture). For your thoughtful viewing pleasure, here is the distribution of the aspect ratios for 65k pieces of artwork held by the Tate as a function of their width

Art dimensions, a technical view

Pixel color (light to dark) indicates density of pieces. There are some interesting clumps in this space, here are some thoughts:

1. On the whole, people prefer to make 4x3 artwork. 

This may largely be driven by stock canvas sizes available from art suppliers.

2. There are more tall pieces than wide pieces.

I find this fascinating, and speculate it may be due to portraits and paintings.

3. People are using the Golden Ratio.

Despite any obvious basis for its use, there are clumps for both wide and tall pieces at the so-called "Golden Ratio", approximately 1:1.681 (as a tribute, that's the ratio I rendered the above figure at)


Art becomes data becomes art

What I learned very quickly after producing the first figure is that nobody understands it. Even though it's very information rich and accurate, I'm violating a basic rule of data visualization: make it understandable! People gave me lots of feedback saying they couldn't wrap their heads around the figure, and I did almost nothing to break it down...

Because this is art, I felt compelled to re-visualize this into something more... visceral. Here is the same data (for art up to 3m x 3m), with each piece represented as a thin wire box.




Play along at home

If you'd like to play with this data and make your own version of these figures, I have replicated (nearly) the figures from this blog post in an IPython notebook, which is up on GitHub! (link to notebook).

Talk - Beauty in Data

No comments:
A few months ago I gave this talk at the Seattle Nerd Nite. It was a great event, and the small crowd of ~75 people who came out to the bar to see me and the other speaker were friendly and chatted me up with questions for probably another 30min after!

Enjoy!

Excel vs Python vs IDL

12 comments:
My favorite quote about camera gear is this:
"Your camera doesn't matter" -Ken Rockwell
If you're reading my blog, odds are you would laugh at the notion of "professional grade plots" being generated using Excel. I've been guilty of this sin as well. We're all wrong. Your software doesn't matter.

There's a lot of geekery, pride, and often vitriol when it comes to visualization tools. If your graph looks dated, or is clearly created using tools that have fallen out of vogue, people will be more dismissive of your scientific results (according to my observations at least). I have observed such viz-bias in PhD scientists and undergrads alike, and have caught myself thinking it as well.

Speaking strictly for visualization (though you can extend this to many aspects of scientific computing presently) as a practitioner in Astronomy these days you're antiquated if you don't use Python (or better yet D3), IDL is considered very unfashionable, and Excel is forbidden.

I say phooey to that.

I'm not dismissing the deep value, or plain superiority in some areas, of Python over IDL. D3 is downright amazing. But, when it comes to the bread and butter plots, the ones that get science done quickly and cleanly, one tool is no better than another. Because I have keywords/settings adjusted already, I can zip out a publication quality plot in a single easy to read command in IDL (many fine examples can be found within this website). If I had been using Python continuously for the last 8 years I could do the same in that language too. With patience you can do the same in Excel.

To prove my point, here is a quick attempt to generate the same basic plot in IDL (v7.1), Python/matplotlib, and Excel (2011). The data is about 2.5 days of a lightcurve from Kepler. To try and make things fair I've scaled them all to similar resolutions, and placed ugly red labels on them in Preview.

Can you guess which plot is which?



There are aspects of each figure that I really like, and while comparing them I find I am truly satisfied with none. I'm not an expert in Python or Excel. Maybe the answers were super obvious (let me know!) but if I saw any of these figures in a research paper I wouldn't stop to wonder about the tools being used, nor question the credibility of the researchers who made them.

So in summary, your visualization tool doesn't matter. Similarly, there's just no excuse for ugly and illegible graphs from any tool. As I've tried to say time (and time and time and time) again: Visualization is first and foremost about asking good questions and clearly communicating your message. If it's artistically pleasing, so much the better!

answers: A=Python, B=Excel, C=IDL


Update:
My good friend, Meredith Rawls, graciously re-made the same figure using SuperMongo (SM). Check it out:
Also - many people have pointed out in the comments and on Twitter, this example is very selective in the graphical skills required. I heartily agree! Your visualization tool doesn't matter, provided it is capable of rendering the visualization you need!

A Summer at Microsoft Research

2 comments:

It's autumn now, a time of harvest and reflection, and the beginning of the academic year. The blog has been dormant for about a month because I've been working very hard in Astro-land.

I spent the last few months only 10 miles away from UW, just across the lake in Redmond. Since lots of people have been interested in how the experience was, and since corporate internships seem to be fairly uncommon in Astronomy, I thought it would be worthwhile recapping my summer at Microsoft Research. (Apologies for a super lengthy post. tl;dr MSR was fun, challenging, would recommend)