The Science of David

No comments:
David has been a popular name for ages, and according to Wikipedia includes over 3% of the American population(!) So perhaps I shouldn't be terribly surprised by the amusing discovery I made last year:

If you want a postdoc position in Astronomy,  you should be named David.

This was the result of a simple "word cloud" investigation of the ever important Astronomy Rumor Mill. At the end of winter quarter last year (my records indicate it was March 10th, 2011) I simply copied the present state of the Rumor Mill text into Wordle and made the following word cloud:

I love word clouds (aka word histograms or wordle's), not because they report precise/quantifiable data, but because they (can) convey both real data and an emotional context. A word cloud captures the soul of a topic, and to me reads like beautiful free-form poetry.

Perhaps a better example of this emotional context is what I made a few years ago. I searched Google for "fraternity" and copied the text from the first 5 or 6 pages into a single Wordle:

It reads like Greek-system advertisement, or maybe a mid-life politician's therapy session, and that's exactly why they (word clouds) make great figures! You can look at it and say "this confirms all my worst fears about the maniacal delusion of Greek life", or you can see in it "this system trains the community leaders of tomorrow". I guess the result, as with all statistics and data presentation, is in the context given.

So back to David, and his apparently prolific status in our field.

Wordle gives the useful feature of returning the actual data on words and their frequency of use, but alas I lost the text file... There's not enough hours in the day to fully exploit the interesting data contained in the Rumor Mill. It is a living study in the culture, anthropology, and psychology of an entire generation of Astronomers & Astrophysicists. It contains much of our hope and fear, jealousy and joy. You would think people as smart as we would abandon such a tortuous, biased, and unreliable source for job information. Self-worth is invested almost completely in our job and status in the field, and we'd love to think it rooted in the merit of our work, that our colleagues will read our papers earnestly and judge us to be worthy of acclaim. This evidently was the case for David last year.

Good luck to my many friends on the job hunt this year. I hope to see all your names, big and bold, in the next Rumor Mill word cloud, and pray all the David's got hired last time!!!

Battery Life

Like everyone, I take my work home with me. This comes in many forms; some nights and weekends I am observing remotely on a telescope, others I am writing or programming for a project or paper. The more insidious and subtle way my work comes home with me is my absurd desire to plot things and seek out data in every day situations. I suspect this is due to being a graduate student, but probably bodes well for my long term job prospects... 

I've written about the health of my laptop's battery before, and since I had already figured out how to get the data, this seemed a fitting topic to post about again!

Premise: I would like to track the health of my laptop's battery over it's operational life, and to estimate the maximum lifetime. To do this I have been using a lightweight program called CoconutBattery. The program is in my login items, and whenever I restart my computer (every 1-2 days or so) it will pop up with a reading of my battery. There is an option to save the reading, and I have been doing so since a few months after I purchased my laptop.

To retrieve the data I opened the xml file into Safari, pasted the text into emacs, worked some simple search/replace magic, and voila:

Alas, there are no error bars... Time=0 was on 2009 May 07. My battery reads today 681 loadcycles, and I have 338 data points.

My initial analysis was only on the first ~1.7 years of data, and could only be fit with a line. This laughably suggested my battery would last 68 years, assuming a constant linear decay of battery capacity. I then started taking data more often, and from 1.7 to 2.5 years I quickly found out that the decay was most certainly not linear.

There is a discontinuity at ~1.9 years that I noticed. I'm not sure what happened here, the sampling cadence was high because I was restarting my computer almost every day to play video games with some friends, and has dropped off because I am not playing much this year. The sampling cadence appears to be roughly constant over this discontinuity, with ~8 measurements per week, suggesting my usage is not likely to blame. My theory instead is that software updates are to blame (or to thank).  My mac's history of Installed Software states that OS X 10.6.6 and 10.6.7 came out in early 2011, and I believe the latter caused a recalibration of the battery measurements. 

The fit looks embarrassingly good, and that's simply because I cheated. From 0 to 1.7 it is a linear fit, and from 1.5 to 2.7 it's a 4th order polynomial. These don't remotely match at time < 1.7, and I should have fit two separate functions before/after the time = 1.9 discontinuity. I did this late last night, so as we say in the biz "it's good enough to first order..."

The inset shows the "prediction", extending the 4th order fit out to where the capacity = 0.  I do not believe this behavior will hold exactly as predicted, and instead (through anecdotal evidence) will probably flatten out around 30% capacity, or ~1 hour operational life. Batteries fail for many reasons, and their lifespan and capacity are dependent on parameters such as their operating temperature, charge depth, and charging voltage. Oh, and age!

I'm planning to buy a new laptop this summer, as my warranty is only valid for 3 years (and the battery capacity drops dramatically at that time range!). I will be keeping more data on this laptop until then, but would still like to write a cron script to take this data passively every hour. I might have even started working on this at one point... more investigation definitely required! 

Five by Five

First posting, here it is. A blog is a funny thing; so easy to start, so hard to continue. The sheer volume of blogs started per day on Blogger alone must be staggering, and I would be fascinated to see the cumulative distribution of blogs with increasing numbers of posts. The yearly trends of when blogs are created would be interesting as well. Blogger/Google no doubt have this data available somewhere internally... more research is definitely warranted.

[Prediction: 90% of blogs are abandoned within 1 year or 25 posts. Blogs are preferentially created at certain times of the year, and it probably aligns with academic schedules]

Here is an (ugly) infographic on blog statistics that a cursory Google search yielded.

This blog is envisioned to be a clearinghouse of my ideas, with a loosely scientific theme.  I am not a statistician, and I abhor most "infographics". I promise to post only what I find interesting, and will try to generate my own content though my own ideas. This may include research, unvetted notions, or calculations of things I want to know, with no promise of continuity. I will try to maintain some level of professionalism, but will fail at this at times.

At time of writing I am half way through my PhD in Astronomy. The blog title "If We Assume" comes from a favorite phrase that is often said when stating physics/astronomy problems. A classic example is predicting milk production from a cow. The physicist might say "... if we assume a spherical cow, with rudders uniformly emitting milk radially..."

I believe there is great power in this method of problem solving, and that it is both intuitive and practical in daily life. Many complex problems or systems can be understood with surprising accuracy using appropriate/realistic assumptions. "You'd be surprised how much you intuitively know..." a colleague once told me. This is frequently the basis of so-called "Order of Magnitude" calculations that astronomers are so famous for. Some of them I love, some of them I hate. Some are both brilliant and disastrous.

So that's all for now it would seem. I have a healthy back burner of post ideas that have nothing to do with my own research, and have plenty to say on things academical as well. Let's see if this dog will hunt...