Last summer, as as part of my internship working with these awesome people at MSR, I spent a lot of time playing with public data sources. One fascinating dataset that I chose as a benchmark (for what is currently known as Tempe at MSR) is the White House Visitor records, which (as of last July) had over 3 Million records of visitors to the White House during the Obama administration.
This dataset has been in the news before, and is (in my opinion) a great example of public disclosure that we should be pushing for in government. A whole other conversation of course is how/when such records should be released, and by whom. The White House Visitor dataset is also known to be incomplete, censoring records for national or personal security reasons, and maybe other reasons too.
Here is just one question I came up with: Do more men or women visit the White House? My guess was that a majority of visitors would be men.