Name your child for success!

Shakespeare famously posed the question:
"What's in a name?"
The answer may actually be: quite a bit!

Your given name (and your family name, for that matter) likely contains a lot of subtle information about you and your history. For example, we often assume names correlate with gender (as I have in previous articles)... Except the gender identity of some common names has changed over history (examples here)! Your name may also correlate with your political affiliation or what job you have.

I recently wondered: do certain names correlate with brilliance or high intellectual achievement? 

To find out, I gathered a large dataset of full names from people with PhDs in science (from the IAU and AAAS), as well as the names of lawyers using several recent years of bar exam "pass lists" provided by WA, NY, and TX. In total I was able to easily (read: quickly) gather over 36,000 full names of scientists and lawyers!

With this corpus of highly educated names in hand, let's look at which are most common!

The most common names of scientists:

Right away you can see a dramatic trend: it's mostly dudes. In fact, of the top 100 most common names for scientists, only 14 are female!

The most common names of lawyers:

While mens names still dominate, there are definitely more women in top list. For comparison, of the top 100 most common names for lawyers, 50 are female! That difference is shocking to me.



Lawyers have more diverse names


Here I compare the distribution of name frequencies between the datasets. For the top 500 names of  Lawyers and Scientists, I've counted the occurrence rate of each name. You can clearly see that the Scientists (red line) are much more concentrated in the first 10-ish names. (Actually this is visible in the wordle's above also) This might seem surprising, given that the lawyer names only come from 3 states in the US, and the scientists come from all over the world.

I next wondered, how does the rank of scientist and lawyer names compare? Obviously names like "Michael" are high in both lists, but how well are they correlated? The answer: Not Very Well!

Huh! Something else is at work here. These names, scientists and lawyers, obviously do not come from the same distributions. I then realized: the scientists sampled are more "senior", while the lawyers are all very "young". The two groups may be separated by over 25 years age difference. This lack of name rank correlation, and maybe the different top names, might be related to the era these names are from. The larger number of women's names among lawyers surely is due to a great increase in the fraction of female attorneys compared to yesteryear.

To investigate this further, I gathered two "ground truth" datasets from the Social Security Administration (SSA): The 100 top baby names over the past 100 years, and the 100 top baby names of 1989. These datasets were broken down by gender, so the rest of my analysis is also.

Here I'm showing the rank of the top 100 names for lawyers, compared with the name rank according to the SSA. For lawyers, male name ranks correlate better with 1989 than the 100-year historic set. Female name ranks are slightly better associated with the historic 100-year set.

For scientist (again, note the dearth of women's names), male name ranks are much more correlated with the 100-year historic set, while female names prefer the 1989 data.


The top names are...

Female Scientists:
Susan
Maria
Mary
Barbara
Nancy
Elizabeth
Linda
Judith
Karen
Elena
Carol
Patricia
Margaret
Laura

Male Scientists:
John
Robert
David
Michael
Richard
James
William
Peter
Thomas
Paul
Charles
George
Donald
Stephen

Female Lawyers:
Sarah
Elizabeth
Jennifer
Jessica
Lauren
Katherine
Laura
Stephanie
Rebecca
Emily

Male Lawyers:
Michael
Matthew
David
Daniel
John
Andrew
Christopher
James
Robert
Jonathan

5 comments:

  1. Lawyers are on par with scientists?

    ReplyDelete
    Replies
    1. Lawyers and scientists are both highly educated.

      Delete
  2. How did you break down the names into male and female? I'm working on a project where I need to do just that for a large sample of name and I've been looking at nltk.

    ReplyDelete
    Replies
    1. For this study I relied on the genders assigned by the SSA "Top 100" data.

      In previous projects (linked in post) I have used a larger dataset (also from SSA) that has tens of thousands of names and genders. You can use this data to get an estimate of a name's gender. There are also several python packages and online APIs that do similar things.

      Delete

Inappropriate comments, advertisements, or spam will be removed.
Posts older than 2 weeks have moderated comments.
(Anonymous commenting disabled due to increasing spam)