Web Scraping – FreeBMD

Scraping FreeBMD is trivial. Why? Because there’s a button on the site which does just what we want. It’s labelled “Download”, and it downloads to your computer the data displayed on the screen. (it’s just to the left of the Key, under “Save Search”) One important point to make here is that there is a limit on the number of results FreeBMD will display – currently 3000, which is probably rather ambitious anyway!

For those of you in The Surname Society, Colin Spencer has made an excellent video of the process (it’s clear and concise for those, like me, who don’t really go for video tutorials – I prefer the written word, obviously, or I wouldn’t be writing this!) You can find the video in the Members section of the Society’s web site – on the menu, look for Surname School videos, then scroll down to Data Extraction.

For those of you not in the Society (why not? it’s only £5!), downloading the file gives you a Tab Separated Value file (like CSV, but with Tabs not commas). You can then import this file into Excel. Colin passes it through a text editor (he and I both like Notepad++) to convert it to a CSV, but I just right-clicked on the file in the file explorer and used “Open With” to load it into Excel (I use Excel 2003 – I hope other versions work similarly). Colin also uses Notepad to remove the extra column inserted by the FreeBMD format, but that’s easy to do directly in Excel.

There’s one major point I’d add to Colin’s tutorial. The format of results changes at some point – either you’ll need to add a column to part of the resulting spreadsheet to line up before and after results (prone to errors), or search in 2 parts (recommended). For births, mother’s maiden name is added from September quarter 1911. For marriages, spouse’s surname is added from March quarter 1912. For deaths, (alleged!) age at death is added from March quarter 1866.

Marriages of course show only the spouses’ surname, and that only from 1911. To get the possible spouses before that date, and/or the spouses’ given name after that date requires a more advanced technique.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: