Advances in Data Linking Methodology at the ABS

The Official Statistics Section of the SSA and the Australian Bureau of Statistics presented this webinar on Monday 21 September. Over 60 people joined online to hear Daniel’s presentation.

This talk represents a follow-up to Anders Holmberg’s presentation in mid-August. Unfortunately I was double-booked that day, but was pleased to discuover that this talk stood alone very nicely and was a clear and comprehensive view of the state of the art of data linking at the ABS.

There’s a bit of terminology to keep straight in your head first. Matches are what exist in the data, a record of a person over here in the Medicare database and over there in the Tax Office database. A link is what is made between them by the analyst, and it can be correct or not. The precision of a linking method is the same as the positive predictive value, and the recall of a linking methd is the same as the sensitivity.

Daniel described deterministic and probabilistic methods of matching, and the possibility that they may be equivalent. At the end of the talk he highighted several areas of current research including improvements n precicion, incorporating the uncertanty of a link in the standard errors of any linked data analysis, and the possibility of using an instrumental variables model.

Leave a comment