Using payroll transaction data: statistical opportunities and challenges

The Canberra Branch of the Statistical Society of Australia held its May meeting on Tuesday 25th. I Zoomed in with around 30 others. The speaker was Mr Cristian Rotaru, Assistant Director and Principal Statistical Analyst at the Australian Bureau of Statistics.

He spoke about the innovative statistical methods tat the ABS decided to employ in order to use Single Touch payroll data to provide near-real time information on changes in the labour market. This was a vital series for the ABS to swing into action with as the COVID-19 pandemic unfolded through 2020 and onwards into this year. These methods gave rise to opportunities as well as challenges.

The main takeaways from Cristian’s talk were around the importance the ABS recognised of building a robust system to make best use of this novel data. There was always going to be a tradeoff between official-statistical values, in particular between timelines and accuracy. I think it is fair to say that the ABS made the decision to prioritise timeliness and they still managed to maintain a high degree of accuracy anyway. The importance of collaboration was also reiterated by Cristian at the end of his talk.

School of Demography Joint Seminars

The School of Demography has a policy of inviting grant holders to speak about the ir projects to the School near the beginning of the project timeline. Tuesday 18 May saw two such projects presented.

The first was by Associate Professor Vladimir Canudas-Romo who spoke on Life expectancy among disease-diagnosed (life left). Mental or physical disorders shorten people’s life spans, and most importantly they occur at different ages, thereby having different impacts upon life left. This project will develop metrics which can measure these differential impacts, by analysing national linkage data such as the Australian human Mortality Database and MADIP integrated datasets from the ABS. Similar efforts have worked well wit Danish data and Vladimir will build on that to produce critical insights into recent gains and losses in Australian life expectancy.

The second was by Dr Bernard Baffour, who spoke on Mapping the prevalence of smoking with increased precision in sparsely sampled regions of Australia – the SPARSE project funded by NHMRC and led by myself and Bernard. He presented two supporting projects that are in progress whilethe application for Datalab access progresses. First, he spoke about the mapping of small areas estimates based on the Australian Early Development Index (work led by Mu Li and published in June 2020). Then he spoke about mapping chronic under-nutrition in Bangladesh across 22 years, 64 districts and yearly ages from 0 to 5 (work led by Sumon Das).

There were a number of interesting questions at the end of the talk which we’ll need to think about as SPARSE progresses. Propensity to self-report smoking is potentially an issue that can bias our prevalence results. Sales or expenditure data could be used to borrow some of the strength we seek, as could strength from areas that are not necessarily close geographically but close demographically or in other senses. Thanks to all who came up with idea, we’ll let you know how they go!

Enhancing systematic review via the application of machine learning to the bibliome: a collaboration with TenWise to reveal insights into ME/CFS

Associate Professor Brett Lidbury of NCEPH gave this talk to RSPH on Thursday 13 May. Whilst his regular visits to the Netherlands have been curtailed in the last 12 months, nonetheless this collaboration has flourished to the point where ANU has an agreement with software developers TenWise to access this ground-breaking software for meta-analysis. The idea is to use text mining and machine learning to support the onerous task in systematic review of reviewing titles, abstracts and papers for inclusion or exclusion in a review. Brett showed examples of the software in action, applied to the condition he has been researching for at least a decade, namely ME/CFS.

One of the key items on the to-do list for Brett and the team to do is to validate the software by lining its results up against a traditional hand-made meta-analysis. Interesting times ahead for developers and users alike!

Women in Maths Day

May 12 – the birthday of Florence Nightingale, and Maryam Mirzakhani, and celebrated worldwide as Women in Mathematics Day.

I attended two events – one local, one international on Zoom. The Mathematical Sciences Institute at ANU put on a display of posters celebrating the achievements of Australian women in mathematics. One is the ANU academic Joan Licata, who I regularly meet on the stairs in my work building. Another was Cheryl Praeger whose contribution to mathematics and the place of women in hat field has been recognised with the naming of a lecture theatre at the University of Western Australia. A third that I recognised was Inge Koch who is as much a statistician as a mathematician, and she was quoted about this on the poster. I hope the set of over a dozen posters will be available for viewing more than just once a year!

In the evening I turned on Zoom and followed a series of talks from Italy under the name of “Women: Statistically Significant”. Their event was organised by the Italian Statistical Society (SIS) and the International Statistical Institute (ISI) as the final event of the International Year of Women in Statistics and Data Science which ran from 12 May 2020.

I heard the following presentations.

Pierluigi Conti, Sapienza University Roma “The lady with the lamp of statistics”. He did a good job of covering Florence’s life and achievements in the statistical sphere. I began to wonder what Italians would make of this British woman and her exploits, whether they would resonate, and then the next talk came along.

Daniela Cocchi, University of Bologna “Cristina di Belgiojoso: the romantic princess who didn’t disdain numbers”. I had no idea this woman existed A contemporary of Florence Nightingale, she led an exciting and varied life that I really enjoyed hearing about.

Massimo Attanasio, University of Padua “Gender gap in Italian academia”. It’s potentially dispiriting to be hearing about the gender gap in another country’s academe, but Massimo used some interesting discrete time Cox models to describe the Italian situation.

At the end, a number of video messages from female statisticians worldwide, including Helen MacGillivray from Australia, past ISI President. I stayed around to hear Ada van Krimpen of the Netherlands, ISI Director; Delia North of South Africa, and Atinuke Adebanji of Ghana.

I’m glad some institutions are continuing with the Zoom meetings to allow those of us on the other side of the globe to participate in events like this. Happy Women in Maths Day everyone!

Women in Mathematics Day 2021

On May 12, it’ll be International Women in Maths Day. This is a joyful opportunity for the mathematical community to celebrate women in mathematics. The goal of the day is to inspire women everywhere to celebrate their achievements in mathematics, and to encourage an open, welcoming and inclusive work environment for everybody. The celebration takes place every year, all around the world. The first Women in Mathematics Day was held in 2019. But why May 12? Because that is the birthday of Maryam Mirzakhani (1977 – 2017). In 2014, Maryam Mirzakhani was awarded the Fields Medal for her outstanding contributions to the dynamics and geometry of Riemann surfaces and their moduli spaces, becoming the first woman to be recognised for her mathematical achievements by this top mathematical prize.

In 2019  the Australian Centre for Excellence for Mathematical and Statistical Frontiers created posters that celebrate women in mathematics and statistics, mainly in South Australia but with representatives from across the world.

In 2020 we were all in lockdown due to COVID-19 but the day did not go by unmarked! Have a look at ACEMS Women in Maths to find videos from prominent women in mathematics (and statistics!) including SCU consultant Marijke Welvaert and SCU Director Alice Richardson. In 2021 the main event from ACEMS is a virtual panel discussion on the day itself, 12 May.

It doesn’t escape my attention that the same day will this year mark the 201st birthday of Florence Nightingale. You can read the Conversation piece I co-wrote for her 200th birthday last year which focused on the healing power of data. Florence was a prodigious writer, which possibly makes her something of a role model for HDR students struggling to put pen to paper (or fingers to keyboard!) Indeed I think Florence would do very well with the text-based communication of the 21st century, and I imagine her with smart phone in hand like this.

Her vast array of correspondence means that she has provided us with a large number of quotable quotes for many situations. My favourite is the comment she made about her time in the Crimea 1854 – 1856, and the data she drew together from that experience to inform the reform of the British health and military systems as a result. However exhausted I might be, the sight of long columns of numbers was perfectly reviving to me.

I think many statistical consultants would feel the same way. The sight of a long column of numbers means data is available to address a research questions, and it’s the intersection between questions, data and methods where advice from a statistical consultant can make the biggest difference to a research project.

#May12 #WomeninMaths #May12WIM

Associate Professor Alice Richardson is Director of the Statistical Consulting Unit (SCU) at the Australian National University. Her research interests are in linear models and robust statistics; statistical properties of data mining methods; and innovation in statistics education. In her role at the SCU she applies statistical methods to large and small data sets, especially for research questions in population health and the biomedical sciences.