COVID-19 vaccine fatigue in Scotland: how do the trends in attrition rates for the second and third doses vary by age, sex and council area?

Robin Muegge is a PhD candidate at the University of Glasgow. He’s visiting professor Andrew Zammit Mangion at the University of Wollongong at the moment, and luckily for the ACT Branch of the Statistical Society, he was able to make the trip up the escarpment to present this talk on Tuesday 30 April. I attended on Zoom with about half a dozen others, and many more attended in person.

First we learnt that vaccine fatigue (failure to complete the full course of doses) is treated differently to vaccine hesitancy (unwillingness to even begin the course of doses). Robin modelled vaccine fatigue across three doses, so two transitions, using a Binomial model for the cumulative count of number of people who received a dose. Age, sex and council area 932 of them) were the predictor variables in the model.

There was some visual evidence of a spatial effect, so Robin used a BYM model and INLA estimation for the parameters. In the end, space was highly correlated with deprivation (a familiar story to those of us researching in the Australian population health setting) and both age and sex, and their interaction, were also important effects.

Robin also described a second project in progress, to ponder what happens if Tobler’s law doesn’t hold. His context here was the smoothing of standardised incidence rations and wanting to identify three types of outlier – global, contextual and collective. Robin has begun by proposing a Poisson model for counts that feed into the SIRs, and he is now searching for a robust spatial smoothing method that is not too computationally expensive so that it is attractive to practitioners, nit just researhers.

I wish Robin well with his projects – they are shaping up to be a strong contribution to statistical methodology and application.

Spatial prediction of non-negative spatial processes using asymmetric losses

Distinguished Professor Noel Cressie visited RSFAS on Wednesday 24 April and because of the Anzac Day holiday, Noel gave his seminar a day earlier than usual. I was pleased to join around twenty others in person whose schedules accommodated the change in day.

Noel’s title is quite technical and much of his talk addressed formulations and theoretical results. Nonetheless as always his research has been motivated by pressing real problem, particularly in the area of climate change. One of the applications of the work discussed today was how to deal with predictions of extreme weather events e.g. flood levels, where underprediction has very different implications to overprediction. This is the asymmetric loss of the talk’s title.

The proposed asymmetric loss function in this talk was a power-divergence loss which i based on a ratio not a difference. The well-known Kullback-Leibler distance is a special case of this family of loss functions, giving it a nice grounding in familiar territory.

Noel illustrated the theory with an oldie-but-a-goodie example of zinc concentrations in soil on the floodplain of the River Meuse in the Netherlands. He also alluded to recent flood events in Australia to help bring local relevance to the examples.

Sample size for monitoring disease in free-ranging wildlife populations

Professor James Booth of Cornell University has been a longtime colleague of what is now the ANU Research School of Finance, Actuarial Studies & Statistics. it was a pleasure to attend his seminar in the School on Thursday 11 April. Around 25 people attended in person.

James’ talk reported on one of those conversations that turn into a full-blown methodology project. In this case, a conversation with a veterinary scientist led to sample size estimation for an evidence base on the eradication of chronic wasting disease in the free-ranging deer populations of the USA.

Neither a hypergeometric model nor a binomial model we’re going to account for the lack of independence between animals, so James ended up with a beta-binomial model. In answer to the question “What would Bayesians do?”, James then added priors on the beta distribution parameters. The final addition to the model was to take account of the fact that animals are not independent at all, but tend to live in family groups of four to five.

And the final answer? A sample maybe as large as 300, or maybe as small as 16, depending on correlations!

Enhancing Bayesian small area level methods with applications in health

It’s been a pleasure collaborating with the spatio-temporal researchers at QUT and the Cancer Atlas, in particular with Susanna Cramb and her PhD student Jamie Hogg. his PhD exit seminar was on Monday 8 April, and I joined over 25 people online (plus the crowd in person) to hear his presentation.

As always, Jamie’s talk did not disappoint. he took us through the pillars of his research, rooted in cancer and the burden of disease. His focus has also been on risk factors, modelled using small area methods right down to SA2 level. he has had to overcome several methodological hurdles around sparse data, the need to develop indices of risky behaviour, and the potential of a two stage logistic normal approach. MrP (multilevel regression and post-stratification) was also invited to the methodological party.

As well as all that, Jamie came second and first in two hackathons, made two trips to Canada to present his research, and attended another nine conferences (one of which was organised by my team, see the report here!) Congratulations on making to the final stages of a PhD Jamie, and all the best for the final write-up!

Optimal dynamic treatment regime inference – a tale of two methods

Dr Weichange Yu of the University of Melbourne came to RSFAS to present this seminar on Thursday 4 April. Around 25 people attended in person to hear this talk about management of patients with ongoing disease.

The two methods of the title were a maximum likelihood method, and a so-called Bayesian method although Weichang did say he thought the Bayesian method was not well named! nonetheless he took the audience through the detail of the expressions to be minimised, along with a couple of small examples to show how changing the treatment mid-way through a clinical trial could indeed be beneficial.

Defining Disclosure and Optimising the Twin Uniform Distribution for Multiplicative Noise Data Masking

Dr Pauline Ding of the University of Wollongong has been President of the Canberra Branch of the Statistical Society for the last two years and so it falls to her to give the outgoing President’s address after the Branch AGM. This all took place on Tuesday 26 March.

Her talk focused on recent research into multiplicative noise masking, a well-known method to perturb data for privacy protection purposes. The twin uniform distribution has been introduced in the literature as a distribution for multiplicative noise, given the simplicity in its mathematical form and the ability to provide good value protection without sacrificing statistical utility. Pauline showed us how to optimise the choice of two distribution parameters for best performances in terms of privacy protection and utility loss when multiplying twin uniform noise for data masking. She finished with an example of a real accounts payable dataset, which showed that the twin uniform distribution (basically two uniform distributions with a gap in between) yields good results for both privacy and utility.

At the AGM prior to the talk, the Branch voted to change its name to the ACT Branch. Look out for the new name in posts to come!

Policy, politics and a healthy human future

Monday 25 March – It was a great privilege to spend an hour and a half in the company of Rt Hon Helen Clark, former Prime Minister of New Zealand, along with a few hundred others for course! She was joined in stage of the ANU Manning Clark Lecture Theatre by Professors of the ANU Sharon Friel, Arnagretta Hunter and Bina D’Costa, in a conversation organised by the Australian Global Health Alliance. The event was opened by Professor Genevieve Bell, the new ANU vice-chancellor. She spoke about the many firsts that she brings to the role (first female, first social scientist , etc but only the second redhead!).

The conversation ranged widely across the title topics of policy, politics and a healthy human future. I particularly appreciated Helen’s words on leadership being for the people, and the importance of asking questions like is this idea good for inclusion? For resilience? For sustainability? I also liked that Bina mentioned the importance of data and statistics in countering false claims – so I can claim that this talk was relevant to my working role too!

There was only time for a couple,of questions at the end, followed by closing affirmations about the importance of interdisciplinary research. A rousing way to end the first day of the working week!

Prediction models in healthcare: a playground for researchers

One of the benefits of ISCB membership is invitations to seminars on interesting biostatistics topics from around the world, especially Europe.

Professor Richard Riley of the University of Birmingham presented this talk to an online audience mid-morning European time, so after dinner for me.

He concentrated his attention on models for individualised prediction, using anything from regression models to random forests or neural networks.

Firstly Richard pointed out the vast number of prediction models already published, and encouraged researchers to consider publishing validations rather than yet another model. He also urged us to stop dichotomising continuous variables, and think really hard about instability. Richard has some startling examples of where AI or even X-AI doesn’t solve instability, and indeed accepting the default settings for a random forests could led to confidence limits for a risk from 0 to 100%! Richard has a number of papers on related topics, and R packages called pmsampsize and pmvalsampsize, and he also referred to a paper on models for COVID by van Smeden and others.

Design advice for species occurrence studies

Darryl’s top five tips on this topic proved to be a huge drawcard for the New Zealand Statistical Association who promoted this talk – over 100 people showed up online to hear Darryl MacKenzie of Proteus impact his wisdom gained form over 20 years of design and analysis of species occurrence studies.

I thought about whether I should spill the beans on what those tips were, and I will, firstly because this blog post cannot possibly delve into all the detail which Darryl provided around these tips. Nor was I able to stay for the Q&A at the end of the talk, which no doubt drew out many more subtle points around the tips.

So, to the tips. Here they are!
1. Clearly define the population of interest. Which is likely to be the landscape, not the species on t!
2. Define the extent of a “presence”. This could be spatial, or temporal, and may be bigger or smaller than you initially think.
3. Limiting where/how data is collected will limit inferences. Like if you only collect data along the road or a transect.
4. Conduct repeat surveys to collect data on the detection process. This can be repeating in space or time (or both).
5. Evaluate how much effort is enough. A simulation study is a valuable tool.

I suppose the point of these five tips is that at their heart they speak to design tips for all sorts of studies – what is the population? What is the outcome? What are the limitations? What about reproducibility? And how powerful (in qualitative terms as much as quantitative) are the results? I hope most of the 100+ participants were able to recognise those connections to their research projects too.

Bayesian semi-parametric mechanistic modelling with greta for ecology and epidemiology

Professor Nick Golding of telethon Kids Institute & Curtin University is in town for a workshop on his software package, greta. It was a good idea for the Canberra branch of the Statistical Society to ask Nick to present a talk to the branch so that those who weren’t going to the workshop could gain some insight into the capabilities of greta too. Attendance online (40+) well outweighed attendance in person (15) which demonstrates the strong interest in this content both locally and more broadly.

Nick started with an overview of the applied statistical modelling work his research group at Telethon Kids does. Nick’s roots are in vector-borne diseases in particular malaria, but like so many disease modellers in 202, COVID has kind of taken over for a few years. The semi-mechanistic models that Nick’s team built for the pandemic even featured in some of the early press conferences around disease transmission.

The models are built using nick’s R package greta, and he ran a short live demo of the main functions, emphasising their simple interface and speed. Questions at the end picked up on a number of the technical aspects of greta, especially around its interaction with other products such as Stan and TensorFlow.

There’s a great joke somewhere that starts with “Stan and Greta went into a bar …” but I haven’t come up with the punchline yet!