School of Population Health


image - Chi Attaca

Antibiotics resistance is recognised as one of the greatest threats to human health worldwide. It is compounded by the lack of development of any new classes of antibiotic drugs in several decades, and the growth in antibiotic resistant infections in recent years, that shows no sign of stopping.

The genetics of this multi-resistance is complex, involving hundreds of genes and mobile elements that enable them to be readily shared even between different species of bacteria. A better understanding of the assembly, evolution and spread of antibiotic resistance is vital to tracking and containing the emergence of resistance.

We are inventing new ways to understand DNA by analogy to natural language. We have developed a novel gene-grammar that can describe the evolution of DNA not just as a series of misspellings and typos (called single nucleotide polymorphisms) but also as vast changes in word order, sentence structures and cut-and-paste errors of large portions of "text".

In addition to the grammars, we are developing novel natural language processing, machine-learning algorithms and data visualisation methods that sift through vast stores of biological knowledge and help pinpoint disease-causing genes.

Our work on infectious diseases is a longstanding partnership with researchers at the Centre for Infectious Diseases and Microbiology (CIDM) at Westmead hospital.

Higher order DNA language and antibiotic resistance

We have demonstrated a novel method to systematically sift through bacterial DNA sequences to find new mechanisms by which antibiotic resistance (ABR) genes spread.  This ground-breaking work on the higher-order language of DNA continues to advance our understanding of the biomolecular mechanisms underlying the growing antibiotic resistance problem. ­ Results of this work have been published in several journals including the prestigious Bioinformatics.

Repository of Antibiotic-resistance Cassettes [GT1] 

The Repository of Antibiotic-resistance Cassettes (RAC) is an important tool that makes DNA language technology available to researchers of ABR genetics.

Gene cassettes are bits of DNA that can move between bacterial organisms even of different species and are the most complex mechanisms for antibiotic resistance transfer.

We developed the RAC as a way to provide a definitive listing of all known antibiotic resistance cassettes. It also offers a way for researchers to contribute new cassettes as they are discovered. Access to our grammatical annotation engine Attacca is free.

Visualising networks of relationships

We have developed a new tool for visualising networks of relationships in the biomedical literature. The networks allow researchers to visualise how organisms relate to diseases, show changing patterns of association over time and allow viewers to zoom in from organism to gene.

We have applied this tool to assist cerebral palsy researchers explore the literature and connect the dots between CP syndromes and potentially undiscovered agents.


We investigate how genetics and epigenetics contribute to the genesis and progression of cancer and how genetic tests can be used to improve cancer screening and treatment outcomes. We build computational tools that identify such genes using DNA grammars and literature-based discovery.

Project Members
image - 1323127848 Guy Tsafnat
Dr Guy Tsafnat
Casual Academic
image - Img 0587
Professor Enrico Coiera
Visiting Professor
Project Collaborators: External

Dr Sally Partridge

Professor Jon Iredell