Oral Presentation International Pasteurellaceae Conference 2014

Investigating the pan-genome of Haemophilus parasuis (#13)

Kate J. Howell 1 , Lucy A. Weinert 1 , Roy R. Chaudhuri 2 , Shi-Lu Luan 1 , Sarah E. Peters 1 , Jukka Corander 3 , David Harris 4 , Oystein Angen 5 , Virginia Aragon 6 , Julian Parkhill 4 , Paul R. Langford 7 , Andrew N. Rycroft 8 , Brendan W. Wren 9 , Matthew T.G. Holden 4 , Alexander W. Tucker 1 , Duncan J. Maskell 1 , on behalf of the BRaDP1T consortium
  1. Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
  2. Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK
  3. Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
  4. Wellcome Trust Sanger Institute, Cambridge, UK
  5. National Veterinary Institute, Oslo, Norway
  6. Centre de Recerca en Sanitat Animal, UAB-IRTA, Barcelona, Spain
  7. Department of Medicine, Imperial College London, London, UK
  8. The Royal Veterinary College, Hatfield, UK
  9. Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK

Haemophilus parasuis is a pathogen of pigs, with strains ranging from non-virulent to hyper-virulent described to date. However, the relationship between genetics and disease outcome is not well understood; there is an assumed relationship between serovar and virulence and many virulence factors have been suggested, but many of these have not been tested on a large dataset to look for statistical significance. We have sequenced over 200 strains of H. parasuis, from both clinical and non-clinical backgrounds, including all 15 serovars and their reference strains, from several European countries to gauge the diversity of this bacterium. We have used these data to estimate the pan-genome of this bacterium using orthoMCL, separating genes into the core (1,049 genes) and accessory genome finding 7431 predicted genes. We have analysed the population structure of this bacterium and the core genome for recombination using Bayesian analysis of population structure (BAPS) and Bayesian recombination analysis tool (BRAT) identifying two separate populations of H parasuis, one clonal and one recombining.  In addition, we have compared the phylogeny to the current MLST scheme and estimated the synteny of the pan-genome identifying several areas of variation associated with the capsule loci, several phage and also the distribution of antibiotic resistance genes throughout this strain collection. Finally, we have used the population genetic approach of genome wide association to examine the relationship between genetics and clinical status and serovar. A series of statistical analyses were performed to look at the associations of genes with clinical phenotypes examining the overrepresentation of genes, and using discriminant analysis of principal components (DAPC) and generalised linear modelling to look at the importance of individual or groups of genes with respect to phenotypes. Using this top-down approach we have assessed the genetic composition and diversity to start to understand the genetics of H. parasuis.