Group B Streptococcus (GBS) is a gut commensal and cause of neonatal invasive disease, potentially following vaginal dysbiosis(1). While maternal colonization is linked to neonatal disease, understanding GBS colonization dynamics across body sites remains critical. One hypothesis suggests vaginal colonization is seeded from the gut, given its gastrointestinal carriage(2). Without whole genome sequences from either niche, metagenomic samples offer an alternative to explore GBS strain diversity. However, low GBS abundance in the gut complicates strain typing.
We conducted a simulation study to assess GBS strain typing at low abundance. Using Seq2mgs, we created 78 synthetic metagenomes by spiking GBS reads into a background metagenome (Accession: SRR12344432) at relative abundances of 0.01X, 0.005X, 0.001X(3). For each clonal complex (CC), we used 2-3 publicly available GBS genomes to generate synthetic metagenomes. Using k-mer based tool StrainGE, we tested these synthetic metagenomes against 130 RefSeq reference genomes(4).
The GBS CC’s were accurately identified for all synthetic metagenomes except CC12 and CC452. CC12 samples were identified as a mix of ST7 and ST283, while 3/9 CC452 samples were identified as a mix of ST23 and ST452. ST mismatches are likely based on limitations of ST typing (based only on 7 housekeeping genes), and the particular heterogeneity of CC12(5,6).
Ultimately, our simulation shows the ability to accurately recapitulate GBS strain types despite low abundance in complex microbial backgrounds. This confirms the potential of using this approach on empirical metagenomic samples to investigate GBS dynamics across carriage niches