Highlights
• Detection of specific mutations induced by pks+ E. coli strains, including Nissle 1917
• Mutation classifier indicates 12% of CRC display colibactin mutagenesis
• Colibactin-associated CRC cases have APC mutations at colibactin motifs
• Colibactin-CRC cases have a younger age of onset in multiple cohorts
Summary
Co-culture of intestinal organoids with a colibactin-producing pks+E. coli strain (EcC) revealed mutational signatures also found in colorectal cancer (CRC). E. coli Nissle 1917 (EcN) remains a commonly used probiotic, despite harboring the pks operon and inducing double strand DNA breaks. We determine the mutagenicity of EcN and three CRC-derived pks+E. coli strains with an analytical framework based on sequence characteristic of colibactin-induced mutations. All strains, including EcN, display varying levels of mutagenic activity. Furthermore, a machine learning approach attributing individual mutations to colibactin reveals that patients with colibactin-induced mutations are diagnosed at a younger age and that colibactin can induce a specific APC mutation. These approaches allow the sensitive detection of colibactin-induced mutations in ∼12% of CRC genomes and even in whole exome sequencing data, representing a crucial step toward pinpointing the mutagenic activity of distinct pks+E. coli strains.
Graphical abstract
Summary
Co-culture of intestinal organoids with a colibactin-producing pks+E. coli strain (EcC) revealed mutational signatures also found in colorectal cancer (CRC). E. coli Nissle 1917 (EcN) remains a commonly used probiotic, despite harboring the pks operon and inducing double strand DNA breaks. We determine the mutagenicity of EcN and three CRC-derived pks+E. coli strains with an analytical framework based on sequence characteristic of colibactin-induced mutations. All strains, including EcN, display varying levels of mutagenic activity. Furthermore, a machine learning approach attributing individual mutations to colibactin reveals that patients with colibactin-induced mutations are diagnosed at a younger age and that colibactin can induce a specific APC mutation. These approaches allow the sensitive detection of colibactin-induced mutations in ∼12% of CRC genomes and even in whole exome sequencing data, representing a crucial step toward pinpointing the mutagenic activity of distinct pks+E. coli strains.
Graphical abstract

Keywords
cancer genomics
genotoxins
organoids
mutational signatures
colorectal cancer
colibactin
probiotics
bacteria
mutagenesis
machine learning
Introduction
E. coli strains associated with increased colorectal cancer (CRC) risk harbor the polyketide synthase (pks) operon.1,2,3,4,5 This operon is responsible for the production of the genotoxin colibactin. Recent studies demonstrate that colibactin can alkylate adenines bivalently and cause DNA cross-links.6,7 Indeed, pks+ strains induce DNA-double strand breaks (DSB) in cell lines.2 Furthermore, the co-culture of pks+ E. coli with intestinal organoids and subsequent whole genome sequencing (WGS) revealed its ability to cause single base substitutions (SBS) and short insertions-deletions (ID) in the form of mutational signatures SBS88 and ID18, respectively.8 SBS88- and ID18-related mutations are characterized by T > N substitutions and T deletions in adenine- and thymine-rich genomic regions,8 in line with other reports indicating on colibactin-induced DSBs.9 Simultaneous presence of SBS88 and ID18 could be detected in tumor genomes, of which the majority were CRC cases, pointing to colibactin as a source of mutations in CRC genomes.10
Several E. coli strains that belong to specific B2 phylogroup lineages harbor the pks operon,11 but it is not clear if they have an equal capability to induce mutations in the epithelium.12 E. coli Nissle 1917 (EcN) is a well-studied probiotic strain, commonly used to treat inflammatory bowel disease (IBD).13 Notably, EcN harbors the pks operon in its genome.2 Current evidence shows that EcN has diminished ability to cause DSBs compared to other pks+ strains.14 Additionally, a recent study using the HPRT gene assay indicates a mutagenic effect of EcN in CHO cells.15 However, no evidence for genome-wide EcN-induced mutations in primary human cells exists to date, and its relative mutagenicity to other pks+ E. coli strains is unknown. To address this, we determined the mutational consequences of a panel of pks+ E. coli strains, consisting of EcN and 3 CRC-derived strains, using the previously established human organoid co-culture system followed by WGS.8,16 Here, we develop 2 computational approaches, (1) relying on the colibactin DNA target motif and (2) a random forest model, to improve the detection of individual colibactin-induced mutations.
pks+E. coli co-culture screening reveals heterogenous mutagenic activity by different strains
First, we established an intestinal organoid co-culture panel comprising EcN and 3 additional CRC-derived pks+ E. coli strains, CFF16-2F8 (2F8), CFF159-19H2 (19H2) (both from17), and the previously tested strain EcC18 (STAR Methods). All strains showed comparable growth dynamics in co-culture, although EcN displayed slightly reduced expansion potential (Figure S1A). EcN caused DNA damage in organoids exposed for 24h, measured by the presence of nuclear γH2AX foci, a DSB marker (Figures S1B and S1C). While EcN did not induce the same level of DSBs as EcC, the DNA damage level was considerably increased over both negative controls, which were injected with dye or EcCΔclbQ, an EcC pks mutant strain unable to produce colibactin (FDR-adjusted p values Wilcoxon test, EcN: 0.004; EcC: 0.004, EcCΔclbQ: 0.0313, dye: 0.25) Figures S1B and S1C).8,18
To characterize the mutagenic effects of the pks+ E. coli strains on co-cultured organoids, we performed single-cell WGS by primary template amplification (PTA) using the PTA analysis toolbox (PTATO, STAR Methods)19,20 (Figure 1A; Table S1). SBS and indels numbers were similar across conditions in our experiments (Figures 1B and 1C). The SBS and ID mutational signature profiles of EcC- and 2F8-exposed organoids were similar to SBS88 and ID18, while EcN and 19H2 showed limited similarity (Figures 1D and 1E), evaluated by cosine similarity (Figure S2H). This is partially in line with the mutational signature refitting results (Including colibactin-induced SBS88 and ID18 and in vitro signatures SBS1, SBS5, SBS18, ID1, and ID2) where only organoids exposed to EcC have a significant contribution of SBS88 (p value 0.031; Dunn’s test with FDR correction) (Figure 1F) and ID18 (p value 0.024; Dunn’s test with FDR correction) (Figure 1G). However, traces of the most characteristic SBS88 peaks are observable for both EcN-, 19H2-, and especially in 2F8-exposed organoids, suggesting that mutational signature refitting is not sensitive enough in samples with low signal-to-noise ratio. We repeated the experiment using clonal expansion of organoids exposed to dye, EcC or EcN with comparable results (Figures S2D–S2L; Table S1).

Figure 1 Mutational profiles and signature contributions of intestinal organoids exposed to pks+ E. coli strains amplified by PTA
Motif filtering improves detection of colibactin-induced mutations
Since the presence of other mutational processes affects the detection of a given signature in mutational datasets,21 we used the extended DNA contexts of colibactin-induced mutations8 to optimize the detection. These contexts appeared in organoids exposed to each strain, and not in control (Figure 2A). The presence of two adenines 3 and 4 bases upstream of the T > N mutation (-3-4AA) was the most significantly enriched motif when comparing mutations from EcC and control (p value 4.40∗10−84, one-sided Fisher’s exact test) (Figure 2B). All exposed organoid genomes (to EcN, 19H2, and 2F8) presented a significant enrichment of colibactin-induced mutations with adenines at the -3-4AA positions (Figures 2C and 2D, p value EcN = 1.28∗10−11, 2F8 = 7.50∗10−70, 19H2 = 5.10∗10−12; one-sided Fisher’s exact test). Cosine similarity and Spearman correlation indicated similarity between T > N trinucleotide profiles of all pks+–exposed organoids and SBS88. This similarity was more pronounced when considering only mutations with -3-4AA colibactin motif, including those to EcN, 2F8, and 19H2 (Figure 2E).

Figure 2 A colibactin-specific -3-4AA mutational motif is enriched in organoids co-cultured with four pks+E. coli strains, including EcN
We further optimized the number T > N trinucleotides used to best distinguish colibactin mutations (STAR Methods). Using mutations occurring at the 17 most frequent SBS88 trinucleotides resulted in the most significant enrichment of -3-4AA presence against the control (Figure 2F, p value = 3.67∗10−78 for -3-4AA enrichment in EcC-treated organoids compared to control, 2.03∗10−13 for EcN compared to control, one-sided Fisher’s exact test). Finally, by generating a sampled range of mutations from control and EcC-exposed organoids, we estimated the -3-4AA motif fraction of all strains relative to EcC (STAR Methods). In our organoid co-culture system, EcN had an estimated 32.9% (95% confidence interval between 21.2% and 62.9%) -3-4AA motif fraction relative to the EcC strain (Figure 2G). Additionally, 19H2 and 2F8 induced 33.7% (95% confidence interval between 21.7% and 64.4%) and 112,0% (95% confidence interval between 62.1% and 238.3%) of the -3-4AA motif fraction of EcC, respectively (Figure 2G). This variability in motif enrichment across strains was also demonstrated by the relative fraction of T > N substitutions with -3-4AA in exposed organoid cells, and enrichments remain stable in resampling-based analyses (Figures S3H and S3I). Additionally, all motif-based enrichments were similar to those obtained using data from clonally expanded organoids exposed to EcC and EcN (Figures S2A–S2G, S2J, and S2K).
Genetic differences between the pks island of genotoxic E. coli strains
To investigate if this divergent mutagenicity could be linked to sequence differences in the pks island, we compared the pks island genetic sequences of each strain. In line with a recent report on pks island diversity,11 we found only few variants, of which most were single base changes (Figures S3A and S3B; Table S2). While some coding changes could influence colibactin production and secretion, most occurred in the self-protection gene clbS of the pks operon. In EcC most of these mutations occurred with an allele frequency of roughly 0.5 and the coverage of clbS was increased compared to neighboring regions, suggesting an allele duplication. (Figure S3C). Nevertheless, the small overall differences suggested other causes than pks island mutations as the source of mutagenic heterogeneity.
Detection of -3-4AA colibactin-mutations in cancer sequencing datasets
To test if this analytical framework could improve the detection of colibactin mutagenesis, we studied a WGS cohort consisting of more than 4,800 metastatic cancers (Hartwig Medical Foundation dataset; HMF).22 119 out of 4,858 samples (2.4%) displayed a significant (p value < 0.001, Fisher’s exact test, one-sided) enrichment for the colibactin motif with a -3-4AA fraction higher than 0.22. We set this cutoff (Figure S4A; ROC curve, Youden index, optimal cut-off: 0.16, STAR Methods) to exclude potential false positive samples from tissues with implausible colibactin exposure, such as brain and bone tumors (Figure 3A, STAR Methods). The cohort contained 656 CRC samples, of which 105 were classified as colibactin motif positive (16%) (Figures 3A and 3B). In addition, colibactin mutagenesis was detected in 2 out of 22 rectal (9.1%), 2 out of 66 small intestine (3%), 5 out of 191 urothelial tract (2.6%), 1 out of 73 head and neck (1.3%), and 1 out of 622 lung (0.16%) samples from the HMF cohort (Figure 3A). Next, we compared the motif classification method to signature refitting of SBS88 and ID18 (Figures S4B–S4D). The motif-based method allowed detection of samples with lower levels of SBS88 and ID18 mutational signatures or with high contribution of other mutational processes (Figures S4C and S4D), highlighting the advantage of using the -3-4AA motif to detect colibactin mutagenesis.

Figure 3 Motif-based and RF-based classification improves the detection of colibactin mutations WGS cancer data
The motif-based analysis revealed a cluster of four -3-4AA colibactin motif-positive samples characterized by a high mutational load (Figure S5A) of which three harbored POLE hotspot mutations (POLEmut) (Figure S5A). POLE encodes the catalytic subunit of DNA polymerase epsilon and hotspot driver mutations are known to result in a hypermutator phenotype.23,24 POLEmut samples are associated with mutational signatures SBS10a, SBS10b, and SBS28, the latter of which is marked by T>G mutations at T[T>G]T.25 POLEmut-associated T > N mutations displayed -3-4AA enrichment (Figure S5B),9 only at T[T>G]T mutations (Figures S5C and S5D) and were more similar to SBS28 than SBS88 (cosine similarity, S5B). Thus, POLEmut SBS28-enriched samples can be classified as false positives because of the presence of -3-4AA enrichment in T[T>G]T substitutions. Further, assessment of the specific detection of colibactin-induced mutations was demonstrated by Pearson correlation between the number of –3-4AA mutations and contributions of COSMIC mutational signatures. SBS88 was the only signature showing a clear correlation (R2 0.88, p < 2.2∗10−16) (Figures S5E and S5F).
A random forest model for colibactin-linked mutation detection
To further investigate the specific mutations caused by colibactin, we employed a random forest (RF) model that can predict the probability that a mutation was caused by colibactin. We trained a model on both WGS data of EcC exposed organoids and CRC patients (Figure S6A, STAR Methods). These models place particular importance on the −3 and −4 position, in concordance with the motif analysis (Figure S6B). For the final probability, we multiply the posterior probability of both models. When classifying the CRC samples included in the HMF dataset, we observed a near perfect correlation between the relative contribution of SBS88 and the fraction of colibactin-induced mutations above the 10% threshold (Pearson’s correlation = 0.92, p value < 2∗10−16, Figure S6C), and a far lower correlation below that threshold, albeit still significant (Pearson’s correlation = 0.44, p value < 2∗10−14). Any sample with more than 10% contribution is considered positive. This RF prediction correlates with the -3-4AA enrichment found using the motif-based method (Figure S6D). The fraction of colibactin-induced mutations did not correlate meaningfully with any other mutational signature than SBS88 (Figures S5E and S5F).
We re-classified all samples in the HMF dataset with the RF model and found 12.3% of CRC samples to be positive (Figure 3C). In total 27 out of 635 samples were called as positive by the motif method but negative by the RF method (Figure 3D). Given the design of RF method to eliminate false positives, this classification could be more accurate.
Large WES datasets may yield additional information on prevalence and timing of colibactin-induced mutagenicity. However, for signature analysis, WGS data are preferred.26 We assessed the performance of both the -3-4AA motif and the RF model on WES data (down-sampled from WGS; WGS classification considered as reference). Only considering mutations in exonic regions, the RF score showed a near perfect correlation between WES and WGS (Pearson correlation = 0.99, p value = 2.2∗10−16 Figure S8E), while the -3-4AA counting showed more spread (Figure S6F). We thus used the RF model to classify a large WES cohort consisting of 2825 cancer WES genomes of The Cancer Genome Atlas (TCGA) (Figure 3E). Using an adapted WES threshold (Figure S6E), we showed that in total 12.5% of CRC samples are positive, in line with the estimations in the WGS HMF cohort (Figure 3C). We also observed a clear enrichment for positive cases in the rectal cancer samples, which was in line with earlier findings.9
CRC driver mutation analysis
Despite the detection of colibactin-induced mutagenesis by the RF, even in positive samples most mutations (93%, SD = 3.4%) have been caused by other processes (Figure 3F). To test whether colibactin-induced mutagenesis can contribute to oncogenesis, we classified mutations in known CRC driver genes. We selected genes with mutations in more than 5% of the samples in the IntOGen database27 and in at least 5 CRC cases in the HMF dataset. This resulted in 10 CRC driver genes: APC, TP53, KRAS, BRAF, PIK3CA, SMAD4, FBXW7, TCF7L2, FAT4, and ATM (Figure 4A). We found that in randomly selected genes the mean difference in probability that mutations are caused by colibactin between the colibactin-positive and negative classes was 0.116 (SD = 0.21). The CRC genes APC, SMAD4, BRAF, FBXW7, ATM, and TCF7L2 showed a larger difference in probability, ranging from 0.157–0.213 (Figure 4A). Of these, APC and BRAF mutations were significantly more likely to be induced by colibactin in colibactin-positive samples after correction for multiple testing (p value = 0.011 and 0.041, Student t test, respectively).

Figure 4 Colibactin-motif enables reliable detection of colibactin-induced mutagenesis in WES cohorts
TP53 was significantly less likely to harbor colibactin-induced mutations compared to random genes in colibactin-positive CRC samples (p value = 0.0029, Student t test). When we classified all positions within TP53, we found that only 378 positions (3.9% of total positions) had a posterior probability to be mutated by colibactin above 0.5, whereas APC contains 5778 (8.5%) such positions. The probability distribution of APC mutations displayed an enrichment in mutations with high probability (Figure 4B), indicating specific colibactin-induced damage. The randomly selected genes showed the background distribution of colibactin damage, with a long tail in the positive class. This was absent in the probability distribution of TP53, showing the depletion of colibactin-induced damage.
Interestingly, we found a high probability mutation hotspot at c.835-8A-G in APC. This hotspot was strongly enriched in colibactin-positive CRC with 7.7% of the patients harboring this specific mutation versus 2.2% of negative patients (p value = 0.01, one-sided Fisher’s exact test). The hotspot is in intron 8 of APC and has a predicted pathogenicity score by FATHMM28 of 0.92 and an RF probability of 0.7592, and leads to a premature stop codon.29 It has been reported in patients with both familial adenomatous polyposis and unexplained colorectal polyposis.30 This suggests a role for this specific mutation in the development of CRC.
Colibactin-linked mutations correlate with earlier CRC onset
APC mutations in CRC are predominantly explained by the aging signature SBS1.31 However, the enrichment of mutations with high colibactin probability in APC could imply that colibactin-induced mutagenesis might accelerate the development of CRC in colibactin-positive individuals. For the HMF dataset, at tumor metastasis sampling colibactin-positive CRC patients were significantly younger than colibactin-negative patients (mean age: 58.48 versus 63.51 years, respectively Wilcoxon test, p value = 0.004) (Figure 4C). In the TCGA cohort, which consists of primary cancers, positive patients had a mean age of 62.96 at diagnosis vs. 67.96 in the negative set (Wilcoxon test, p value = 7∗10−5) (Figure 4D). We also investigated a CRC screening cohort,32 where healthy colon crypts were sequenced from patients undergoing a colonoscopy. Here we found that in the patient group where CRC was diagnosed, the mean age at diagnosis was lower for the colibactin-positive compared to negative patients albeit not significant (63.43 vs. 67.38 years, respectively, p value = 0.1, Wilcoxon test, Figure 4E). There is no correlation between the fraction of colibactin-induced damage, age of the patient, and whether CRC was diagnosed (Figure 4F). Taken together, these results suggested that while colibactin can prime cells for transformation early during life, additional hits that are caused by other mutational processes are necessary for tumorigenesis.
Discussion
Both the motif-based and random forest-based classification allow to distinguish a larger group of tumors with enrichment of colibactin-linked mutations compared to mutational signature refitting. The colibactin-positive samples detected were mostly CRC, amounting to more than 12% of CRC cases. All other positive samples originated from organs harboring a microbiota, like the urinary tract, head and neck, lung, rectum, or small intestine. The absence of tumors from organs without a microbiota is indicative of preserved specificity. In addition, in healthy colorectal crypts, 21% of patients were reported using signature analysis to contain SBS88 and ID18-positive crypts.32,33 The RF model enables interrogation of WES cohorts with a much lower false positive rate compared to signature re-fitting. WES signature refitting resulted in 30 samples being falsely classified as colibactin linked. However, the motif-based approach and RF model enable reliable detection of true positive samples. This opens the door to systematically interrogating WES datasets for colibactin and potentially, other mutational patterns.
This study adds to evidence on EcN’s DNA-damaging and mutagenic properties in relationship to its probiotic role,15,34 yet the variance in DNA damage among colibactin producers11,14 remains unexplained. We explored the pks island sequences of all strains used in this study and were able to detect a small number of genomic variants across strains. Beyond these, differences in production levels of rate limiting components of the pks enzymatic machinery, differences in how the toxin is exported and reaches the eukaryotic nuclei, as well as strain differences in metabolism of iron, spermidine, glucose, or inulin, which have been proposed to affect colibactin production ability,35,36,37,38,39 could explain the differential mutagenic capacity. Finally, the relatively lower expansion speed of EcN compared to the cancer-derived strains in our experimental setup (Figure S1A) may potentially lead to an underestimation of its mutagenic capacity in vivo. Overall, the lack of correlation between intra-organoid expansion and mutagenicity across the whole strain panel suggests further factors influencing relative genotoxicity of strains. The human gut with a complete microbiota, mature mucus layer, and immune system, inter-individual differences in DNA repair efficiency and the duration of the exposure could further influence the mutagenic potential of pks+ bacteria, including EcN. Whether cell-intrinsic or -extrinsic, the factors regulating colibactin production could be of clinical interest to target and reduce the mutagenic ability of pks+ bacteria. Given that healthy colon cells accumulate only ∼40 SBS mutations each year,31,32 prolonged exposure of the human gut to even lowly mutagenic pks+ strains could result in a markedly increased mutation load.
While earlier studies report hotspot mutations resulting in truncated APC,8,14,30 in vivo evidence of colibactin-induced mutagenesis leading to transformation is lacking. Comparison of EcN with other pks+ bacterial strains in such in vivo studies will help to elucidate the relative mutagenicity and specific risk caused by this probiotic strain. As EcN is used as a probiotic in conditions of varying severity and even in young patient groups, a careful assessment of its potential long-term mutagenicity in relation to clinical benefits is warranted for each of these use cases. Assessment of EcN-linked mutations in animal models and patients treated with EcN is required to determine the safety of this commonly prescribed probiotic. The framework presented in this manuscript is expected to translate well to in vivo datasets and could thereby contribute to future clinical assessment of EcN mutagenicity.
Limitations of the study
The mutational features of the frameworks used are not exclusively present in colibactin-induced mutations. A low background level of mutations within a -3-4AA motif or classified by the RF is present in all mutation catalogs, including our control organoid dataset that has been completely devoid of any colibactin exposure. Therefore, a minimal threshold of colibactin-induced mutations is needed for classification to pks+ E. coli. In our study, we did not observe any specific differences in the mutational profiles induced by the different pks+ E. coli strains. Thus, strain-specific classification, or even determining the influence of any probiotic treatment by assessing mutation characteristics is not possible. Although the RF model predicts that a particular driver mutation was likely to have been caused by colibactin-induced mutagenesis, this study does not allow us to casually link pks+ E. coli exposure to the induction of cancer. Dedicated epidemiological studies, coupled with WGS/WES to determine past mutagenic activity of pks+ E. coli, may help addressing whether exposure to pks+ E. coli, including Nissle, increases the risks of cancer onset.
References
1.
Arthur, J.C. ∙ Perez-Chanona, E. ∙ Mühlbauer, M. …
Intestinal inflammation targets cancer-inducing activity of the microbiota
Science (New York, N.Y.). 2012; 338:120-123
2.
Nougayrède, J.P. ∙ Homburg, S. ∙ Taieb, F. …
Escherichia coli Induces DNA Double-Strand Breaks in Eukaryotic Cells
Science. 2006; 313:848-851
3.
Wirbel, J. ∙ Pyl, P.T. ∙ Kartal, E. …
Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer
Nat. Med. 2019; 25:679-689
4.
Yachida, S. ∙ Mizutani, S. ∙ Shiroma, H. …
Metagenomic and metabolomic analyses reveal distinct stage-specific phenotypes of the gut microbiota in colorectal cancer
Nat. Med. 2019; 25:968-976
5.
Pleguezuelos-Manzano, C. ∙ Puschhof, J. ∙ Clevers, H.
Gut Microbiota in Colorectal Cancer: Associations, Mechanisms, and Clinical Approaches
Annu. Rev. Cancer Biol. 2022; 6:65-84
6.
Xue, M. ∙ Kim, C.S. ∙ Healy, A.R. …
Structure Elucidation of Colibactin and its DNA Cross-Links.
Science (New York, N.Y.). 2019; 365:aax2685
7.
Wilson, M.R. ∙ Jiang, Y. ∙ Villalta, P.W. …
The Human Gut Bacterial Genotoxin Colibactin Alkylates DNA
Science (New York, N.Y.). 2019; 363:aar7785
8.
Pleguezuelos-Manzano, C. ∙ Puschhof, J. ∙ Rosendahl Huber, A. …
Mutational signature in colorectal cancer caused by genotoxic pks+ E. coli
Nature. 2020; 580:269-273
9.
Dziubańska-Kusibab, P.J. ∙ Berger, H. ∙ Battistini, F. …
Colibactin DNA-damage signature indicates mutational impact in colorectal cancer
Nat. Med. 2020; 26:1063-1069
10.
Alexandrov, L.B. ∙ Kim, J. ∙ Haradhvala, N.J. …
The repertoire of mutational signatures in human cancer
Nature. 2020; 578:94-101
11.
Auvray, F. ∙ Perrat, A. ∙ Arimizu, Y. …
Insights into the acquisition of the pks island and production of colibactin in the Escherichia coli population
Microb. Genom. 2021; 7, 000579
12.
Bossuet-Greif, N. ∙ Vignard, J. ∙ Taieb, F. …
The Colibactin Genotoxin Generates DNA Interstrand Cross-Links in Infected Cells
mBio. 2018; 9, e02393-17
13.
Schultz, M.
Clinical use of E. coli Nissle 1917 in inflammatory bowel disease
Inflamm. Bowel Dis. 2008; 14:1012-1018
14.
Iftekhar, A. ∙ Berger, H. ∙ Bouznad, N. …
Genomic aberrations after short-term exposure to colibactin-producing E. coli transform primary colon epithelial cells
Nat. Commun. 2021; 12:1003
15.
Nougayrède, J.P. ∙ Chagneau, C.V. ∙ Motta, J.P. …
A Toxic Friend: Genotoxic and Mutagenic Activity of the Probiotic Strain Escherichia coli Nissle 1917
mSphere. 2021; 6, e0062421
16.
Puschhof, J. ∙ Pleguezuelos-Manzano, C. ∙ Martinez-Silgado, A. …
Intestinal organoid cocultures with microbes
Nat. Protoc. 2021; 16:4633-4649
17.
Buc, E. ∙ Dubois, D. ∙ Sauvanet, P. …
High Prevalence of Mucosa-Associated E. coli Producing Cyclomodulin and Genotoxin in Colon Cancer
PLoS One. 2013; 8, e56964
18.
Cougnoux, A. ∙ Dalmasso, G. ∙ Martinez, R. …
Bacterial genotoxin colibactin promotes colon tumour growth by inducing a senescence-associated secretory phenotype
19.
Gonzalez-Pena, V. ∙ Natarajan, S. ∙ Xia, Y. …
Accurate genomic variant detection in single cells with primary template-directed amplification
Proc. Natl. Acad. Sci. USA. 2021; 118, e2024176118
20.
Middelkamp, S. ∙ Manders, F. ∙ Peci, F. …
Comprehensive Single-Cell Genome Analysis at Nucleotide Resolution Using the PTA Analysis Toolbox
Preprint atbioRxiv. 2023;
21.
Kucab, J.E. ∙ Zou, X. ∙ Morganella, S. …
A Compendium of Mutational Signatures of Environmental Agents
Cell. 2019; 177:821-836.e16
22.
Priestley, P. ∙ Baber, J. ∙ Lolkema, M.P. …
Pan-cancer whole-genome analyses of metastatic solid tumours
Nature. 2019; 575:210-216
23.
Campbell, B.B. ∙ Light, N. ∙ Fabrizio, D. …
Comprehensive Analysis of Hypermutation in Human Cancer
Cell. 2017; 171:1042-1056.e10
Full Text
24.
Rayner, E. ∙ van Gool, I.C. ∙ Palles, C. …
A panoply of errors: polymerase proofreading domain mutations in cancer
Nat. Rev. Cancer. 2016; 16:71-81
25.
Hodel, K.P. ∙ Sun, M.J.S. ∙ Ungerleider, N. …
POLE Mutation Spectra Are Shaped by the Mutant Allele Identity, Its Abundance, and Mismatch Repair Status
Mol. Cell. 2020; 78:1166-1177.e6
26.
Koh, G. ∙ Degasperi, A. ∙ Zou, X. …
Mutational signatures: emerging concepts, caveats and clinical applications
Nat. Rev. Cancer. 2021; 21:619-637
27.
Gonzalez-Perez, A. ∙ Perez-Llamas, C. ∙ Deu-Pons, J. …
IntOGen-mutations identifies cancer drivers across tumor types
Nat. Methods. 2013; 10:1081-1082
28.
Rogers, M.F. ∙ Shihab, H.A. ∙ Mort, M. …
FATHMM-XF:Accurate prediction of pathogenic point mutations via extended features
Bioinformatics. Published online. 2018;
29.
Fostira, F. ∙ Thodi, G. ∙ Sandaltzopoulos, R. …
Mutational spectrum of APC and genotype-phenotype correlations in Greek FAP patients
BMC Cancer. 2010; 10:389
30.
Terlouw, D. ∙ Suerink, M. ∙ Boot, A. …
Recurrent APC Splice Variant c.835-8A>G in Patients With Unexplained Colorectal Polyposis Fulfilling the Colibactin Mutational Signature
Gastroenterology. 2020; 159:1612-1614.e5
31.
Blokzijl, F. ∙ de Ligt, J. ∙ Jager, M. …
Tissue-specific mutation accumulation in human adult stem cells during life
Nature. 2016; 538:260-264
32.
Lee-Six, H. ∙ Olafsson, S. ∙ Ellis, P. …
The landscape of somatic mutation in normal colorectal epithelial cells
Nature. 2019; 574:532-537
33.
Cagan, A. ∙ Baez-Ortega, A. ∙ Brzozowska, N. …
Somatic mutation rates scale with lifespan across mammals
Nature. 2022; 604:517-524
34.
Massip, C. ∙ Branchu, P. ∙ Bossuet-Greif, N. …
Deciphering the interplay between the genotoxic and probiotic activities of Escherichia coli Nissle 1917
PLoS Pathog. 2019; 15, e1008029
35.
Oliero, M. ∙ Calvé, A. ∙ Fragoso, G. …
Oligosaccharides increase the genotoxic effect of colibactin produced by pks+ Escherichia coli strains
BMC Cancer. 2021; 21:172
36.
Wallenstein, A. ∙ Rehm, N. ∙ Brinkmann, M. …
ClbR Is the Key Transcriptional Activator of Colibactin Gene Expression in Escherichia coli
mSphere. 2020; 5, e00591-20
37.
Tronnet, S. ∙ Garcie, C. ∙ Rehm, N. …
Iron Homeostasis Regulates the Genotoxicity of Escherichia coli That Produces Colibactin
Infect. Immun. 2016; 84:3358-3368
38.
Chagneau, C.V. ∙ Garcie, C. ∙ Bossuet-Greif, N. …
The Polyamine Spermidine Modulates the Production of the Bacterial Genotoxin Colibactin
mSphere. 2019; 4:e00414-19
39.
Dougherty, M.W. ∙ Jobin, C.
Shining a Light on Colibactin Biology
Toxins. 2021; 13, 346
40.
Buc, E. ∙ Dubois, D. ∙ Sauvanet, P. …
High prevalence of mucosa-associated E. coli producing cyclomodulin and genotoxin in colon cancer
PLoS One. 2013; 8:e56964
41.
Wickham, H.
ggplot2: Elegant Graphics for Data Analysis.Springer-Verlag, New York, 2016
42.
Robin, X. ∙ Turck, N. ∙ Hainard, A. …
pROC: an open-source package for R and S+ to analyze and compare ROC curves
BMC Bioinf. 2011; 12:77
43.
Manders, F. ∙ Brandsma, A.M. ∙ de Kanter, J. …
MutationalPatterns: the one stop shop for the analysis of mutational processes
BMC Genom. 2022; 23:134
44.
Sato, T. ∙ Stange, D.E. ∙ Ferrante, M. …
Long-term expansion of epithelial organoids from human colon, adenoma, adenocarcinoma, and Barrett’s epithelium
Gastroenterology. 2011; 141:1762-1772
45.
Bartfeld, S. ∙ Bayram, T. ∙ van de Wetering, M. …
In vitro expansion of human gastric epithelial stem cells and their responses to bacterial infection
Gastroenterology. 2015; 148:126-136.e6
46.
Osorio, F.G. ∙ Rosendahl Huber, A. ∙ Oka, R. …
Somatic Mutations Reveal Lineage Relationships and Age-Related Mutagenesis in Human Hematopoiesis
Cell Rep. 2018; 25:2308-2316.e4
47.
Alexandrov, L.B. ∙ Jones, P.H. ∙ Wedge, D.C. …
Clock-like mutational processes in human somatic cells
Nat. Genet. 2015; 47:1402-1407
48.
Cancer Genome Atlas Research Network
Comprehensive molecular characterization of gastric adenocarcinoma
Nature. 2014; 513:202-209
49.
Cancer Genome Atlas Network
Comprehensive molecular characterization of human colon and rectal cancer
Nature. 2012; 487:330-337
50.
Cancer Genome Atlas Research Network
Comprehensive molecular profiling of lung adenocarcinoma
Nature. 2014; 511:543-550
51.
Cancer Genome Atlas Research Network
Comprehensive genomic characterization of squamous cell lung cancers
Nature. 2012; 489:519-525
52.
Cancer Genome Atlas Network
Comprehensive genomic characterization of head and neck squamous cell carcinomas
Nature. 2015; 517:576-582
53.
Benjamin, D. ∙ Sato, T. ∙ Cibulskis, K. …
Calling Somatic SNVs and Indels with Mutect2
Preprint atbioRxiv. 2019;