A 1.5-Mb continuous endogenous viral region in the arbuscular mycorrhizal fungus Rhizophagus irregularis

Zhao, Hongda Zhang, Ruixuan Wu, Junyi Meng, Lingjie Okazaki, Yusuke Hikida, Hiroyuki Ogata, Hiroyuki 京都大学 DOI:10.1093/ve/vead064



Most fungal viruses are RNA viruses, and no double-stranded DNA virus that infects fungi is known to date. A recent study detected DNA polymerase genes that originated from large dsDNA viruses in the genomes of basal fungi, suggestive of the existence of dsDNA viruses capable of infecting fungi. In this study, we searched for viral infection signatures in chromosome-level genome assemblies of the arbuscular mycorrhizal fungus Rhizophagus irregularis. We identified a continuous 1.5-Mb putative viral region on a chromosome in R. irregularis strain 4401. Phylogenetic analyses revealed that the viral region is related to viruses in the family Asfarviridae of the phylum Nucleocytoviricota. This viral region was absent in the genomes of four other R. irregularis strains and had fewer signals of fungal transposable elements than the other genomic regions, suggesting a recent and single insertion of a large dsDNA viral genome in the genome of this fungal strain. We also incidentally identified viral-like sequences in the genome assembly of the sea slug Elysia marginata that are evolutionally close to the 1.5-Mb putative viral region. In conclusion, our findings provide strong evidence of the recent infection of the fungus by a dsDNA virus.


Downloaded from https://academic.oup.com/ve/article/9/2/vead064/7334491 by Kyoto University user on 07 March 2024

orientation and with the same database hits. Then, we compared

the nucleotide sequences covering these ORFs to the amino acid

sequences of the reference sequences in RefSeq by using Dotter

v4.22 (Sonnhammer and Durbin 1995) to make dot-plots. On the

one hand, some of such neighboring ORFs were identified as candidates of genes with introns, when the ORFs are continuously

aligned with the reference sequences at the protein sequence

level. These regions were further examined with FGENESH v2.6

(Solovyev et al. 2006) to predict exons. When the program predicted two or more exons, the regions were assumed to contain

genes with introns. On the other hand, many other cases were

identified as pseudogenes. To maximize the gene annotation, we

also incorporated the original fungal gene annotations (Yildirir

et al. 2022) for some GEVE regions, that is, the regions with either

(1) genes with introns as identified above or (2) with no predicted

genes or pseudogenes by the above procedure. In the case of (1),

we overwrite the above gene annotations with the original fungal annotations. In the case of (2), we added the original fungal


The ORFs were annotated using BLASTp in Diamond (Evalue < 10−5 ). Because previous studies may have annotated viral

insertions in fungal genomes as fungal genes, we used the NR

database excluding the sequences from the fungal class Glomeromycetes (NCBI: txid214506, which includes R. irregularis). In

annotating the sea slug viral regions, we used the NR database and

excluded all sequences of E. marginata (NCBI: txid1093978). The

best match for each ORF was used to determine the taxonomic

distribution of the ORFs (i.e. eukaryote, prokaryote, and virus). For

ORFs with the eukaryotic best hit (excluding E. marginata), we performed BLASTp to search against other predicted ORF in strain

4401 except for the 1.5-Mb viral region. Functional annotations

were retrieved using eggNOG-mapper v2.1.9 (Cantalapiedra et al.


To identify traces of genes, we extracted the genomic region

between two predicted ORFs and eliminated the regions shorter

than 100 bp before performing a BLASTx search using Diamond

(E-value < 10−5 ). The NR database and all predicted proteins on

viral regions identified by ViralRecall were used as the reference

and ‘–ultra-sensitive’ was selected as the parameter. We also used

Tandem Repeat Finder v4.09 (Benson 1999) to identify the tandem

repeats in the GEVE region, with ‘2 7 7 80 10 30 2000 -f -d -m’

selected as the parameter.

H. Zhao et al.

Downloaded from https://academic.oup.com/ve/article/9/2/vead064/7334491 by Kyoto University user on 07 March 2024

