An ancient retroviral RNA element hidden in mammalian genomes and its involvement in co-opted retroviral gene regulation

Kitao, Koichi



(2021) 18:36
Kitao et al. Retrovirology

Open Access


An ancient retroviral RNA element hidden
in mammalian genomes and its involvement
in co‑opted retroviral gene regulation
Koichi Kitao1, So Nakagawa2 and Takayuki Miyazawa1*   

Background:  Retroviruses utilize multiple unique RNA elements to control RNA processing and translation. However,
it is unclear what functional RNA elements are present in endogenous retroviruses (ERVs). Gene co-option from ERVs
sometimes entails the conservation of viral cis-elements required for gene expression, which might reveal the RNA
regulation in ERVs.
Results:  Here, we characterized an RNA element found in ERVs consisting of three specific sequence motifs, called
SPRE. The SPRE-like elements were found in different ERV families but not in any exogenous viral sequences examined. We observed more than a thousand of copies of the SPRE-like elements in several mammalian genomes; in
human and marmoset genomes, they overlapped with lineage-specific ERVs. SPRE was originally found in human
syncytin-1 and syncytin-2. Indeed, several mammalian syncytin genes: mac-syncytin-3 of macaque, syncytin-Ten1 of
tenrec, and syncytin-Car1 of Carnivora, contained the SPRE-like elements. A reporter assay revealed that the enhancement of gene expression by SPRE depended on the reporter genes. Mutation of SPRE impaired the wild-type syncytin-2 expression while the same mutation did not affect codon-optimized syncytin-2, suggesting that SPRE activity
depends on the coding sequence.
Conclusions:  These results indicate multiple independent invasions of various mammalian genomes by retroviruses
harboring SPRE-like elements. Functional SPRE-like elements are found in several syncytin genes derived from these
retroviruses. This element may facilitate the expression of viral genes, which were suppressed due to inefficient codon
frequency or repressive elements within the coding sequences. These findings provide new insights into the longterm evolution of RNA elements and molecular mechanisms of gene expression in retroviruses.
Keywords:  Endogenous retroviruses, RNA regulatory element, Syncytin, Mammalian genomes
Just as the traces of ancient organisms remain as fossils, traces of ancient retroviruses remain as DNA
sequences in host genomes. They are called endogenous
retroviruses (ERVs), remnants of ancient retroviruses
*Correspondence: takavet@infront.kyoto-u.ac.jp
Laboratory of Virus‑Host Coevolution, Institute for Frontier Life
and Medical Sciences, Kyoto University, 53 Shogoin‑Kawaharacho,
Sakyo‑ku, Kyoto 606‑8507, Japan
Full list of author information is available at the end of the article

incorporated into the genome through infection of host
germ cells. ERVs are not mere fossil records, as some
are still active as protein-coding genes or regulatory elements in the host genome. Their functional features have
been inherited from ancestral viruses: for example, the
placental fusogenic Syncytin proteins from the fusogenic
envelope protein [1], and lineage-specific host enhancers/promoters from the long terminal repeat (LTR) of
viral multifunctional regulatory elements [2]. On the
other hand, much is unknown about the role of RNA elements in ERVs. Recent studies reported that several host

RNA-binding proteins interact with ERV transcripts [3].
However, it is unclear what kinds of unique RNA elements are present in ERVs and their biological significance for the hosts.
RNA elements provide a layer of post-transcriptional
regulation to balance the gene expression of the viral proteins. Retroviruses have three genes: gag gene encoding
the major structural protein; pol gene encoding RNase H,
reverse transcriptase, and integrase; and env gene encoding envelope protein. Some retroviruses have additional
RNA-binding proteins involved in post-transcriptional
regulation. For example, human immunodeficiency virus
(HIV)-1 belonging to the Genus Lentivirus encodes the
regulatory protein termed Rev that binds to the Revresponsive element (RRE) in the env region [4, 5]. The
binding of Rev to RRE facilitates the export of un-spliced
viral RNA to the cytoplasm with the host factor CRM1/
XPO1 [6] as well as the translation of Env and regulatory and accessory proteins [7]. Similarly, Rex of human
T-lymphotropic leukemia virus 1 belonging to the Genus
Deltaretrovirus [8] and Rem of murine mammary tumor
virus belonging to the Genus Betaretrovirus [9] are regulatory proteins that bind to their viral RNAs, allowing
efficient viral replication. Mason-Pfizer monkey virus
(MPMV), belonging to the Genus Betaretrovirus, does
not have regulatory proteins but has an RNA element
called the constitutive transport element (CTE). Bray
et  al. [10] initially reported that CTE could compensate
for Rev-deficient HIV-1 replication. Then, it was revealed
that the binding of the host protein TAP/NXF1 to CTE
promotes nuclear transport and the translation of unspliced viral RNA [11, 12]. Similarly, binging of NXF1 to
the cytoplasmic accumulation element (CAE) in murine
leukemia viral RNA belonging to the Genus Gammaretrovirus also promotes the expression of viral proteins
[13]. Recent comprehensive mutagenesis approaches
revealed that HIV-1 transcripts contain many undefined
RNA elements required for efficient viral replication [14].
Thus, retroviruses have complex RNA elements in their
short genomes that allow them to replicate efficiently.
Identification of such RNA elements from ERVs is challenging because accumulated mutations may have disrupted such elements. Exceptionally, HERV-K, which
is a young ERV family, retains intact viral ORFs and
shows polymorphic loci in the human genome [15, 16].
The post-transcriptional roles of its RNA-binding regulatory protein, Rec, and its binding RNA element have
been demonstrated [17, 18]. Co-opted viral genes may
provide important clues to investigate ancient viral RNA
elements, given that these elements might have been
similarly conserved to regulate the expression of coopted genes. Syncytin-1 is an env gene of ERVWE1 and
contributes to cell fusion to differentiate multinucleated

syncytiotrophoblasts in the human placenta [19, 20]. We
previously reported that an RNA element located in the
3′ end of the protein-coding sequence and 3′ untranslated region (3′ UTR) of human syncytin-1 is important
for its protein expression and was named syncytin posttranscriptional regulatory element (SPRE) [21]. Indeed,
human syncytin-2, another syncytin gene derived from an
env gene of ERVFRDE1 [22], also contains a functional
element in their 3′ UTR [21], although we have not examined it in detail. Such RNA elements would enable us
to examine the RNA regulatory mechanisms of ancient
In this study, a hidden Markov model (HMM)-based
sequence search in an ERV database revealed the core
motifs of SPRE. We found that the defined SPRE-like elements were widespread in 378 distinct ERV families but
not in extant viruses. We also detected the SPRE-like elements in three non-human syncytin genes. A reporter
assay verified their functionality and revealed the unique
features allowing the protein-coding sequence of the target gene to affect the SPRE activity. These results provide
new insights into ancient retroviral post-transcriptional
regulation as well as its involvement in the co-opted
genes from ERVs.

The SPRE‑core motif is functionally essential for SPRE

Previously, we reported that a partial sequence in the 3′
end of the protein-coding sequence and subsequent 3′
UTR of human syncytin-1 (68-nt) and a partial sequence
in 3′ UTR of syncytin-2 (400-nt) increase protein expression when inserted into 3′ UTR of an HIV-1 Gag expression plasmid [21] (Fig.  1a). We hypothesized that these
sequences share functional RNA motifs. To explore
the essential motif(s), two regulatory sequences were
aligned and compared. We revealed that a 17-nt common
sequence (5′-TCA​GCA​GGA​AGC​AGTTA-3′) is shared
in syncytin-1 and syncytin-2 (Fig.  1b). Next, we examined whether the common sequence is essential for the
expression of Syncytin-1 and Syncytin-2. We generated
expression plasmids by cloning syncytin-1 and syncytin-2
with their 3′ UTRs and introduced mutations (11 nucleotides) into the 17-nt common sequence (Fig.  1c). Since
this common sequence overlaps with the syncytin-1 coding sequence, we generated mutants avoiding any amino
acid substitutions in Syncytin-1. Syncytin expression levels were evaluated by a cell fusion-dependent luciferase
assay utilizing the property that both Syncytin-1 and
Syncytin-2 induce strong cell fusion. As a result, mutations in the common sequence markedly reduced cell
fusion activities for Syncytin-1 and Syncytin-2 (Fig.  1d). ...



