banner
Centro de notícias
Elegante e moderno

Interrogando a matéria escura viral do ecossistema ruminal com um banco de dados global de viroma

Jun 18, 2023

Nature Communications volume 14, número do artigo: 5254 (2023) Citar este artigo

331 Acessos

8 Altmétrico

Detalhes das métricas

O diversificado viroma ruminal pode modular o microbioma ruminal, mas permanece amplamente inexplorado. Aqui, extraímos 975 metagenomas ruminais publicados para sequências virais, criamos um banco de dados global de viroma ruminal (RVD) e analisamos o viroma ruminal quanto à diversidade, ligações vírus-hospedeiro e papéis potenciais em afetar as funções ruminais. Contendo 397.180 unidades taxonômicas operacionais virais em nível de espécie (vOTUs), o RVD aumenta substancialmente a taxa de detecção de vírus ruminais de metagenomas em comparação com IMG/VR V3. A maioria das vOTUs classificadas pertencem a Caudovirales, diferindo daquelas encontradas no intestino humano. Prevê-se que o viroma ruminal infecte o microbioma ruminal central, incluindo degradadores de fibras e metanógenos, carregue diversos genes metabólicos auxiliares e, portanto, provavelmente impacte o ecossistema ruminal tanto de cima para baixo quanto de baixo para cima. O RVD e as descobertas fornecem recursos úteis e uma estrutura de base para pesquisas futuras para investigar como os vírus podem impactar o ecossistema ruminal e a fisiologia digestiva.

Uma enxurrada de estudos metagenômicos recentes focados em vírus gerou catálogos e bancos de dados de genoma viral muito grandes para vários ecossistemas, incluindo vírus oceânicos1,2, o intestino humano3,4,5 e o solo6. Eles revelaram viromas muito diversos, identificaram numerosos genes metabólicos auxiliares e lançaram uma nova luz sobre o impacto ecológico dos vírus. Além disso, estudos modelo focados em sistemas começaram a revelar como os vírus podem reprogramar o metabolismo dos seus hospedeiros procarióticos, formando virocélulas distintas que alteram a aptidão ecológica e o metabolismo dos hospedeiros7. Evidências emergentes apoiam os impactos potenciais dos vírus na biogeoquímica dos oceanos1,8, na fisiologia humana4 e nos estados de doença9. Não estão disponíveis estudos semelhantes sobre o viroma ruminal ou sobre o banco de dados de viroma específico do rúmen.

O rúmen abriga um ecossistema diversificado de vários reinos contendo bactérias, arquéias, fungos, protozoários e vírus. Coletivamente, o microbioma ruminal digere e fermenta alimentos que de outra forma seriam indigestos e fornece a maior parte da energia (na forma de ácidos graxos voláteis) e nitrogênio metabolizável (na forma de proteína microbiana) necessários aos ruminantes para crescer e produzir carne e leite. Fortes associações de bactérias ruminais, archaea e protozoários com eficiência alimentar, emissões de metano (CH4) e saúde animal foram documentadas10, mas os vírus ruminais, apesar de abundantes, permanecem pouco compreendidos, apesar de estudos focados em vírus contribuírem para a caracterização do rúmen. viroma11,12. Estudos iniciais utilizando microscopia eletrônica documentaram bacteriófagos morfologicamente diversos e revelaram a predominância de fagos com cauda . Estudos iniciais dependentes de cultura encontraram bacteriófagos que poderiam infectar uma ampla gama de espécies ou cepas de bactérias ruminais, incluindo espécies prevalentes de Prevotella, Ruminococcus e Streptococcus, e classificaram a maioria desses fagos com base em sua morfologia nas famílias Myoviridae, Siphoviridae, Podoviridae e Inoviridae (revisado por Gilbert e Klieve15). Embora esses estudos tenham fornecido informações valiosas sobre os vírus ruminais, as morfologias simples dos fagos não permitem uma classificação taxonômica confiável e, portanto, o Comitê Internacional de Taxonomia de Vírus (ICTV: https://ictv.global/taxonomy) não reconhece mais a morfologia- classificação de vírus baseada.

Genômica, metagenômica e metatranscriptômica tornaram-se as principais tecnologias para estudar viromas, incluindo o viroma ruminal. O sequenciamento recente do genoma completo, dependente de cultura, identificou 10 fagos que infectam Prevotella ruminicola, Ruminococcus albus, Streptococcus bovis e Butyrivibrio fibrisolvens16,17, que desempenham papéis importantes na digestão e fermentação de alimentos. Esses genomas de fagos exibem organização genômica modular, genes virais conservados e potencial para serem líticos e lisogênicos . Os vírus ruminais também foram estudados usando metagenomas de partículas semelhantes a vírus (VLPs) (revisado em 11). No entanto, as bases de dados de genoma de referência que têm sido utilizadas sub-representam os vírus ruminais, limitando assim a identificação e classificação dos vírus ruminais e a previsão do seu hospedeiro. Por exemplo, foram encontrados vírus ruminais com diversos genótipos, mas a maioria deles não foi classificada devido à falta de correspondências para referenciar sequências virais18,19,20. Miller et al.18 encontraram elementos de repetição palindrômica curta agrupada regularmente interespaçada (CRISPR)/proteína associada a CRISPR (Cas) em alguns genomas e metagenomas microbianos do rúmen, mas encontraram poucas sequências espaçadoras correspondentes às sequências virais do rúmen para predição do hospedeiro. Portanto, tem sido difícil caracterizar o viroma ruminal, especialmente no que diz respeito a novos vírus.

12-fold) and IMG/VR V3 and improving the identification of viral sequences based on rumen metagenomics, RVD will be useful as a new community resource and will provide new insights for future studies on the rumen virome and its implication in feed digestion, microbial protein synthesis, feed efficiency, and CH4 emissions./p>5 kb each and clustered them into 411,125 vOTUs. After validation with VIBRANT23, we constructed a rumen virome database (RVD, download available at https://zenodo.org/record/7412085#.ZDsE2XbMK5c) representing 397,180 vOTUs (Supplementary Fig. 1), with 193,327 vOTUs of >10 kb. Checking with CheckV21 revealed 4400 complete vOTUs, 4396 high-quality vOTUs, and 32,942 medium-quality vOTUs. The completeness and quality of the RVD vOTUs were probably underestimated because CheckV is database dependent, and the databases used are primarily derived from other ecosystems. All the vOTUs in RVD meet Uncultivated Virus Genome (MIUViG) standards25./p>50% completeness of the current study and the two largest human gut virome databases (MGV4 and GPD5). For better visualization, only one representative vOTU (the longest and most complete) was included for each genus-level vOTU (714 in total). The branches were color-coded: green, the Caudovirales lineages exclusively found in the human virome; red, the lineages exclusively found in the rumen virome of the current study; blue, the lineages found in both the rumen and the human viromes. Lysogeny rates (proportion) were calculated with VIBRANT and shown as the inner ring. The number of vOTUs representing each lineage was shown as a bar plot (red for human viruses, and black for human viruses). d Proportion of lineages of Caudovirales viruses unique to the human intestine, the rumen, and shared. e A rarefaction curve of the vOTUs identified in the rumen virome. The upward trend of the rarefaction curve indicates that more rumen viruses remain to be identified at the specie level./p>1 phage per host genome. The percentage of lysogenic viruses varied among the host genera, and it was low for most host genera (Fig. 3c). Most ciliate SAGs presented multiple EVEs, among which all five SAGs of Isotricha sp. YL-2021b and Dasytricha ruminantium presented the greatest number (>50) EVEs per SAG (Supplementary Fig. 5). Little is known about viruses infecting ciliates, and no EVEs have been reported for even model ciliate species (e.g., Tetrahymena thermophila). However, EVEs have been recently found in Entamoeba and Giardia in human stool metagenomes32. Therefore, rumen ciliates probably carry EVEs. The large number of EVEs per ciliate SAG may correspond to the high polyploidy and the enormous numbers of chromosomes found in many rumen ciliates (e.g., >10,000 in Entodinium caudatum33)./p>12-fold). Based on the gene-sharing network, most rumen vOTUs were clustered into four groups (Fig. 3b). Groups I (the largest) and IV (the smallest) contained more classified vOTUs than groups II and III. Groups I and IV had a broader host range among bacterial phyla, including both gram-positive and gram-negative bacteria with different niches and capacities, but few of their genera or families were predominant in the rumen. Groups II and III mainly infected Bacteroidota and Methanobacteriota, respectively (Fig. 3c), and most viruses of these two groups could not be classified with any of the current virome databases; thus, they represent new viral lineages. The narrow host range (a single phylum) of groups II and III supports the notion that phages with a high degree of gene sharing generally infect phylogenetically related hosts./p>2400) and bacteriophages (>40,000) down to the species level, and many of the host species are known to play important roles in feed digestion, fermentation, and methane emissions. Advancement in the prediction of hosts and virus‒host linkages will aid in understanding the ecological roles of rumen viruses. Such information will be especially useful when both the rumen metagenome and virome are investigated for their association with major rumen functions. Among the rumen vOTUs with a predicted host match, 99.5% were inferred to infect prokaryotes primarily found in the rumen, even though most of the reference prokaryote genomes that were used came from prokaryotes in other environments, demonstrating the rigor and low false positive rate of our host prediction pipeline./p>5 kb were verified using VirSorter222 (option: --min-score 0.5), and the contigs that passed the verification procedure were input to CheckV21 to trim off host sequences flanking prophages. We only chose viral contigs >5 kb because the currently available bioinformatics tools show a relatively high false positive rate when identifying viral contigs <5 kb30. Only the contigs falling into categories Keep1 and Keep2 were retained as putative viral contigs (708,580 in total) for further analyses./p>10 kb to genus-level viral taxa based on a gene-sharing network using vConTACT226, which uses NCBI RefSeq Viral (release 88) as reference genomes. The vOTUs that could be clustered with the reference genomes of a viral genus were assigned to that genus according to the vConTACT2 workflow. We assigned the vOTUs that failed to be assigned to a viral genus and those <10 kb to family-level viral taxa using the majority rule, as applied previously4. Briefly, we predicted the ORFs of each vOTU using Prodigal56 and then aligned the ORF sequences with those of NCBI RefSeq Viral using BLASTp with a bit score of ≥50. The vOTUs that were aligned with the NCBI RefSeq Viral genomes of a viral family with >50% of their protein sequences were assigned to that family. We identified crAss-like phages using BLASTn against 2,478 crAss-like phage genomes identified from previous studies57,58,59, with a threshold of ≥80% sequence identity along ≥50% of the length of previously identified crAss-like vOTUs./p>50% were included in the search. We then aligned each of the marker genes from the three databases using MAFFT62, sliced out the positions with >50% gaps using trimAl63, concatenated each aligned marker gene, and filled the gap where a marker gene was absent. Only the concatenated marker genes that each showed >3 marker genes and were found in >5% of all the aligned concatemers were retained, resulting in 10,203 Caudovirales marker gene concatemers, each with 13,573 alignment columns. These marker gene concatemers were clustered into genus-level vOTUs as described previously5, where benchmarking was performed to achieve high taxonomic homogeneity using NCBI RefSeq Viral genomes. We built a phylogenetic tree of Caudovirales viruses using FastTree v.2.1.9 (option: -mlacc 2 -slownni -wag)64 and aligned the concatenated marker genes of the representative vOTUs sequences of all the genus-level vOTUs with genome completeness >50% (based on CheckV analysis). The Caudovirales tree was visualized using iTOL65. The vOTUs identified as prophages or encoding an integrase were considered lysogenic. The lysogenic rate (%) was calculated based on the VIBRANT results as the percentage of lysogenic viruses of all the viruses for each genus of their probable hosts./p>2,500 bp of a host genome or MAG matched a vOTU sequence at >90% sequence identity over 75% of the vOTU sequence length4. We predicted probable protozoal hosts of the rumen viruses by searching the 52 high-quality ciliate SAGs68 for EVEs using BLASTn and the above criteria./p>10 kb (5912 in total) for AMG identification using the criteria recommended in a benchmarking paper30. The selected vMAGs were then subjected to AMG identification and genome annotation using DRAMv72 after processing with VirSorter2 with the options “—prep-for-dramv” applied. Second, the AMG-carrying vMAGs were removed if the AMGs were at an end of the vMAGs or if the AMGs were not flanked by both one viral hallmark gene and one viral-like gene or by two viral hallmark genes (category 1 and category 2 as determined by DRAMv). Third, the remaining vMAGs were further manually curated based on the criteria specified in the VirSorter2 SOP (https://doi.org/10.17504/protocols.io.bwm5pc86; also see https://github.com/yan1365/RVD/blob/main/vmags_check_helper/readme.txt). We eventually obtained 1,880 vMAGs. To further minimize false identification, we manually checked the genomic context of these vMAGs and found that some of them were still possible genomic islands. Therefore, we filtered the 1880 vMAGs based on the criteria established by Sun and Pratama et al. (unpublished data). Briefly, vMAGs with only integrases/transposases, tail fiber genes, or any nonviral genes were removed. The remaining vMAGs were filtered again to remove those that did not have at least one of the viral structural genes (i.e., capsid protein, portal protein, phage coat protein, baseplate, head protein, tail protein, virion structural protein, and terminase) and those containing genes encoding an endonuclease, plasmid stability protein, lipopolysaccharide biosynthesis enzyme, glycosyltransferase (GT) families 11 and 25, nucleotidyltransferase, carbohydrate kinase, or nucleotide sugar epimerase. We eventually obtained 504 vMAGs free of genomic islands. To benchmark our curation pipeline, 100 of the vMAGs were randomly selected for detailed manual curation based on their genomic context. According to the benchmarking results, we were confident that we retained only complete vMAGs for AMG prediction. Detailed results of each curation step and full annotation of the final vMAGs and the annotation of the identified AMGs are presented in Supplementary Data 4. We compared the AMGs identified in the rumen virome to the previously identified AMGs from other viromes, which are available in an expert-curated AMG database (https://github.com/WrightonLabCSU/DRAM/blob/master/data/amg_database.tsv). For the newly identified AMGs, we double-checked the annotations and searched the literature to ensure that they were truly AMGs./p>50% concentrate). First, we transformed the raw abundance table into a binary matrix (presence or absence). Then, the prevalence of each vOTU in each sample was calculated. A vOTU was included in the core rumen virome if its prevalence exceeded 50% of the prevalence for each concentrate level or all cattle. Based on prevalence, the vOTUs were categorized as individualized (observed in only one sample), one concentrate level (observed in more than 1 sample but exclusively from a single concentrate level), two concentrate levels (observed in animals from two concentrate levels) and three concentrate levels (observed in all three concentrate levels). The numbers of vOTUs shared by the core viromes among the three concentrate levels were visualized with a Venn graph in R. We examined whether animals from the same diet or same breed share more vOTUs compared to animals fed different diets or of different breeds using subsets of data from Stewart et al.78 and Li et al.79 respectively. The Kruskal–Wallis test was used to compare the numbers of shared vOTUs in different groups in R./p>12 metagenomes were retained for the analysis. The number of vOTUs shared by two studies was compared for every study pair, and the results were subjected to hierarchical clustering. The hierarchical clustering results were visualized in R with the ComplexHeatmap package81 and annotated according to the metadata./p>