图片摘自:www.anatomybox.com
2016年3月14日 讯 /生物谷BIOON/ –近日,刊登于国际杂志BMC Bioinformatics上的一项研究报告中,来自莫斯科物理技术学院创新中心的一组研究者设计了一种新方法,该方法可以对生物材料样本中的所有有机体进行宏基因组偶联的DNA序列对比分析,同时该方法或可帮助有效且快速地解决样本对比中出现的问题,而且还可以被轻松嵌入到数据分析过程中进行宏基因组的研究。
细菌就是科学家们研究宏基因组的一个很好的对象,不管是评估机体中的细菌数量,还是进行别的抗体研究,宏基因组的重要性不可低估;多个全球的研究计划,比如人类微生物组研究计划,就揭示了机体细菌群落的组成如何影响患病的风险。
进行宏基因组分析的传统方法就是根据分类学的组成(每种微生物的比例)来对不同样本进行对比,为了确定样本的组成,其遗传序列就会被用来同已知的细菌基因组数据库进行对比,而已知的细菌基因组就是俗称的参考集,然而传统的方法存在很多缺点,首先参考性的基因组通常都是不准确的,其次并不是所有的有机体都是在参考基因组中进行收集的,比如有机体如果是病毒的话,因此在分析的过程中部分样本序列并不能简单地被考虑;同时基于k-mer频率的对比方法并不需要依赖于参照的样本或者对有机体研究的相关信息。
本文中研究者开发的方法基于对有机体基因组序列的表示,由于基因组是每一种有机体中特殊的遗传序列,而不同的遗传字母在不同个体中差异很大,因此用于进行宏基因组分析的k-mers集应当被视为一组数据进行分析。而这就可以帮助研究人员分析不同样本间细菌组成的差异。
我们都知道,细菌在肠道中群落组成在不同个体间是明显不同的,而新型算法或可帮助揭示不同组成的差异;相比利用传统方法对参考数据集进行绘图分析,研究者开发的新方法可以通过比较数据间的k-mers来得出更好的分析结果。此外当利用真实数据时,肠道群落的k-mer值和传统方法结果之间的错配将帮助我们检测肠道宏基因组(细菌噬菌体crAssphage)中的另一种重要的组分;研究者Dmitri Alexeev说道,这种特殊基因不仅可以被认为是DNA的片段,而且还可以作为常规信息进行分析,而相关的信息则可以帮助鉴别出新型的DNA片段。
最后研究者表示,这种新开发的技术奖帮助我们更加有效且准确地寻找多种细菌群落中宏基因组间的差异,这对于研究并且开发诊断及治疗人类疾病的新型疗法提供了新的线索和思路。(基因宝jiyinbao.com)
本文系生物谷原创编译整理,欢迎转载!点击 获取授权 。更多资讯请下载生物谷APP.
Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis
Veronika B. DubinkinaEmail author, Dmitry S. Ischenko, Vladimir I. Ulyantsev, Alexander V. Tyakht and Dmitry G. Alexeev
Background A rapidly increasing flow of genomic data requires the development of efficient methods for obtaining its compact representation. Feature extraction facilitates classification, clustering and model analysis for testing and refining biological hypotheses. “Shotgun” metagenome is an analytically challenging type of genomic data – containing sequences of all genes from the totality of a complex microbial community. Recently, researchers started to analyze metagenomes using reference-free methods based on the analysis of oligonucleotides (k-mers) frequency spectrum previously applied to isolated genomes. However, little is known about their correlation with the existing approaches for metagenomic feature extraction, as well as the limits of applicability. Here we evaluated a metagenomic pairwise dissimilarity measure based on short k-mer spectrum using the example of human gut microbiota, a biomedically significant object of study. Results We developed a method for calculating pairwise dissimilarity (beta-diversity) of “shotgun” metagenomes based on short k-mer spectra (5≤k≤11). The method was validated on simulated metagenomes and further applied to a large collection of human gut metagenomes from the populations of the world (n=281). The k-mer spectrum-based measure was found to behave similarly to one based on mapping to a reference gene catalog, but different from one using a genome catalog. This difference turned out to be associated with a significant presence of viral reads in a number of metagenomes. Simulations showed limited impact of bacterial genetic variability as well as sequencing errors on k-mer spectra. Specific differences between the datasets from individual populations were identified. Conclusions Our approach allows rapid estimation of pairwise dissimilarity between metagenomes. Though we applied this technique to gut microbiota, it should be useful for arbitrary metagenomes, even metagenomes with novel microbiota. Dissimilarity measure based on k-mer spectrum provides a wider perspective in comparison with the ones based on the alignment against reference sequence sets. It helps not to miss possible outstanding features of metagenomic composition, particularly related to the presence of an unknown bacteria, virus or eukaryote, as well as to technical artifacts (sample contamination, reads of non-biological origin, etc.) at the early stages of bioinformatic analysis. Our method is complementary to reference-based approaches and can be easily integrated into metagenomic analysis pipelines.
相关会议推荐
2016(第二届)肠道微生物组与临床应用研讨会
会议时间:2016.04.15-2016.04.16 会议地点:上海