0


具有正则化输出和输入核方法的蛋白质网络推理

Protein-protein network inference with regularized output and input kernel methods
课程网址: http://videolectures.net/mlsb2010_dalche_buc_ppn/  
主讲教师: Florence d Alche-Buc
开课单位: 视频讲座网
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
中文简介:
两种蛋白质之间物理相互作用的预测已在有监督学习、无监督学习以及最近使用各种信息源(基因组、系统发育、蛋白质定位和功能)的半监督学习的背景下进行了讨论。如果将蛋白质之间的相似性定义为图中的节点,或者将其定义为输入为成对蛋白质的二元监督分类任务,则该问题可视为核矩阵完成任务。在本文中,我们首先回顾了现有的工作(矩阵完成、成对支持向量机、度量学习、训练集扩展),确定了每种方法的相关特性。然后定义了输出核回归(OKR)的框架,该框架在输出特征空间中使用了核技巧。在回顾了目前基于树的输出核回归方法的研究成果后,我们提出了一种新的基于核岭回归的方法,该方法利用了核在输入特征空间和输出特征空间的应用。这种方法的主要兴趣在于,施加各种正则化约束仍然会导致封闭形式的解。我们特别展示了这种方法如何允许在网络推理问题的转导设置中处理未标记的数据,以及在多任务类推理问题中处理多个网络。模拟数据和酵母数据的新结果说明了这一点。
课程简介: Prediction of a physical interaction between two proteins has been addressed in the context of supervised learning, unsupervised learning and more recently, semi-supervised learning using various sources of information (genomic, phylogenetic, protein localization and function). The problem can be seen as a kernel matrix completion task if one defines a kernel that encodes similarity between proteins as nodes in a graph or alternatively, as a binary supervised classification task where inputs are pairs of proteins. In this talk, we first make a review of existing works (matrix completion, SVM for pairs, metric learning, training set expansion), identifying the relevant features of each approach. Then we define the framework of output kernel regression (OKR) that uses the kernel trick in the output feature space. After recalling the results obtained so far with tree-based output kernel regression methods, we develop a new family of methods based on Kernel Ridge Regression that benefit from the use of kernels both in the input feature space and the output feature space. The main interest of such methods is that imposing various regularization constraints still leads to closed form solutions. We show especially how such an approach allows to handle unlabeled data in a transductive setting of the network inference problem and multiple networks in a multi-task like inference problem. New results on simulated data and yeast data illustrate the talk.
关 键 词: 核方法; 支持向量机; 机器学习; 结构化数据; 计算机科学; 生物信息学; 计算系统生物学
课程来源: 视频讲座网公开课
入库时间: 2016-07-21
最后编审: 2016-07-21:cmh
阅读次数: 2