首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Support vector machines: relevance feedback and information retrieval
Institution:1. Faculty of Health and Life Sciences, York St John University, Lord Mayor''s Walk, York YO31 7EX, England, UK;2. Health Education Yorkshire and the Humber, Ground Floor, Blenheim House, Duncombe Street, Leeds LS1 4PL, England, UK;3. Department of Health Sciences, Seebohm Rowntree Building, University of York, York, YO10 5DD, England, UK;1. Department of Pediatrics, The Jikei University School of Medicine, Tokyo, Japan;2. Division of Gene Therapy, Research Center for Medical Sciences, The Jikei University School of Medicine, Tokyo, Japan;3. Advanced Clinical Research Center, Institute of Neurological Disorders, Kanagawa, Japan
Abstract:We compare support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. It is assumed a preliminary search finds a set of documents that the user marks as relevant or not and then feedback iterations commence. Particular attention is paid to IR searches where the number of relevant documents in the database is low and the preliminary set of documents used to start the search has few relevant documents. Experiments show that if inverse document frequency (IDF) weighting is not used because one is unwilling to pay the time penalty needed to obtain these features, then SVMs are better whether using term-frequency (TF) or binary weighting. SVM performance is marginally better than Ide dec-hi if TF-IDF weighting is used and there is a reasonable number of relevant documents found in the preliminary search. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号