首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Transaction logs of NAVER, a major Korean Web search engine, were analyzed to track the information-seeking behavior of Korean Web users. These transaction logs include more than 40 million queries collected over 1 week. This study examines current transaction log analysis methodologies and proposes a method for log cleaning, session definition, and query classification. A term definition method which is necessary for Korean transaction log analysis is also discussed. The results of this study show that users behave in a simple way: they type in short queries with a few query terms, seldom use advanced features, and view few results' pages. Users also behave in a passive way: they seldom change search environments set by the system. It is of interest that users tend to change their queries totally rather than adding or deleting terms to modify the previous queries. The results of this study might contribute to the development of more efficient and effective Web search engines and services.  相似文献   

2.
Past studies of citation coverage of Web of Science, Scopus, and Google Scholar do not demonstrate a consistent pattern that can be applied to the interdisciplinary mix of resources used in social work research. To determine the utility of these tools to social work researchers, an analysis of citing references to well-known social work journals was conducted. Web of Science had the fewest citing references and almost no variety in source format. Scopus provided higher citation counts, but the pattern of coverage was similar to Web of Science. Google Scholar provided substantially more citing references, but only a relatively small percentage of them were unique scholarly journal articles.The patterns of database coverage were replicated when the citations were broken out for each journal separately. The results of this analysis demonstrate the need to determine what resources constitute scholarly research and reflect the need for future researchers to consider the merits of each database before undertaking their research. This study will be of interest to scholars in library and information science as well as social work, as it facilitates a greater understanding of the strengths and limitations of each database and brings to light important considerations for conducting future research.  相似文献   

3.
Transaction logs from online search engines are valuable for two reasons: First, they provide insight into human information-seeking behavior. Second, log data can be used to train user models, which can then be applied to improve retrieval systems. This article presents a study of logs from PubMed®, the public gateway to the MEDLINE® database of bibliographic records from the medical and biomedical primary literature. Unlike most previous studies on general Web search, our work examines user activities with a highly-specialized search engine. We encode user actions as string sequences and model these sequences using n-gram language models. The models are evaluated in terms of perplexity and in a sequence prediction task. They help us better understand how PubMed users search for information and provide an enabler for improving users’ search experience.  相似文献   

4.
ABSTRACT

Although there is a proliferation of information available on the Web, and law professors, students, and other users have a variety of channels to locate information and complete their research activities, the law library catalog still remains an important source for offering users access to information that has been evaluated and cataloged by experts. The usability of the catalog needs to be effectively measured before any necessary improvements can be made. This study was undertaken to investigate the information retrieval patterns of users of the Rutgers Law Library Online Public Access Catalog and to develop the catalog into a more effective search tool for these users. This study used an experimental approach to measure the usability of our catalog by analyzing the transaction logs from the OPAC system and the results from Google Analytics. The findings provided not only important information on user demographics and their computer systems, but also more insight on the search behaviors of users. The specific findings included the following:
  1. As a Web-analytic tool Google Analytics provided extensive information on the OPAC and the navigational behaviors of users.

  2. Fifty-eight percent of our users visited the Web site regularly.

  3. The most popular search method, which was employed by 37% of our users, was by title.

  4. Most patrons used computer systems with a high resolution and color depth monitor and visited the catalog Web site with a high-speed Internet connection.

  5. Suggestions were made by the authors to improve the users’ search experience of the catalog Web site.

This study is significant to libraries with Web catalogs because it demonstrates the potential value of using Google Analytics as a Web analytics tool in combination with the OPAC transaction logs to measure catalog usability.  相似文献   

5.
6.
The following study analyzes user search behavior using a tabbed-search interface. For this study, a transaction log was used to collect information about user searches and included tab used; search terms; date, time, and location of search (on campus or off campus); as well as a unique ID to identify the user session and another ID to identify each transaction. This article explains the process for examining 4,300 search queries conducted on the library homepage during an academic semester and presents findings from the analysis. The article also details enhancements that were made to the tabbed-search interface as a result of the transaction log analysis. Additionally, the article discusses the merits of using a transaction log as a method of ongoing assessment of a library Web site's search interface.  相似文献   

7.
This study investigates the information seeking behavior of general Korean Web users. The data from transaction logs of selected dates from August 2006 to August 2007 were used to examine characteristics of Web queries and to analyze click logs that consist of a collection of documents that users clicked and viewed for each query. Changes in search topics are explored for NAVER users from 2003/2004 to 2006/2007. Patterns involving spelling errors and queries in foreign languages are also investigated. Search behaviors of Korean Web users are compared to those of the United States and other countries. The results show that entertainment is the topranked category, followed by shopping, education, games, and computer/Internet. Search topics changed from computer/Internet to entertainment and shopping from 2003/2004 to 2006/2007 in Korea. The ratios of both spelling errors and queries in foreign languages are low. This study reveals differences for search topics among different regions of the world. The results suggest that the analysis of click logs allows for the reduction of unknown or unidentifiable queries by providing actual data on user behaviors and their probable underlying information needs. The implications for system designers and Web content providers are discussed.  相似文献   

8.
The collective feedback of the users of an Information Retrieval (IR) system has been shown to provide semantic information that, while hard to extract using standard IR techniques, can be useful in Web mining tasks. In the last few years, several approaches have been proposed to process the logs stored by Internet Service Providers (ISP), Intranet proxies or Web search engines. However, the solutions proposed in the literature only partially represent the information available in the Web logs. In this paper, we propose to use a richer data structure, which is able to preserve most of the information available in the Web logs. This data structure consists of three groups of entities: users, documents and queries, which are connected in a network of relations. Query refinements correspond to separate transitions between the corresponding query nodes in the graph, while users are linked to the queries they have issued and to the documents they have selected. The classical query/document transitions, which connect a query to the documents selected by the users’ in the returned result page, are also considered. The resulting data structure is a complete representation of the collective search activity performed by the users of a search engine or of an Intranet. The experimental results show that this more powerful representation can be successfully used in several Web mining tasks like discovering semantically relevant query suggestions and Web page categorization by topic.  相似文献   

9.
10.
The Central Medical Library (CMK) at the Faculty of Medicine, University of Ljubljana, Slovenia, started to build a library Website that included a guide to library services and resources in 1997. The evaluation of Website usage plays an important role in its maintenance and development. Analyzing and exploring regularities in the visitors'' behavior can be used to enhance the quality and facilitate delivery of information services, identify visitors'' interests, and improve the server''s performance. The analysis of the CMK Website users'' navigational behavior was carried out by analyzing the Web server log files. These files contained information on all user accesses to the Website and provided a great opportunity to learn more about the behavior of visitors to the Website. The majority of the available tools for Web log file analysis provide a predefined set of reports showing the access count and the transferred bytes grouped along several dimensions. In addition to the reports mentioned above, the authors wanted to be able to perform interactive exploration and ad hoc analysis and discover trends in a user-friendly way. Because of that, we developed our own solution for exploring and analyzing the Web logs based on data warehousing and online analytical processing technologies. The analytical solution we developed proved successful, so it may find further application in the field of Web log file analysis. We will apply the findings of the analysis to restructuring the CMK Website.  相似文献   

11.
Comprehensive yet efficient search methods are essential for any systematic or scoping review. This article outlines the stages of development of a systematic search methodology for a scoping review within the library and information science (LIS) literature. The effectiveness of the database search strategies (LISTA, LISA, ERIC, Scopus, Web of Science) and supplemental search techniques are measured through a retrospective analysis of performance metrics. Findings show that for research topics limited to the library setting, it may be more effective to search fewer databases (LISTA and Scopus only) for peer reviewed journal articles and allot more time to alternate search techniques such as web searching to identify non-journal literature. The article provides an evidence-based, methodological approach to developing a systematic search plan, unique to LIS researchers, that accounts for time and resource needs.  相似文献   

12.
In today's fast-paced world, anecdotal evidence suggests that information tends to inundate people, and users of information systems want to find information quickly and conveniently. Empirical evidence for convenience as a critical factor is explored in the data from two multi-year, user study projects funded by the Institute of Museum and Library Services. The theoretical framework for this understanding is founded in the concepts of bounded rationality and rational choice theory, with Savolainen's (2006) concept of time as a context in information seeking, as well as gratification theory, informing the emphasis on the seekers' time horizons. Convenience is a situational criterion in peoples' choices and actions during all stages of the information-seeking process. The concept of convenience can include their choice of an information source, their satisfaction with the source and its ease of use, and their time horizon in information seeking. The centrality of convenience is especially prevalent among the younger subjects (“millennials”) in both studies, but also holds across all demographic categories—age, gender, academic role, or user or non-user of virtual reference services. These two studies further indicate that convenience is a factor for making choices in a variety of situations, including both academic information seeking and everyday-life information seeking, although it plays different roles in different situations.  相似文献   

13.
Web search queries are often ambiguous or faceted, and the task of identifying the major underlying senses and facets of queries has received much attention in recent years. We refer to this task as query subtopic mining. In this paper, we propose to use surrounding text of query terms in top retrieved documents to mine subtopics and rank them. We first extract text fragments containing query terms from different parts of documents. Then we group similar text fragments into clusters and generate a readable subtopic for each cluster. Based on the cluster and the language model trained from a query log, we calculate three features and combine them into a relevance score for each subtopic. Subtopics are finally ranked by balancing relevance and novelty. Our evaluation experiments with the NTCIR-9 INTENT Chinese Subtopic Mining test collection show that our method significantly outperforms a query log based method proposed by Radlinski et al. (2010) and a search result clustering based method proposed by Zeng et al. (2004) in terms of precision, I-rec, D-nDCG and D#-nDCG, the official evaluation metrics used at the NTCIR-9 INTENT task. Moreover, our generated subtopics are significantly more readable than those generated by the search result clustering method.  相似文献   

14.
Abstract

Digital facsimile maps of the Netherlands viewable over the Internet or available on CD-ROMs-as an example of what may be available for other countries-are the focus of this article. The author gives an overview of selected Web sites in The Netherlands that provide access to digitized maps and atlases, outlining strengths and weaknesses for each, and detailing future directions for these types of Web sites.  相似文献   

15.
Measuring Search Engine Quality   总被引:12,自引:3,他引:9  
The effectiveness of twenty public search engines is evaluated using TREC-inspired methods and a set of 54 queries taken from real Web search logs. The World Wide Web is taken as the test collection and a combination of crawler and text retrieval system is evaluated. The engines are compared on a range of measures derivable from binary relevance judgments of the first seven live results returned. Statistical testing reveals a significant difference between engines and high intercorrelations between measures. Surprisingly, given the dynamic nature of the Web and the time elapsed, there is also a high correlation between results of this study and a previous study by Gordon and Pathak. For nearly all engines, there is a gradual decline in precision at increasing cutoff after some initial fluctuation. Performance of the engines as a group is found to be inferior to the group of participants in the TREC-8 Large Web task, although the best engines approach the median of those systems. Shortcomings of current Web search evaluation methodology are identified and recommendations are made for future improvements. In particular, the present study and its predecessors deal with queries which are assumed to derive from a need to find a selection of documents relevant to a topic. By contrast, real Web search reflects a range of other information need types which require different judging and different measures.  相似文献   

16.
The Web of Science is no longer the only database which offers citation indexing of the social sciences. Scopus, CSA Illumina and Google Scholar are new entrants in this market. The holdings and citation records of these four databases were assessed against two sets of data one drawn from the 2001 Research Assessment Exercise and the other from the International bibliography of the Social Sciences. Initially, CSA Illumina's coverage at journal title level appeared to be the most comprehensive. But when recall and average citation count was tested at article level and rankings extrapolated by submission frequency to individual journal titles, Scopus was ranked first. When issues of functionality, the quality of record processing and depth of coverage are taken into account, Scopus and Web of Science have a significant advantage over the other two databases. From this analysis, Scopus offers the best coverage from amongst these databases and could be used as an alternative to the Web of Science as a tool to evaluate the research impact in the social sciences.  相似文献   

17.
18.
ABSTRACT

The importance to catalogers of having local procedures that are readily available, current, and accurate cannot be overemphasized. By using Web technology, procedures are easily updated, broadly available, searchable through powerful Web search engines, and capable of linking directly to related resources. Catalogers' skills in organization and classification provide a good foundation for learning the basics of Web creation. This article presents some guidelines dealing with the logical organization of procedures on the Web, along with the use of appropriate language and consistent design.  相似文献   

19.
Web search algorithms that rank Web pages by examining the link structure of the Web are attractive from both theoretical and practical aspects. Todays prevailing link-based ranking algorithms rank Web pages by using the dominant eigenvector of certain matrices—like the co-citation matrix or variations thereof. Recent analyses of ranking algorithms have focused attention on the case where the corresponding matrices are irreducible, thus avoiding singularities of reducible matrices. Consequently, rank analysis has been concentrated on authority connected graphs, which are graphs whose co-citation matrix is irreducible (after deleting zero rows and columns). Such graphs conceptually correspond to thematically related collections, in which most pages pertain to a single, dominant topic of interest.A link-based search algorithm A is rank-stable if minor changes in the link structure of the input graph, which is usually a subgraph of the Web, do not affect the ranking it produces; algorithms A,B are rank-similar if they produce similar rankings. These concepts were introduced and studied recently for various existing search algorithms.This paper studies the rank-stability and rank-similarity of three link-based ranking algorithms—PageRank, HITS and SALSA—in authority connected graphs. For this class of graphs, we show that neither HITS nor PageRank is rank stable. We then show that HITS and PageRank are not rank similar on this class, nor is any of them rank similar to SALSA.This research was supported by the Fund for the Promotion of Research at the Technion, and by the Barnard Elkin Chair in Computer Science.  相似文献   

20.
网络搜索中语言使用特征研究   总被引:1,自引:0,他引:1  
以网络搜索中语言使用的特征为研究对象,旨在对网络搜索中查询式的句法和语义问题进行探索性的研究。主要使用搜索引擎查询日志挖掘的方法,辅以网络问卷调查法所得到的结论进行比较分析,得出在句法、词汇类别、辅助词和主体词等方面的特征。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号