首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Crime base: Towards building a knowledge base for crime entities and their relationships from online news papers
Institution:1. School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN, USA;2. Department of Computing and Mathematics University of São Paulo, Ribeirão Preto, SP, Brazil;3. Institute of Science and Technology Federal University of São Paulo, São José dos Campos, SP, Brazil;4. Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada;1. University of North Carolina at Chapel Hill, United States;2. Data Science, Facebook 1 Hacker Way Menlo Park, CA 94025, United States;3. User Experience and Customer Insights, NetBrain 15 Network Drive Burlington, MA, 01803, United States;1. Department of Computer Science and Software Engineering, International Islamic University Islamabad, Pakistan;2. Department of Informatics and Telematics, Harokopio University of Athens, Greece;3. Department of Computer Science, National University of Modern Languages, Islamabad, Pakistan;1. School of Data and Computer Science, Sun Yat-sen University, China;2. School of Computing Science, University of Glasgow, Glasgow, UK;3. School of Computer Science, The University of Adelaide, Adelaide, Australia
Abstract:In the current era of internet, information related to crime is scattered across many sources namely news media, social networks, blogs, and video repositories, etc. Crime reports published in online newspapers are often considered as reliable compared to crowdsourced data like social media and contain crime information not only in the form of unstructured text but also in the form of images. Given the volume and availability of crime-related information present in online newspapers, gathering and integrating crime entities from multiple modalities and representing them as a knowledge base in machine-readable form will be useful for any law enforcement agencies to analyze and prevent criminal activities. Extant research works to generate the crime knowledge base, does not address extraction of all non-redundant entities from text and image data present in multiple newspapers. Hence, this work proposes Crime Base, an entity relationship based system to extract and integrate crime related text and image data from online newspapers with a focus towards reducing duplicity and loss of information in the knowledge base. The proposed system uses a rule-based approach to extract the entities from text and image captions. The entities extracted from text data are correlated using contextual as-well-as semantic similarity measures and image entities are correlated using low-level and high-level image features. The proposed system also presents an integrated view of these entities and their relations in the form of a knowledge base using OWL. The system is tested for a collection of crime related articles from popular Indian online newspapers.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号