英汉对照语言对自动获取 Automatic Acquisition of English-Chinese Parallel Pairs期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

英汉对照语言对自动获取

引用本文：	王东波,谢靖.英汉对照语言对自动获取[J].图书情报工作,2010,54(17):108-112.

作者姓名：	王东波谢靖

作者单位：	南京大学信息管理系

基金项目：	南京大学研究生科研创新基金，教育部人文社会科学研究基金

摘要：	首先，在基于语料库统计和人工内省的语言知识基础上制定抓取底表，使用抓取工具Wget从网络上抓取含有英汉对照语言对的网页。其次，通过程序从抓取网页中提取英汉对照语言对，对获取的英汉对照语言对进行后续整理，如去重、格式转换等。最后，把英汉对照语言对存入到数据库中。
关键词：	英汉对照语言对 Wget 底表 MySQL数据库
收稿时间：	2010-03-29
修稿时间：	2010-06-02
Automatic Acquisition of English-Chinese Parallel Pairs

Wang Dongbo,Xie Jing.Automatic Acquisition of English-Chinese Parallel Pairs[J].Library and Information Service,2010,54(17):108-112.

Authors:	Wang Dongbo Xie Jing

Institution:	Department of Information Management, Nanjing University,

Abstract:	Firstly, the words library is worked out based on corpus-based statistical data and the linguistic knowledge of artifical introspection.Then, the websites containing the English-Chinese parallel pairs are crawled by using the tool of Wget. Secondly, the Chinese-English parallel pairs are extracted from the crawled pages through program and the extracted parallel pairs are follow-up processed,such as de emphasis,format conversion,ect. Finally,the English-Chinese parallel pairs are stored into database.

Keywords:	English-Chinese parallel pairs Wget words library MySQL database
本文献已被万方数据等数据库收录！
	点击此处可从《图书情报工作》浏览原始摘要信息
	点击此处可从《图书情报工作》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏