首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Construction and compression of Dwarf
作者姓名:向隆刚  冯玉才  桂浩
作者单位:School of Computer Science,Huazhong University of Science and Technology,Wuhan 430074,China,School of Computer Science,Huazhong University of Science and Technology,Wuhan 430074,China,School of Computer Science,Huazhong University of Science and Technology,Wuhan 430074,China
基金项目:Project (No. 20030487032) supported by the Specialized Research Fund for the Doctoral Program of Higher Education, China
摘    要:INTRODUCTION The CUBE BY operator (Gray et al., 1996) is anessential facility for data warehousing and OLAP. Itis a multidimensional extension of the standardGROUP BY operator, computing all possible com-binations of the grouping attributes in the CUBE BYclause. A CUBE BY with N grouping attributes willcompute 2N group-bys. In the real world, a fact tableis often very large and sparse. In such cases, the sizeof a group-by is possibly close to the size of the facttable. So th…

关 键 词:数据库  压缩软件  文件后缀  前缀路径  数据分割
收稿时间:20 April 2004
修稿时间:1 August 2004

Construction and compression of Dwarf
Xiang Long-gang,Feng Yu-cai,Gui Hao.Construction and compression of Dwarf[J].Journal of Zhejiang University Science,2005,6(6):519-527.
Authors:Xiang Long-gang  Feng Yu-cai  Gui Hao
Institution:(1) School of Computer Science, Huazhong University of Science and Technology, 430074 Wuhan, China
Abstract:There exists an inherent difficulty in the original algorithm for the construction of Dwarf, which prevents it from constructing true Dwarfs. We explained when and why it introduces suffix redundancies into the Dwarf structure. To solve this problem, we proposed a completely new algorithm called PID. It bottom-up computes partitions of a fact table, and inserts them into the Dwarf structure. Ifa partition is an MSV partition, coalesce its sub-Dwarf; otherwise create necessary nodes and cells. Our performance study showed that PID is efficient. For further condensing of Dwarf, we proposed Condensed Dwarf, a more compressed structure, combining the strength of Dwarf and Condensed Cube. By eliminating unnecessary stores of "ALL" cells from the Dwarf structure, Condensed Dwarf could effectively reduce the size of Dwarf, especially for Dwarfs of the real world, which was illustrated by our experiments. Its query processing is still simple and, only two minor modifications to PID are required for the construction of Condensed Dwarf.
Keywords:Data cube  Dwarf  Suffix coalescing  Prefix path  MSV partition  Condensed Dwarf
本文献已被 CNKI 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号