排序方式: 共有171条查询结果,搜索用时 15 毫秒
1.
2.
3.
企业邮件系统中每天都要产生大量的日志,通过对日志的分析能够获取用户、设备甚至是潜在安全风险等信息,由于日志量大,采用传统的日志分析方法已难以满足企业需求。随着Hadoop平台技术的成熟,通过大数据技术能够实现对大数据量日志的分析。本文以邮件系统中用户访问日志为例,通过Hadoop平台的构建,使用Hive实现了对该日志的分析,有助于发现邮件系统中潜在的安全风险,保障系统的稳定运行。 相似文献
4.
耿兴隆王丽 《河北软件职业技术学院学报》2016,(1):44-47
随着信息技术和物联网技术在城市交通领域的广泛应用,城市交通流量数据已经呈现出大数据的诸多特征。采用传统的信息处理技术对交通大数据进行分析时不可避免地遇到了性能瓶颈。基于Hadoop的交通流量统计分析系统可以很好地统计和分析这些数据。通过基于Hadoop的平台对交通流量信息的处理方法展开研究,设计了交通流量统计分析系统,并给出相应研究数据,最后对系统进行仿真并验证系统的可行性与有效性。 相似文献
5.
File semantic has proven effective in optimizing large scale distributed file system. As a consequence of the elaborate and
rich I/O interfaces between upper layer applications and file systems, file system can provide useful and insightful information
about semantic. Hence, file semantic mining has become an increasingly important practice in both engineering and research
community. Unfortunately, it is a challenge to exploit file semantic knowledge because a variety of factors could affect this
information exploration process. Even worse, the challenges are exacerbated due to the intricate interdependency between these
factors, and make it difficult to fully exploit the potentially important correlation among various semantic knowledges. This
article proposes a file access correlation miming and evaluation reference (FARMER) model, where file is treated as a multivariate
vector space, and each item within the vector corresponds a separate factor of the given file. The selection of factor depends
on the application, examples of factors are file path, creator and executing program. If one particular factor occurs in both
files, its value is non-zero. It is clear that the extent of inter-file relationships can be measured based on the likeness
of their factor values in the semantic vectors. Benefit from this model, FARMER represents files as structured vectors of
identifiers, and basic vector operations can be leveraged to quantify file correlation between two file vectors. FARMER model
leverages linear regression model to estimate the strength of the relationship between file correlation and a set of influencing
factors so that the “bad knowledge” can be filtered out. To demonstrate the ability of new FARMER model, FARMER is incorporated
into a real large-scale object-based storage system as a case study to dynamically infer file correlations. In addition FARMER-enabled
optimize service for metadata prefetching algorithm and object data layout algorithm is implemented. Experimental results
show that is FARMER-enabled prefetching algorithm is shown to reduce the metadata operations latency by approximately 30%–40%
when compared to a state-of-the-art metadata prefetching algorithm and a commonly used replacement policy. 相似文献
6.
7.
Hadoop是目前应用最广泛的分布式框架,作业调度是其重要环节,它直接关系到集群的性能与资源利用率。研究了作业调度流程、作业调度策略模式,对Hadoop自带的3种调度器的设计要点与配置方法进行了探讨。 相似文献
8.
文章介绍了云存储技术概念、架构和关键技术,分析了Hadoop的核心组成技术,包括分布式文件系统HDFS、并行计算框架MapReduce、分布式数据库HBase.在对电子档案长期保存业务分析和云存储平台分析的基础上,对云存储服务系统的体系结构、功能模块、数据库及运行环境进行了设计,构建了一个基于Hadoop三大核心技术的云存储服务系统,为处理海量大规模数据的存储问题提供了解决方案. 相似文献
9.
10.
《实验室研究与探索》2015,(11):77-81
随着互联网技术的发展,数据量成爆炸性增长趋势,单机难以存储、组织和分析这些海量数据。面对单机难以处理海量数据的现状,建立分布式计算平台对于今后科研工作和实验教学具有重要的意义。就如何在实验室环境下搭建分布式计算平台做了详细说明并对hadoop和spark的性能进行比较,包括Hadoop和Spark集群的安装和部署,Spark集成开发环境的建立,同一组数据集在两个平台上进行Kmeans聚类的时间对比。对于建设分布式计算平台具有一定的指导意义。 相似文献