Renuka Nilaya, Shiva Mookambika Nagar, 6th Cross, Uppharahally, TUMKUR-02
Mobile: 8904892715 Email:
Website:
BIG DATA/HADOOP IEEE PROJECT TITLES - 2015
S.No / Title1 / FastRAQ: A Fast Approach to Range-Aggregate Queries in Big Data Environments
Cloud Computing Apr/Jun 2015
2 / Privacy-Preserving Ciphertext Multi-Sharing Control for Big Data Storage
Information forensics and security, Aug 2015
3 / BDCaM: Big Data for Context-aware Monitoring - A Personalized Knowledge Discovery Framework for Assisted Healthcare
Cloud Computing, June 2015
4 / Self-Adjusting Slot Configurations for Homogeneous and Heterogeneous Hadoop Clusters
Cloud Computing, March 2015
5 / An Incremental and Distributed Inference Method for Large-Scale Ontologies Based on MapReduce Paradigm
Systems, Man, and Cybernetics Society, Jan 2015
IEEE 2015 TITLE WITH ABSTRACT – HADOOP
S.No / Title1 / FastRAQ: A Fast Approach to Range-Aggregate Queries in Big Data Environments
Cloud Computing Apr/Jun 2015
Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges. Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environments. In this paper, we propose FastRAQ-a fast approach to range-aggregate queries in big data environments. FastRAQ first divides big data into independent partitions with a balanced partitioning algorithm, and then generates a local estimation sketch for each partition. When a range-aggregate query request arrives, FastRAQ obtains the result directly by summarizing local estimates from all partitions. FastRAQ has O(1) time complexity for data updates and O(N/P×B) time complexity for range-aggregate queries, where N is the number of distinct tuples for all dimensions, P is the partition number, and B is the bucket number in the histogram. We implement the FastRAQ approach on the Linux platform, and evaluate its performance with about 10 billions data records. Experimental results demonstrate that FastRAQ provides range-aggregate query results within a time period two orders of magnitude lower than that of Hive, while the relative error is less than 3 percent within the given confidence interval.
2 / Privacy-Preserving Ciphertext Multi-Sharing Control for Big Data Storage
Information forensics and security, Aug 2015
The need of secure big data storage service is more desirable than ever to date. The basic requirement of the service is to guarantee the confidentiality of the data. However, the anonymity of the service clients, one of the most essential aspects of privacy, should be considered simultaneously. Moreover, the service also should provide practical and fine-grained encrypted data sharing such that a data owner is allowed to share a ciphertext of data among others under some specified conditions. This paper, for the first time, proposes a privacy-preserving ciphertext multi-sharing mechanism to achieve the above properties. It combines the merits of proxy re-encryption with anonymous technique in which a ciphertext can be securely and conditionally shared multiple times without leaking both the knowledge of underlying message and the identity information of ciphertext senders/recipients. Furthermore, this paper shows that the new primitive is secure against chosen-ciphertext attacks in the standard model.
3 / BDCaM: Big Data for Context-aware Monitoring - A Personalized Knowledge Discovery Framework for Assisted Healthcare
Cloud Computing, June 2015
Context-aware monitoring is an emerging technology that provides real-time personalised health-care services and a rich area of big data application. In this paper, we propose a knowledge discovery-based approach that allows the context-aware system to adapt its behaviour in runtime by analysing large amounts of data generated in ambient assisted living (AAL) systems and stored in cloud repositories. The proposed BDCaM model facilitates analysis of big data inside a cloud environment. It first mines the trends and patterns in the data of an individual patient with associated probabilities and utilizes that knowledge to learn proper abnormal conditions. The outcomes of this learning method are then applied in context-aware decision-making processes for the patient. A use case is implemented to illustrate the applicability of the framework that discovers the knowledge of classification to identify the true abnormal conditions of patients having variations in blood pressure (BP) and heart rate (HR). The evaluation shows a much better estimate of detecting proper anomalous situations for different types of patients. The accuracy and efficiency obtained for the implemented case study demonstrate the effectiveness of the proposed model.
4 / Self-Adjusting Slot Configurations for Homogeneous and Heterogeneous Hadoop Clusters
Cloud Computing, March 2015
The MapReduce framework and its open source implementation Hadoop have become the defacto platform for scalable analysis on large data sets in recent years. One of the primary concerns in Hadoop is how to minimize the completion length (i.e., makespan) of a set of MapReduce jobs. The current Hadoop only allows static slot configuration, i.e., fixed numbers of map slots and reduce slots throughout the lifetime of a cluster. However, we found that such a static configuration may lead to low system resource utilizations as well as long completion length. Motivated by this, we propose simple yet effective schemes which use slot ratio between map and reduce tasks as a tunable knob for reducing the makespan of a given set. By leveraging the workload information of recently completed jobs, our schemes dynamically allocates resources (or slots) to map and reduce tasks. We implemented the presented schemes in Hadoop V0.20.2 and evaluated them with representative MapReduce benchmarks at Amazon EC2. The experimental results demonstrate the effectiveness and robustness of our schemes under both simple workloads and more complex mixed workloads.
5 / An Incremental and Distributed Inference Method for Large-Scale Ontologies Based on MapReduce Paradigm
Systems, Man, and Cybernetics Society, Jan 2015
With the upcoming data deluge of semantic data, the fast growth of ontology bases has brought significant challenges in performing efficient and scalable reasoning. Traditional centralized reasoning methods are not sufficient to process large ontologies. Distributed reasoning methods are thus required to improve the scalability and performance of inferences. This paper proposes an incremental and distributed inference method for large-scale ontologies by using MapReduce, which realizes high-performance reasoning and runtime searching, especially for incremental knowledge base. By constructing transfer inference forest and effective assertional triples, the storage is largely reduced and the reasoning process is simplified and accelerated. Finally, a prototype system is implemented on a Hadoop framework and the experimental results validate the usability and effectiveness of the proposed approach.