SANE: SEMANTIC-AWARE NAMESPACE

IN ULTRA-LARGE-SCALE FILE SYSTEMS

ABSTRACT:

The explosive growth in data volume and complexity imposes great challenges for file systems. To address thesechallenges, an innovative namespace management scheme is in desperate need to provide both the ease and efficiency of dataaccess. In almost all today’s file systems, the namespace management is based on hierarchical directory trees. This tree-basednamespace scheme is prone to severe performance bottlenecks and often fails to provide real-time response to complex data lookups.This paper proposes a Semantic-Aware Namespace scheme, called SANE, which provides dynamic and adaptive namespacemanagement for ultra-large storage systems with billions of files. SANE introduces a new naming methodology based on the notion ofsemantic-aware per-file namespace, which exploits semantic correlations among files, to dynamically aggregate correlated files intosmall, flat but readily manageable groups to achieve fast and accurate lookups. SANE is implemented as a middleware in conventionalfile systems and works orthogonally with hierarchical directory trees. The semantic correlations and file groups identified in SANE canalso be used to facilitate file prefetching and data de-duplication, among other system-level optimizations. Extensive trace-drivenexperiments on our prototype implementation validate the efficacy and efficiency of SANE.

EXISTING SYSTEM:

According to arecent survey of 1,780 data center managers in 26 countries, over 36 percent of respondents faced two critical challenges:efficiently supporting a flood of emerging applicationsand handling the sharply increased data managementcomplexity. This reflects a reality in which we are generatingand storing much more data than ever and this trendcontinues at an accelerated pace. This data volume explosionhas imposed great challenges to storage systems, particularlyto the metadata management of file systems. Forexample, many systems are required to perform hundredsof thousands of metadata operations per second and theperformance is severely restricted by the hierarchicaldirectory-tree based metadata management scheme used inalmost all file systems today.

The most important functions of namespace managementare file identification and lookup. File system namespaceas an information-organizing infrastructure isfundamental to system’s quality of service such as performance,scalability, and ease of use. Almost all current filesystems, unfortunately, are based on hierarchical directorytrees.

DISADVANTAGES OF EXISTING SYSTEM:

Limited system scalability.

Reliance on end-users to organize and lookupdata.

Lack of metadata-semantics exploration.

PROPOSED SYSTEM:

We propose a new namespace management scheme,called SANE, which provides a flat but small, manageableand efficient namespace for each file. In SANE, the notion ofsemantic-aware per-file namespace is proposed in which afile is represented by its semantic correlations to other files,instead of conventional static file names. Our goal is not toreplace conventional directory-tree management thatalready has a large user base. Instead, we aim to provideanother metadata overlay that is orthogonal to directorytrees. SANE runs concurrently with the conventional filesystem that integrates it and takes over the responsibilitiesof file search and semantic file grouping from the file systemwhen necessary. Moreover, SANE, while providing thesame functionalities, makes use of a new naming schemethat only requires constant-scale complexity to identify andaggregate semantically correlated files. SANE extracts thesemantic correlation information from a hierarchical tree.

ADVANTAGES OF PROPOSED SYSTEM:

The metadata of files that are strongly correlated are automatically aggregatedand then stored together in SANE.

SANE is implemented as a transparent middleware that can be deployed/embedded in most existing filesystems without modifying the kernels or applications.

SYSTEM CONFIGURATION:-

HARDWARE REQUIREMENTS:-

Processor-Pentium –IV

Speed-1.1 Ghz

RAM-512 MB(min)

Hard Disk-40 GB

Key Board-Standard Windows Keyboard

Mouse-Two or Three Button Mouse

Monitor-LCD/LED

SOFTWARE REQUIREMENTS:

Operating system:Windows XP.

Coding Language:.Net

Data Base:SQL Server 2005

Tool:VISUAL STUDIO 2008.

REFERENCE:

Yu Hua, Hong Jiang, Senior ,Yifeng Zhu, Dan Feng, and Lei Xu_, “SANE: Semantic-Aware Namespacein Ultra-Large-Scale File Systems”IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 5, MAY 2014