Nara Sr. Big Data/Hadoop/Tableau/ETL - Architect/Lead

Professional Summary

-14+years of work experience in IT Industry in Analysis, Design, Development and Maintenance of various software applications mainly inHadoop – Cloudera,HortonWorks ,Oracle Business Intelligence, Oracle (SQL, PL/SQL),Data Base Administrator in UNIX and Windows environments in industry verticals like Banking, Financial, Pharmacies,Financial Assets, Fixed Income, Equities, Telecom& Health Insurance.

-6 + years of experience in Hadoop1.x/2.x, CDH3U6, HDFS, Hbase , Spark, Sqoop 2.x, Hive , YARN, Kafka,Flume,Elasticsearch,ASW,Java, Linux, Eclipse Juno, security –Kerberos,NiFi,Ambari, Ansible, Impala, XML, JSON, Maven, SVN,Amazon RedshiftAZURE

-5 + years of experience in Tableau 9.3 ( Online, Desktop, Public Vizable ), Oracle Business Intelligence Enterprise Edition (OBIEE) 11g/10.1.3.3.0, Oracle exalytics and SiebelAnalytics 7.7.1/7.8.5.

-5+ years of ETL experience using Informatica Power center 8.2/7.2 ( Repository Admin, Rep Manager, Designer, Workflow Manager & Workflow monitor), Informatica Power Analyzer, Power Exchange, Talend 6.x, SQL Server 2005 Integration Services (SSIS), SSAS & SSRS

-3 + Year of database administration experience using Oracle DBA 9i/10g/11g databases like setup schemas, backup and recovery , setup database environments, Installation/configuration and Physical & Logical Data Modeling.

-2 + Year of Data Modeling & Data Analysisexperience using Dimensional Data Modeling and Relational Data Modeling, Star Join Schema/Snowflake modeling(ROLAP,MOLAP & HOLAP), FACT & Dimensions tables, Physical & Logical Data Modeling.

-Experience in developing applications using Java and related technologies using WATERFALL and AGILE (SCRUM) methodologies.

-Solid experience in Agile Methodology - stories, sprints, Kanban, Scrum & tasks

-Good Knowledge of Core Java, JSP, Servlets, JDBC, SQL.

TECHNICAL SKILLS

Big Data Technology: Apache Hadoop, Hadoop Clusters, Spark, Spark Streaming, Hadoop Common, Hadop Distributed File System,security –Kerberos, YARN,Ambari, Ansible, Replication, Cloudera Cluster, Hadoop Pig, Map Reduce, Cassandra,Kafka, Flume, Amazon servers - AWS; MongoDB; Mongoose; NiFi, Tableau 8.2/9.x; Predixion Insight; Informatica; Relational, hierarchical and graph databases, distributed data file systems, data federation and query optimization

Business Intelligence:Tableau 9.3 ( Online, Desktop, Public Vizable), Oracle Business Intelligence Enterprise Edition OBIEE 11g/10.1.3.4, BI Apps (OBIA), Noetix 5.8, Siebel Business Analytics

RDBMS: Oracle 11g/10g, DB2 8.0/7.0 & MS-Server 2005

Data Modeling: Dimensional Data Modeling, Star Join Schema Modeling, Snow Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling, Erwin 3.5.2/3.x & Toad

Programming: UNIX Shell Scripting, SQL, PL/SQL, VB & C.

Operating Systems: Windows 2000, UNIX AIX.

Work Experience:

United States PTO – DC, USA Nov’15 to till date

Sr. Big Data/Hadoop/Tableau -Architect/Lead

Job Responsibilities:

-Installing and configuring the Spark, Kafka , Python on CentOS

-Installing and configuringHIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisting with performance tuning and monitoring.

-Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time

-Configured deployed and maintained multi-node Dev and Test Kafka Clusters.

-Developing Spark scripts by using Python shell commands as per the requirement to read/write JSON files

-Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's to read/write JSON files.

-Developing Spark Ingest process to extract 1 TB data on daily basis

-Installing, Configuring , Monitoring multiple Hadoop clusters environments, Monitoring workload, job performance andcapacity planningusing HortonWorksAmbari.

-Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze the logs produced by the spark cluster - HortonWorks.

-Creating HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.

-Invoking & scheduling Spark python scripts in NiFi

-Writing complex python scripts to download jsonl files, jsonl to xml , xml to jsonl

-Supporting code/design analysis, strategy development and project planning.

-Creating reports for the BI team using Sqoop to export data into HDFS and Hive.

-Working in large scale database environment like Hadoop and MapReduce, with working mechanism of Hadoop clusters, nodes and Hadoop Distributed File System (HDFS).

-Configuring and install Hadoop and Hadoop ecosystems (Hive/Pig/ HBase/ Sqoop/ Flume)

-Designing and implemented a distributed data storage system based on HBase and HDFS.

-Importing and exporting data into HDFS and Hive.

-Commissioning and Decommissioning the Hadoop nodes & Data Re-balancing

-Loading data into parquet files by applying transformation using Impala

-Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.

-Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself. Created and worked Sqoop jobs with incremental load to populate Hive External tables.

-Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.

-Developed Hive (version 0.10) scripts for end user / analyst requirements to perform ad hoc analysis

-Working on HANA Views, Optimizing Data loads with the use of SLT and minimizing SAP HANA space

-reading data from SAP HANA into Bigdata Reservoir - BDR

-Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance

-Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.

-Developed Oozie workflow for scheduling and orchestrating the ETL process

-Implemented authentication using Kerberos and authentication using Apache Sentry.

-Creating the scheduling job/workflows in oozie, Zookeeper

-Hive QL scripts to analyze customer data to determine patient's health patterns.

-Hive QL scripts to create, load, and query tables in a Hive.

-Hive QL scripts to perform Sentiment Analysis (analyzed customer's comments and product ratings).

-Set up and administer Amazon servers (AWS, Linux, Apache, MySQL, Python/Django, ElasticSearch, tripwire, fail2ban, ssh, sendmail, sudo, etc.)

-Worked in large scale database environment like Hadoop and MapReduce, with working mechanism of Hadoop clusters, nodes and Hadoop Distributed File System (HDFS).

-Supports tuple processing, writing data with Storm by provide Storm-Kafka connectors.

-As a contracted consultant, optimized the configuration of Amazon Redshift clusters, data distribution , and data processing to meet company’s current and future needs

-As Sr. Tableau /ETL architect & lead,Interacted with business users for gathering requirements and look and feel of the applications to be developer.
Developed various dashboards, used context filters, sets while dealing with huge volume of data

-Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.

-Building, publishing customized interactive reports and dashboards, report scheduling using Tableau server.

-Effectively used data blending feature in tableau.

-Defined best practices for Tableau report development.

-Administered user, user groups, and scheduled instances for reports in Tableau.

-Executed and tested required queries and reports before publishing.

-Mastered the ability to design and deploy rich Graphic visualizations with Drill Down and Drop down menu option and Parameterized using Tableau.

-Worked in Tableau environment to create dashboards like weekly, monthly, daily reports using tableau desktop & publish them to server.

-Deploying and managing applications in Datacenter, Virtual environment

-and Azure platform as well.

-Integrate Kafka and Storm by using Avro for serializing and desterilizing the data and Kafka procedure and consumer.

Environment:

Hadoop 2.x MR1, CDH3U6, HDFS, HortonWorks ,Hbase 2.x/0.90.x, Java, Amazon Redshift,SparkFlume 0.9.3, Kafka , Python ,AZURE,Ansibleblueprint, NiFi, Ambari,Elasticsearch, Impala,security –Kerberos, Sqoop 2.x, Hive 0.7.1Tableau 9.3 ( Online, Desktop, Public Vizable ).

C&S Wholesale Grocers Inc– NJ, USA Jan 2013 to Nov’2015

Sr. Big Data/Hadoop/ Tableau - Architect/Lead

Job Responsibilities:

-Worked on loading log data directly into HDFS using Flume in Cloudera.

-Involved in loading data from LINUX file system to HDFSin Cloudera.

-Responsible for managing data from multiple sources.

-Experienced in running Hadoop streaming jobs to process terabytes of xml format data.

-Responsible to manage data coming from different sources.

-Assisted in exporting analyzed data to relational databases using Sqoop.

-Experienced in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOPin Cloudera.

-Installed and configured MapReduce, HIVE and the HDFS

-Developing Spark scripts by using Java per the requirement to read/write JSON files

-Worked on Importing and exporting data into HDFS and Hive using Sqoop

-Worked on Hadoop Administration, development, NoSQL in in Cloudera

-Define Big Data load strategy

-Load and transform large sets of structured, semi structured and unstructured data

-Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map

-Automated all the jobs ,for pulling data from FTP server to load data into Hive tables, using Oozie workflows.

-Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.

-Supported code/design analysis, strategy development and project planning.

-Created reports for the BI team using Sqoop to export data into HDFS and Hive.

-Configure and install Hadoop and Hadoop ecosystems (Hive/Pig/ HBase/ Sqoop/ Flume)

-Designed and implemented a distributed data storage system based on HBase and HDFS.

-Importing and exporting data into HDFS and Hive.

-Worked on the development of Dashboard reports for the Key Performance Indicators for the top management.

-Experience in creating different visualizations using Bars, Lines and Pies, Maps, Scatter plots, Gantts, Bubbles,

-Designed & Implemented Data Warehouse creating facts and dimension tables and loading them using Informatica Power Center Tools fetching data from the OLTP system to the Analytics Data Warehouse.

-Work closely with various levels of individuals to coordinate and prioritize multiple projects. Estimate, schedule and track BI projects throughout SDLC.

-Coordinating with business user to gather the new requirements and working with existing issues.

-Tableau: Provided support for Tableau developed objects and understand tool administration.

-Tableau: Worked in Tableau environment to create dashboards like weekly, monthly, daily reports using tableau desktop & publish them to server.

-Tableau: Created Custom Hierarchies to meet the Business requirement in Tableau.

-Tableau: Work on various requests to create views for Oracle Discoverer for Order Management and Finance areas.

-Tableau: Created side by side bars, Scatter Plots, Stacked Bars, Heat Maps, Filled Maps and Symbol Maps according to deliverable specifications.

-Tableau: Consistently attended meetings with the Client subject matter experts to acquire functional business requirements in order to build SQL queries that would be used in dashboards to satisfy the business's needs.

-Design and Development of Integration APIs using various Data Structure concepts, Java Collection Framework along with exception handling mechanism to return response within 500ms. Usage of Java Thread concept to handle concurrent request.

Environment:

Hadoop 1.x/2.x MR1, ClouderaCDH3U6, HDFS, Hbase 0.90.x, Flume 0.9.3, Java, Sqoop 2.x, Hive 0.7.1,Tableau ( Online, Desktop, Public Vizable )

Magellan Medicaid Administration, VA, USA Feb’11 to Jan’ 2013

Sr. Big Data/Hadoop/OBIEE/Tableau - Architect/Lead

Job Responsibilities:

-Install and configure Hue.

-Importing data from MySQL database to HiveQL using Scoop.

-Develop, validate and maintain HiveQL queries.

-Running reports in Pig and Hive Queries.

-Analyzing data with Hive, Pig.

-Designed Hive tables to load data to and from external files.

-Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.

-Monitoring clusters to provide reporting using SOLR.

-Designed business models using SpagoBI, an analytic platform.

-Installed and configured Pig and also wrote Pig Latin scripts, Wrote MapReduce jobs using Pig Latin.

-Developed workflow using Oozie for running MapReduce jobs and Hive Queries.

-Worked on Cluster coordination services through Zookeeper.

-Interacted with Client Services Management business representatives for gathering the Reports Dashboards requirements and to define business and functional specifications in OBIEE

-Data Model Design and Documentation (Facts and Dimensions, Business Users & Security Permissions, Stars with Hierarchies, Business Model Names for Source Columns, Reports and Dashboard Layouts)

-Responsible for Monitor, analyze, design, implement the Marketing and Sales dashboard based on key performance metrics, data aging, average resolution time, increasing product sales and marketing and reducing costs.

-OBIEE meta-data development. Configure repository on all the three layers (Physical Layer, Business Logic and Mapping Layer and Presentation Layer)

-Different levels of Security set-up for the entities of the application (on both Repository file and Web catalog file) in OBIEE

-Worked on Security management for users, groups and web-groups using the various authentication systems such as LDAP, OS, Database and Database table authentication using Session Variable features, as well as Dashboard / Report-level security in OBIEE.

-Developed and Debugged Interactive Dashboards and Reports relating to Customer Feedback Analytics, with different Analytic Views (Guided Navigation, Pivot Table, Chart, View Selector, Column Selector, Filters, and Dashboard Prompts) using Presentation Services in OBIEE.

-Implemented Object Level Security for the objects like Dashboards and Reports, and Data Level Security for Region and Product dimensions, using Session Variables.

-Developed Performance Tuning Strategies, for optimized performance, by implementing Aggregate Navigation, Cache Management, re-organizing data, and making some changes to Reports.

-Performed the detailed design of Reports / Dashboards, Analytical Data Model (rpd) and performed Fit/Gap Analysis between existing reports and new requirements in OBIEE.

-Financial AnalyticsHR Analytics: Experience in Mapping Oracle GL Natural Accounts to Group Account Numbers.

-Created several new SDE’s and SIL’s Mappings.

-Modified existing SDE’s and SIL’s Mappings as per the requirements.

-Created and Customized DAC Tables,Tasks,Subject areas,Execution plans.

-ETL refresh performance improvement using DAC and Informatica.

-Configure and modify DAC, Informatica SDE, SIOL, PLP , RPD and dash boards for Financial Analytics, HR Analytics and Asset and Serialization

-Knowledge of Oracle Procurement & Spend Analytics, Oracle Supply Chain & Order Management Analytics, Oracle Service Analytics, and Oracle Financial Analytics

-Implemented the delivery of iBots using Oracle BI Delivers to alert the associated teams and Agency personnel also used iBots for Cache Seeding and Cache purging.

-Performed Unit Testing, Integration Testing, and User Acceptance Testing (UAT), to validate reports, and played an active role in product rollouts.

Environment: Oracle Business Intelligence EE 10.1.3.3.0, Oracle 10g,DAC,Oracle Apps R12LDAP, SQL Server 2005,Mapviewer ,My SQL, IMS, PL/SQL Gateway, SQL Developer, SQL and PL/SQL.

(OBIA):Financial Analytics, ,HR Analytics, Informatica Power Center 8.6(ETL),LDAP, SQL Server 2005,Mapviewer , Neon Shadow 6.0 (IMS), OViD, SQL Developer, SQL and PL/SQL.

Deluxe Corp, MN ,USA Mar’08 to Feb’11

Sr. ETL/OBIEE – Architect/Lead

Job Responsibilities:

-Interacted with Client Services Management business representatives for gathering the Reports Dashboards requirements and to define business and functional specifications

-Data Model Design and Documentation (Facts and Dimensions, Business Users & Security Permissions, Stars with Hierarchies, Business Model Names for Source Columns, Reports and Dashboard Layouts)

-OBIEE meta-data development. Configure repository on all the three layers (Physical Layer, Business Logic and Mapping Layer and Presentation Layer)

-Set up Multi User Development Environment (MUDE), Integrating OBIEE and BI Publisher.

-Different levels of Security set-up for the entities of the application (on both Repository file and Web catalog file)

-Implemented Object Level Security for the objects like Dashboards and Reports, and Data Level Security for Region and Product dimensions, using Session Variables.

-Developed Performance Tuning Strategies, for optimized performance, by implementing Aggregate Navigation, Cache Management, re-organizing data, and making some changes to Reports.

-Implemented the delivery of iBots using Oracle BI Delivers to alert the associated teams and Agency personnel also used iBots for Cache Seeding and Cache purging.

-Performed Unit Testing, Integration Testing, and User Acceptance Testing (UAT), to validate reports, and played an active role in product rollouts.

-Created comprehensive and easy-to-understand Oracle Answers & Dashboards User Guide, and provided intensive Training to business users.

-Developed many Dashboards / Reports for providing analytical information's using Oracle BI Answers and Dashboards. Created and used Oracle BI Delivers and iBots extensively.