Data Migration AssessmentGuide

5-Nov-18

Version 3.0 Final

Prepared by

1

Table of Contents

1Introduction

1.1Overview

1.2Who Should Read This Guide

1.3How to Use This Guide

2Pre-Requisites

2.1Enable Power Pivot

2.2Verify the Data Migration Assessment - Jump Start - Analysis Workbook

3Collect Data from the Target File Servers

3.1Execute PowerShell Script or VBScript on File Servers

3.2Executing the Data Migration Assessment Scripts

3.3Prepare Data for Analysis

4Analyze the Data from the Target File Servers

4.1Refresh the Data Connection in Power Pivot

4.2Analyze the Collected Data

4.3Document and Discuss Assessment Results

1

1Introduction

1.1Overview

This document provides information on implementing the Data Migration Assessment process delivered as part of the Data Migration Assessment Jump Start Engagement.

1.2Who Should Read This Guide

This document is designed to assist the Delivery Organization consultants and architects responsible for delivering the Data Migration Assessment Jump Start. The document explains how to use the technical IP included in the Offering to deliver the Data Migration Assessment.

1.3How to Use This Guide

This document will assist the consultants with step by step procedures for utilizing the PowerShell and Excel based Solution for assessing the customer’s file servers and data to develop recommendations for which volumes can be migrated to StorSimple.

2Pre-Requisites

The following pre-requisites are required to perform the Data Migration Assessment:

  1. The delivery consultant must install (if not already installed) Office 2013 Professional Plus on their laptop or identify a customer workstation running Office 2013 Professional Plus (only Excel is actually required)
  2. Enable Power Pivot

Power Pivot must be enabled using the following steps:

  1. Go to FileOptionsAdd-Ins.
  2. In the Manage box, click COM Add-insGo.
  3. Check the Microsoft Office Power Pivot in Microsoft Excel 2013 box, and then click OK. If you have other versions of the Power Pivot add-in installed, those versions are also listed in the COM Add-ins list. Be sure to select the Power Pivot add-in for Excel 2013.


The ribbon now has a Power Pivot tab.

Open the Power Pivot window

  1. Click Power Pivot.

This is the tab where you work with Power Pivot PivotTables, calculated fields, and key performance indicators (KPIs) (key performance indicator (KPI): A predefined measure that is used to track performance against a strategic goal, objective, plan, initiative, or business process. A visual cue is often used to communicate the performance against the measure.), and creating linked tables.


  1. Click Manage.


Now you’re in the Power Pivot window. Here you can click Get External Data to use the Table Import Wizard to filter data as you add it to your file, create relationships between tables, enrich the data with calculations and expressions, and then use this data to create PivotTables and PivotCharts.

  1. Now that you have enabled Power Pivot, close Excel.

2.2Verify the Data Migration Assessment - Jump Start - Analysis Workbook

  1. Open the Data Migration Assessment – Jump Start – Analysis Workbook.xlsx file included as part of the engagement materials.
  2. Ensure the StorageAssessmentInputFile.csv file is located in the same directory as the Excel workbook
  3. The workbook should open and display the sample data

Note: If asked to help access to external connections, select yes.

  1. Click through the Charts, Tables, and Files to Flag sheets in the workbook and help make sure the sample data is displayed correctly. You should see data like these screenshots:

3Collect Data from the Target File Servers

In this section, the target file servers identified during Envisioning and the Solution Alignment Workshop will be scanned to collect information about the files, types, and ages on each of the servers. A PowerShell script will be executed on each target server which will generate a CSV output text file.

3.1Execute PowerShell Script or VBScript on File Servers

The FileScan.ps1 and FileScan.vbs script included as part of the technical IP for the Jump Start can copied to each file server and executed locally or remotely under a user account that has read and file scan access to all of the data on the server (typically this would be a customer executeing the file using an administrator account).

Note: PowerShell or VBScript version of script does the same functionality. Depending on customer environment the type of the script (PS or VBS) get executed.

1

  1. Download the script and review the contents with your customer technical point of contact. The script code is illustrated here:

Note that the PowerShell script is essentially 77 lines of code executing three primary PowerShell cmdlets: Get-Volume, Get-ChildItem, and Export-CSV. The script does not modify and data other than creating a single output file.

1

Note that the VBScript is essentially 290 lines of code executing various method and Export-CSV. The script does not modify and data other than creating a single output file.

  1. Ensure the customer technical Proof of Concept (POC) is comfortable with the script and able to execute it on the servers.

Important: Be sure to reading “Script implementation Steps” below before executing the script as it describes how to configure the script for testing and to change the output paths if needed.

  1. On each file server, have the technical POC log in as an administrator, copy the script to the server, then open PowerShell or VBScript as administrator and execut the script (ex: .\FileScan.ps1 or Filescan.vbs). This script can run remotely (without log in individual server) to scan multiple remote computer. Refer “Script implementation steps” document.

Important: The script will put some load on the file servers, primarily in terms of CPU as the files all get scanned. The script may take 5 – 240 minutes to complete depending on how many files the server and its storage contain and how fast the server is. The customer may want to implement this process off hours or during low demand periods.

  1. When the script has finished executing, locate the output CSV file which should be named Main-StorageAssessmentInput.csv and will by default be at the c:\users\<logged on user>\Assessment-<Current Date and Time> folder. Copy the CSV file to the laptop or workstation running Excel and PowerPivot. For remote servers all CSV files will be consolidated into Main-StorageAssessmentInput.csv.
  2. Repeat steps 3 and 4 for each of the in scope file servers.
  3. At this point, the analysis workstation should have CSV files for each of the servers in scope for assessment.
  4. The output CSV files will look like this:

3.2Executing the Data Migration Assessment Scripts

The FileScan.ps1 and FileScan.vbs script included as part of the technical IP for the Jump Start can be copied to each file server and executed locally or remotely under a user account that has read and file scan access to all of the data on the server (typically this would be a customer executing the file using an administrator account).

Note: PowerShell or VBScript version of script does the same functionality. Depending on customer environment the type of the script (PS or VBS) get executed.

  1. To Scan a single drive
  2. Filescan.ps1 –drive c:
  3. cscript Filescan.vbs c
  4. To Scan Multiple drives ( This is default)
  5. Filescan.ps1
  6. cscript filescan.vbs
  7. Scan List of servers Remotely
  8. Filescan.ps1 –serverlist <Server.txt> [ The input text file can be any name]
  9. cscript Filescan.vbs <server.txt> [The input text file can be any name]
  10. Output filename “Main-StorageAssessmentInput.csv” is generated in user's homedirectory\Assessment<current Date<Current Time>" every time the script ran.
  11. Example C:\users\<user>\assessment09262014-T163116"
  12. Scanning multiple servers generates csv for each server name, <computer Name>-Assessment.csv. Then script consolidate all the CSV and generate a single CSV “Main-StorageAssessmentInput.csv".

IMPORTANT: The CSV output files may be very large (tens to hundreds of megabytes) depending on the size of the volumes. Ensure that if you are running these scripts locally on the file servers, be sure to select a volume for the output that has large amount of paid space (typically not the C: drive or OS drive on the file server).

3.3Prepare Data for Analysis

In this section, the data collected in the previous section will be assembled into a single file for analysis.

  1. In each of the CSV files created in the previous section, remove the first two lines of the files:

#TYPE Selected.System.IO.FileInfo

"HostName","PSDrive","Directory","Name","Extension","Length","CreationTime","LastAccessTime","LastWriteTime"

Important: The first line is an output bug. The second line is the column headings for the CSV files. We will be pasting the data into the existing sample file which already has the heading.

  1. Open the Main-StorageAssessmentInput.csv sample file that is included as part of the Offering technical IP.
  2. Delete all of the sample data but retain the first row (the column headings)
  3. Open each modified CSV file from each of the in scope file servers and paste the data into the StorageAssessmentInput.csv file
  4. This file will likely be extremely large (millions of rows). Unless the file servers are enormous, this amount of data should be ok since we are using Power Pivot with Excel
  5. Save the file which now contains all of the data from the in scope file server.

4Analyze the Data from the Target File Servers

In this section you will use Excel and PowerPivot to analyze the collected data. For reference purposes, Power Pivot is a significant enhancement to Excel which helps it to analyze extremely large data sets in a high performance manner.

4.1Refresh the Data Connection in Power Pivot

  1. Open the Data Migration Assessment - Jump Start - Analysis Workbook.xlsx workbook.

  2. Select the Power Pivot tab in the ribbon the click the Manage button.
  3. The Power Pivot workbook should now open.
  1. Initially this will show the sample data. To refresh the data using the updated StorageAssessmentInput.csv file, select the Existing Connections button on the ribbon.
  2. In the dialog box that appears, select the Text data item at the top, then click Edit.
  1. In the Edit dialog box, click Browse and locate the updated version of the StorageAssessmentInput.csv file.
  2. Ensure the Use First Row as Column Headers checkbox is selected.
  1. Click Save. This will return you to the Existing Connections dialog box. Now click the Refresh button.
  1. This will trigger the import of all of the data from the main-StorageAssessmentInput.csv file. Using vertical compression and other optimizations, this huge amount of data from CSV will result in only a moderately large Excel file.
  2. When the import is complete, all of the Pivot Charts and Pivot Tables should be updated.

4.2Analyze the Collected Data

  1. Begin analyzing the data by selecting the Charts worksheet.
  2. By default these charts summarize all data from all servers and volumes.
  3. This initial summary view will help you determine how old in general all of the files are in terms of creation data, last accessed data, and last written date.
  4. Use the filter buttons in each chart as needed to isolate individual servers or volumes for analysis.
  5. The purpose is to get a feel for each server and volume in terms of how “stale” the data is
  6. Next evaluate the Percentage of Storage by Top 10 File Extensions chart. This indicates the types of files that are consuming the most space across all of the volumes assessed.
  7. Use the filter buttons in the chart as needed to isolate individual servers or volumes for analysis.
  8. Continue analyzing the data by selecting the Tables worksheet.
  9. This worksheet includes a Pivot Table summarizing the entire set of data from Power Pivot (which could be millions of rows).
  10. The instructions for using this Pivot Table are documented in the worksheet itself.
  11. To summarize, what you will do hear is for each volume on each host, you will see if the sum of the percentages for Age3 and Age4 are greater than 50% and if so, that volume will be flagged as a potential candidate for migrating to StorSimple.
  12. By default, the Pivot Table is using last accessed (read) date for the calculations. It can easily be changed to last written date or creation date.
  13. The default rule documented is if 50% or more of the storage on the volume has not been accessed in the last 180 days, it should be a candidate for migration to StorSimple. You can discuss this rule with your customer and modify using your judgment. The final calculation is done manually as this was difficult to try to do in Excel but only adds about 10 minutes of manual work. For each volume, add up the Age3 and Age4 percentages and if greater than 50%, enter “Migrate” in Column D next to the Volume row for each relevant volume.
  1. One file analysis step remains and that is to evaluate each volume for file types that need to be flagged. Select the File Types to Flag worksheet.
  2. This worksheet contains another Pivot Table and the instructions for using it are documented in the worksheet.
  3. In summary, the table includes files by volume and file extension. What we want to do here is filter the table to show us only file extensions we care about. In this case, those are extensions which are typically not recommended for use with StorSimple such as high IO active database files, active virtual machine files, etc.
  4. In row 16, select the filter button in Column B
  5. Clear the filter so it does not select all, then use the search field to enter the file extensions you want to include in the filter. Rows 9 – 14 show some examples. Search for those you want to filter for and click Add current selection to filter.
  6. Once your filter is configured, click OK and the Pivot Table should update to only show files with the extensions you selected. It will summarize the file by size and by volume.
  7. At this point, you need to identify volumes that have large numbers or sizes of file flagged. These may not be good candidates for migration to StorSimple for example if a volume contains many active VHDX files, it is likely not a candidate (unless it is just a VHDX library, meaning non-active VHDX files)
  8. You must now compare your list of migration candidates from steps 1 – 13 with those from steps 14 – 20 to see if any of the original candidates need to be eliminated.
  9. The remaining list of volumes are now those preliminarily recommended for migration to StorSimple.

4.3Document and Discuss Assessment Results

In V1 of this engagement, there is not a template for assessment results. To create the final deliverable, utilize the filtering capability described above for the Charts and Tables worksheets to create chart and table images to include in a Word document. For each of the volumes which you are recommending for migration to StorSimple, capture in a Word document the host name, volume name, and the relevant filtered chart and pivot table data from the Charts and Tables tabs.

Summarize the final data by calculating the total amount of storage being recommended for migration to StorSimple.

Review the preliminary recommendations with your customer technical POC to further refine the list based on their knowledge of their environment and data.

Edit the list based on any customer feedback and finalize the document as the engagement deliverable.

1