D2.2: MOBISERV Validation Plan Issue 31/75

MOBISERV – FP7 248434
An Integrated Intelligent Home Environment for the Provision of Health, Nutrition and Mobility Services to the Elderly
Final Deliverable
D2.2: MOBISERV Validation Plan (Issue 3)
Date of delivery: Sept 30th , 2011 (Updated to Dec 16th 2011)
Contributing Partners: UWE, ST, SMARTEX, CSEM, ROBS, AUTH, SMH
Date Issued: 20th Dec 2011 / Version: Issue 3 v6.0

Document Control

Title: / D2.2: MOBISERV Validation Plan
Project: / MOBISERV (FP7 248434)
Nature: / Report / Dissemination Level: Restricted
Authors: / UWE, SMH, SYSTEMA, AUTH, CSEM, SMARTEX, ROBOSOFT
Origin: / UWE
Doc ID: / MOBISERV D2.2vol3_v6.0

Amendment History

Version / Date / Author / Description/Comments
v0.1 / 2011-09-12 / UWE / First Version
v1.1 / 2011-09-21 / UWE / Second Version
v2.0 / 2011-09-26 / UWE, SMH / Third Version – Requesting Technical Partner input
v3.0 / 2011-10-17 / UWE, SMH, ST, CSEM and Smartex / Input from Systema, CSEM and Smartex
V4.0 / 2011-12-04 / UWE, SMH, ST, CSEM, Smartex, ROBS, AUTH / Input from Robosoft and Thessaloniki
V5.0 / 2011-12-16 / UWE, SMH, ST, CSEM, Smartex, ROBS, AUTH / Final review
V6.0 / 2011-12-20 / UWE, SMH, ST, CSEM, Smartex, ROBS, AUTH / Responses to Internal Moderation (LUT) incorporated

Table of contents

Executive Summary......

1Introduction......

1.1System and scope......

1.2Objectives and constraints......

1.3Intended Audience......

2Qualification and Validation of MOBISERV system......

2.1Validation approach......

2.2WP 4 Component Validation (Nutrition Support System)......

2.2.1KPIs and Validation Plan......

2.2.1.1KPIs to be used......

2.2.1.2Benchmarking Tests to carried out – Methodology......

2.3WP 5 Component Validation (Data Logger)

2.3.1KPIs and Validation Plan......

2.3.1.1KPIs to be used......

2.3.1.2Benchmarking Tests to carried out – Methodology......

2.4WP 5 Component Validation (Smart Garments)......

2.4.1KPIs and Validation Plan......

2.4.1.1KPIs to be used......

2.4.1.2Benchmarking Tests to carried out – Methodology......

2.4.1.3Operational safety criteria and testing process......

2.5WP 6 Component Validation (Information and Coordination and Communication support system)

2.5.1KPIs and Validation Plan......

2.5.1.1KPIs to be used......

2.5.1.2Benchmarking Tests to carried out – Methodology......

2.5.1.3Operational safety criteria and testing process......

2.6WP 7 Component Validation (Robotic Platform)......

2.6.1.1KPIs to be used......

2.6.1.2Benchmarking Tests to be carried out – Methodology......

2.6.1.3Operational safety criteria and testing process......

3Usability Evaluation Plan (Field Trials at User Sites)......

3.1Scope of the first prototype evaluation......

3.2Scope of the second prototype evaluation......

3.3Key performance indicators......

3.4Part A Expert evaluation......

3.4.1Participation......

3.4.2Technical requirements......

3.4.3Aim......

3.4.4Activities......

3.4.5Outcome......

3.5Part B Small focus groups......

3.5.1Participation......

3.5.2Technical requirements......

3.5.3Aim......

3.5.4Activities......

3.5.5Outcome......

3.6Part C Field trials of individual components with users......

3.6.1Study Aims......

3.6.2Participants

3.6.3Timescales

3.6.4Functions to be evaluated

3.6.5Session 1 (50 minutes) – at individual participant’s homes

3.6.5.1Informed consent and explanation of session (10 minutes)......

3.6.5.2Pre-test interview (10 mins)......

3.6.5.3Voice training (30 mins)......

3.6.6Session 2 (50 minutes) – 2 focus groups (older people and carers)......

3.6.6.1Explanation of session (10 minutes)......

3.6.6.2Demonstration of PRU (30 minutes)......

3.6.6.3Demonstration of WHSU (UK only) (10 minutes)......

3.6.7Session 3 (90 minutes) – with individual participants at UK/NL test locations

3.6.7.1Test scenarios (45 mins)......

3.6.7.1.1Primary users

3.6.7.1.2Secondary users

3.6.7.2Break (5 mins)......

3.6.8Post-session discussion (40 mins)......

3.6.8.1Level of user satisfaction......

3.6.8.2Level of ease of use (ergonomic) of the input devices for the function......

3.6.8.3User rating of function output/feedback (in relation to quality, utility and comprehensibility)....

3.6.8.4Acceptance criteria......

3.6.9Analysis Plan......

3.6.9.1Data analysis after sessions......

3.6.9.1.1Level of function usage by the user

3.6.9.1.2System response time for the component/function to user input (voice and touch)

3.6.9.1.3Success of system to adapt to change in environment (e.g. background noises, lighting)

3.6.9.2Video analysis after the sessions......

Ease of configurability of function settings......

3.6.9.3Outcome......

3.7Part D Hazard Analysis......

4Responsibilities and Documentation......

4.1WP leader responsibilities......

5References......

6Appendix......

6.1MOBISERV Project Objectives......

6.2Smart home environment......

6.3Heuristics for Expert Evaluation......

6.4Specification of the Evaluation Wizard......

6.4.1Introduction......

6.4.2Requirements for Wizard......

6.4.2.1Global navigation functions......

6.4.2.2Global HMI functions......

6.4.3Specific functions......

6.4.3.1Drinking......

6.4.3.2Eating......

6.4.3.3Front door......

6.4.3.4Exercise......

6.4.3.5Voice/video......

6.4.4Proposed design of the Wizard interface......

6.4.5Other issues......

Table of Figures

Figure 1: Relation between project deliverables......

Figure 2 USE and Validation as part of system development process......

Figure 3 Validation curves for a continuous video sequence from MOBISERV-AIIA database.

Figure 3 2D Layout - Smartest Home of The Netherlands in Eindhoven,......

Figure 4 Possible wizard control panel layout

List of Tables

Table 1 Functional and Non-functional requirements......

Table 2 KPI Table WP4 Components......

Table 3 Confusion matrix for an LODO cross-validation experiment with 40 dynemes and classification accuracy rate 79.82%.

Table 4 KPI Measurement Table WP4 Components......

Table 5 KPI Table WP5 Components (Data Logger)......

Table 6 KPI Measurement Table WP5 Components (Data Logger)......

Table 7 KPI Table WP4 Components (Smart Garments)......

Table 8 KPI Measurement Table WP4 Components (Smart Garments)......

Table 9 KPI Table WP6 Components......

Table 10 KPI Measurement Table WP6 Components......

Table 11 KPI Table WP7 Components......

Table 12 KPI Measurement Table WP7 Components......

Table 13 Technical Characteristics of Kompai......

Table 14 Kompai Control Panel Components......

Table 15. Scope of focus groups......

Table 16. Target participants for field trials......

Table 17. Field trial timeline......

Table 18: Project Objectives with Targets......

Table 19 Heuristics for Expert evaluation......

Glossary

Term / Explanation
MOBISERV / An Integrated Intelligent Home Environment for the Provision of Health, Nutrition and Mobility Services to the Elderly
KPI / Key Performance Indicator
USE / Usability Evaluation
ILAEXP / Independent Living & Ageing & cross-industrial committee of experts
USE / Usability Evaluation
PRU / Physical Robotic Unit
WHSU / Wearable Health Support Unit

Executive Summary

This document, D2.2: MOBISERV Validation Plan - Issue 3, builds on the previous versions of D2.2 and D2.4 (available from the MOBISERV project website) by defining in more detail the VALIDATION PLAN for the MOBISERV system, its components and required processes. The results will be documented in protocols and separate reports to clearly specify whether the component, the process or the entire MOBISERV system meet the pre-defined project functional and non-functional requirements or not.

In the present approach, User Evaluation and Validation will be considered against identified qualitative and quantitative key performance indicators (KPI) to satisfy functional and non-functional requirements identified at the start of the project as part of the user needs assessment (see document D2.3: Initial System Requirements Specification Vols I, and II) and the main MOBISERV project objectives outlined in Appendix section 9.1.

Controlled experiments for both laboratory and field-testing at user sites will support this validation process to measure the acceptability and usability of components and system functions. A complete list of criteria and methods is given in D2.4, along with descriptions in KPI tables in this document.

This version further updates identified KPIs for specific functional requirements that the ILAEXP committee deemed as being high priority in relation to the core MOBISERV aims and objectives. It also includes qualitative KPIs and specific KPIs for MOBISERV components. The third update of the plan will complete this list when the first context-aware and nutrition support prototype, the first coordination and communication system prototype and the robotic platform first prototype are available to Usability Evaluation and Validation experts.

In summary, the objectives of the Usability Evaluation and Validation process in MOBISERV are as follows:

Create evidence-based documents that clearly specify whether the component, the process or the entire MOBISERV system meet the pre-defined project requirements (see D2.3 vols I and II) or not.
Identify possible hazards in the system (e.g. through a hazard analysis).
Identify appropriate and measurable KPI for each of the MOBISERV system components (WP4, WP5 and WP6) as a stand-alone component and within the integrated overall MOBISERV system.
Verify that the MOBISERV equipment and their integration meet the functional and non-functional requirements identified in D2.3 vols I and II.
Show that MOBISERV can achieve its overall objectives, which is to develop and use up-to-date technology in a coordinated, intelligent and easy to use way to support independent living of older persons.

1 Introduction

1.1 System and scope

This document presents the validation and field evaluation plans for the MOBISERV platform, providing an Integrated Intelligent Home Environment for the Provision of Health, Nutrition and Mobility Services to the Elderly. This system consists of various complex technical components and interacting processes that will be integrated to an automated system in a so-called ‘smart home environment’ in Eindhoven, NL (Appendix Figure 3) which will be one of the user evaluation sites, the other evaluations will be conducted in Bristol, UK. Main components of this system are:

An autonomous mobile personal robotic unit (PRU) with an interactive GUI
a multi-sensor in textiles embedded system for vital signs and activity monitoring
a context-aware and nutrition support system (monitoring system) and
a coordination and communication system

The present Validation Plan D2.2 is created based on the EU guideline for Good Manufacturing Practice (GMP)[1]. Annex 15 in this guideline outlines the principles for the Qualification and Validation of equipment and processes.

The plan will be maintained as a live document that will be reviewed throughout the project. Two updates of the initial plan (which was submitted at the end of the first quarter of the first year) were planned at time periods when MOBISERV components including their specific technical component specifications became available to Usability Evaluation (USE) experts. This is the second update provided in calendar month M21 when the first context-aware and nutrition support prototype, the first coordination and communication system prototype and the robotic platform first prototype are available. Figure 1 outlines these three deliverables (D2.2 Issue 1, D2.2 Issue 2 and D2.2 Issue 3) with respect to other deliverables of the current project.

Figure 1: Relation between project deliverables

1.2 Objectives and constraints

WP 2 of the overall project seeks to develop and employ a user centred development process, where the key focus is to ensure that at each stage in the product design process there is a rigorous analysis and evaluation of each component from the end-user’s perspective. This is done to assure that by the time each single component will be integrated to the overall MOBISERV system, it will provide the necessary functionality and will be considered usable by the end users.

The objectives of the USE (Usability Evaluation) and Validation process can be summarised as follows:

Create evidence-based documents that clearly specify whether the component, the process or the entire MOBISERV system meet the pre-defined project requirements (see D2.3 vols I, II and III) or not.
Identify possible hazards in the system (e.g. through a hazard analysis).
Identify appropriate and measurable Key Performance Indictors (KPIs) for each of the MOBISERV system components (WP4, WP5 and WP6) as a stand-alone component and within the integrated overall MOBISERV system.
Verify that the MOBISERV equipment and their integration meet the functional and non-functional requirements identified in D2.3.
Show that MOBISERV can achieve its overall objective which is to develop and use up-to-date technology in a coordinated, intelligent and easy to use way to support independent living of older persons.

1.3 Intended Audience

The Validation and Usability Evaluation Plan provides a means of communication to everyone associated with the project. All project members will be able to use this document as a guide to implement functional and non-functional requirements identified as part of the user needs assessment. All technical WP leaders have provided information for this document as to how their individual and integrated will be validated.

The document will be reviewed by the technology manager (TM) and committee and maintained by the USE (Usability Evaluation) representatives and USE management to ensure conformance to stated process and procedures and hence delivery against stated objectives.

2 Qualification and Validation of MOBISERV system

2.1 Validation approach

As already outlined in section 1.2, WP2 aims to employ a user centred development process where the key focus is to ensure that at each stage in the product design process there is a rigorous analysis and evaluation of each component from the end-user’s perspective. The diagram below outlines a general product development process including highlighted stages which are part of our USE and Validation approach.

Figure 2 USE and Validation as part of system development process

(adapted: Roozenburg, el at., 1995)

Usability Evaluation and Validation will be considered against qualitative and quantitative key performance indicators (KPI)[2] in relation to functional and non-functional requirements identified at the start of the project as part of the user needs assessment (see document D2.3 vols I and II) and main MOBISERV project objectives outlined in Appendix section 9.1. Functional requirements describe the behaviours (functions or services) of the system that support user/project goals, tasks and activities (Malan and Bredemeyer, 2001). A complete list of these identified functional requirements is listed in document D2.3 Vols I and II. The main functional system requirements of the MOBISERV system, selected by the ILAEXP committee, are given below in Table 2. On the other hand, non-functional requirements are important properties and characteristics of the system, i.e. qualities that the users care about and hence will affect their degree of user satisfaction with the system. These non-functional requirements were identified also within WP 2 as part of the user assessment (D2.3 vol II and D2.4).

Table 1 Functional and Non-functional requirements

Main Functional system
requirements selected by ILAEXP / Selected Non-functional system
requirements
Reminder and Encouragement to eat / Customisability and Support
Reminder and Encouragement to drink / Effectiveness and Support
Telemedicine/ self check platform / Security and privacy
Games for social and cognitive stimulation / Engagement and effectiveness
A mobile screen connected to the front door / Reliability
Response to call for help from user / Effectiveness
Voice/Video/SMS via Robot
-To communicate
with friends and relatives
-For a social caregiver to access remotely / Ease of communication
Encouragement to exercise / Adaptability to user’s individual patterns of behaviour
Report and communication to health professionals / Privacy and reliability

2.2 WP 4 Component Validation (Nutrition Support System)

2.2.1 KPIs and Validation Plan

AUTH has focused its research on the eating and drinking activity detection and recognition. Regarding the functionalities of the Nutrition Support System, the eating and drinking detection and recognition subsystem is related to ‘Reminder and Encouragement to eat’ and ‘Reminder and Encouragement to drink’ components.

Eating/Drinking activity recognition is performed by a module named NutritionActivityDetection which will interact with a Microsoft Robotics Developer Studio (MRDS) service named NutritionActivity. Monitoring will be performed at pre-specified time intervals. Information about these time intervals will be provided by the older person or a secondary user (through a graphical user interface), stored in a database and retrieved by the NutritionAgenda module. The InteractionManager will inform the NutritionActivity service that the NutritionActivityDetection module should be initialized and start monitoring. Furthermore, the InteractionManager will interact with the NutritionActivity service in order to obtain information about the Eating/Drinking activity of the person under consideration. The NutritionActivityDetection module should be able to respond to questions set by NutritionActivity service giving a measure of confidence.

For facial expression recognition the developed module performs recognition of the provided facial image depicting a subject while performing a facial action, in one of the seven facial expression classes namely: Anger, Disgust, Fear, Happiness, Sadness, Surprise and Neutral. For details, the reader is referred to D4.1 and D4.2.

2.2.1.1 KPIs to be used

In the validation plan we identify KPIs to give a qualitative and quantitative assessment of the performance of the Eating and Drinking Detection and Tracking Algorithms. The performance indicators will be based on various measures described below.

WP / Sub ID / System/Sub-Component / Name of KPI / Type of Measure / Expected Target Value / Nature of Expected Result
4 / KPI4_1 / Eating and Drinking Detection Algorithm / Accuracy Rate / Accuracy / Maximized
(e.g. >80%) / number
4 / KPI4_2 / Eating and Drinking Detection Algorithm / Confusion Matrices / Classification Accuracy / Balanced accuracy between the classes
(e.g. difference < 10%) / NxN matrices, N = number of classes
4 / KPI4_3 / Eating and Drinking Detection Algorithm / True Positives / Statistical Result / Maximized (e.g., >80%) / number
4 / KPI4_4 / Eating and Drinking Detection Algorithm / False Positives / Statistical Result / Minimized (e.g., <20%) / number
4 / KPI4_5 / Tracking Algorithm / Average Tracking Accuracy / Tracking Accuracy / Maximized (e.g., >0.6) / number
4 / KPI4_6 / Eating and Drinking Detection Algorithm / Measure of Confidence / Confidence of the activity detection results / Low confidence for the false positives (e.g., <0.4), high confidence for the true positives (e.g., >0.6) / number
4 / KPI4_7 / Facial Expression Recognition Algorithm / Accuracy Rate / Classification Accuracy / Maximized (e.g., >80%) / number

Table 2 KPI Table WP4 Components

Accuracy Rate

Classification accuracy rate is calculated as the number of the samples classified in the class they actually belong divided by the total number of samples and then multiplied with 100. It is used in various cross validation experiments that were conducted in order to evaluate our algorithms.

Confusion Matrices

The confusion matrix shows in which class each elementary activity video should be classified and in which it was assigned finally. This gives a clearer picture regarding the classification results.

Actual Label / Label Found / eat / drink / apraxia
eat / 304 / 45 / 21
drink / 74 / 218 / 0
apraxia / 32 / 10 / 198

Table 3 Confusion matrix for an LODO cross-validation experiment with 40 dynemes and classification accuracy rate 79.82%.

True Positives

A sample is identified to belong in a class (the primary class) in which it actually belongs.

False Positives

A sample is identified to belong in a class in which it does not belong. For example, an elementary activity video is classified in ‘eating activity’ class, but it illustrates a person while chewing his/her food.

Average Tracking Accuracy

Average Tracking Accuracy (ATA) is calculated by averaging the Frame Detection Accuracy (FDA) measure over all video frames. FDA calculates the overlap area between the ground truth object G and the detected object D at a given video frame t. It takes values from 0 (when the object is lost) to 1 (when detection is accurate).