ICT, FET Open

LIFTICT-FP7-255951

Using Local Inference in Massively Distributed Systems

Collaborative Project

D 6.4

Project Workshop

Contractual Date of Delivery:31.03.2013

Actual Date of Delivery:08.05.2013

Author(s):Minos Garofalakis (TUC), Antonis Deligiannakis (TUC), Michael May (FHG),Christine Kopp (FHG), Michael Kamp (FHG)

Institution: FHG

Workpackage:WP 6

Security:PU

Nature:R

Total number of pages:6

Project coordinator name: Michael May

Project coordinator organisation name:
Fraunhofer Institute for Intelligent Analysis
and Information Systems (IAIS) / Revision:2
Schloss Birlinghoven, 53754 Sankt Augustin, Germany
URL:

Abstract:

This document is the LIFT deliverable of WP6 for the third review period (01.10.2012 – 30.09.2013).The document contains an overview of two project-related workshops to be held in August 2013.

Revision history

Administration Status
Project acronym:LIFT / ID: ICT-FP7-255951
Document identifier: / D 6.4 Project Workshop
(01.10.2012 – 30.09.2013)
Leading Partner: / FHG
Report version: / 2 (workshop program and details added)
Report preparation date: / 10.12.2013
Classification: / PU
Nature: / REPORT
Author(s) and contributors: / Minos Garofalakis (TUC), Antonis Deligiannakis (TUC), Michael May (FHG),Christine Kopp (FHG), Michael Kamp (FHG)
Status: / - / Plan
- / Draft
- / Working
- / Final
x / Submitted

Copyright

This report is © LIFT Consortium 2013. Its duplication is restricted to the personal use within the consortium and the European Commission.

1

D6.4 – Project Workshop

Minos Garofalakis, Antonis Deligiannakis, Michael May, Christine Kopp

1Introduction

In year three of LIFT two project-related workshops will take place that are co-located with major scientific conferences in the area of data management and data mining. The first workshop, Big Dynamic Distributed Data, is co-located with VLDB 2013 and is organized by TUC. It focuses on the core topic of LIFT, namely complex computations over partitions of massive streaming data. The second workshop, Ubiquitous Data Mining, is jointly organized by João Gama and FHG. It is already the third workshop in its series and has in 2013 a special focus un distributed streaming data.

2First International Workshop on Big Dynamic Distributed Data (BD³ 2013)

Date and Location

August 30th, 2013, Trento, Italy (in conjunction with VLDB 2013)

Chairs

  • General Chairs:Minos Garofalakis, Antonios Deligiannakis
  • Program Chairs:Graham Cormode, Assaf Schuster, Ke Yi

Website

Program

08:30 - 09:00 Session 1: Invited Talk

Title: Streaming Balanced Partitioning of Massive Scale Graphs

Speaker: Milan Vojnovic, MSR Cambridge

09:00 - 10:00 Session 1: Keynote Speech

Title: Coding Theory for Large-Scale Storage

Speaker: Alex Dimakis, UT Austin

10:00 - 10:30 Coffee break

10:30 - 12:00 Session 2: Distributed Monitoring

Safe-Zones for Monitoring Distributed Streams

Daniel Keren (Haifa), Guy Sagy (Technion), Amir Abboud (Technion), Izchak Sharfman (Technion), Assaf Schuster (Technion), David Ben-David (Technion)

Communication-Efficient Distributed Online Prediction using Dynamic Model Synchronizations

Mario Boley (Fraunhofer IAIS), Izchak Sharfman (Technion), Daniel Keren (Haifa), Michael Kamp (Fraunhofer IAIS), Assaf Schuster (Technion)

Communication-efficient Outlier Detection for Scale-out Systems

Moshe Gabel (Technion), Daniel Keren (Haifa), Assaf Schuster (Technion)

12:00 - 13:30 Lunch

13:30 - 15:30 Session 3: CEP and Graphs

Elastic Complex Event Processing under Varying Query Load

Thomas Heinze (SAP AG), YuanzhenJi (SAP AG), Yinying Pan (SAP AG), Franz Josef Grüneberger (SAP AG), ZbigniewJerzak (SAP AG), ChristofFetzer (TU Dresden)

Adaptive Selective Replication for Complex Event Processing Systems

Franz Josef Grüneberger (SAP AG), Thomas Heinze (SAP AG), Pascal Felber (Universite de Neuchatel)

Dynamic Partitioning of Big Hierarchical Graphs

VasilisSpyropoulos (Athens University of Economics and Business), YannisKotidis (Athens University of Economics and Business)

Scalable and Robust Management of Dynamic Graph Data

Alan Labouseur (SUNY -- Albany), Paul Olsen (SUNY -- Albany), Jeong-Hyon Hwang (SUNY - Albany)

15:30 - 16:00 Coffee break

16:00 - 18:00 Session 4: Stream Processing

Towards Elastic Stream Processing: Patterns and Infrastructure

Kai-Uwe Sattler (TU Ilmenau), Felix Beier (TU Ilmenau)

Task Graphs of Stream Mining Algorithms

SayakaAkioka (Meiji University)

Large-scale Online Mobility Monitoring with Exponential Histograms

Christine Kopp (Fraunhofer IAIS), Michael Mock (Fraunhofer IAIS), OdysseasPapapetrou (Technical Univ. of Crete), Michael May (Fraunhofer IAIS)

Multi-Stage Malicious Click Detection on Large Scale Web Advertising Data

Leyi Song (East China Normal University), Xueqing Gong (East China Normal University), Xiaofeng He (East China Normal University), Rong Zhang (East China Normal University), Aoying Zhou (East China Normal University)

18:00 - 18:10 Closing Remarks

Workshop Proceedings

Workshop Description

As the amount of streaming data produced by large-scale systems such as environmental monitoring, scientific experiments and communication networks grows rapidly, new approaches are needed to effectively process and analyze such data. There are several promising directions in the area of large-scale distributed computation, that is, where multiple computing entities work together over partitions of the massive, streaming data to perform complex computations. Two important paradigms in this realm are continuous distributed monitoring (i.e., continually maintaining an accurate estimate of a complex query), and distributed and cluster-based systems that allow the processing of big, streaming data (e.g., IBM System S, Apache S4, and Twitter Storm).

The aim of the BD3 workshop is to bring together computer scientists with interests in this field to present recent innovations, find topics of common interest and to stimulate further development of new approaches to deal with massive dynamic and distributed data.

Topics of interest include (but are not limited to):

  • Novel architectures for BD3
  • Extensions to existing models for BD3
  • Algorithms for mining and analytics for BD3
  • Query processing in BD3
  • Efficient communication protocols for BD3
  • Languages and structures for BD3
  • Theoretical basis and hardness for BD3
  • Engineering case-studies in BD3
  • Position papers on challenges and new directions in BD3
  • Privacy issues in BD3
  • Energy efficiency and reliability in BD3
  • Scheduling and provisioning issues in BD3

The 1st International Workshop on Big, Dynamic, Distributed Data (BD3) took place on Friday 30/8, in conjunction with VLDB'2013 in Riva Del Garda, Italy. (VLDB is the top international research forum on data management systems.) The workshop received 15 submissions out of which 11 papers were selected for presentation. Four of the accepted papers were, in fact, by LIFT authors, so LIFT work took center stage at the workshop. Professor Alex Dimakis from UT-Austin delivered a very interesting keynote talk on novel applications of coding theory in robust distributed storage. VLDB handled workshop registrations centrally, so people registered for all workshops on that day - there were 8 concurrent workshops taking place on 30/8. The BD3 workshop was very well attended, with30-40 participants in the workshop throughout the day, and several interesting technical discussions taking place both during the sessions and the coffee breaks.

3Workshop on Ubiquitous Data Mining (UDM 2013)

Date and Location

August 3rd-5th, 2013, Beijing, China (in conjunction with IJCAI 2013)

Chairs

  • João Gama, Michael May, Nuno Marques, Paulo Cortez

Website

Program

09:00 -10:00 Invited Talk

Title: Exploiting Label Relationship in Multi-Label Learning

Speaker: Zhi-Hua Zhou

10:00 – 12:25 Paper Session 1

Predicting Globally and Locally: A Comparison of Methods for Vehicle Trajectory

William Groves, Ernesto Nunes and Maria Gini

On Recommending Urban Hotspots to Find Our Next Passenger

Luis Moreira-Matias, Ricardo Fernandes, Joao Gama, Michel Ferreira, João Mendes-Moreira and Luis Damas

Road-quality classification and bump detection with bycicle-mounted smartphones

Marius Hoffmann, Michael Mock and Michael May

Cicerone: Design of a Real-Time Area Knowledge-Enhanced Venue Recommender

Daniel Villatoro, JordiAranda, Marc Planagumà, Rafael Gimenez and Marc Torrent-Moreno

Visual Scenes Clustering Using Variational Incremental Learning of Infinite Generalized Dirichlet Mixture

Wentao Fan and NizarBouguila

12:30 – 13:30 Lunch

13:30 – 14:30 Invited Talk

Title: NIM: Scalable Distributed Stream Processing System on Mobile Network Data

Speaker: Wei Fan

14:30 – 17:00 Paper Session 2

Learning Model Rules from High-Speed Data Streams

Ezilda Almeida, Carlos A. Ferreira and João Gama

Simultaneous segmentation and recognition of gestures for human-machine interaction

Harold Vasquez, Luis Enrique Sucar and Hugo Jair Escalante

Ubiquitous Self-Organizing Maps

Bruno Silva and Nuno C. Marques

Simulating Price Interactions by Mining Multivariate Financial Time Series

Bruno Silva, Luis Cavique and Nuno C. Marques

Trend template: mining trends with a semi-formal trend model

Olga Streibel

17:00 Closing

Workshop Proceedings

Workshop Description

Two major technological evolutions modify our relationship with our environment:

  • Widely available and cheap computer power. Simple objects that surround us are gaining sensors, computational power, and actuators, and are changing from static, into adaptive and reactive systems.
  • The explosion of networks of all kinds offers new possibilities for the development and self-organization of communities.

The new characteristics of data reflect a World in Movement:

  • Time and space. The objects of analysis exist in time and space. Often they are able to move.
  • Dynamic environment. These objects exist in a dynamic and unstable environment, evolving incrementally over time.
  • Information processing capability. The objects are endowed with information processing capabilities.
  • Locality. The objects never see the global picture - they know only their local spatio-temporal environment.
  • Real-Time. The models have to evolve incrementally in correspondence with the evolving environment.
  • Distributed. The object will be able to exchange information with other objects, thus forming a truly distributed environment.

Ubiquitous Data Mining (UDM) uses Data Mining techniques to extract useful knowledge from data with these characteristics. The goal of this workshop is to convene researchers (from both academia and industry) who deal with techniques such as: decision rules, decision trees, association rules, clustering, filtering, learning classifier systems neural networks, support vector machines, preprocessing, post processing, feature selection, visualization techniques, etc. for UDM and related themes.

Topics include but are not restricted to:

  • Adaptive Data Mining
  • Distributed Data Mining
  • Distributed Data Streams
  • Grid Data Mining
  • Learning in Ubiquitousenvironments
  • Learning from Sensor Networks
  • Learning fromSocial Networks
  • VisualizationTechniquesfor UDM
  • Incremental On-line Learning Algorithms
  • Single-Pass andScalableAlgorithms
  • Learning in distributed neural network systems;
  • Real-Time and Real-World Applications
  • Resource-aware UDM
  • Theoreticalframeworksfor UDM

The Workshop on Ubiquitous Data Mining took place on Saturday 3/8, in conjunction with IJCAI’2013 in Beijing, China. The workshop received 16 submissions out of which 8 papers were selected for presentation by a scientific committee. The UDM workshop was well attended, with around 25 participants in the workshop throughout the day, and several interesting technical discussions taking place both during the sessions and breaks.

1