This year, the STAF workshops are proud to feature the following outstanding keynote speakers:
- Mirco Musolesi: Mining Big (and Small) Mobile Data for Social Good
- Emanuela Merelli: The Topological Field Theory of Data: a program towards a novel strategy for data mining through data language
- Gul Agha: Probabilistic Programming, Estimation, and Euclidean Model Checking for Aggregate Behavior of Concurrent Systems
- Adelinde Uhrmacher: Modeling and simulation of collective adaptive systems: a case for self-adaptive software and domain specific languages
- Mirko Viroli: Resiliency once and for all with Aggregate Computing
- Michael Sedlmair: Human-centered Methods in Visualization Research
- Daniel Ratiu: Enabling Software Verification for Practicing Engineers with Domain Specific Languages
Mining Big (and Small) Mobile Data for Social Good
Mirco Musolesi, University College London, UK
An increasing amount of data describing people’s behaviour is collected by means of applications running on smartphones or directly by mobile operators through their cellular infrastructure. This information is extremely valuable for marketing applications, but it has also an incredible potential to be beneficial for society as a whole, thanks to applications in a variety of fields, from healthcare to transportation, from geodemographics to national security. In particular, mobile data can be extremely valuable for developing and evaluating quantitative models of human behaviour, which can be used as a basis for the development of intelligent mobile systems. In this talk I will analyze the challenges and opportunities in using big (and small) data for applications of high societal and commercial impact discussing the current work of my lab in the area of mobile data mining and anticipatory mobile computing. The scope of my talk will be broad, encompassing both modelling and systems-oriented issues.
Mirco Musolesi is a Reader in Data Science at the Department of Geography at University College London and a Faculty Fellow at the Alan Turing Institute, the UK National Institute for Data Science. He received a PhD in Computer Science from University College London and a Master in Electronic Engineering from the University of Bologna. He held research and teaching positions at Dartmouth College, Cambridge, St Andrews and Birmingham. He is a computer scientist with a strong interest in sensing, modelling, understanding and predicting human behaviour and dynamics in space and time, at different scales, using the “digital traces” we generate daily in our online and offline lives. He is interested in developing mathematical and computational models as well as implementing real-world systems based on them. This work has applications in a variety of domains, such as intelligent systems, ubiquitous computing, networked systems, healthcare, security&privacy, and data analytics for “social good”.
The Topological Field Theory of Data: a program towards a novel strategy for data mining through data language
Emanuela Merelli, University of Camerino, Italy
We aim to challenge the current thinking in IT for the “Big Data” question, proposing a program aiming to construct an innovative methodology to perform data analytics in a way that returns an automaton as a recognizer of the data language: a Field Theory of Data. We suggest to build, directly out of probing data space, a theoretical framework enabling us to extract the manifold hidden relations (patterns) that exist among data, as correlations depending on the semantics generated by the mining context. The program, that is grounded in the recent innovative ways of integrating data into a topological setting, proposes the realization of a Topological Field Theory of Data, transferring and generalizing to the space of data notions inspired by physical (topological) field theories and harnesses the theory of formal languages to define the potential semantics necessary to understand the emerging patterns.
Emanuela Merelli is full professor of Computer Science at the University of Camerino. She founded the BioShape laboratory, a multidisciplinary environment where young researchers and students work for conceiving new formal and computational methods for “decode” the behavior of complex biology systems. She has been the coordinator of the TOPDRIM FP7-FET project and the partner of LITBIO (Laboratory for Interdisciplinary Technologies in Bioinformatics) Italian FIRB project. Currently, she is working for launching a new program on topological field theory of data with the aim to pave a way for synthesizing automata as recognizers of languages derived from topological data space and for understanding the complexity of some computing paradigms. She published many papers in refereed international journals and she is continuously involved in the organization of events with interdisciplinary character.
Probabilistic Programming, Estimation, and Euclidean Model Checking for Aggregate Behavior of Concurrent Systems
Gul Agha, University of Illinois at Urbana-Champaign, US
In many real-world applications, such as those involving biological systems or big data, only probabilistic “guarantees” of approximate behavior are feasible. Moreover, we are often interested in aggregate behavior of such systems. The talk will describe probabilistic programming and methods for estimation which can guide the execution of probabilistic programs. Such methods are based on sampling the behavior of a concurrent system and closely connected to statistical model checking. I will then present Euclidean model checking, a method for analyzing quantitative properties of systems. The approach will be illustrated with applications such as Pharmacokinetics and sensor networks.
Gul Agha is Professor of Computer Science at the University of Illinois at Urbana-Champaign. Agha is a Fellow of the IEEE and a recipient of the IEEE Computer Society Meritorious Service Award. He served as Editor-in-Chief of IEEE Concurrency: Parallel, Distributed and Mobile Computing (1994-98), and of ACM Computing Surveys (2000-2007). While Agha is best known for his work on the formalization of the Actor Model and the development of actor languages, he has also led his research group in pioneering research on statistical model checking, concolic testing, computational learning for verification, and in developing a computational model for energy complexity of parallel algorithms. Agha is the co-director of the interdisciplinary Illinois Structural Health Monitoring Project which has developed sensor networks to monitor civil infrastructure, and a co-founder of Embedor Technologies which is using actor inspired distributed computing to enable continuous monitoring of bridges for improved maintenance and anomaly detection.
Modeling and simulation of collective adaptive systems: a case for self-adaptive software and domain specific languages
Adelinde Uhrmacher, University of Rostock, Germany
Collective adaptive systems we interpret as systems that operate on different organizational levels and whose structures, in terms of components and interactions, are changing over time. These systems provide specific challenges for modeling as well as simulation methods. First, modeling languages have to be sufficiently expressive to be able to capture the essentials of collective adaptive systems. We present the multi-level modeling languages ML-Rules, ML-Space, and ML3, and discuss their features that facilitate developing and maintaining compact models of collective adaptive systems. Second, experimentation languages like SESSL can facilitate specifying, executing, and documenting complex “in-silico” experiments, e.g., parameter scans, statistical model checking, or optimization experiments comprising multiple executions and analysis steps.
However, executing experiments with complex models comes at a price. E.g., in the case of MLRules, allowing arbitrary functions not only on attributes and content but also for specifying reaction kinetics on the fly prevents us from taking short cuts that simulators for less expressive languages can take. A family of execution algorithms, from which a suitable algorithm can be selected and configured on demand before or during simulation, helps an efficient execution. Intelligent strategies for automatically selecting and configuring algorithms do not only come handy for executing models, but also for identifying and configuring suitable methods for the diverse analysis steps involved in simulation experiments. A pre-requisite for this kind of self-adaptation is a clear separation of concern and a component-based software design, as realized in the modeling and simulation framework James II.
Thus, similarly as domain-specific languages support users in developing and maintaining models of collective adaptive system, and specifying, executing, and documenting experiments, flexible, selfadaptive simulation environments contribute to more efficient and effective experimenting with these models.
Adelinde M. Uhrmacher is Professor at the University of Rostock, Germany, and head of the Modeling and Simulation Group. Her research is on discrete-event modeling and simulation methods and their applications in different areas, like demography and cell biology. Her research comprises the development of modeling languages and simulation algorithms, support for simulation experimentation as well as the realization of simulation systems. She has been involved in the organization of various modeling and simulation conferences, e.g., as program chair of the Winter Simulation Conference 2012. She has been and is member of the editorial board of several simulation related journals. She was editor-in-chief of SCS Simulation from 2000 until 2006. Since 2013 she serves as editor-in-chief of the ACM Transactions on Modeling and Computer Simulation (TOMACS).
Resiliency once and for all with Aggregate Computing
Mirko Viroli, University of Bologna, Italy
One of the difficulties in engineering collective adaptive systems is the challenge of simultaneously engineering the desired resilient behaviour of the collective, as well as the details of implementation on individual devices. Aggregate computing simplifies this problem by separating these aspects into different layers of abstraction by means of a unifying notion of computational field, and a computational model inspired by the functional paradigm. We review the state of the art in aggregate computing, discuss the many resiliency properties it supports, and develop a roadmap of foundational problems still needing to be addressed in the continued development of this emerging discipline.
Mirko Viroli is Associate Professor at the University of Bologna, Italy, where he received his M.Sc. and Ph.D. degrees in Computer Science Engineering in 1997 and 2003, respectively. He is an expert in programming languages, computational models, and engineering of self-adaptive and self-organising systems, and has written over 200 articles on such topics. He is member of the Editorial Board of the “Knowledge Engineering Review” (Cambridge University Press), and was program chair of ACM SAC 2008 and 2009, IEEE SASO 2014, and IFIP COORDINATION 2015.
Human-centered Methods in Visualization Research
Michael Sedlmair, University of Vienna, Austria
Visualization systems provide a unique way for people to interact, explore, and better understand their data. Visualization systems might, for instance, help medical doctors to base tricky medication decisions on historic patient data; journalists to get a quick overview over large sets of documents; or economists to better understand stock trends. As visualization systems target human users, closely involving them into design and research processes is crucial. While this importance has been acknowledged many times, realizing such a human-centered focus in day-to-day research practices is challenging.
This talk will shed some light into existing human-centered methods that are used in visualization research, as well as the challenges that come with them. To illustrate these aspects, I will present case studies from several visualization research projects, such as my own 3.5-year collaboration with automotive engineers at BMW. The ultimate goal of the talk would be to help building a bridge between the human-centered challenges faced in visualization research and those in formal methods.
Michael Sedlmair is an assistant professor at the University of Vienna, where he works at the intersection of human-computer interaction, visualization, and data analysis. His specific research interests focus on studying visual data analysis techniques, tools, and users in real-world settings, and innovating and refining human-centered research methods and methodologies for the area of data analysis. Previously, he worked at the University of British Columbia, University of Munich, and the BMW Group Research and Technology.
Enabling Software Verification for Practicing Engineers with Domain Specific Languages
Daniel Ratiu, Siemens AG, Germany
Despite the maturity of current verification techniques, formal verification tools have not found their way in the daily practice yet. The main reason for the low adoption is related to pragmatic aspects such as usability or the cost of applying formal verification (e.g. specifying properties, running the analyses, interpreting the results). For a large majority of developers, formal verification techniques are seen rather as expert tools and not as engineering tools that can be used on a daily basis. This is mostly the case in the context of main stream systems (e.g. automotive, medical, industrial automation) where pragmatics (e.g. personnel skills, cost structures, deadlines, existent processes) plays a major role. I will present our approach and experience to tackle some of these challenges with the help of domain specific languages in the mbeddr project and its extensions at Siemens.
Daniel Ratiu works as researcher, coach and software engineer for Siemens Corporate Technology in Munich. His focus is on robust software development in context of dependable systems with the help of domain specific languages and formal verification Before joining Siemens, Daniel was research group lead at fortiss Research Institute in Munich. In 2009 he obtained a PhD degree from the Technische Universität München.
Continuing a Benchmark for UML and OCL Design and Analysis Tools
Martin Gogolla, University of Bremen, Germany
UML and OCL are frequently employed languages in model-based engineering. OCL is supported by a variety of design and analysis tools having different scopes, aims and technological corner stones. The spectrum ranges from treating issues concerning formal proof techniques to testing approaches, from validation to verification, and from logic programming and rewriting to SAT-based technology. The talk presents steps towards a well-founded benchmark for assessing UML and OCL validation and verification techniques. It puts forward a set of UML and OCL models together with particular questions centered around OCL and the notions consistency, independence, consequences, and reachability. Furthermore aspects of integer arithmetic and aggregations functions (in the spirit of SQL functions as COUNT or SUM) are discussed. The claim of the talk is not to present a complete benchmark. It is intended to continue the development of further UML and OCL models and accompanying questions within the modeling community having the aim to obtain an overall accepted benchmark.
Martin Gogolla is professor for Computer Science at University of Bremen, Germany and is the head of the Research Group Database Systems.
Spreadsheet Programming Using Examples
Sumit Gulwani, Microsoft Research
99% of spreadsheet users do not know programming, and struggle with repetitive data wrangling tasks such as extracting tabular data from text files, string transformations, and table re-formatting. Programming by Examples (PBE) can revolutionize this landscape by enabling users to synthesize intended programs from example-based specifications.
A key technical challenge in PBE is to search for programs that are consistent with the examples provided by the user. Our efficient search methodology is based on two key ideas: (i) Restriction of the search space to an appropriate domain-specific language that offers balanced expressivity and readability (ii) A divide-and-conquer based deductive search paradigm that inductively reduces the problem of synthesizing a program of a certain kind that satisfies a given specification into sub-problems that refer to sub-programs or sub-specifications.
Another challenge in PBE is to resolve the ambiguity in the example based specification. We will discuss two complementary approaches: (a) machine learning based ranking techniques that can pick an intended program from among those that satisfy the specification, and (b) active-learning based user interaction models.
The above concepts will be illustrated using various PBE technologies including FlashFill, FlashExtract, and FlashRelate. These technologies have been released inside various Microsoft products including Excel. The Microsoft PROSE SDK allows easy construction of such technologies.
Sumit Gulwani is a Research manager and a Principal researcher at Microsoft, and an affiliate faculty in the Computer Science Department at University of Washington (USA). He has expertise in formal methods and automated program analysis and synthesis techniques. His recent research interests lie in the cross-disciplinary areas of automating end-user programming, and building intelligent tutoring systems (for various subject domains including programming, math, logic, and automata). His programming-by-example work led to the Flash Fill feature of Microsoft Excel 2013 that is used by hundreds of millions of people. He was awarded the ACM SIGPLAN Robin Milner Young Researcher Award in 2014. He obtained his PhD in Computer Science from UC-Berkeley in 2005, and was awarded the ACM SIGPLAN Outstanding Doctoral Dissertation Award. He obtained his BTech in Computer Science and Engineering from IIT Kanpur (India) in 2000, and was awarded the President’s Gold Medal.