Tutorials | CAV 2016

Parosh Abdulla, Uppsala University

Small Models in Parameterized Verification

The behavior of many types of computer systems can be described using one or more parameters whose values vary over specified ranges. The goal of parameterized verification is to prove correctness of the system for all such values. The parameters may relate to the size or topology of a network, to data types over which a system is constructed, to initial values of variables, etc. There are numerous applications where parameterized systems appear naturally, such as hardware circuits, multithreaded programs, distributed systems and communication protocols. For instance, the specification of a mutual exclusion protocol may be parameterized by the number of processes participating in a given session of the protocol, the operation of a set of communicating processes may depend on the number of pending messages, and the behavior of a concurrent library may depend on the number of threads sharing the library. In all these cases, it is important to verify correctness independently of the size of the given parameter.

In this tutorial, we present a simple yet general “small model” approach to parameterized verification. The method typically inspects a small number of parameter values in order to show correctness of the whole system. It relies on an abstraction scheme that views the system from the perspective of a fixed number of components. The abstraction is used during the verification procedure in order to dynamically detect cut-off points beyond which the search of the state space need not continue. We show that the method is complete for a large class of well quasi-ordered systems including Petri nets. We also describe experimentation on a variety of benchmarks to demonstrate that the method is highly efficient and that it works well even for classes of systems with undecidable verification problems.

As an application of parameterized verification, we describe a framework for automated verification of linearizability for concurrent data structures that implement sets, stacks, and queues. To automate verification, we apply small model reasoning to bound the number of considered threads, and use a novel symbolic representation for unbounded heap structures that store data from an unbounded domain.

The tutorial does not assume any prior knowledge of paramterized verification.

The presentation is based on previous work with Giorgio Delzanno, Frederic Haziza, Lukas Holik, Bengt Jonsson, Ahmed Rezine, and Cong Quy Trinh.

Vitaly Chipounov, Cyberhaven and Zhenkun Yang, Intel

The S2E Platform: Design, Implementation, and Applications

In this tutorial, we will present several years of experience using the S2E platform in both academic and industrial environments. We will show a diverse set of tools that practitioners built on top of S2E, such as tools for finding security vulnerabilities in firmware and device drivers, testing of filesystem code, verification of networking code, as well as analyzing code written in interpreted languages. You will see live demos showing S2E in action from analyzing Microsoft Office on Windows to finding security holes in various IoT and system binaries.

S2E is a platform for in-vivo analysis of software systems that combines a virtual machine with symbolic execution. Users install and run in S2E any unmodified x86 or ARM software stack, including programs, libraries, the OS kernel, and drivers. Symbolic execution then automatically explores hundreds of thousands of paths through the system, allowing users to check desired properties even in corner-case situations and at any level of the software stack. S2E works on unmodified binaries, does not require writing code in special languages, or model the environment.

We will discuss how S2E addresses scalability challenges that users face when building analysis tools. In particular, we will show how techniques such as fuzzing, concolic execution, state merging, function summaries, execution consistency models, static analysis, hardware acceleration, etc. can be used and combined in S2E in order to build analysis tools that scale to entire binary software stacks.

Paulo Tabuada, University of California at Los Angeles

Synthesizing Robust Cyber-Physical Systems

Programming Cyber-Physical Systems (CPS) is notoriously difficult due to the tight interactions between cyber and physical components. Yet, some of the most exciting technological developments in the last decade have resulted in fairly complex CPS such as self-driving cars or humanoid robots that are able to walk and run in unstructured environments. Rather than focusing on the verification of CPS, this tutorial will focus on the synthesis of software for complex CPS from high-level specifications. We will discuss how the differential equation models for physical components can be abstracted to finite-state models and how software synthesized for these abstractions can be refined to software acting on the physical components. We will also discuss how properties are preserved by this abstraction-based approach to synthesis so that we understand the extent to which specifications are enforced by the synthesized software. In addition to satisfy the specifications by construction, the synthesized software should also satisfy the specifications robustly in the sense that small deviations from the environment assumptions should only lead to small deviations from the specified behavior. We will discuss different ways of specifying and synthesizing software for robustness. The theoretical results will be illustrated with concrete case studies such as a bipedal robot for which we synthesized the software making it walk.

Martin Vechev and Pavol Bielik, ETH

Machine Learning for Programs

The increased availability of massive codebases, sometimes referred to as “Big Code”, creates a unique and emerging opportunity for new kinds of programming techniques and tools based on probabilistic models. Similar to how deep learning systems work for other domains (e.g., computer vision), these approaches will extract useful information from existing codebases and use that information to provide statistically likely solutions to problems that are difficult or impossible to solve with traditional techniques. An example of such a system is JSNice (JSNice.org) — a tool we recently released and one that can statistically de-minify JavaScript code and predict useful semantic properties of programs. By now, the system has become quite popular and is currently being used by > 100,000 users worldwide. In this tutorial, we will discuss how to build such probabilistic programming systems. The tutorial is self-contained and will include both: (i) an introduction to several broadly applicable machine learning models suitable for learning from programs, such as probabilistic graphical models, deep neural nets, decision trees and probabilistic grammars (including ways to train and perform inference with these), and (ii) a hands-on session where participants will apply the theory to build a probabilistic programming system using the recently released Nice2Predict.org framework (upon which JSNice is built). In the process we will highlight connections to classic program analysis and synthesis, interplay with probabilistic programming and discuss several new research directions. By the end of the tutorial, the participant should have: (i) learned the fundamentals, pros and cons of several machine learning models, (ii) learned how to combine these models with programming languages concepts, and (iii) learned how to build a statistical programming tool using the concepts in the tutorial.