Session 1: Distributed Transactional Memory Platforms
Title: When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Data Replication
Authors: Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, Luis Rodrigues
Abstract: In this talk we will address partial data replication in the context of distributed transactional systems by introducing GMU, a novel, genuine partial replication protocol that implements a scalable, distributed multi-versioning scheme. The protocol never aborts read-only transactions, and spares them from any distributed validation schemes. GMU is therefore particularly suited for workload profiles where read-only transactions play a significant role, as typical of a wide range of real-world applications.
In terms of consistency, our proposal guarantees Update Serializability (US) , , an isolation creation that represents a sweet spot in the consistency/performance tradeoff. US, is, in fact, sufficiently strong to ensure the correctness even for very demanding applications (such as TPC- C ), but is also weak enough to allow efficient and scalable implementations, such as GMU. Further, unlike several relaxed consistency models proposed in literature, US has a simple and intuitive semantic, thus being an attractive, scalable consistency model for ordinary programmers.
We have integrated GMU protocol in one of the most popular open source in-memory transactional data grids for cloud environments, Infinispan  by Red-Hat/JBoss. On the basis of a large scale experimental study performed on heterogeneous experimental platforms and using industry standard benchmarks (namely TPC-C  and YCSB ), we show that GMU achieves linear scalability and that it introduces negligible overheads (less than 10%) with respect to solutions ensuring non-serializable semantics in a wide range of workloads.
 A. Adya, “Weak consistency: A generalized theory and optimistic implementations for distributed transactions,” tech. rep., PhD Thesis, MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 1999.
 R. Hansdah and L. Patnaik, “Update serializability in locking,” vol. 243 of Lecture Notes in Computer Science, pp. 171–185, Springer Berlin / Heidelberg, 1986.
 TPC Council, “TPC-C Benchmark.” http://www.tpc.org/tpcc.
 Red-Hat/JBoss, “Infinispan.” http://www.jboss.org/infinispan.
 B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” in Proceedings of the 1st ACM symposium on Cloud computing, SoCC ’10, 2010.
Title: Gargamel: A Conflict-Aware Contention Resolution Policy for STM
Authors: Pierpaolo Cincilla, Sebastien Monnet, Marc Shapiro
Abstract: When a classical TM systems experiences an abort, a contention manager is in charge of managing the conflicting transactions [1,2]. Instead, we propose to preventively steer conflicting transactions away from each other. Gargamel schedules transactions in such a way that parallelism is maximized and transactions never abort.
Our conflict-aware contention resolution policy is based on an oracle that, for any pair of transactions, detects whether they are conflicting. Based on this information, the contention manager computes a schedule that serialises conflicting transactions only, and executes non-conflicting ones in parallel. Such a schedule is optimal in terms of throughput, since it does not abort transactions, and does not serialize when not needed.
We call *chain* a sequence of conflicting transactions produced by the Gargamel scheduler, based on the oracle's conflict prognosis. For some incoming transaction T, if T conflicts with some (scheduled or running) transaction T', Gargamel schedules T at the end of a chain containing T'; and similarly for all such conflicts. Chains can split or merge, according to their dependencies. A split divides a chain in two or more branches, a merge unifies two or more branches of one (or more) chains. Each branch executes sequentially, in a single thread.
Our oracle detects conflicts, either by static analysis of transaction SELECT statements, or checking parameters of methods invocations. A practical oracle cannot be perfect. A false positive (transactions declared to be conflicting, although at run-time no conflict occurs) causes unecessary serialization. (In our current work, we ignore false negatives, because an oracle can always avoid them by static analysis of objects involved in transactions, and/or by increasing the amount of false positives.)
Our simulations show that false positives impact throughput only under low transaction rates, or when resources are unbounded. Highly loaded systems with bounded resources, such as multicores or clusters, benefit greatly from Gargamel, since transactions never abort, and furthermore, false positives do not decrease throughput, because unnecessary serialization is absorbed by the high load. In systems with excess resources, the relative impact of false positives is larger, because parallelism was lost and because aborts do not cost throughput. In future work, we will explore the impact of false negatives and plan to use machine learning techniques to improve accuracy and coverage of the oracle.
 Tim Harris, James Larus, and Ravi Rajwar. Transactional Memory (2nd ed.). Morgan and Claypool Publishers, 2010.
 William N. Scherer III and Michael L. Scott. Advanced Contention Management for Dynamic Software Transactional Memory. In: Proc. 24th ACM Symp. on Principles of Distributed Computing (PODC). Las Vegas, NV. July 2005.
Title: DecentSTM - A fully decentralized STM algorithm
Authors: Annette Bieniusa, Thomas Fuhrmann
Abstract: DecentSTM  is a fully decentralized object-based STM algorithm.
It entirely avoids locking and centralized components during execution of the program. The shared memory system is given by the union of all globally accessible objects (GAOs). All operations on GAOs involve only message-based communication between the transactions and a decentralized memory system.
Delayed communication, e.g., caused by retransmissions in the transport layer of the network, only affects performance, but not consistency of the shared global state. The DecentSTM algorithm is thus well suited even for large scale distributed systems.
Further, it employs a multi-versioning scheme for GAOs where transactions work on local copies of shared data, which are later transferred back to the decentralized memory system. Transactions obtain lazily a consistent memory snapshot during their execution. The system keeps a version history list of committed versions for all shared data. When reading from a GAO, a transaction obtains a version which is consistent with the objects read so far.
DecentSTM implements several pruning strategies for keeping the memory overhead low, e.g. keeping a statically fixed number of versions per object or keeping the optimal number such that read requests never fail.
For committing updated object versions, DecentSTM employs a novel genuine distributed randomized consensus protocol. It is based on voting messages involving only the concurrently committing transactions and the nodes of the distributed memory system to which new versions are committed.
As consistency semantics, DecentSTM implements as default the weaker snapshot isolation semantics. Snapshot isolation is a popular isolation level in replicated database systems because it allows long-running read-only transactions to safely coexist with short transactions updating the state.
For compatibility with applications requiring serializability, the DecentSTM system allows either to switch to serializable snapshot isolation  or to revert to a classic two-phase commit protocol.
DecentSTM has been designed with several enhancements in mind, such as speculative execution or caching, replicating and migrating shared data to dynamically adapt to the available memory and processor resources.
 Annette Bieniusa and Thomas Fuhrmann. Consistency in hindsight: A fully decentralized STM algorithm. In 24th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, Atlanta, Georgia, USA, 19-23 April 2010 - Conference Proceedings, pages 1-12. IEEE, 2010.
 Michael J. Cahill, Uwe Rohm, and Alan David Fekete. Serializable isolation for snapshot databases. ACM Trans. Database Syst., 34(4), 2009.
Session 2: Distributed Transactional Memory Platforms and Applications
Title: TM2C, Transactional Memory for Many-Cores
Authors: Vasileios Trigonakis, Rachid Guerraoui, Vincent Gramoli
Abstract: The last months we have been working on a TM system tailored for non-coherent many-cores, named TM2C. TM2C capitalizes the low latency of on-die message passing by using visible reads and being the first starvation-free DTM system. Read visibility allows the detection of conflict whenever a transaction attempts to override some data read by another transaction, hence anticipating the conflict resolution, otherwise deferred to the commit phase of the reading transaction. In contrast with many-cores, high latency systems require to pipeline asynchronous reads (inherently invisible) to achieve reasonable performance. Visible reads allow us to utilize contention management in a way similar to STMs, yet fully decentralized, to provide starvation-freedom. TM2C comes with FairCM, a companion distributed contention manager that ensures the termination of every transaction and the fair usage of the TM system by each core.
We exploit the large amount of cores by assigning the transactional application and the DTM services of TM2C to different partitions of the cores so that no more than a single task is allocated per core. More precisely, two disjoint groups of cores run each of these two services, respectively. This decoupling benefits the communication load by limiting message exchanges between cores of distinct groups only. In addition and for a particular workload, TM2C reduces communication further by trading read-access requests with a lightweight in-memory read validation to implement a weaker transactional model, namely elastic transactions [DISC '09].
We evaluated TM2C on the Intel's Single-chip Cloud Computer (SCC), a non-coherent message passing processor. The SCC is a 48-core experimental processor relying on a 6x4 two-dimensional mesh of tiles (two cores per tile) that is claimed to be "arbitrarily scalable" [SC '10]. On a hash table data structure and a MapReduce example application TM2C performs up to 20 and 27 times better than the corresponding bare (non-transactional) applications running on a single core. We also evaluated the importance of fair contention management by comparing our FairCM scheme with alternative ones on various workloads. Particularly, FairCM performs up to 9 times better than the others on a workload with a single core running long conflict prone transactions. Last but not least, we also report on an extension of TM2C implementing the elastic transactional model. We showed how on search structures such as linked lists, we can achieve up to 40-fold performance improvement over our classical transaction implementation.
Title: Transactional In-memory Storage for Extended MapReduce
Authors: Kim-Thomas Rehmann and Michael Schoettner
Abstract: The MapReduce programming model simplifies the design and implementation of some parallel algorithms. Recently, several work-groups have extended MapReduce's application domain to iterative and on-line data processing. In contrast to the original MapReduce model, software frameworks for extended MapReduce use message passing for propagating data updates. For example, Twister promotes data updates using publish-subscribe messaging, and Hadoop Online Prototype sends intermediate data directly from map to reduce workers using RPC. In order to benefit from aggregated main memories, fast data access and stronger data consistency, we propose to use transactional in-memory storage for extended MapReduce. In this presentation, we describe the design and implementation of EMR, a transactional in-memory framework for extended MapReduce.
The EMR framework bases on the ECRAM transactional in-memory storage. ECRAM replicates data adaptively among storage nodes. It implements a distributed transactional memory (DTM) for strong data consistency, while also supporting data accesses outside transactions if weaker consistency suffices.
The EMR framework implements job management for extended MapReduce on top of ECRAM. EMR stores work-queues, jobs and configuration data in ECRAM to take advantage of automatic replication and consistency. EMR uses CRAM‚Äôs built-in event propagation that blocks until a storage object fulfills a given condition in order to avoid polling. The contention on a global job queue would provoke high transaction conflict rates. Therefore, we have implemented per-worker job queues and a simple work-stealing approach for load balancing. If a worker detects that it is about to block on its empty queue, it scans the work-queues of its peers for jobs to steal from them. Given that work-queues are stored as shared objects, there is no danger of deadlocks or lost jobs.
To illustrate the usage and performance of the EMR framework, we present several extended MapReduce applications from diverse problem domains including iterative and on-line data processing. Experiments show that in-memory MapReduce scales well to at least 32 worker nodes. Designing and implementing the EMR framework has provided us with valuable insights about the application of distributed transactional memory.
Title: Practical considerations of DSTM systems development
Authors: Tadeusz Kobus, Maciej Kokociński, Paweł T. Wojciechowski
Abstract: The goal of our research is to devise tools for easy and efficient distributed programming. Our challenge was to design and develop from scratch a Distributed Software Transactional Memory (DSTM) system that would enable fault tolerance while providing high performance and easy to use API. For the past 2 years we have been working on our own DSTM system called Paxos STM. It leverages an optimistic concurrency control scheme with full object replication on each node. Object replication and maintaining strong data consistency is achieved by running the transaction processing layer as an actively replicated service on top of JPaxos, a fast, fault tolerant implementation of the MultiPaxos algorithm. Through the use of multiple optimizations in JPaxos e.g. request batching, multiple concurrent rounds of consensus) as well as in transaction processing layer, Paxos STM achieves high performance. Our experimental results show that in many scenarios, a service (i.e. a program and its objects) replicated using Paxos STM significantly outperforms the same service replicated using JPaxos, i.e. using the traditional state machine approach to service replication.
One of the unique features of JPaxos and Paxos STM is the ability to work in crash-recovery failure model. This is achieved by using one of numerous algorithms built within JPaxos based on either storing application snapshots in the stable storage or maintaining necessary data only in volatile memory. The overall performance of the system as well as the provided guarantees (e.g. in the respect of catastrophic failures) depends on the chosen algorithm. This feature is embraced by Paxos STM by providing the programmer with a convenient way of implementing recovery capabilities in the distributed application.
A lot of effort has been devoted to the API in order to make Paxos STM easy to use for the programmer. Our aim was to provide constructs for transactional semantics that do not introduce any extensions to the programming language.Instead our solution is based on automatic code instrumentation and requires minimal input from the programmer.
In order to evaluate our system we have implemented several benchmarks with hash table, bank, STMBench7 among others. The results indicate direct consequences of employed algorithms and protocols together with their properties and limitations. However, one has to also consider the impact of implementation specific design choices, such as distributed garbage collection, state sharing procedures, smart object serialization, etc. We would like to share the experiences that we gathered in the course of our ongoing work on Paxos STM research platform.
Session 3: Adaptive/Self-Tunning Systems
Title: Telex/IceCube: a distributed memory with tunable consistency
Authors: Marc Shapiro, Nuno Preguiça, Pierre Sutra, Pierpaolo Cincilla
Abstract: Telex/IceCube is a distributed object memory designed for interactive, co- operative applications [1, 3].
In contrast to classical TMs, which ensure strong consistency based on fine- grain memory tests, Telex is designed for scalability, availability and durability, even when the network is slow or disconnected. Telex transactions are durable but not strictly isolated, and Telex uses partial replication. Telex aims to adaptively use the cheapest consistency mechanism that is sufficient to eventually maintain application-level invariants.
We provide several means to express invariants. First, application prorammers classify pairs of operations as commutative (non-conflicting), non-commutative, mutual ly exclusive, or causal ly dependent. Furthermore, static analysis may mark non-commuting pairs as helping or hindering . Finally, application code checks for logical conflicts at run time.
The scheduler selects only one of a mutually-exclusive pair, executes operations in an order that satisfies dependencies, and attempts to satisfy helps/ hinders heuristics. If a conflict occurs at run time nonetheless, the scheduler rolls back, and uses machine learning techniques to avoid the same conflict in the future. Once a schedule succeeds, the system tries to get it adopted by other replicas, using a whole-schedule consensus  that is optimized for commutative operations .
In a well-connected environment, performance does not improve because of the overhead of this semantic and optimistic approach. However, it is highly desirable in disconnected mode, since data remains available both for reads and for writes. Furthermore, our experimental results show that, in a WAN environment, our approach can considerably improve throughput and latency .
 Lamia Benmouffok, Jean-Michel Busca, Joan Manuel Marquès, Marc Shapiro, Pierre Sutra, and Georgios Tsoukalas. Telex: A semantic platform for cooperative application development. In Conf. Française sur les Systèmes d’Exploitation (CFSE), Toulouse, France, September 2009.
 Nuno Preguiça, Marc Shapiro, and J. Legatheaux Martins. Automating semantics-based reconciliation for mobile transactions. In Conf. Française sur les Systèmes d’Exploitation (CFSE), pages 515–524, La-Colle-sur-Loup, France, October 2003.
 Nuno Preguiça, Marc Shapiro, and Caroline Matheson. Semantics-based reconciliation for collaborative and mobile environments. In Int. Conf. on Coop. Info. Sys. (CoopIS), volume 2888 of Lecture Notes in Comp. Sc., pages 38–55, Catania, Sicily, Italy, November 2003. Springer-Verlag GmbH.
 Pierre Sutra, João Barreto, and Marc Shapiro. Decentralised commit- ment for optimistic semantic replication. In Int. Conf. on Coop. Info. Sys. (CoopIS), Vilamoura, Algarve, Portugal, November 2007.
 Pierre Sutra and Marc Shapiro. Fast Genuine Generalized Consensus. In Symp. on Reliable Dist. Sys. (SRDS), Madrid, Spain, October 2011.
Title: PolyCert: Polymorphic Self-Optimizing Replication for In-Memory Transactional Grids
Authors: Maria Couceiro, Paolo Romano and Luis Rodrigues
Abstract: In-memory NoSQL transactional data grids are emerging as an attractive alternative to conventional relational distributed databases. In these platforms, replication plays a role of paramount importance, as it represents the key mechanism to ensure data durability. In this talk we focus on Atomic Broadcast (AB) based certification replication schemes, which have recently emerged as much more scalable alternative to classical replication protocols based on active replication or atomic commit protocols. We first show that, among the proposed AB-based certification protocols, no-one-fits-all solution exists that achieve optimal performance when considering heterogeneous workloads produced by complex transactional applications. Next, we present PolyCert, a polymorphic certification protocol that allows for the concurrent coexistence of different certification protocols and that relies on machine-learning techniques to determine the optimal certification scheme on a per transaction basis. We present the evaluation of two alternative oracles, based on parameter-free machine learning techniques that rely both on off-line and on-line training approaches. Our experimental results demonstrate the effectiveness of the proposed approach, highlighting that PolyCert is capable of achieving performance extremely close to those of an optimal non-adaptive certification protocol in presence of non heterogeneous workloads, and significantly outperform any non-adaptive protocol when used with realistic, complex applications that generate heterogeneous workloads.
Title: Transactional Auto Scaler: Elastic scaling of NoSQL transactional data grids
Authors: Diego Didona, Paolo Romano, Sebastiano Peluso, Francesco Quaglia
Abstract: In this talk we will present TAS (Transactional Auto Scaler), a system that relies on a novel hybrid analytical/machine-learning-based forecasting methodology in order to accurately predict the performance achievable by transactional applications executing on top of transactional in-memory data stores, in face of changes of the scale of the system.
Applications of TAS range from on-line self-optimization of in-production applications, to the automatic generation of QoS/cost driven elastic scaling policies, and support for what-if analysis on the scalability of transactional applications.
We demonstrate the accuracy and feasibility of TAS via an extensive experimental study based on a fully ﬂedged prototypal implementation integrated with one of the most popular open-source transactional data grids (JBoss Inﬁnispan) and industry-standard benchmarks generating a breadth of heterogeneous workloads.
Session 4: Theory of Distributed Transactional Memories
Title: A Wait-Free Disjoint Access Parallel Universal Construction
Authors: Faith Ellen, Panagiota Fatourou, Eleftherios Kosmas, Alessia Milani, Corentin Travers
Abstracts: It is desirable that implementations resulting from a universal construction or a transactional memory allow operations operating on disjoint parts of the data structure to progress concurrently.
Disjoint-access-parallelism formalizes this by requiring that different operations cannot both access the same base object unless they operate on some common part of the data structure.
Several definitions for disjoint-access-parallelism have being proposed in the literature. We show that all these definitions are provably incompatible with wait-freedom.Finally, we propose a new algorithm that is wait-free and disjoint-access parallel but that works only for data structures such that every sequential execution of an operation applied to the data structure accesses at most M data items for some constant M.
Title: Boosting STM Replication via Speculation
Authors: Roberto Palmieri, Paolo Romano, Francesco Quaglia, Luis Rodrigues
Abstract: Over the last years, several papers have addressed the issue of how to enhance both performance and depend- ability of Software Transactional Memory (STM) systems using replication techniques, e.g., [1, 3, 2, 13]. Since STM and Database Management Systems (DBMS) share the notion of atomic transaction, quite unsurprisingly, the vast literature on database replication, e.g. [9, 8, 4], has represented a natural source of inspiration also for the design of replicated STMs. However, in typical STM settings, the transaction execution time is up to two or three orders of magnitude smaller than in database environments [10, 5]. In fact, unlike in conventional DBMSs, STM transactions only access to in-memory data items avoiding to interact with any stable storage devices for manipulating data and logging operations (e.g., in case of recovery after a crash), hence not incurring in the latencies proper of disk accesses. Also, in STMs the overhead associated with SQL parsing and plan optimization are absent, which further contributes to shortening the transaction lifetime. It follows therefore that the relative overhead introduced by classical database replication schemes is correspondingly amplified, when these are straightforwardly employed in the context of STMs.
In order to address these efficiency issues, we have recently proposed a new approach to replicate STMs (but in in general for transactional systems), which we named speculative replication . The speculative replication algorithms that we have proposed so far [12, 6, 7] explore a number of different trade-offs, but all of them are based on two key common principles:
i minimizing replication overhead by overlapping the replica synchronization and transaction execution phases, and
ii maximizing robustness in presence of mispeculations by exploring multiple transaction serialization orders.
Selecting from our published works, in this talk we present two speculative protocols, namely AGGRO  and OSARE , tailored for replicated STM systems, in which a considerable effort has been spent in order to improve system performance smartly selecting one (or more) speculative serialization orders among all the possibles.
 Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, and Christos Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. In SOSP ’07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 159–174, New York, NY, USA, 2007. ACM.
 Nuno Carvalho, Paolo Romano, and Lu ́ıs Rodrigues. Asynchronous lease-based replication of software transac- tional memory. In Middleware, pages 376–396, 2010.
 Maria Couceiro, Paolo Romano, Nuno Carvalho, and Luis Rodrigues. D2STM: Dependable Distributed Software Transactional Memory. In Proc. of PRDC. IEEE Computer Society Press, 2009.
 Bettina Kemme, Fernando Pedone, Gustavo Alonso, Andre Schiper, and Matthias Wiesmann. Using optimistic atomic broadcast in transaction processing systems. IEEE TKDE, 15(4):1018–1032, 2003.
 R. Palmieri, F. Quaglia, P. Romano, and N. Carvalho. Evaluating database-oriented replication schemes in software transactional memory systems. In Proc. of DPDNS, 2010.
 Roberto Palmieri, Francesco Quaglia, and Paolo Romano. Aggro: Boosting stm replication via aggressively optimistic transaction processing. In Proc. of NCA, pages 20–27, Los Alamitos, CA, USA, 2010. IEEE Computer Society.
 Roberto Palmieri, Francesco Quaglia, and Paolo Romano. Osare: Opportunistic speculation in actively replicated transactional systems. In SRDS, pages 59–64, 2011.
 Marta Patino-Mart ́ınez, Ricardo Jim enez-Peris, Bettina Kemme, and Gustavo Alonso. Scalable replication in database clusters. In Proc. of the 14th International Conference on Distributed Computing, pages 315–329, London, UK, 2000. Springer-Verlag.
 Fernando Pedone, Rachid Guerraoui, and Andr ́e Schiper. The database state machine approach. Distributed and Parallel Databases, 14(1):71–98, 2003.
 P. Romano, N. Carvalho, and L. Rodrigues. Towards distributed software transactional memory systems. In Proc. of the Workshop on Large-Scale Distributed Systems and Middleware, September 2008.
 Paolo Romano, Roberto Palmieri, Francesco Quaglia, Nuno Carvalho, and Luis Rodrigues. Brief announcement: On speculative replication of transactional systems. In Proc. of SPAA, 2010.
 Paolo Romano, Roberto Palmieri, Francesco Quaglia, Nuno Carvalho, and Luis Rodrigues. An optimal speculative transactional replication protocol. In Proc. of ISPA, volume 0, pages 449–457, 2010.
 M. M. Saad and B. Ravindran. Supporting stm in distributed systems: Mechanisms and a java framework. In Proc. of TRANSACT, 2011.
Title: Write-Write Conflict Detection for Distributed TM Systems
Authors: Ricardo J. Dias and João M. Lourenço
Abstract: Typical Distributed Transactional Memory (DTM) systems provide serializability by detecting read-write conflicts between accesses to shared data objects. Detecting read-write conflicts requires the bookkeeping of all the read and write accesses to the shared objects made inside the transaction, and the validation of all these accesses in the end of a transaction. For most DTM algorithms, the validation of a transaction requires the broadcast of its read-set to all the nodes of the distributed system during the commit phase. Since applications tend to read more than write, the overhead associated with the bookkeeping of read accesses, and the amount of network traffic generated during the commit, is a main performance problem in DTM systems.
Database systems frequently rely on weaker isolation models to improve performance. In particular, Snapshot Isolation (SI) is widely used in industry. An interesting aspect of SI is that only write-write conflicts are checked at commit time and considered for detecting conflicting transactions. As main result, a DTM using this isolation model does not need to keep track of the read accesses, considerably reducing the bookkeeping overhead, and also eliminates the need of broadcasting the read-set at transaction commit time, thus reducing the network traffic and considerably increase the scalability of the whole system.
By only detecting write-write conflicts, this isolation model allows a much higher commit rate, which comes at the expense of allowing some real conflicting transactions to commit. Thus, relaxing the isolation of a transactional program may lead previously correct programs to misbehave due to the anomalies resulting from malign data-races that are now allowed by the relaxed transactional run-time. These anomalies can be precisely characterized, and are often called in the literature as write-skew anomalies.
Write-skew anomalies can be avoided dynamically with the introduction of algorithms that verify the serializability of running transactions, which may result in a non-negligible overhead. Our approach is to avoid the run-time overhead by statically asserting that a DTM program will execute without generating write-skew anomalies at runtime, while only verifying write-write conflicts. Conflicting transactions can be made “safe” by code rewriting using well known techniques.
Our verification technique uses separation logic, a logic for verifying intensive heap manipulation programs, which was extended to construct abstract read- and write-sets for each transaction in the DTM program. This verification technique is implemented in the StarTM tool, which analyzes transactional Java bytecode programs. The transactions that cannot be verified by our tool are executed under serializable isolation (detecting read-write conflicts), while for the “safe” transactions we only detect write-write conflicts. The results of our experiments when verifying micro-benchmarks, such as linked lists and binary trees, and partially verifying some macro-benchmarks, such as the STAMP benchmark, strongly encourage the ongoing work for the application of this technique to DTM.