Selected Publications
Generation of a QEMU based instruction set simulator from a processor description in OpenVADL
  Appeared at SAMOS'25,
  
slides
  and
  
poster
  and
  
article.
Pattern Matching, Transformation and Code Replacement on a Polyhedral Representation of Nested Loops
  Appeared at CF'25,
  
slides
  and
  
article.
OpenVADL: An open source implementation of the Vienna Architecture Description Language
  Appeared at ARCS'25,
  
slides
  and
  
article.
The Vienna Architecture Description Language
  Appeared as 
  
pdf and with source at
  
arXiv.
A pred-LL (*) Parsable Typed Higher-Order Macro System for Architecture Description Languages
  Appeared at GPCE'23,
  
slides
  and
  
article.
Instruction Code Selection
  Appeared in 2021 as
  
chapter 19
  in the book
  
SSA-based Compiler Design.
Fast and Flexible Instruction Selection with Constraints
  Appeared at CC'18,
  
slides
  and
  
article.
Vectorization in PyPy's Tracing Just-In-Time Compiler
  Appeared at SCOPES'16,
  
slides
  and
  
article.
vanHelsing: A Fast Theorem Prover for Debuggable Compiler Verification
  Appeared at SYNASC'15,
  
slides
  and
  
article.
PyPy's Number Crunching Optimization
  Appeared at KPS'15,
  
article.
CASM - Optimized Compilation of Abstract State Machines
  Appeared at LCTES'14,
  
slides
  and
  
article.
Computation of Alias Sets from Shape Graphs for Comparison of Shape Analysis Precision
  Appeared at Software, IET, Vol 8(3) 2014,
  
article.
Integrated Modulo Scheduling and Cluster Assignment for TI TMS320C64x+ Architecture
  Appeared at ODES'14,
  
slides
  and
  
article.
Correct Compilers for Correct Processors
  Keynote at HIPEAC'14,
  
slides.
DSP Instruction Set Simulation
  Appeared at Handbook of Signal Processing Systems 2013,
  
article.
IR-Level Versus Machine-Level If-Conversion for Predicated Architectures
  Appeared at ODES'13,
  
slides
  and
  
article.
Optimal and Heuristic Global Code Motion for Minimal Spilling
  Appeared at CC'13,
  
slides
  and
  
article.
CASM: Implementing an Abstract State Machine based Programming Language
  Appeared at ATPS'13,
  
article.
Using the CASM Language for Simulator Synthesis and Model Verification
  Appeared at Rapido'13,
  
article.
Software De-Pipelining for Nested Loops
  Appeared at IJCSEE'13,
  
article.
A Unified Processor Model for Compiler Verification and Simulation Using ASM
  Appeared at ABVZ'12,
  
article.
Using Semantic Relatedness and Locality for Requirements Elicitation Guidance
  Appeared at SEKE'12,
  
article.
Automatic generation of compiler backends
  Appeared at SPE'12,
  
article.
Modeling Application-Specific Processors for Embedded Systems
  Appeared at Informatik'11,
  
article.
Computation of Alias Sets from Shape Graphs for Comparison of Shape Analysis Precision
  Appeared at SCAM'11,
  
article.
Ontology-driven guidance for requirements elicitation
  Appeared at ESWC'11,
  
article.
DODT: Increasing Requirements Formalism using Domain Ontologies for Improved Embedded Systems Development
  Appeared at DDECS'11,
  
article.
DSP Instruction Set Simulation
  Appeared at Handbook of Signal Processing Systems 2010,
  
article.
Execution Models for Processors and Instructions
  Appeared at Norchip'10,
  
article.
Optimistic Integrated Instruction Scheduling and Register Allocation
  Appeared at CPC'10,
  
article.
Progressive Spill Code Placement
  Appeared at CASES'09,
  
article.
Stack Allocation of Objects in the Cacao Virtual Machine
  Appeared at PPPJ'09,
  
slides
  and
  
article.
Fast and Accurate Simulation using the LLVM Compiler Framework
  Appeared at RAPIDO'09,
  
article.
Generalized Instruction Selection using SSA -Graphs
  Appeared at LCTES'08,
  
slides
  and
  
article.
Leveraging Predicated Execution for Multimedia Processing
  Appeared at ESRTM'07,
  
article.
Compiler Generation from Structural Architecture Descriptions
  Appeared at CASES'07,
  
article.
Adaptive Inlining and On-Stack Replacement in the CACAO Virtual Machine
  Appeared at PPPJ'07,
  
slides
  and
  
article.
Instruction Set Encoding Optimization for Code Size Reduction
  Appeared at SAMOS'07,
  
article.
Compiler optimizations for processors with SIMD instructions
  Appeared at SPE'06,
  
article.
Effective Compiler Generation by Architecture Description
  Appeared at LCTES'06,
  
abstract
  and
  
article.
Superinstructions and Replication in the Cacao JVM Interpreter
  Appeared at 4th International Conference in Central Europe on .NET Technologies 2006,
  
abstract
  and
  
article.
Static Verification of Global Heap References in Java Native Libraries
  Appeared at SPACE'06,
  
abstract
  and
  
article.
Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures
  Appeared at SAMOS'05,
  
article.
Control flow graph reconstruction for assembly language programs with delayed instructions
  Appeared at SCAM'05,
  
article.
xDSPcore: A Compiler-Based Configureable Digital Signal Processor
  Appeared at IEEE Micro 2004,
  
article.
FSEL - Selective Predicated Execution for a Configurable DSP Core
  Appeared at ISVLSI'04,
  
article.
DSPxPlore - Design Space Exploration Methodology for an Embedded DSP Core
  Appeared at SAC'04,
  
article.
Pointer Alignment Analysis for Processors with SIMD Instruction
  Appeared at MSP'03,
  
article.
VLIW Operation Refinement for Reducing Energy Consumption
  Appeared at SOC'03,
  
article.
Graph Coloring vs. Optimal Register Allocation for Optimizing Compilers
  Appeared at JMLC'03,
  
article.
Register Liveness Analysis for Optimizing Dynamic Binary Translation
  Appeared at WCRE'02,
  
abstract
  and
  
article.
Supporting Design by Contract in Java
  Appeared at TOOLS'02, revised version in JOT,
  
abstract
  and
  
article.
Vmgen - a generator of efficient virtual machine interpreters
  Appeared at SPE'02,
  
article.
Implementing an Efficient Java Interpreter
  Appeared at HPCN'01,
  
abstract
  and
  
article.
Java for Large-Scale Scientific Computations?
  Appeared at LSSC'01,
  
abstract
  and
  
article.
Compilation Techniques for Multimedia Processors
  Presented at Dagstuhl Seminar Instruction-Level Parallelism and Parallelizing Compilation,
  
abstract
  and
  
article.
  Full version appeared in
  International Journal of Parallel Programming 28(4), pp. 347-361, 2000.
Minimizing cost of local variables access for DSP-processors
  Appeared at LCTES'99,
  
abstract
  and
  
article.
Garbage Collection for Large Memory Java Applications
  Appeared at HPCN'99,
  
abstract
  and
  
article.
Efficient JavaVM Just-in-Time Compilation
  Appeared at PACT'98,
  
abstract
  and
  
article.
CACAO - Eine effiziente JavaVM Implementierung
  Appeared at JAVA und Eingebettete Systeme '98,
  
abstract
  and
  
article.
Monitors and Exceptions: How to implement Java efficiently
  Appeared at ACM 1998 Workshop on Java for High-Performance Network Computing,
  
abstract
  and
  
article.
  Full version appeared in
  Concurrency: Practice and Experience 10(8), 1998.
JavaVM Implementation: Compilers versus Hardware
  Appeared at ACAC'98,
  
abstract
  and
  
article.
Efficient Type Inclusion Tests
  Appeared at OOPSLA'97,
  
abstract
  and
  
article.
CACAO - A 64 bit JavaVM Just-in-Time Compiler
  Appeared at PPoPP'97 Workshop on Java for Science and
  Engineering Computation,
  
abstract
  and
  
article.
  Full version appeared in Concurrency: Practice and Experience 9(11), 1997.
Near Optimal Hierarchical Encoding of Types
  Appeared at ECOOP'97,
  
abstract
  and
  
article.
On Extending Java
  Appeared at JMLC'97,
  
abstract
  and
  
article.
Removing Anti Dependences by Repairing
  Appeared at CC'96,
  
abstract
  and
  
article.
Software Pipelining with Register Allocation and Spilling
  Short version appeared at MICRO'95,
  
abstract
  and
  
article.
Software Pipelining with Reduced Register Requirement
  Short version appeared at PACT'95,
  
abstract
  and
  
article.
Register Requirement for Exploiting Loops' Maximum Instruction-Level Parallelism
  Short version appeared at ICYCS'95,
  
abstract
  and
  
article.
Incremental Global Compilation of Prolog with the Vienna Abstract Machine
  Appeared at ICLP'95,
  
abstract
  and
  
article.
The VAMAai - an Abstract Machine for Incremental Global Dataflow
           Analysis of Prolog
  Appeared at ICLP'95 Workshop on Abstract Interpretation,
  
abstract
  and
  
article.
Incremental Flow Analysis
  Appeared at the WFLP'94,
  
abstract
  and
  
article.
Improving Semi-static Branch Prediction by Code Replication
  Appeared at PLDI'94,
  
abstract
  and
  
article.
Dependence-Conscious Global Register Allocation
  Appeared at PLSA'94,
  
abstract
  and
  
article.
Delayed Exceptions - Speculative Execution of Trapping Instructions
  Appeared at CC'94,
  
abstract
  and
  
article.
Implementation Techniques for Prolog
  Appeared at WLP'94,
  
slides,
  
abstract
  and
  
article.
A Progress Report on Incremental Global Compilation of Prolog
  Appeared at the ILPS'94 Workshop on Implementation Techniques for Logic
  Programming  Languages,
  
abstract
  and
  
article.
High level constraints over finite domains
  Appeared at CSAM'93,
  
article.
Instruction Scheduling for Complex Pipelines
  Appeared at CC'92,
  
abstract
  and
  
article.
Optimal instruction scheduling using constraint logic programming
  Appeared at PLILP'91,
  
article.
The Vienna Abstract Machine
  Appeared at PLILP'90,
  
abstract
  and
  
article.