Oliver Kennedy is a postdoctoral researcher at EPFL Switzerland, working with Christoph Koch working on stream processing and distributed and in-memory databases. Oliver is a graduate of Cornell University, where he studied databases: distributed, probabilistic, toasted, and otherwise.
In his spare time, Oliver writes about himself in the third person, plots world domination through cooking, and enjoys western martial arts, photography, adventuring, and an assortment of coding projects in languages including OCaml, C++, Objective-C, Cappuccino, and Ruby.
Recent Research Projects
- DBToaster is a Database compiler. Given a small set of (potentially parametrized) queries, DBToaster produces a database engine optimized for that specific workload. Queries are processed in a streaming manner so that answers are always available. Thanks to a novel recursive compilation technique, DBToaster is able to optimize a space vs processing time tradeoff that allows it to achieve (for a large fragment of SQL) streaming time complexities equivalent to what a relational engine could achieve with batch processing.
- DBToaster's compilation technique reduces queries into primitive vetor/matrix operations over tuple sets. This simple format - which we call K3 - is ammenable both to functional-style optimizations, and to being mapped to a broad range of platform- specific runtimes. Of particular interest: the set-at-a-time approach of K3 makes distribuition both easy and eficient.
- My work on DBToaster focuses on the core DBToaster compiler, as well as synchronization and data-placement in its distributed runtimes.
- Jigsaw
- Probabilistic databases, in particular those that allow users to externally define probability distributions - so called VG-Functions - are an ideal tool for constructing, simulating, and analyzing hypothetical business scenarios. Jigsaw is a probabilistic database-based analytics tool resulting from a collaboration between Microsoft Research and the Microsoft Server and Tools Division. Jigsaw focuses on parameter optimization, a goal hitherto unexplored by the probabilistic database community. The exponential relationship between number of parameters and the size of the resulting parameter space makes evaluating queries across the entire rage of possible parameter values impractical. Using a novel "fingerprinting" technique to re-use computations for different parameter value combinations, Jigsaw achieves speedups of as much as 2 orders of magnitude over the naive state-of-the-art for real-world business scenarios.
- Most historical probabilistic database research relies heavily on an assumption of discreteness in the probability distributions being managed. Several recent projects have demonstrated the possibility of implementing continuous probabilistic databases using a variety of brute force techniques. Pip takes this idea several steps further and demonstrates how such databases can be made extremely efficient.
- Rumor Mongering
- Some of my early database research, exploring in-network aggregation in dynamic decentralized mobile wireless networks. The algorithms we developed make it possible to obtain continuously up-to-date estimates of the sum, count, and average aggregates applied to networks with high levels of churn.
Courses
- EPFL Advanced Databases
- Spring 2011 (TA)
- Cornell CS 4410/4411 (formerly CS 414/415)
- Spring 2006 (Lab Instructor)
- Fall 2006 (Lab Instructor)
- Fall 2008 (Lab Instructor)
Personal Projects
- Hugin is a shared whiteboard for "tabletop" gaming groups. Currently in early beta, Hugin allows the use of miniatures on top of a vector based graphics system. Hugin is implemented in cappuccino.
- Dawn is the latest incarnation of the HotCoca Hotline client. Featuring an extensible architecture, Dawn interfaces with the (now defunct) Hotline Connect server in order to provide chat, bulletin board, file transfer, and message board functionality. Dawn also includes a randomizer plugin designed to support "tabletop" RPGs.