TrapperKeeper: Using Virtualization to Add Type-Awareness to File Systems

Daniel Peek and Jason Flinn

 

Abstract

TrapperKeeper is a system that enables the development of type-aware file system functionality. In contrast to existing plug-in-based architectures that require a software developer to write and maintain separate code modules for each new file type, TrapperKeeper requires no type-specific code. Instead, TrapperKeeper executes existing software applications that parse the desired file type inside virtual machines. It then uses accessibility APIs to control the application and extract desired information from the application’s graphical user interface. We have implemented metadata extraction and document preview features that use TrapperKeeper, and we have used TrapperKeeper to capture the type-specific cognizance of over 20 applications that collectively parse more than 100 distinct file types. Our experimental results show that TrapperKeeper can execute these two features on hundreds of files per hour, a pace that far exceeds the rate that files are modified or created on the average desktop.