September 1 was the start date of an important new project for two faculty members in Computer Science and Engineering. CSE Prof. Yannis Papakonstantinou (near right) is principal investigator on the $1.1 million, three-year project funded by the National Science Foundation to build Plato, a model-based database for compressed, spatiotemporal sensor data. Co-PI on the project is CSE Prof. Yoav Freund (far right).
At it stands, analytics for sensor data is not as productive as tools for non-sensor business intelligence platforms. The reason? Database technology and sensor data processing currently don't mix, at least not very well, part because SQL databases of spatiotemporal sensor data fail due to the lack of critical abstractions (real-world models) that capture the stochastic processes which generate measurements. This is particularly true when dealing with many types of sensor data, or when mixing sensor data with metadata from conventional databases, or when many different types of analysis are required. Furthermore, anyone handling this type of data must be simultaneously an expert in signal processing, statistics, and the management of big data.
So the Plato system will allow analysts to develop quickly declarative queries that can be automatically optimized. "By doing so, the project will deliver the envisioned productivity gains," says Papakonstantinou. "Plato will also lower the technical sophistication required of users, therefore enabling many scientists and domain specialists to work with sensor-data analytics." While Papakonstantinou focuses on designing a model-aware data model and query language features that combine conventional SQL querying with statistical signal processing, co-PI Yoav Freund will develop learning algorithms that learn the model components of reduced-noise, additive model representations. Other algorithmic work will involve query processing directly on compressed representations rather than the original data, and semiautomated algorithms to further compress the model representations in light of dependencies between the models.
The researchers are also planning to use the CSE-built UC San Diego Energy Dashboard (pictured at left) and the Qualcomm Institute-based Data E-platform Leveraged for Patient Empowerment and Population Health Improvement (DELPHI) as primary use cases for the new database system as it develops.