Knowing: A Generic Data Analysis Application

We got another demo accepted:

Knowing: A Generic Data Analysis Application

Thomas Bernecker, Franz Graf, Hans-Peter Kriegel, Nepomuk Seiler, Christoph Türmer, Dieter Dill
To appear at 15th International Conference on Extending Database Technology (2012)
March 27-30, 2012, Berlin, Germany

Abstract:

Extracting knowledge from data is, in most cases, not restricted to the analysis itself but accompanied by preparation and post-processing steps. Handling data coming directly from the source, e.g. a sensor, often requires preconditioning like parsing and removing irrelevant information before data mining algorithms can be applied to analyze the data. Stand-alone data mining frameworks in general do not provide such components since they require a specified input data format. Furthermore, they are often restricted to the available algorithms or a rapid integration of new algorithms for the purpose of quick testing is not possible. To address this shortcoming, we present the data analysis framework Knowing, which is easily extendible with additional algorithms by using an OSGi compliant architecture. In this demonstration, we apply the Knowing framework to a medical monitoring system recording physical activity. We use the data of 3D accelerometers to detect activities and perform data mining techniques and motion detection to classify and evaluate the quality and amount of physical activities. In the presented use case, patients and physicians can analyze the daily activity processes and perform long term data analysis by using an aggregated view of the results of the data mining process. Developers can integrate and evaluate newly developed algorithms and methods for data mining on the recorded database.

BibTex

@INPROCEEDINGS{BerGraKriSeietal12,
  AUTHOR     = {T. Bernecker and F. Graf and H.-P. Kriegel and N. Seiler and C. Tuermer and D. Dill},
  TITLE      = {Knowing: A Generic Data Analysis Application},
  BOOKTITLE  = {Proceedings of the 15th International Conference on Extending Database Technology (EDBT), Berlin, Germany},
  YEAR       = {2012}
}

More informations will be published at the official publication site at the LMU.

Research Idea: Evaluation of Traffic Lane Detection with OpenStreetMap GPS Data

I am soon leaving University and thus the time for pure research will soon be over. Unfortunately I still have some ideas for possible research. I’ve tried getting them out of my head as this has not yet worked out, I’ll try to write them down – maybe somewone finds them interesting enough for a Bachelor-/Masterthesis or something like that …

Introduction

OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them. The project was started because most maps you think of as free actually have legal or technical restrictions on their use, holding back people from using them in creative, productive, or unexpected ways. The OpenStreetMap approach is comparable to Wikipedia where everyone can contribute content. In openStreetMap, registered users can edit the map directly by using different editors or indirectly by providing ground truth data in terms of GPS tracks following pathes or roads. A recent study shows, that the difference between OpenStreetMap’s street network coverage for car navigation in Germany and a comparable proprietary dataset was only 9% in June 2011.

In 2010, Yihua Chen and John Krumm have published a paper at ACM GIS about “Probabilistic Modeling of Traffic Lanes from GPS Traces“. Chen and Krum apply Gaussian micture Models (GMM) on a data set of 55 shuttle vehicles driving between the Microsoft corporate buildings in the Seattle area. The vehicles were tracked for an average of 12.7 days resulting in about 20 million GPS points. By applying their algorithm to this data, they were able to infer lane structures from the given GPS tracks.

Adding and validating lane attributes completely manually is a rather tedious task for humans – especially in cases of data sets like OpenStreetMap. Therefore it should be evaluated if the proposed algorithm could be applied to OpenStreetMap data in order to infer and/or validate lane attributes on existing data in an automatic or semiautomatic way.

Continue reading Research Idea: Evaluation of Traffic Lane Detection with OpenStreetMap GPS Data

SRTM Plugin for OpenStreetMap

One of the features of our TrafficMining Project at the LMU was to use additional attributes in routing queries. Standart routing queries usually just use time and distance for calculating the fastest/shortest routes. In order to have an additional attribute we decided to use evelation data as this might be an issue if you also want to take fuel cost into account or if you’re planning a bike tour (instead of crossing a hill, you might want to consider a longer tour that avoids crossing the hill).

The problem just was that data nodes from OpenStreetMap are defined mostly by id, latitude and longitude, which is totally enough for painting 2D maps and standard routing queries. As the elevation is not added to OpenStreetMap data directly (and it is also not intended to be added to the OSM data base), a component was needed that parses both Nasa SRTM data as well as OSM data files in order to combine the data.

In the first version, we parsed the SRTM data directly and addied it to the nodes of the OSM-Graph directly. During one refactoring, we decided to integtrate Osmosis into the project in order to be able to read PBF files (another OpenStreetMap file format). During this integration we decided to separate the SRTM parsing into a separate module so that other people can make use of it as well. The plugin was open sourced some time ago at google code as the “osmosis-srtm-plugin” under an LGPL licence.

Relevant Links:

TrafficMining Project goes open source

Quite some time ago I wrote about a little demo that was published at SIGMOD 2010 and SSTD 2011 (see post1 and post2).

The TrafficMining project could be described shortly as:

An academic framework for routing algorithms based on OpenStreetMapdata. Actually this framework is not intended to replace current routing applications but to provide an easy to use GUI for testing and developing new routing algorithms on real OpenStreetMap data.

Well, what makes this worth a post is the fact that we finally switched development over to GoogleCode with a discussion group at Google Groups.
GoogleCode has the major advantage of a Mercurial repository, an issue tracker, easy code reviews and an miproved possibility to contribute code. If you just want to follow the development, just join the google group or keep a bookmark to the project’s update feed.

By the way: the PAROS and MARiO downloads can be found there in the downloads section.

Finished my Posters for ICIP and MICCAI

Finally finished the posters for my publications:

F. Graf, H.-P. Kriegel, M. Schubert, S. Poelsterl, A. Cavallaro
2D Image Registration in CT Images using Radial Image Descriptors
In Medical Image Computing and Computer-Assisted Intervention (MICCAI), Toronto, Canada, 2011.

and

F. Graf, H.-P. Kriegel, M. Weiler
Robust Segmentation of Relevant Regions in Low Depth of Field Images
In Proceedings of the IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, 2011.

Maximum Gain Round Trips with Cost Constraints

The idea is the following: Finding the shortest/fastes path from A to B is rather exploited. But if you start a hike, knowing that you want to spend 4 hours and then come back to the starting point. Then the problem suddenly starts to become a bit complex (NP-hard to be honest if you do not add any constraints).

We propose a solution to do this kind of search a bit more efficient. but don’t expect linear search time 😉 And – in contrast to quite some other research – we are operating on REAL data obtained from OpenStreetMap.

Abstract:

Searching for optimal ways in a network is an important task in multiple application areas such as social networks, co-citation graphs or road networks. In the majority of applications, each edge in a network is associated with a certain cost and an optimal way minimizes the cost while fulfilling a certain property, e.g connecting a start and a destination node. In this paper, we want to extend pure cost networks to so-called cost-gain networks. In this type of network, each edge is additionally associated with a certain gain. Thus, a way having a certain cost additionally provides a certain gain. In the following, we will discuss the problem of finding ways providing maximal gain while costing less than a certain budget. An application for this type of problem is the round trip problem of a traveler: Given a certain amount of time, which is the best round trip traversing the most scenic landscape or visiting the most important sights? In the following, we distinguish two cases of the problem. The first does not control any redundant edges and the second allows a more sophisticated handling of edges occurring more than once. To answer the maximum round trip queries on a given graph data set, we propose unidirectional and bidirectional search algorithms. Both types of algorithms are tested for the use case named above on real world spatial networks.

Documents

At our project site you can find:

Bibtex

@TECHREPORT{GraKriSchu11,
  AUTHOR      = {F. Graf and H.-P. Kriegel and M. Schubert},
  TITLE       = {Maximum Gain Round Trips with Cost Constraints},
  INSTITUTION = {Institute for Informatics, Ludwig-Maximilians-University, Munich, Germany},
  YEAR        = {2011},
  LINK        = {http://arxiv.org/abs/1105.0830v1}
}

MARiO: Multi Attribute Routing in Open Street Map

Yeah, I got a new Publication accepted at Symposium on Spatial and Temporal Databases (SSTD) 2011 that is dealing with OpenStreetMap Data (using the JXMapKit and JXMapViewer).

MARiO: Multi Attribute Routing in Open Street Map

Franz Graf, Hans-Peter Kriegel, Matthias Schubert, Matthias Renz

Published at Symposium on Spatial and Temporal Databases (SSTD) 2011
Conference Date: August 24th – 26th, 2011
Conference Location: Minneapolis, MN, USA.

Abstract:

In recent years, the Open Street Map (OSM) project collected a large repository of spatial network data containing a rich variety of information about traffic lights, road types, points of interest etc.. Formally, this network can be described as a multi-attribute graph, i.e. a graph considering multiple attributes when describing the traversal of an edge. In this demo, we present our framework for Multi Attribute Routing in Open Street Map (MARiO). MARiO includes methods for preprocessing OSM data by deriving attribute information and integrating additional data from external sources. There are several routing algorithms already available and additional methods can be easily added by using a plugin mechanism. Since routing in a multi-attribute environment often results in large sets of potentially interesting routes, our graphical fronted allows various views to interactively explore query results.

Documents:

Bibtex

@INPROCEEDINGS{GraKriRenSch11,
  AUTHOR      = {F. Graf and H.-P. Kriegel and M. Renz and M. Schubert},
  TITLE       = {{MARiO}: Multi Attribute Routing in Open Street Map},
  BOOKTITLE   = {Proceedings of the 12th International Symposium on Spatial and Temporal Databases (SSTD), Minneapolis, MN, USA},
  YEAR        = {2011}
}

Robust Segmentation of Relevant Regions in Low Depth of Field Images

Great, we got accepted (as a poster) on the ICIP 2011 with the paper “Robust Segmentation of Relevant Regions in Low Depth of Field Images”:

Low depth of field (DOF) is an important technique to emphasize the object of interest (OOI) within an image. When viewing a low depth of field image, the viewer implicitly segments the image into region of interest and non regions of interest which has major impact on the perception of the image. Thus, robust algorithms for the detection of the OOI in low DOF images provide valuable information for subsequent image processing and image retrieval. In this paper we propose a robust and parameterless algorithm for the fully automatic segmentation of low depth of field images. We compare our method with three similar methods and show the superior robustness even though our algorithm does not require any parameters to be set by hand. The experiments are conducted on a real world data set with high and low depth of field images. (Abstract from the paper)

The work is a result of a collaboration with Michael Weiler. We extended his Diploma thesis and produced an improved segmentation algorithm for Low Depth Of Field images. Compared to the other 3 competing algorithms, ours is a bit slower but at least it works. The other algorithms turned out to be extremely unstable and/or sensitive to parameters.

On the project site you can find

  • an online demo
  • the test images,
  • the masks
  • the NetBeans project including the full Java source code for our algorithm and the reimplementation of the comparison partners (of course we had to re-implement as we didn’t even get binaries – as usual)

So if you plan to do some image segmentation, just go there download the stuff and cite our work 😉

Fully automatic detection of the vertebrae in 2D CT images – the Talk

Yea finally I gave the talk for my Publication “Fully automatic detection of the vertebrae in 2D CT images” Paper 7962-11 at SPIE Medical Imaging 2011, Conference 7962 Image Processing (see index) in front of about 200 people.

Everything went fine. Just some nice questions right after the talk and some hints afterwards. Hey – some guys even remembered the talk 2 days later! 🙂

Thanks, SPIE Medical Imaging.

Bibliography Extension for MediaWiki

MediaWiki > Skins > Extension … a use case story

In mid 2009 we were asked to redesign our homepage to fit the corporate design of the LMU. During this time, I introduced and established MediaWiki as content management system for our website as it provided the needed flexibility, freedom and usability for our group.

One major issue was the list of publications which is very important for every researcher. In the old system, everyone maintained his own publication list manually. In parallel, all the publications were maintained in a central BibTex-file stored in CVS. Nice example for ‘Redundancy‘ – hey we’re a data base group – there shouldn’t be redundancies. So I wrote an extension in my spare time which parses a bibliography-file into a more convenient format. The result of such an automatically generated list can be seen for example on my publications list. – Well, finally I just put it online. Maybe you are a researcher or just maintinaing the website of some researcher(s) who already have a bib-File of their own publications and want to put your publications list online.

What this Extension can do / Features

The extension in action
The extension in action

This extension allows processing a central bibliography file in BibTeX format in order to create personalized publication pages for authors, projects or keywords. The BibTeX data can be stored in a file in the filesystem or in a special wiki article.

Implemented Features

  • The bibliography can be stored in a filesystem file or in a separate article of the wiki.
  • Multiple authors can share a single BibTeX source and have individual publication lists on their personal pages (=articles)
  • Filtering can be done on all attributes of the bibtex file/article.
  • Filters can be combined: for example, all papers of year=2010 AND author=xy AND keyword=xyz
  • Supports optional bibtex entries (“pages =100–110” may be entered in one bibtex entry but not in another – but if the entry is present, it should be formated as “p. 100-110”. If the attribute is not there, “p. ” should also not appear)
  • Different bibliography-types (article, book, inproceeding, …) use different styles according to the mandatory and optional fields.
  • @unpublished entries will be ignored
  • Supports @String-replacement as it is can be done by BibTeX
  • Automatically adds a separator between entries of different years
  • Provide additional links for each BibTeX entry (like PDFs or links to articles with further informations)
  • Author names can be linked automatically to pre defined wiki articles

Download

Just visit my How-To page, to see what the extension can do.
You can download the extension by using the link in the download section.

Maybe you can just drop me a line if you find the extension useful.