ACM MM 2011 Grand Challenge

In November ’11, we presented a response to one of the ACM MM Grand Challenge. The Grand Challenge we answered to is described here; to make it fast, we are dealing with a very large sets of multiple audio, video and sensors (accelerometers) captured during a Salsa course, with one set per student. The data set was acquired in the context of the 3DLife EU project. More information on the data set can be found here.

Our proposal combined several topics such as automatic resynchronization of multiple cameras, microphones and various sensors, quality rating of the performance, automatic choreography subtitling (dance steps’ names), skeleton extraction on a Kinect video, along with interactive visualization of the entire data set. The visualization is done through an SVG Tiny 1.2 application, played by GPAC. The main characteristic of the demo is:

Several video streams can be viewed synchronously, with a frame level accuracy.
Streams can be changed on the fly without breaking the synchronization
Any of the 16 audio tracks can be selected on the fly
playback speed can be changed (slow motion / fast forward)
viewers performances can be compared by viewing two dancers at the same time

In the first video, we show the layout of the microphones and cameras around the dance floor, and how videos can be previewed.

[youtube]http://www.youtube.com/watch?v=kfLm6A1pJRQ[/youtube]

In the second video, we show how audio tracks and videos are selected and can be switched on the fly.

[youtube]http://www.youtube.com/watch?v=t8d5fUPZWAw[/youtube]

The third video shows an enhanced visualization mode were steps info is synchronously displayed and overdubbed in the audio track.

[youtube]http://www.youtube.com/watch?v=iFNNJI49nb0[/youtube]

The last video shows how dancers can be selected, showing the dancer performance ratings. It also shows how two dancers can be compared, using slow motion to analyse the errors of the left dancer in this case.

[youtube]http://www.youtube.com/watch?v=nGswS1ocbUA[/youtube]

The audio resynchronization tools have been developed by Telecom ParisTech Audio Group while the SVG application was developed within Telecom ParisTech MultiMedia Group.

The complete description of this work is available in this paper: « Enhanced visualisation of dance performance from automatically synchronised multimodal recordings ».

Laisser un commentaire Annuler la réponse