Silverlight Video Pyramids


VideoTile Fig 1
Fig 1 – Silverlight Video Tile Pyramid

Microsoft’s DeepZoom technology capitalizes on tile pyramids for MultiScaleImage elements. It is an impressive technology and is the foundation of Bing Maps Silverlight Control Navigation. I have wondered for some time why the DeepZoom researchers haven’t extended this concept a little. One possible extension that has intriguing possibilities is a MultiScaleVideo element.

The idea seems feasible, breaking each frame into a DeepZoom type pyramid and then refashioning as a pyramid of video codecs. Being impatient, I decided to take an afternoon and try out some proof of concept experiments. Rather than do a frame by frame tiling, I thought I’d see how a pyramid of WMV files could be synchronized as a Grid of MediaElements:

<font color="#0000FF" size=2><</font><font color="#A31515" size=2>Grid</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="VideoTiles"</font><font color="#FF0000" size=2> Background</font><font color="#0000FF" size=2>="{</font><font color="#A31515" size=2>StaticResource</font><font color="#FF0000" size=2> OnTerraBackgroundBrush</font><font color="#0000FF" size=2>}" </font>
<font color="#FF0000" size=2> Width</font><font color="#0000FF" size=2>="426"</font><font color="#FF0000" size=2> Height</font><font color="#0000FF" size=2>="240"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>
<font color="#FF0000" size=2> HorizontalAlignment</font><font color="#0000FF" size=2>="Center"</font><font color="#FF0000" size=2> VerticalAlignment</font><font color="#0000FF" size=2>="Center">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>Grid.ColumnDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>ColumnDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>ColumnDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid.ColumnDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>Grid.RowDefinitions</font><font color="#0000FF" size=2>>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>RowDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>RowDefinition</font><font color="#0000FF" size=2>/>
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid.RowDefinitions</font><font color="#0000FF" size=2>>
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v00"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>="http://az1709.vo.msecnd.net/video/v00.wmv"</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="0"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="0" <font color="#0000FF" size=2>/>
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v10"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>="http://az1709.vo.msecnd.net/video/v10.wmv"</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="1"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="0" />
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v11"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>="http://az1709.vo.msecnd.net/video/v11.wmv"</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="1"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="1" />
&nbsp;&nbsp;<font color="#0000FF" size=2><</font></font><font color="#A31515" size=2>MediaElement</font><font color="#FF0000" size=2> x</font><font color="#0000FF" size=2>:</font><font color="#FF0000" size=2>Name</font><font color="#0000FF" size=2>="v01"</font><font color="#FF0000" size=2> Source</font><font color="#0000FF" size=2>="http://az1709.vo.msecnd.net/video/v01.wmv"</font><font color="#FF0000" size=2>
 Grid.Column</font><font color="#0000FF" size=2>="0"</font><font color="#FF0000" size=2> Grid.Row</font><font color="#0000FF" size=2>="1" />
<font color="#0000FF" size=2>&lt;/</font></font><font color="#A31515" size=2>Grid</font><font color="#0000FF" size=2>></font>

Ideally to try out a video tile pyramid I would want something like 4096×4096 since it divides nicely into 256 like the Bing Maps pyramid. However, Codecs are all over the place, and tend to cluster on 4:3 or 16:9 aspect ratios. Red 4K at 4520×2540 is the highest resolution out there, but I didn’t see any way to work with that format in Silverlight. The best resolution sample clip I could find that would work in Silverlight was the WMV HD 1440×1080 Content Showcase. Since I like the Sting background music, I decided on “The Living Sea” IMAX sample.

Not enough resolution to get too far, but I am just looking at multi tile synching for now and two levels will do. I ended up using Expression Encoder 3 to take the original resolution and clip to smaller subsets.

Zoom Level 1:

00 10
01 11

Zoom Level 2:

0000 0010 1000 1010
0001 0011 1001 1011
0100 0110 1100 1110
0101 0111 1101 1111

I encoded ZoomLevel 1 as 4 tiles 640×480 and Zoom Level 2 as 16 tiles at 320×240. I then took all these tiles and dropped them into my Azure CDN video container. Again this is not a streaming server, but I hoped it would be adequate to at least try this in a limited time frame. Now that I have the video pyramid with two zoom levels I can start trying out some ideas.


VideoTile Fig 2
Fig 2 – Silverlight Video Tile Pyramid Zoom Level 1


VideoTile Fig 3
Fig 3 – Silverlight Video Tile Pyramid ZoomLevel 2

First, it is fairly difficult to keep the Grid from showing in the layout. Moving around with different sizes can change the border but there is generally a slight line visible, which can be seen in Fig 2. Even though you don’t see the lines in Fig 3, it also is made up of four tiles. This is setup just like a normal tile pyramid with four tiles under each upper tile in a kind of quad tree arrangement. In this case very simple with just the 2 levels.

I tied some events to the MediaElements. The main pyramid events are tied to MouseWheel events:

void Video_MouseWheel(object sender, MouseWheelEventArgs e)
{
    int delta = e.Delta;
    if (delta < 0)
    {
      //zoom out
      VideoZoomLevel--;
      if (e.OriginalSource.GetType() == typeof(MediaElement))
      {
        VideoCnt = 0;
        MediaElement me = e.OriginalSource as MediaElement;
        currentPostion = me.Position;
        v00.Source = new Uri("http://az1709.vo.msecnd.net/video/v00.wmv");
        v10.Source = new Uri("http://az1709.vo.msecnd.net/video/v10.wmv");
        v11.Source = new Uri("http://az1709.vo.msecnd.net/video/v11.wmv");
        v01.Source = new Uri("http://az1709.vo.msecnd.net/video/v01.wmv");
      }
    }
    else if (delta > 0)
    {
      //zoom in
      if (e.OriginalSource.GetType() == typeof(MediaElement))
      {
        VideoZoomLevel++;
        if (VideoZoomLevel <= maxVideoZoom)
        {
            VideoCnt = 0;
            MediaElement me = e.OriginalSource as MediaElement;
            currentPostion = me.Position;
            string quad = me.Source.LocalPath.Substring(0, me.Source.LocalPath.IndexOf(".wmv"));

            v00.Source = new Uri("http://az1709.vo.msecnd.net" + quad + "00.wmv");
            v10.Source = new Uri("http://az1709.vo.msecnd.net" + quad + "10.wmv");
            v11.Source = new Uri("http://az1709.vo.msecnd.net" + quad + "11.wmv");
            v01.Source = new Uri("http://az1709.vo.msecnd.net" + quad + "01.wmv");
        }
        else
        {
            VideoZoomLevel = maxVideoZoom;
        }
      }
    }
}

I’m just checking a MouseWheel delta to determine whether to go in or out. Then looking at the original source I determine which quad the mouse is over and then create the new URIs for the new Zoom Level. This is not terribly sophisticated. Not surprisingly the buffering is what is the killer. There are some MediaOpen and Load events which I attempted to use, however, there were quite a few problems with synchronizing the four tiles.

If you can patiently wait for the buffering it does work after a fashion. Eventually the wmv are in local cache which helps. However, the whole affair is fragile and erratic.

I didn’t attempt to go any further with panning across the Zoom Level 2. I guess buffering was the biggest problem. I’m not sure how much further I could get trying to move to a Streaming Media server or monitoring BufferProgress with a timer thread.

The experiment may have been a failure, but the concept is none the less interesting. Perhaps some day a sophisticated codec will have such things built in.

The high altitude perspective

One aspect which makes MultiScaleVideo interesting is just its additional dimension of interactivity. As film moves inexorably to streaming internet, there is more opportunity for viewer participation. In a pyramid approach focus is in the viewer’s hand. The remote becomes a focus tool that moves in and out of magnification levels as well as panning across the video 2D surface.

In the business world this makes interfaces to continuous data collections even more useful. As in the video feed experiment of the previous post, interfaces can scan at low zoom levels and then zoom in for detailed examination of items of interest. Streetside photos in the example Streetside path navigation already hint at this, using the run navigation to animate a short photo stream while also providing zoom and pan DeepZoom capability.

One of the potential pluses for this, from a distributor point of view, is repeat viewer engagement. Since the viewer is in control, any viewing is potentially unique, which discourages the typical view and discard common with film videos today. This adds value to potential advertisement revenue.

The film producer also has some incentive with a whole new viewer axis to play with. Now focus and peripheral vision take on another dimension, and focal point clues can create more interaction or in some cases deceptive side trails in a plot. Easter eggs in films provide an avid fan base with even more reason to view a film repeatedly.

Finally, small form factor hand held viewers such as iPhone and Android enabled cell phones can benefit from some form of streaming that allows user focus navigation. The screen in these cases is small enough to warrant some navigation facility. Perhaps IMAX or even Red 4K on handhelds is unreasonable, but certainly allowing navigation makes even the more common HD codecs more useable. A video pyramid of streaming sources could make a compelling difference in the handheld video market.

Summary

MultiScaleVideo is a way to enhance user interaction in a 2D video. It doesn’t approach the game level interaction of true 3D scene graphs, but it does add another axis of interest. My primitive exercise was not successful. I am hoping that Microsoft Labs will someday make this feasible and add another type of Navigation to the arsenal. Of course, you can imagine the ensuing remote controller wars if DeepZoom Video ever becomes common place.

One more thing, check out the cool scaling animation attached to the expander button, courtesy of Silverlight Toolkit Nov drop.

This entry was posted in Uncategorized by admin. Bookmark the permalink.

Comments are closed.