Monday, March 28, 2011

Hardware Accelerated Video Playback in Moonlight

David Reveman has just completed a series of optimizations in the Moonlight engine that allows Moonlight to take advantage of your GPU for the data intensive video rendering operations. This is in addition to the standard GPU hardware acceleration that we debuted a few weeks ago.
This is what the video rendering loop looks like in Moonlight:

mediaelement-pipeline.PNG
Every one of those steps is an expensive process as it has to crunch to a lot of data. For example, a 720p video which has a frame size of 1280x720, this turns out to be 921,600 pixels. This frame while stored in RGB format at 8 bits per channel takes 2,764,800 bytes of memory. If you are decoding video at 30 frames per second, you need to at least move from the encoded input to the video 82 megabytes per second. Things are worse because the data is transformed on every step in that pipeline. This is what each step does:

The video decoding is the step that decompresses your video frames. This is done one frame at a time, the input might be small, but the output will be the size of the original video.

The decoding process generates images in YUV format. This format is used to store images and videos but and with previous versions of Moonlight, we had to convert this YUV data into an in-memory bitmap encoded in RGB format.

The final step is to transfer this image to the graphics card. This typically involves copying the data from the system memory to the graphics card, and in Unix this goes through the user process to the X server process, which eventually moves the data to the graphics card.

New Hardware Accelerated Framework