Next: Discussion Up: MPEG Plug-in for Netscape Previous: Choosing an MPEG Decoder

Tailoring `mpeg_play`

A decision was made to use mpeg_play-2.3-patched as the origin for an MPEG playing code library. As an initial attempt to simplify the code, several of the available dithers were removed. Some of them didn't seem to work, while others failed to provide excessive features compared to the other dither algorithms. Several #ifdef's that were not used, or that were used for testing, were removed, along with program options and features that were of no or little interest when using this program inside a Web browser, like the stand-alone control bar. The cleanup reduced the code size by several thousand lines.

Then came the time to extend the library. First of all, mpeg_play was written to read data from a file at it's own pace, that is; ``pulling'' data from the source. The input routines had to be rewritten to let Navigator ``push'' the data as it gets available. Secondly, mpeg_play spends some time decoding the data, especially if the flowing of the MPEG stream is slow. Since Navigator executes plug-ins in it's own thread, this could cause unfortunate ``hangs'', where user input is not responded to. This problem calls for a separate thread, and with separate threads come the problems of parallelism.

The Client - Server Approach

One of the two threads handles decoding of an MPEG stream to produce video output. This thread may be classified as an MPEG decoding server. The original thread, running inside Netscape Navigator, may be viewed as a client, feeding commands and MPEG data to the server.

Real multi-threading, that is; several processes running in the same address space, is currently not supported by all Unix-systems, so from a portability point of view, it is safer to rely on old-fashioned separate processes. This also gives another benefit: The mpeg_play source code pays little attention to freeing memory. For each movie it allocates lots of memory, but the chunks are never freed. When using a separate process for each film, the memory will automatically be released when the process terminates. Solving the problem by using separate processes is much easier than tracking every point of memory leakage in the code.

In Unix, the system call fork is used for creating a new process. The new (child) process is an exact copy of the calling (parent) process. Both processes continue running after the call to fork. The return value from fork makes it possible to identify which process is the parent, and which is the child, to let the two processes perform different tasks. The plug-in will fork off a child to handle the MPEG decoding, while the parent, running in Navigator's address space, will handle user input.

Interprocess Communication

The separate processes are supposed to cooperate, so we need a way to make them talk to each other through interprocess communication (IPC). The MPEG decoding process plays the role of a server, waiting for commands and data from the client process. The commands are initiated by user action, such as pressing buttons to start and stop the playback. Commands are also initiated by Navigator; for instance to quit the server when the user leaves a page, and when more MPEG data are available. All this traffic moves from the client to the server. We also need some information to go in the other direction: When Navigator works in a streamed manner, it will query the plug-in how many bytes it will accept (see NPP_WriteReady). The reception and buffering of incoming MPEG data is handled by the server, so we need to return that information through IPC.

Unix provides several methods for doing IPC (For an overview of all these methods and more, see [1] and [75].)

Shared files: are regular files on disk. One process writes to the file, while the other reads. Obviously, this is quite slow, since it includes disk access.
Pipes: are one way channels maintained by the operating system kernel. The processes use pipes via file handles obtained from the pipe system call. Pipes are read and written as regular files, but no disk access is involved. This kind of IPC is supported by all Unix systems, and it is the only one required by POSIX.1, an ISO standard operating system interface and environment to support application portability at the source code level.
Named pipes: works like pipes, except that the entry point is a filename in the filesystem. This way processes that do not have the same ancestor may share the pipe.
Stream pipes: are full duplex versions of pipes.
System V IPC: has, as the name implies, it's origin on Unix System V, but are now available on most modern Unixes. Includes shared memory, semaphores, and message queues. Shared memory is the fastest way to transfer data between processes, since no copying is necessary. Unfortunately System V IPC resources are scarce; a limited number of shared memory segments, semaphores and message queues can exist at any given time on the system.

Ideally one would have used shared memory to transfer MPEG data, but since System V IPC may well not be available, the decision was made not to do so. Needless to say, shared files would be too slow, so the pipes were chosen. Since data is supposed to flow in both directions, two pipes are needed. Stream pipes were also an option, but it would give little benefit over ``normal'' pipes.

Figure 5.1: The two processes involved in the MPEG plug-in.

Figure 5.1 illustrates the two processes cooperating in decoding and viewing an MPEG stream. The process on the left is the original Netscape Navigator process with the plug-in dynamically loaded. When the plug-in starts receiving a stream, it forks, creating an identical process. The Navigator part of the new process is never run, and in modern operating systems it doesn't even take up valuable memory. Two pipes are created, one for sending commands and data from the client to the server, and one for giving feedback from the server to the client. When the client receives parts of the MPEG stream from Netscape Navigator, it passes it on to the server, and receives a status reply in return. The server handles all decoding, and displays the result in a subwindow of the browser.

The `mpeg_play` Library API

This section describes the API that was written for the extended mpeg_play library to hide the details of parallelism and IPC, and thus simplify it's use. The user of the library, in our case the MPEG plug-in, calls normal functions, of which two fork a new process, while others perform the IPC. The two fork'ing functions are the main entrypoints to the library. They differ in the way they set up the handling of the MPEG stream.

mpNewStreamFromFile: takes a filename parameter, and is thus used when a local file containing the MPEG data is available.
mpNewStreamPushed: initiates a pushed stream. The data transfer is controlled by the three functions mpQueryWantedBytes, mpSendBytes and mpEndOfStream, as described below.

After fork'ing, the child initializes the MPEG decoder. This includes setting up dithering, and creating a window in which to display the movie. It then enters the main loop of the program. This loop is responsible for fetching commands from the client, and do timing according to the pace of which the film is playing.

Among the commands accepted, are the ones used to push the MPEG stream to the decoder.

mpQueryWantedBytes: is used to request the maximum number of bytes the library is able to receive at the time the call is issued.
mpSendBytes: does the actual data transfer. Attempting to send more bytes than indicated by mpQueryWantedBytes, will result in an error.
mpEndOfStream: tells the library that no more data will be sent for this stream.

Another set of functions is involved with direct user action. In the plug-in, these are called whenever the user presses buttons like ``Play'', ``Rewind'', ``Stop'', etc.

mpSetLoop: sets or clears the internal loop flag. If the loop flag is set, the movie is automatically rewinded and restarted when the end is reached. Note that this is only possible if the MPEG stream is available to the library as a file.
mpPlay: starts playing, or resumes playing from the current position.
mpNextFrame: displays the next frame of the movie. Note that this is supposed to be used when the film is paused. When in play mode, the frames are advanced automatically.
mpRewind: restarts the current stream. This is only possible when the library has access to the stream as a file.
mpStop: will pause the playing.
mpQuit: shuts down the server process. After this is called, no other functions should be used, since the child process is no longer available.

The last functions deal with the window in which the movie is shown. They are typically called when the client receives certain events from the window system.

mpRepaint: should be called when it's time to redraw the contents of the window, typically when the parent window receives an ExposureEvent.
mpParentResize: handles positioning of the video window within the parent. Should be called when the size of the parent is changed.

Avoiding the Pitfalls of Parallel Processing

The original mpeg_play was written to read the MPEG stream from a file. To simplify conversion to a more ``event-driven'' approach, a new set of functions were written, resembling the original fread- and fgetc-functions that were used formerly. The new functions work against a buffer that is filled when the client sends MPEG data using mpSendBytes. If the buffer is empty when any of these functions are called, a tight loop similar to the main loop is entered, waiting for more data from the client. The call to the reading function is not returned until more data is available, or an mpEndOfStream is issued.

The main goal of the tight loop, is to receive MPEG data for further decoding. What if the client sends other commands? Handling them may work for some, but fail for others. An example: The server is running out of data while decoding a frame, so it enters the loop to wait for more. The client sends a command ordering the server to skip to the next frame. The server obeys this command, calling the function to decode a frame a second time recursively, messing everything up. To get around this problem, queuing of commands was introduced. Every command that is not related to sending MPEG data or quitting, are entered in a queue. This queue is later processed in the main loop of the server, before reading any further commands from the client.

Note that mpQueryWantedBytes must be responded to immediately, that is; not through the queue. Failure to do so would introduce a deadlock (more on deadlocks in [76]): The server is waiting for the client to send data. The client calls mpQueryWantedBytes, and waits for a reply. The reply will never come, since the server just queues the command.

On X11 and Colors

The X Window System is designed to be portable across a wide range of platforms, with an even wider range of supported displays. Unfortunately, displays differ too much in their characteristics to make this transparent to programmers.

The display hardware offers one or more bitplanes. The combination of corresponding bits from each plane yields a pixel value, controlling a single pixel on the screen by indexing into a colormap. The number of simultaneous colors or grayscales, is thus given by the formula , where n is the number of bitplanes. Monochrome displays have a single bitplane. Color or grayscale displays typically have between 8 and 24 bitplanes.

Display hardware is typically capable of generating a much larger number of colors than may be displayed at once. To control which colors are currently displayable, colormaps are used. For color displays, a colormap entry describes the mixture of red, green and blue light used to produce the color in question. The pixel value from the bitplanes is used as an index into the current colormap.

Depending on the hardware, a colormap may be writable, or read-only. Writable colormaps let programs change the red, green and blue component to fit their needs. Read-only colormaps have preset values that may not be changed.

In X11, the characteristics of a colormap is described using a visual. The visual describes, among other things, the number of bitplanes, the size of the colormap, and a visual class. The visual class describes the features of the colormap; is it writable, or read-only? Is it color, grayscale or monochrome? Is the index into the colormap decomposed into separate indexes for the three color components? Table 5.9 sums this up for the six available visual classes:

table1775
Table 5.9: Comparison of Visual Classes. (From [77, section 7.3.4,])

The books [77] and [78] gives thorough information on X11 and colors.

MPEG Plug-in and Colormaps

The MPEG plug-in has two approaches to the use of colors. Which one to use is decided by the user. If the hardware supports multiple colormaps, the plug-in may create it's own map, coexisting with the one used by Navigator. The switching of colormaps is done by the window manager, so applications using multiple colormaps have to inform the window manager about this. The information is passed using window manager hints on the toplevel window of the application. When the plug-in is set up to use its own map, it searches its ancestors until the main window is found, and adds the appropriate hints to that window.

A problem may arise when only one hardware colormap is available. In that case the plug-in has to share the colormap with Navigator. For displays with more than eight bitplanes, the numbers of available colors suffice. When only eight bitplanes are available, which is the case for many X Window workstations, only a few colors are available to the plug-in, as Navigator allocates most to itself. In our implementation, the plug-in allocates as many colors as it can. When further allocation fails, it starts matching against the existing entries in the colormap, to find close colors. Unfortunately, this yields unacceptable results.