A decision was made to use mpeg_play-2.3-patched as the origin for an MPEG playing code library. As an initial attempt to simplify the code, several of the available dithers were removed. Some of them didn't seem to work, while others failed to provide excessive features compared to the other dither algorithms. Several #ifdef's that were not used, or that were used for testing, were removed, along with program options and features that were of no or little interest when using this program inside a Web browser, like the stand-alone control bar. The cleanup reduced the code size by several thousand lines.
Then came the time to extend the library. First of all, mpeg_play was written to read data from a file at it's own pace, that is; ``pulling'' data from the source. The input routines had to be rewritten to let Navigator ``push'' the data as it gets available. Secondly, mpeg_play spends some time decoding the data, especially if the flowing of the MPEG stream is slow. Since Navigator executes plug-ins in it's own thread, this could cause unfortunate ``hangs'', where user input is not responded to. This problem calls for a separate thread, and with separate threads come the problems of parallelism.
One of the two threads handles decoding of an MPEG stream to produce video output. This thread may be classified as an MPEG decoding server. The original thread, running inside Netscape Navigator, may be viewed as a client, feeding commands and MPEG data to the server.
Real multi-threading, that is; several processes running in the same address space, is currently not supported by all Unix-systems, so from a portability point of view, it is safer to rely on old-fashioned separate processes. This also gives another benefit: The mpeg_play source code pays little attention to freeing memory. For each movie it allocates lots of memory, but the chunks are never freed. When using a separate process for each film, the memory will automatically be released when the process terminates. Solving the problem by using separate processes is much easier than tracking every point of memory leakage in the code.
In Unix, the system call fork is used for creating a new process. The new (child) process is an exact copy of the calling (parent) process. Both processes continue running after the call to fork. The return value from fork makes it possible to identify which process is the parent, and which is the child, to let the two processes perform different tasks. The plug-in will fork off a child to handle the MPEG decoding, while the parent, running in Navigator's address space, will handle user input.
The separate processes are supposed to cooperate, so we need a way to make them talk to each other through interprocess communication (IPC). The MPEG decoding process plays the role of a server, waiting for commands and data from the client process. The commands are initiated by user action, such as pressing buttons to start and stop the playback. Commands are also initiated by Navigator; for instance to quit the server when the user leaves a page, and when more MPEG data are available. All this traffic moves from the client to the server. We also need some information to go in the other direction: When Navigator works in a streamed manner, it will query the plug-in how many bytes it will accept (see NPP_WriteReady). The reception and buffering of incoming MPEG data is handled by the server, so we need to return that information through IPC.
Unix provides several methods for doing IPC (For an overview of all these methods and more, see [1] and [75].)
Figure 5.1: The two processes involved in the
MPEG plug-in.
Figure 5.1 illustrates the two processes cooperating in decoding and viewing an MPEG stream. The process on the left is the original Netscape Navigator process with the plug-in dynamically loaded. When the plug-in starts receiving a stream, it forks, creating an identical process. The Navigator part of the new process is never run, and in modern operating systems it doesn't even take up valuable memory. Two pipes are created, one for sending commands and data from the client to the server, and one for giving feedback from the server to the client. When the client receives parts of the MPEG stream from Netscape Navigator, it passes it on to the server, and receives a status reply in return. The server handles all decoding, and displays the result in a subwindow of the browser.
This section describes the API that was written for the extended mpeg_play library to hide the details of parallelism and IPC, and thus simplify it's use. The user of the library, in our case the MPEG plug-in, calls normal functions, of which two fork a new process, while others perform the IPC. The two fork'ing functions are the main entrypoints to the library. They differ in the way they set up the handling of the MPEG stream.
Among the commands accepted, are the ones used to push the MPEG stream to the decoder.
Another set of functions is involved with direct user action. In the plug-in, these are called whenever the user presses buttons like ``Play'', ``Rewind'', ``Stop'', etc.
The last functions deal with the window in which the movie is shown. They are typically called when the client receives certain events from the window system.
The original mpeg_play was written to read the MPEG stream from a file. To simplify conversion to a more ``event-driven'' approach, a new set of functions were written, resembling the original fread- and fgetc-functions that were used formerly. The new functions work against a buffer that is filled when the client sends MPEG data using mpSendBytes. If the buffer is empty when any of these functions are called, a tight loop similar to the main loop is entered, waiting for more data from the client. The call to the reading function is not returned until more data is available, or an mpEndOfStream is issued.
The main goal of the tight loop, is to receive MPEG data for further decoding. What if the client sends other commands? Handling them may work for some, but fail for others. An example: The server is running out of data while decoding a frame, so it enters the loop to wait for more. The client sends a command ordering the server to skip to the next frame. The server obeys this command, calling the function to decode a frame a second time recursively, messing everything up. To get around this problem, queuing of commands was introduced. Every command that is not related to sending MPEG data or quitting, are entered in a queue. This queue is later processed in the main loop of the server, before reading any further commands from the client.
Note that mpQueryWantedBytes must be responded to immediately, that is; not through the queue. Failure to do so would introduce a deadlock (more on deadlocks in [76]): The server is waiting for the client to send data. The client calls mpQueryWantedBytes, and waits for a reply. The reply will never come, since the server just queues the command.
The X Window System is designed to be portable across a wide range of platforms, with an even wider range of supported displays. Unfortunately, displays differ too much in their characteristics to make this transparent to programmers.
The display hardware offers one or more bitplanes. The combination of corresponding bits from each plane yields a pixel value, controlling a single pixel on the screen by indexing into a colormap. The number of simultaneous colors or grayscales, is thus given by the formula , where n is the number of bitplanes. Monochrome displays have a single bitplane. Color or grayscale displays typically have between 8 and 24 bitplanes.
Display hardware is typically capable of generating a much larger number of colors than may be displayed at once. To control which colors are currently displayable, colormaps are used. For color displays, a colormap entry describes the mixture of red, green and blue light used to produce the color in question. The pixel value from the bitplanes is used as an index into the current colormap.
Depending on the hardware, a colormap may be writable, or read-only. Writable colormaps let programs change the red, green and blue component to fit their needs. Read-only colormaps have preset values that may not be changed.
In X11, the characteristics of a colormap is described using a visual. The visual describes, among other things, the number of bitplanes, the size of the colormap, and a visual class. The visual class describes the features of the colormap; is it writable, or read-only? Is it color, grayscale or monochrome? Is the index into the colormap decomposed into separate indexes for the three color components? Table 5.9 sums this up for the six available visual classes:
Table 5.9: Comparison of Visual Classes. (From
[77, section 7.3.4,])
The books [77] and [78] gives thorough information on X11 and colors.
The MPEG plug-in has two approaches to the use of colors. Which one to use is decided by the user. If the hardware supports multiple colormaps, the plug-in may create it's own map, coexisting with the one used by Navigator. The switching of colormaps is done by the window manager, so applications using multiple colormaps have to inform the window manager about this. The information is passed using window manager hints on the toplevel window of the application. When the plug-in is set up to use its own map, it searches its ancestors until the main window is found, and adds the appropriate hints to that window.
A problem may arise when only one hardware colormap is available. In that case the plug-in has to share the colormap with Navigator. For displays with more than eight bitplanes, the numbers of available colors suffice. When only eight bitplanes are available, which is the case for many X Window workstations, only a few colors are available to the plug-in, as Navigator allocates most to itself. In our implementation, the plug-in allocates as many colors as it can. When further allocation fails, it starts matching against the existing entries in the colormap, to find close colors. Unfortunately, this yields unacceptable results.