SPEECH SYNTHESIS MODULE #3330
Since the Intellivision was based on a General Instruments chip set, it didn't take long for someone to realize that the General Instruments "Orator" speech-synthesis chip could be the basis for a nifty Intellivision add-on. Ron Carlson, an engineer in Design & Development, was put in charge of developing the Intellivoice module hardware; Ron Surratt, who would later manage the programming of M Network Atari 2600 games, was hired to write the software; Patrick Jost was in charge of analyzing and editing voice data.
They took advantage of the Orator's onboard 16K of ROM to build a set of generic words and phrases that could be used in any voice game to pad its vocabulary. These included, read in a male announcer's voice, numbers, "left," "right," "up, "down," and the familiar "Mattel Electronics Presents." These onboard phrases, called the RESROM (for Resident ROM), along with the voices for the first voice game, , were recorded and digitized in New York, at General Instruments' voice lab.
Ron Carlson and Patrick Jost supervised the sessions in New York and sent the data back to Ron Surratt in California, who loaded it into the Intellivoice prototype. But once switched on, all it could do was keep repeating "Auk yooo! Auk yooo!" which didn't go over well with the Mattel executives and marketing personnel. In several heated phone calls between Hawthorne and New York, Carlson blamed the problem on Surratt's software and Surratt blamed it on Carlson's hardware.
"I really didn't know what I was talking about," Surratt admits today, "but luckily it did turn out to be the hardware."
Eventually the bugs were ironed out and Mattel fully committed to developing voice games. A state-of-the-art voice lab was built in the Mattel Electronics building in Hawthorne; this heavily soundproofed facility was ideal for editing, as well as for after-hours (and occasional during-hours) hanky-panky.
Voice editing was crucial, since each game cartridge could only hold 4 to 8K of voice data. Words had to be digitized at the lowest possible sampling rate at which they could be understood; often, the sampling rate would be changed three or four times within the same word - lower for vowels, higher for consonants - to save space.
Despite these space-saving efforts, the number of words that could be fit into a voice game was extremely limited, which probably contributed to the Intellivoice's failure. While orders for the initial voice game releases were around 300,000 each, orders for the fourth game, , released later, hit only 90,000. A completed children's game, , was shelved.
A restyled Intellivoice, designed to match the Intellivision II, appeared in the January 1983 Mattel Electronics catalog; a working prototype, however, was never built. The module shown in the catalog was merely a carved and painted block of wood.
At least two prototypes were built, though, of an International Intellivoice module. The prototypes look like a regular Intellivoice, but they contain additional ROM with French, German and Italian versions of the RESROM. Foreign language versions of Space Spartans were produced, but neither they nor the International Intellivoice module itself were ever released.
An attempt to recoup the Intellivoice investment was made by deciding to include the Orator chip and RESROM in the Intellivision III master component; no add-on module would be needed to play the original or new Intellivoice games. Unfortunately, the Intellivision III never got off the drawing board.
On August 4, 1983, all Mattel Electronics personnel related to Intellivoice were laid off.
[The following material is excerpted from the Intellivoice (Model 3330) Product Engineering Specification by Thomas L. Randolph, project engineer, March 18, 1982, revised May 6, 1982; from the General Instruments Product Specification for the Orator Speech Processor; and from the Intellivoice Service Manual.]
The 3330 produces audio speech signals when used in conjunction with a and/or , and . Cartridges are deemed "Voice Compatible" if they make use of the 3330 speech facilities. Cartridges which are not voice compatible do not make use of the speech facilities, nor allow certain of the required speech production signal functions to be performed. Nonetheless, non-voice compatible game cartridges can be used with the 3330, but no voice gameplay enhancement is provided.
When used in gameplay, the Intellivoice unit "speaks" through the sound channel of the television. It uses the same audio channel as the sound generator in the Master Component.
A volume control on the 3330 allows variance of the voice loudness level. This control does not affect the normal game sounds - only voice.
The 3330 unit will be the base for future peripherals. Additional hardware was included to provide for controlled communication between the Master Component and those peripherals. Future peripherals will plug into the top-mounted connector of the 3330.
The 3330 consists of a VLSI speech synthesizer, an LSI buffer/interface chip, an active audio filter/amplifier section, and provisions for current assistance to the Master/Keyboard Component's +5V power supply.
The speech synthesizer is the General Instruments SP-0256 Orator. The SP-0256 incorporates four basic functions
• A software programmable digital filter that can be made to model a VOCAL TRACT.
• A 16K ROM which stores both speech data (Resident ROM or RESROM) and instructions (the PROGRAM).
• A MICROCONTROLLER which controls the data flow from the ROM to the digital filter, the assembly of the "word strings" necessary for linking speech elements together, and the amplitude and pitch information to excite the digital filter.
• A PULSE WIDTH MODULATOR that creates a digital output which is converted to an analog signal when filtered by an external low pass filter.
The SP-0256 can also accept serial speech data from an external source.
For the 3330, the RESROM contains a variety of words and phrases that may be useful in video games. The PROGRAM consists of 17 different parameters used by the VOCAL TRACT model to imitate human speech patterns.
The buffer/interface chip (General Instruments SPB-640) contains logic required to interface the speech synthesizer to the Master/Keyboard Component cartridge bus.
Controlling input to the buffer/interface chip is primarily from the Master Component Microprocessor bus signals. Other controlling inputs are generated by the speech synthesizer during speech production.
The buffer/interface chip has three methods of transmitting data to the speech synthesizer and peripherals connected to the stacking connector.
The first speech-oriented data transference method causes the synthesizer to produce speech segments contained in its internal ROM (RESROM): the buffer/interface chip allows the address of the desired speech segment to pass onto an 8-bit peripheral data bus connecting the buffer/interface and synthesizer chips, and sets the proper control lines for the synthesizer to generate the segment.
The second method of moving speech data allows the Master/Keyboard Component to load custom speech data into the synthesizer: data from the game cartridge is loaded into the buffer/interface chip's 640-bit FIFO array and converted to serial data, and the buffer/interface chip sets the proper control lines for the synthesizer to read the serial data and convert it to speech.
Finally, the buffer/interface chip also allows moving data to and from peripherals through the top-mounted stacking connector: the buffer/interface chip sets the proper control lines for the peripheral bus to carry bi-directional microprocessor data.
ACTIVE AUDIO FILTER/AMPLIFIER SECTION
The output of the speech synthesizer is not conventional audio, but is a 40 KHz digital Pulse Width Modulated (PWM) signal. When viewed on an oscilloscope, this appears to be a square wave whose edges rapidly expand and contract as speech generation takes place.
A series of filters (an LM-324C Quad OP Amp and related components) converts the PWM signal to conventional audio which is then amplified (an LM-358C Dual OP Amp and related components, including volume control) and fed to the Master Component.
The effective passband for the speech signals is from 150 Hz to 5KHz. Within this is also a 3db/octave bass pre-emphasis.
POWER SUPPLY BOOSTING/SUPPORT
The stacking connector has its connections arranged in such a way as to allow a future power supply to fill the 3330 and game cartridge power requirements, and boost the power capability of the Master/Keyboard Component's power supply.
Power supply boosting can be accomplished by allowing power input to pin six of the stacking connector. This unregulated voltage is applied through an 8.2 Ohm, 2W resistor to the Master/Keyboard Component Vcc on pin 43 of the cartridge port to supply an approximately 270mA boost.