Sunday, 27 September 2015 17:05

Zhong Shuo: Technical Description

Posted by
Rate this item
(1 Vote)

A modified telephone in the kiosk and computer interface (both designed and built by Jim Sosnin) enables the computer to record speech and play voice prompts on the receiver. The interface connects to the games port of a sound card and produces control voltages in response to MIDI messages from the computer (using a PIC chip) to control the ringer of the phone. Control voltages sent in the opposite direction, from the phone, are converted to MIDI messages by the interface and sent to the input of the games port. These messages allow software on the computer to detect when the receiver is on or off-hook. A second output channel of audio from the computer is connected to a mono amplifier which sends sound to the outdoor loudspeaker.

The computer at each site uses the Linux operating system. Miller Puckette's graphical sound language ¡°Pd¡± performs audio input and output tasks and a variety of other operations including MIDI processing. Custom extensions to Pd written in the C language interface with a database called PostgreSQL and perform editing operations. When the receiver is lifted, Pd begins playback to the earpiece channel, a pre-recorded message repeating the questions in Chinese and prompting the participant to speak after a beep. If the receiver remains off-hook, Pd begins to record the voice direct to the hard disk. Once the receiver is replaced, the recording is stopped and information about the recording, including the file name, date, time and duration is written to the database in special tables corresponding to the local site (ie. the Beijing installation writes this data to the ¡°Beijing¡± tables). Recordings are compressed using the ¡°speex¡± format. If the recording is less than seven seconds long, it is rejected. If a speaker talks for 11 minutes and 45 seconds (15 seconds less than the maximum length of 10 minutes), Pd sends a warning voice message to the earpiece informing the speaker they have 15 seconds to finish. If the speaker continues to speak for more than 15 seconds a second message informs the them that the recording has finished and to please hang up the receiver. All recordings longer than 7 seconds are acknowledged by a voice message played over the garden speaker.

At all times, recorded story files are selected, edited and played back by the local computer in an interwoven fashion (the volume of this playback is however lowered slightly during recording to minimise leakage of other recordings). Selection and playback is performed by an extension to Pd that attempts to fill a particular length of time with edited stories. This is currently set to 15 minutes and the number of stories/voices it plays back ¡°simultaneously¡± in interwoven or cross-faded form is set to 3. Stories are fragmented into semi-random chunks, usually around 10-15 seconds in length and are played in their original temporal sequence. These fragment are given additional ¡°lead-in¡± and ¡°lead-out¡± durations which are sounded during cross-fades and allow the listener to reacquaint themselves to a given speaker returning to his or her story. The playback process begins playing a given story and as the end of the first fragment approaches, it begins to cross-fade the voice with that of a new speaker. It then does the same with a third voice (and so on if more that 3 simultaneous speakers has been specified) and returns to the first. This continues until a story is exhausted whereupon a new story is added to the queue. Once the 15 minutes of stories has been played, the entire process begins again.

As this project develops and new installations are created, the playback process will select stories from all sites established. This will be done with an even distribution of sites in the resulting playback.

Each installation will have its own computer and the system on each will be almost identical. The Beijing machine is slightly different in that it streams its audio output (the edited stories and not the voice prompts) to the Internet. These streams can be accessed at www.reverberant.com/cw. Streaming runs for 24 hours a day at present, but in future there will be some downtime as file transfers take place between installations. Streaming is achieved with the ¡°shoutcast~¡± Pd external by Olaf Matthes which is configured to send an MP3 stream to a local (on the Beijing machine) ¡°Icecast¡± server. The signal from this is then relayed to a commercial ¡°Shoutcast¡± service to enable a broader public access (more simultaneous connections).

The other difference with the Beijing machine is that it will act to initiate synchronisation of content between installations. As audio material, and database material associated with it, is shared between the machines, the Beijing machine will initiate a process of first collecting data from each machine, then redistributing to other machines that need it. This will be performed at night, once daily. Synchronisation will be achieved using a variety of ¡°shell¡± scripts but principally the efficient UNIX data transfer tool ¡°rsync¡± over a secure network connection (ssh).

Read 3881 times Last modified on Sunday, 27 September 2015 17:17

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.

Latest Articles & News

  • Mosca Video Tutorial
    Tutorial on using the GUI interface of the Mosca quark for SuperCollider. Please listen with headphones and please view in full-screen mode.
  • Display last photo taken on Android Phone
    Display last photo taken on Android Phone Bash shell script and other settings to download and display the last photo taken on an Android phone. The script is run on a computer and communicates with an Android…
  • Theatre of the Ears (O Teatro dos Ouvidos)
    Theatre of the Ears (O Teatro dos Ouvidos) Theatre of the Ears (O Teatro dos Ouvidos) by Valère Novarina read by class A of the subject "Voz e Palavra na Performance Teatral Contemporânea 1", Theatre Arts Department, University…
     
  • Chinese-language SuperCollider tutorial translations by Way Wang
    Chinese-language SuperCollider tutorial translations by Way Wang
  • Ambisonic Map
    Ambisonic Map Ambisonic map with high quality B-format field recordings for download (48kHz, 24bit in wav format). Each record is accompanied by UHJ stereo and binaural mixes for direct listening online. The…
  • B-Format to Binaural & UHJ Stereo
    Use the three scripts contained in the zip file below in "Download attachments" to batch convert a directory of B-format audio to binaural and UHJ stereo. Requires that SuperCollider is…
  • Making Impulse Responses with Aliki
    Making Impulse Responses with Aliki The following procedure shows how to make B-format impulse responses (IRs) with the Linux software Aliki by Fons Adriaensen. A detailed user manual is available for Aliki, however the guide…