Max Neuhaus

1991
The Networks - The Broadcast Works and Audium Model by Max Neuhaus, 1991

Max Neuhaus -- The Networks - The Broadcast Works and Audium Model

I would like to begin by talking about something I feel is quite amazing -- this built in sound analyzer and source that we all have. One of the most amazing things about it is that we are largely unaware of it.

I am always finding myself in discussions with people in the eye world about ear and eye. Somehow people need to agree or take the position that one is more important than the other. I think it is obvious that they are complementary. The eye does things which the ear can't do; the ear does things which the eye can't do, it is not a question of one being better than the other: they fit together.

I am often tempted to smile during these discussions because as he derides hearing and I defend it, he doesn't realize that the battle is raging only in his ear. Right now I'm talking to all of you, but few of you realize your actually hearing. You don't hear what I'm saying as sound, you are able to understand this small group of phoneme sounds directly as the English language. Your aural mind takes care of all the intricate steps in between, without bothering you.

I am also fascinated by the truly remarkable level of aural discrimination which we demonstrate through our use of language. If we look at our language sounds in the context of the total spectrum of sound possibilities that we are able to perceive, then we can see that these sounds that we communicate ideas and thoughts with, occupy only a very very minute part of that spectrum. And that the differences between them are very small, so small that a non native speaker has trouble distinguishing between many of them. Yet in our own language we go much further than simply distinguishing between its phoneme sounds. We can tell which part of the country someone was born in from small differences in the way these few sounds are pronounced. These differences are almost immeasurable, but yet we are able to distinguish them quite easily, almost automatically. 

Another thing most of us are not aware of when we speak is that we superimpose another language on top of our verbal one. It is a language we begin to develop at a very early age -- that some say we are even born with. It is pan cultural and some speculate that it is even pan species. It is also the information between the lines and the missing element when we transcribe the spoken into the written word. 

It is not a discrete language made up of separate words like our verbal one, but a continuum of inflection and intonation as we speak those words. It is a very rich source of information about the person we are listening to and also a very accurate one: it is very hard for the speaker to manipulate convincingly. Often we use it as the final arbiter of the meaning of the words themselves.

This language that has not received much attention from scientists and engineers.  In fact for many years telephone engineers denied its existence both theoretically and literally by limiting telephone bandwidth to the point where it was largely eliminated and only the words could be understood. Modern proposals where the voice sounds in a telephone conversation would not actually be transmitted,  but only enough information to resynthesize the words at the other end, deny its existence also.
 
It seems strange for science to ignore it especially in the digital age when they are trying to get computers to feel more comfortable by teaching them to talk: it is the element missing from computer speech. But, among other things intonation communicates the emotional states of the person speaking and in the super objective world of science of course, emotion is taboo. 

In the world of culture though, it is not.

I should also give a some background about the ways I think about broadcasting and telephony. Radio and telephone both may seem like rather primitive technologies in this digital age at the end of the twentieth century, but in fact they are the most widely used forms of live communication technologies we have and will remain so for a long time to come. 

The global telephone system at this time connects 500 million different places on the earth. It is an incredible machine. It is the biggest machine that we have ever made. This idea of a conversation between two people that can ignore geography... the quality of the line is good enough today that often when I call transatlantic I can convince the other person I'm in New York even though I'm sitting in Paris. The only time I'm caught is when police car goes by, they hear the difference in sirens... Max you are not.. where are you ? 

The telephone forms a two way virtual space in the aural dimension: we function in it aurally as if we were in one real space, but  this space doesn't exist. The radio on the other hand can give us a live ear view into a space which can be anywhere or nowhere -- it can also be completely artificial. 

The fact that these are single dimension virtual spaces has some interesting aspects. Rather than the multidimensional virtual realities we are dreaming of in the future, which many look forward to as better than real life, an aural virtual space can never be taken as a substitute for real life -- it will always be an extension. If one is dealing sound, extending only in that single dimension reproportions focus.  If we combine the public telephone network and radio broadcast, we can make a virtual aural space in which a large number of people can be at the same time.

This is what I did with Public Supply I. 

Looking back to 1966, it seems as though I began these broadcast pieces almost by accident. I was asked by a woman who was the music director at radio station WBAI in New York if she could interview me. At a certain moment while thinking about it I had this idea -- instead of talking, why not try to make a work for the radio itself. 

I was a performer at that time, but I was interested in trying to move beyond that and beyond being a composer, into the idea of being a catalyser of sound activity. 

I realized I could open a large door into the radio studio with the telephone -- if I installed telephone lines in the studio anybody could aurally walk in from any telephone. At that time there were no live call-in shows. The idea of putting phone calls directly on the air rather than prerecording them was not greeted with open arms. The engineer insisted the station would loose its license and refused to have anything to do with it. His solution was to put a mike in the studio and pretend it was a strange kind of interview show.

I got the telephone company to install ten telephones in the studio by telling them they were for taking the responses for a fund raising campaign. The engineer laughed and asked me how I was going to answer them all. I also had to find a way to get them on the air: he would only give me an hour of studio time just before the broadcast.

With a friend, I built this wonderful pre-answering machine ten line answering machine. Each phone sat on a small platform and had a solenoid controlled lever which fit under its receiver. A plastic cup with a microphone inside was fitted over the ear piece. The mikes and solenoids were connected to a box with switches controlling the solenoids, and pots for the mike gains. The output went to an amp and a speaker. The studio engineer looked in a few minutes before air time expecting hopeless chaos. It was a bit strange but not chaos. 10 telephones on the floor with their handsets popping up and down and voices coming out of a speaker in front of his microphone. There wasn't much he could do, he flipped the switch and put us on the air.

The results were wonderfully unexpected.  I had done a mailing which told the people about time and the phone number, so there was no shortage of calls. In fact because there were so many, entering into the work became a game of chance. your call had to coincide with that of another person hanging up.

I had told people they could phone in any sounds they wanted and asked them to leave their radio on while calling so that I would have some different feedbacks to work with. I saw myself as a sort of moderator.  I tried to form interesting combinations of callers on the air and counterbalance the extroverted with the introverted.

I think I was a little in shock after it was over. It wasn't an idea that I had thought out, it just came to me. Realizing the scale of this thing. On the screen (illus. 2) the map at the bottom shows Manhattan Island to the right we have Brooklyn, Queens and above the Bronx. I had made a virtual space which any one of the ten million people living there could enter into by dialing a telephone number. It gave me a lot to think about.

I realize now that the reason I did it had to do with some of my ideas about music.

We don't know much about the history of the sound activities in societies of the past. We have some of the artifacts but none of the sounds: we only have recordings of the last sixty years. Our histories talk about other things: we have writings and drawings that go back thousands of years. 

Therefore we don't know very much about the music of the past either. What it really sounded like who played it and its role in society are all debatable questions when we step back only a short time in history. 

Anthropologists in looking at societies which have not yet had contact with modern man have often found whole communities making music together. Not one small group making music for the others to listen to, but music as a sound dialog between all the members of the community.

Although I was not able to articulate it in 1966, now, after having worked with this idea for a long time and talked about it and thought about it; it seems that what these works are really about is proposing to reinstate a kind of music which we have forgotten about and which is perhaps is the original of the impulse for music in man.  Not making a musical product to be listened to, but forming a dialogue, a dialogue without language, a sound dialogue. 

These pieces then are about taking ordinary people and somehow putting them in a situation where they can start this nonverbal dialog. They have the innate skills as our ability with language demonstrates. The real problem then is finding a way to let them escape from  their preconceptions of what music is. We now think of music as an aesthetic product. When you propose to a lay public that they make music together,  they all try to imitate professional musicians making a musical product, badly. It only gets interesting when they loose their self-consciousness and become themselves.

The first thing I realized was that with a conventional hand mixer it was really hard to move ten things at the same time. I felt I had to find a way to use the skill that I had in my hands from being a musician to make it a more fluid situation. I built what I called a finger mixer. It was a flat plate with four photocells for each finger arranged in the shape of my hand. Each caller had two photocells with which I could control his gain and stereo position. This meant that just by moving my hand very slightly and letting more or less light fall on different photocells I could shape gain and position of all ten callers simultaneously. I had a very fine control and it allowed me to move the mixing and grouping into something which was fast moving and dynamic. I first used it in Toronto in 1968. Again a map of the city covered by radio station CJRT (illus. 2). 

By 1973 in Chicago at WFMT there was no guerrilla warfare anymore, after seven years they were beginning to get the idea. Here I started exploring the concept of  giving people a special instruments to play with their voice over the telephone. In this work I built a synthesis circuit for each caller. Rather simple -- oscillators where the pitch was determined by the energy of each call. The signals were integrated over a long period of time so that the result was a bank of slowly shifting pitches forming a cluster which was constantly reforming according to what people were doing. The sounds that they were making rode along on top of this.

In the same year I proposed to National Public Radio that we try to do not just one station but their whole network of two hundred stations spread across the country with five cities where people could call: New York, Dallas, Atlanta, Minneapolis and Los Angeles.  (Illus. 3).

Having made this vocally played instrument for Chicago let me think about having the callers also do the mixing and grouping for themselves.  Obviously I could not be in this five places mixing and grouping at once, so I decided to remove myself completely from that process and implement it as an autonomous electronic system. 

In 'Radio Net', the mixing was done with what we would call sound grains, today. Although heard as a conventional mix of input signals, the output was actually being switched very quickly between inputs. The level of an input in the mix depended on how long the output lingered on it, the length of the grain. The technique allowed automatic mixing according to an analysis of each signal. The criteria I used here was that the highest most pitched signal at any given instant won the output for that particular fraction  of a second. 

A week before the broadcast, I shipped these self mixers to the engineers at the stations in each of the call-in cities and hooked up and debugged them over the phone.

In those days radio programs on NPR were distributed by what they called a
Round Robin: telephone lines connecting all two hundred stations into a large loop stretching across the country. Any station in the system could broadcast a program on all the others by opening the loop and feeding the program the loop. 

I saw that it was possible to make the loop itself into a sound transformation circuit and tried a few things with it in several preliminary studies in 1974 (illus. 4  and 5). For the broadcast I decided to configure it into five loops, one for each call-in city, all entering and leaving the NPR studios in Washington. Instead of being open loops as usual during a broadcast  though, I wanted to close them and insert a frequency shifter in each so that the sounds would circulate.  It created a sound transformation 'box' that was literally fifteen hundred miles wide by three thousand miles long with five ins and five outs emerging in Washington.  

We had a 'dress rehearsal' the day before the broadcast so I could get a feel for things.  It is touchy when you put a wire that long in a loop even if you do have a frequency shifter and gain control, each loop was in a sense a living thing -- they could get out of hand very quickly. During the broadcast  I was on a conference call with five engineers and could listen to each loop and ask for changes in shift and gain at any time. My role was holding the balance of this big five looped animal with as little motion as possible.

During the broadcast the sounds phoned into each city passed through its self mixer and started looping. With each cross country pass, each sound made another layer, overlapping itself at  different pitches until it gradually dyed away. It was quite a beautiful Sunday afternoon -- two hours over which ten thousand people made sounds.

This was 1977 and shortly after finishing it, I began to develop an international project which I called 'Audium'. I was interested in including people with different native tongues in this non verbal dialog.  I also wanted to go further in removing myself from the actual process of the broadcasts -- this idea of implementing these spaces completely in an autonomous system. There were also some other new ideas which I will come to.

I think of an electronic system as a special kind of  statement of idea. Writing something in words on a piece of paper or making a drawing are static statements of idea. If you program an idea into an computer system, though, you not only have the written statement of the idea but the system also realizes the idea -- dynamic statement of idea. I wanted to implement 'Audium' in a system which would not only state the idea but execute it as well.

All the previous systems had been built with analog circuitry because that was the only technology available. Here, I wanted the freedom of moving into the digital world. Unfortunately in 1980 the digital sound world was not there. I did find a very strange company in Massachusetts who made a digital signal processing box that weighed a couple of hundred pounds. They were very curious who I was because their only other customer was the US Navy. Theoretically one could have done something with it but it would have been starting from scratch -- a decade of writing assembly code routines. So throughout the eighties I concentrated on other things. 

In the beginning of the nineties I noticed that means to realize many of my digital dreams were sitting in boxes the music store as sound processing and synthesis devices. There were also some computer languages around to control them in ways beyond what their manufacturers intended and invisaged. In 1991 I began collecting research material for a work called 'Audium Model'. 

The most difficult thing about realizing a large new ideas is explaining what they are to those who will provide the support to realize them.  You can talk about it and write about it, but if it is a genuinely new idea there are by definition, no references. You are asking them to imagine what you are imagining by hinting at it in foreign tongue. 

In addition to being a work in its own right, 'Audium Model' is also the first step in the aesthetic research for 'Audium' and a realization of its fundamental concepts.

It will be a special double phone booth for two people: two rooms, each with one transparent wall with a door in it. (illus. 6) Inside each room is a telephone handset mounted on the wall. To model the conditions of a phone call, the booths are arranged so that the occupants can't see each other. 

The handsets connect them with and through a third party  -- the computer system which comprises the work. The aural result of the sound activities between these three parties will emanate from speakers outside the booths.

So we have the elements of Audium -- the telephone handsets represent any telephone; the electronic system is the moderator and the speakers outside the booths, the broadcast.

The electronic system has two roles. One, it engages in a dialog with each of the occupants of the booths and two, it acts as the instrument which they play on with their voices. 

This general form of the work has been fixed. I am now in the process of research which will define the rest of it. The block diagram (illus. 8) shows the current state of my ideas about the flows of information and sound. 

You can see that there is an arrow going back from the work into the earpiece of each person's telephone.  This is a new idea for the broadcast works -- what I am calling an active score -- a dialog between each person and the computer 'moderator'. 

When we speak we have to constantly listen to the sound we are making and adjust our sound producing muscles so that it matches the phoneme we are trying to pronounce. If we could not hear we could no longer speak accurately: we need this constant feedback even though we have been doing it all our lives.  I want to add another layer to this feedback.

In spite of science's general aversion to studying the language of inflection, there have been a number researchers who have been interested in the question over the last fifty years. Most have been motivated by a quest to quantify emotion, many with goal of lie detection and business advantage. As a result of all this, the basic acoustic parameters of intonation have emerged. Quantifying their meaning is another question, but of course that not what I am interested in doing here.

The dialog between the work and the persons in the booths will be in the language of inflection. The work will recognize vocal phrases by inflection and continually generate sound responses for each persons earpiece -- a special sound feedback which is built for each person as they vocalize. I hope it will be a means of breaking away from the stereotyped ideas of what music is and can guide them out of their self-consciousness and past their preconceptions.

The acoustic parameters of inflection are of course patterns of fundamental
frequency: frequency range and frequency mean and also amplitude: range and mean. There are additional ones of formant and spectrum. So far I have built and am working with a system which can extract some of these parameters in real time from two people simultaneously.  I have also implemented a neural network algorithm which allows one pass categorization and mapping of analog vectors also in real time. (1) It can be used to generalize: to make decisions through inference and extrapolation and it learns immediately. It is not like a back propagation neural net which has to be taught for a few hours -- it only takes this one ten milliseconds to find or learn a category. 

These are the components I will use to build the work's sense of each persons vocal activity and its sound response for the active scores. 

The other part of the work -- again an instrument that can be played by the voice -- will generate the work's output sound.  It will also use this sense of the persons vocal activity to adjust itself while being played. Currently I am experimenting with some imaginary string spaces: digital implementations of six separate strings whose characteristics can be modulated without glitches, in real time. Because I have all this information about frequency and amplitude coming in, I can apply not only a voice sound to the string, but I can also get the string to listen to what it is being touched with. I like the idea of being to pluck or stroke a listening string with your tongue from a distance of 10,000 miles. 

I hope to realize the first 'Audium Model' in the fall of 1995.  The next step of course is to implement several 'Audium Models' in different language groups and interconnect  them. This is not difficult once the first 'Audium Model' is made and completes the work. This network also provides a clear model of 'Audium' itself. 

Another new idea for these broadcast works which I hope will be implemented with 'Audium' is the one of a radio installation. All of the works so far have been radio events because that is the nature of radio in most peoples minds: it has events-- radio shows. But one could also make a radio installation.

Although a radio event certainly gets attention and encourages people to enter into it, at the same time it makes it difficult to do so as it generates congestion.  In 'Radio Net' 10,000 won and got their calls through.  This probably means that 100,000 tried and weren't successful.  There is no way to install enough lines to respond to a call-in request of this kind over the radio; the more lines you add the more people are encouraged to call in.  The radio event also discourages the development of a group dialogue; everyone knows they have only a certain amount of time and wants to get their say in.

But if it's always there you can call in at any time, and you can stay in as long as you want; it allows a natural long-term evolution of this new kind of sound dialogue.  It becomes an entity - a virtual place.

Do I sense shivers of panic running up the spines of radio bureaucracy? 

Of course it is very expensive to run a radio station, and to dedicate it to one idea is unheard of. 

Or is it? 

In fact many radio stations are dedicated to one idea -- rock, news, sports etc. 'Audium' is another idea of programming, and hopefully its live and unpredictable nature, and its continuous evolution, and its pan-national character will combine to make it quite a bit more interesting than many others.

I hear them whispering 'but the band is so crowded, there aren't enough frequencies to allow another station for such a strange idea'. 

Right now the AM band and many of its transmitters are being abandoned -- deserted for the world of FM. Audium could live quite happily in all that territory, emanating from those poor unwanted transmitters. 

It would be much less expensive than other forms of programming. The major cost of a radio station is not the broadcasting equipment, nor the electricity to run it. It is the programming -- the making of radio shows. 

Audium is simply an electronic system with one side connected to the phone network and the other to the transmitter. 'Audium' programs itself, or more accurately, it is programmed by the people who will use it.

(1) Fuzzy ARTMAP, Gail Carpenter, Stephen Grossberg, Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems.
_