|
|
This page has 30 subjects devoted to cinema sound and is 1.65MB in size. Side navigation is provided for the 30 subjects and is repeated down the page. This will enable faster access between subjects than re-loading separate pages. The history of cinema sound spans over 80yrs and is a subject so vast that it is not possible for one person to be an expert in every facet. Some subjects are abridged to focus on essential points and we welcome views and or corrections. Many myths and memes are challenged and hopfully critical points will be accepted in good humour. The intent is to evoke thought on each subject. 'Memes' are ideas and practices which spread within a culture that are rarely questioned. Further engineering details on technical subjects and standards will be later paralleled on www.sound.westhost.com |
|
Cinema (5.1)
What you wanted to know about 5.1 but were too afraid to ask

5.1 refers to the protocol of speaker placement for the separate sound tracks.
(left - centre - right) (left-rear - right-rear) and (.1 for sub-bass)
5.1 does not refer to sound quality or realism. The majority of home cinema sound systems consist of 5 small loud-speakers, which is all that is necessary to hear the periodic novelty of sound from the 5 positions. The (.1) sub-bass box is a low cost solution for bass not obtainable from the 5 small speakers.
The history of surround sound formats for cinemas including sound alignment procedures are a mess to say the least. Mono has always been the standard and everything else became add-ons. There are large inconsistencies between films. Many B grade films are recorded in mono, with post-production pseudo stereo effect for left and right; rears and sub-woofer periodically turned on and off. Then there are the disjointed sub-bass effects often consisting of sequenced bursts of filtered pink noise to generate a mindless emotive reaction, similar to the TV sitcom use of clap-trap.
A Full Fidelity Stereo gives greater realism than 5 small low fidelity speakers.
Imagine if you will, given the choice between the largest screen with the smallest low fidelity 5.1 sound system, or the smallest screen with the most magnificent stereo system money could buy.
For well produced films with excellent sound tracks, a large stereo full-fidelity music system will sound more realistic and enjoyable than five small low fidelity home-cinema speakers. However 5 large full-fidelity speakers can transform the home-cinema experience well beyond what is provided by the present sound systems in large cinemas.
Home Cinema Marketing Sigmund Freud and Carl Jung would have loved home cinema; but if alive today they would have a great deal to say about the marketing. In the era before Television, home entertainment marketing often showed a family sitting around a radiogram, in a softly furnished fire-lit room. By contrast, modern marketing shows isolated people in a bright unfurnished reverberant room, reflecting a superficial lifestyle with little interest in music, except with ipods.
The modern home architecture of white un-furnished reverberant rooms has more in common with mausoleums that symbolise perpetuity of the dead, rather than the enjoyment of living in the present.
Excessive bright reverberant rooms are the worst environments for home cinema or music to be experienced. The majority of sound heard in modern homes from people talking, general background clutter or from sound systems, is annoying continuous reverberation and echo reflected from walls floor and ceiling. Articulation for dialogue is diffused, making it difficult to understand the words being spoken. In these hard minimal furnished rooms the surround speakers of home cinema systems make the reverberant problem worse.
Films with beautiful sound tracks and intellectual depth, are sound quality and dialogue dependant, and have no value when experienced in excessive bright reverberant environments. It's easy to understand that the majority of home cinema marketing is aimed at the lowest intellectual level. Sports, football, motor racing, understandably have no demand on sound fidelity or dialogue articulation. There is no point to discuss information on cinema sound when applied to applications that have no demand for fidelity or experienced in excessive reverberant environments.
For Home Cinema to have its place, it will have to integrate (not dominate) within homes that are comfortable and softly furnished, where family, laughter and music are our primary experience. Approx 1 in 30 films are magnificently produced with full fidelity and continuous integrated surround sound. Over many decades this small percentage of excellently produced films has evolved into a large number. This percentage of intelligently crafted films generate a passionate following for those of us who enjoy the cinema experience in an acoustic absorbent environment with an excellent sound system and high quality vision.
Cinema of the Golden era
Going to the movies is a magical experience the whole of humanity shares, and belongs to us all. The history of movies began in the 1890s, and another 20 years would pass before the technology for sound was invented. Sound first arrived in cinemas in the late 1920s. This period till the late 1940s was described as cinemas Golden era.

Sound from the early optical sound track on the edge the film stock struggled to reach 6kHz. Increasing the physical size of the optical track gave a larger signal but with less hi-frequencies. A smaller optical track gave a higher frequency response but with less signal level and greater noise. Noise hiss was also generated from the film reader and early valve pre-amplifiers. Putting a hi-frequency filter at the output of the pre-amplifier reduced the noise hiss, but further reduced hi-frequencies. The simplest solution developed in the 1930's was to boost the hi-frequencies when recording, and allow the play back noise reduction filter to achieve a close to flat response as possible to provide some consistency between the recording on film and playback in the cinema.

Because almost every cinema rushed to install sound as soon as it arrived, it was difficult to take advantage of technological improvements in sound systems designs that followed shortly after. Only a limited number of mixing studios and cinemas could afford to keep continuously upgrading to the latest technology. The majority of cinemas had poor sound systems in comparison to later developments. High and low frequency response was almost non-existent.
Academy characteristic In 1938 a study of the 'worst case' cinemas in the US showed an average energy (frequency) responce of (-7 dB at 40 Hz) (flat 100 Hz - 1.6 kHz) (-10 dB at 5 kHz) (-18 dB at 8 kHz). This is said to be the Academy Curve.
Mixing studios could then apply equalization to simulate the poor response of the 'worst case' cinemas, to be able to craft and evaluate the soundtrack as it would be heard by the majority of audience. This practice became the foundation (trend) for how soundtracks were and are prepared. Many variations of the Academy curve followed and later became re-named the X-curve.
The practice of modifying sound performance for the mass end of the cinema exhibition market has always been hotly debated, because it is based on assumptions and compromise. This practice also institutionalised lower performing sound systems as being the benchmark for the characteristic 'cinema sound' and later represented a blockade restricting hi-fidelity sound from being the adopted goal when technology allowed. This can be argued as being an economic rationalist solution, not a technical or scientific solution.
The un-foreseen and un-fortunate outcome from this compromised approach made it difficult to implement improvements beyond what was initially incorporated, and has basically remained for 35mm film until the 1970s. Even with these limitations, cinema sound from optical format was better by comparison to home record players and radios, until the early 1960s when superior hi-fi stereos became available.
European film industry It has been argued that pre WW2 European film technology was superior to the US. German cinema technology was said to have been favouring a flat response which came close to achieving 8kHz. The expectation may have been that cinemas would aspire to full fidelity sound by installing the most advanced technology available, with no compromise for poor quality sound systems or acoustics.
The demise of the European film industry during WW2 is said to have resulted in the US dominating the world with its approach of a compromised alignment for the majority of cinemas which had poor quality sound systems and acoustics.
Early speaker systems
Early speaker systems achieved the best performance with the technology available, and most research on horn technology at that time has been un-surpassed. The aim was to achieve the highest efficiency and voice articulation which horns are ideally and uniquely suited to. Also valve amps of that time were approx 30 to 60 Watts. This resulted in the efficiency and size of the horn speaker systems being taken to the technological limit. The narrow bandwidth from the mono optical sound track (100Hz - 4kHz) set the precedent for the speaker systems only needing to be 2 way passive without tweeters. The 15in speakers and horn were and still are crossed over at approx 700Hz.

Most cinema speaker systems have remained principally the same as when first applied: two 15in speakers for the low frequencies; a compression driver and horn for hi-frequencies; and crossed over at approx 700Hz. Altec Lansing, Western Electric and other well known companies aspired to make the highest quality sound systems with the resources available.
But many early cinemas could not afford the best quality systems. The hi-frequency compression driver and horn are approx 20dB more efficient than the 15in speakers. Sometimes it was difficult to correctly equalize both components to sound flat, because sound measurement equipment was rare and expensive. Cheap crossovers consisted of a simple capacitor and resistor only.

Many sound systems had to be aligned by ear. Because of the limited power of early valve amps, the horn driver was sometimes kept close to full efficiency with little attenuation. This was done to obtain as much sound level as possible, especially for large cinemas. Cinema screen and air attenuation, helped reduce the higher efficiency of the horn, bringing it back to a flatter response.

Cinema sound is recognised by the unique acoustical character (of the projected voice range) created by the horn systems used in all commercial cinemas throughout the world. Regardless of their limitations which will be discussed further on this page, large cinema horns are magnitudes greater in efficiency than domestic speaker systems.
Cinema horns have an acoustical directivity that projects the sound image forward into the audience, giving a feeling of dimension to the sound. Horns also increase the articulation of dialogue when correctly applied. The characteristic sound of the horn gives cinema its unique sound quality. It is not possible for small domestic cone speakers in home-cinema systems to replicate the experience of large horn speakers in commercial cinemas.
mass - average - performance - aceptance
The marketing pholosphies of capitalist economic principles favour technology being reduced to achieve mass average aceptance, humourisily described as McDonaldizing. Audio replication (fidelity) has been uniquelly compromised, held back, limited and modified from delivering the highest performance possible throughout its history.
Un-resolvable mess. The principles behind the Academy and later X-curve were not intented to utalise the best performance technology is capable of, for high quality sound reproduction in acoustically controled venues. Over many generations starting from the original Academy EQ and later recording and playback EQs for different optical and magnetic formats, noise filters, noise reduction systems, speaker alignment and cinema acoustics all became entangled into an un-resolvable mess.
These problems are also seen with pop music which is extensivelly EQ modified, dynamically compressed and level boosted. This is done to achieve the maximum effect possible on the limited bandwidth of radio TV and MP3, mostly heard through small low fidelity speakers (getto blasters etc) in excessive reverberant inviroments.
It is often stated that a percentage of the population has no interest in how 'sound sounds'. Therefore it can be argued "Why waste effort in modifying sound recording and reproduction for those who are not interested ?". Many audio enthusiasts state that sound recording and reproduction should only be focused on achieving the highest fidelity possible for an appreciative audience, regardless of what percentage this is.
The next chapter describes a brief period of cinema history where the highest technical achivments without compromise to performance or cost were created, but was to fail when confronted by television, economic rationalism and superficial consumerism that followed.
70mm Widescreen
During the 1950s and 60s special 70mm films were produced and shown in cinemas equipped with super wide curved screens often exceeding 50 feet. A single cinema was as large as today's multiplex, requiring to show the same film for many months or years to break even or make a profit. Many of the 70mm films had 6 independent channels of full fidelity magnetic sound tracks on the film stock. The sound fidelity was superior to most of the digital formats of today.
The channel separation was so great that it was possible to independently record and play back different orchestral music on each track, with minimal loss or crosstalk. This separation enabled remarkable sound scaping that could emulate the experience of a real symphony orchestra on the stage.
Prior to digital recording the standard test for high quality analogue magnetic multi-track recorders was to repeatedly record and play back and record over and over until noticeable degradation appeared, which was approx 6 times. Only un-compressed digital recordings can achieve being repeatedly re-recorded with zero loss.
The cost of 70mm was more than 10 times greater compared to 35mm film. The process of applying the magnetic tracks to the 70mm film was extremely difficult, which was done at the end of the film stock's manufacturing process. The play back head on the projector had to be in perfect alignment, regularly inspected, meticulously cleaned and periodically de-magnetized and replaced when worn. This procedure was done with great pride and attention.
The only limitation was from partial de-magnetization of the sound tracks if the film stock was accidently placed against a transformer. But in all in all other respects the magnetic format was more robust and superior in performance to the optical format including today's digital.

Many of the large early cinemas equipped for showing 70mm cinemas in the 1950s were fitted with five speaker systems that were magnitudes greater in size than the majority of sound systems in cinemas today. (left) - (left-center)- (center) - (right-center) - (right) The five screen speaker systems were able to give precise sound positioning and even spatial movement across the screen. The single surround channel could be automated to left side - right side - rear - and overhead to simulate an aircraft flying around the room. However switching the surround channel was more commonly done with Cinerama.

Stanley Kubrick's '2001 A Space Odyssey' had its world premiere on 2 April 1968 and became the most successful film in this format. But many people were unable to experience it as he intended, as most of the large wide screen cinemas with 70mm projectors were closing to make way for the 'economically rational' new multiplexes equipped with the new improved 35mm projectors with improved Lens (Anamorphic) that could also achieve wide screen formats. Sound fidelity was no longer considered important as the economical rationalist belief was that the majority of the new emerging consumer driven generation have become obsessed with image and brand identification.
There are only a limited number of 70mm films being produced today (IMAX is 70mm) and sound reproduction is now rationalised to the modern digital format and uses the external DTS system. Today's compressed digital formats printed on 35mm film stock are far more restricted in fidelity with limited channel separation in comparison to the 6 analogue magnetic tracks on the earlier 70mm film.
During the early 60s I was apprenticed to an ex senior engineer of Western Electric UK who had been involved in pioneering technology for magnetic multiple track recording and large scale sound systems for cinema and auditoriums. He and his fellow engineers had been with the cinema industry from the beginning of sound. In the 1950s they believed that the full fidelity magnetic format was the way of the future, and the optical low fidelity format would go the way of the Dodo, especially since Television had arrived. I think back in regret at how I could have paid more attention to all they tried to teach me. But as a 17yr old in the early 60s my primary interests were creating sound systems and mixing for hi-energy blues bands, surfing and chasing free love, which I never caught.


Cinerama was without doubt the greatest achievement in cinematic history. Created by people of genius, madness and unlimited passion. Russia also had its own equivalent Kino-panorama. This is what going to the movies is about, 'To be beamed up and blown away'. The opening roller coaster ride caused everyone to tightly grip their seats and hang on for their lives, screaming with excitement. Paper bags were also supplied for emergency's. I was 12 in 1957 when I first experienced Cinerama.
Cinerama consisted of three projectors covering a giant 146deg super wide curved screen that included peripheral vision, enabling a 3D experience to be achieved. Cinerama was spoken of as an experience separately from just being seen. The only limitation was the annoying joining lines, which were eventually minimised.



The sound was supplied from a separate 35mm 7 track magnetic tape machine that sync locked the 3 projectors. There were 5 giant speaker systems behind the screen as well as surround speakers, as shown in the below pic. After intermission there was a demonstration of the sound system effortlessly replicating a symphony orchestra as though it was actually present on stage. It was this effect of hearing an accurate replication of sound reality and detailed positioning of a symphony orchestra that resulted in electro-acoustic technology becoming the passion of my lifes' work.

Many early NASA astronauts and engineers including John F Kennedy described how they were inspired by the Cinerama experience. Both Cinerama and Kino-panorama influenced the generation of the 1950s proving the impossible could be achieved and gave extra motivation for the race to put a man into space (April 1961) and then to the moon (September 16, 1969).
Unfortunately these large magnificent cinemas and titanic film productions are out of place in a modern superficial digital world of consumerism, where corporate and political motivation is now driven by self interest and economic greed.
On behalf of all who are passionate about Cinerama technology we wish to express our gratitude to John Mitchell for his life work in recovering and restoring large quantities of the Cinerama stock and also David Culls for his entertaining historical accounts of the Cinerama era.
John Mitchell's backyard Cinerama pics
www.cineramaadventure.com
www.cineramaadventure.com Downloadable Videos describing the Cinerama experience.
www.cinerama.topcities.com
Copyright note Some of these pics have been modified to represent the concepts. We thank and recommend the Wide-screen museum and other historical sites who allow these picturers to be available for education purposes.
www.widescreenmuseum.com
www.widescreen.org
www.in70mm.com
wikipedia.org/wiki/Widescreen
www.audioheritage.org
Cinema sound formats
35mm has been the standard film stock for the majority of movies throughout cinema history. There had always been a variation of Anamorphic lens for 35mm film for achieving variations of wide screen aspect ratios. The most commonly known was Cinema-Scope. Over time the 35mm film stock and lens technology greatly improved. Economic rationalising favoured 35mm and this caused the high cost superior 70mm films with its 5 screen sound channels to be used less and less.
Dolby Stereo Mono sound dominated the movie experience in the 35mm medium till 1976 when an international agreement to allow the Dolby two track optical format to become the new standard and included the Dolby A noise reduction. A matrix tecnique enabled 4 channels of sound to be achieved from the 2 track optical format (Left) - (Center) - (Right) - (Surround). The tracks are described as Left total (Lt) and Right total (Rt).

The matrix technique of obtaining four channels of sound from a dual format system had been used in the communications industry and with earlier quad vinyl recordings. However the existing matrix techniques could not enable backwards compatibly for film stock to be read from existing mono only reading projectors.
A cleaver and more complex matrix system was developed that enabled backwards compatibly. Over time electronic IC (integrated circuit) and component technology had improved, enabling higher fidelity with lower noise. Bass performance was also improved referred to as OBE (Optical bass extension). Many cinemas started to add an independant bass extension speaker. By 1986 this final optical sound improvement became known as 'Dolby SR' (Spectral Recording) and remains in place to this day.
Digital sound formats Between 1990 - 1993 four different and competing cinema digital sound formats were developed of which two remain in popular use. The (.1) sub-bass LFE (low frequency effect) was added which behaves as a seperate channel limited to approx 250Hz. All systems automatically default to the analogue optical system as a security back up. The Dolby digital system dominated due to its simplicity of application and being more economical to manage.
Unfortunately many cinemas have limited fidelity 2-way passive speaker systems which are unseen and have remained basically as they were 50yrs ago. It can often take a trained ear, to hear if the sound is being taken from the old optical analogue, or the new digital formats.

Dolby SR-D is the most common digital format. The digital information is stored between the sprocket holes. The small space between the sprocket holes can only contain a small amount of data, this being a major limitation. A common belief exists that this area of the film stock was chosen because it suffers the least wear, and is the most reliable. However there are many reviewers and projectionists who claim that the space between the sprocket holes suffers the most wear and is the least reliable place to put digital data.
DTS Digital Theatre Systems is said to be the preference of audiophiles. This view is also disputed. It uses a specially designed external CD player and requires 2 CD's per film. The CD player is sync locked to the SMPTE time code on the film stock. DTS is less preferred by cinema chains and film distributors because of extra cost, effort and possibility of CDs being lost.
SDDS Sony Dynamic Digital Sound has the capacity for 8 channels. It is strongly argued that SDDS is the best performing format. The digital data is stored on the outer edges of the film stock. There are some reviews that claim the outer edges are vulnerable to wear and damage, and again there are other reviews that state the opposite. SDDS can provide for 5 screen channels plus independent surrounds and sub bass. It appears SDDS is technically supported but not promoted.
Dolby 320Kb/s compression ratio 10:1 average 64Kb/s per channel for 5 channels.
DTS 1.04Mb/s compression ratio 4:1 average 240Kb/s per channel for 5 channels.
SDDS 2.46Mb/s compression ratio 5:1 average 307Kb/s per channel for 8 channels, but because SDDS has back up tracks the average may be similar to DTS.
The Dolby method is said to take advantage of momentary space from any one channel to increase the capacity of the other channels. Instead of the audio quality being inversely proportional the number of channels, it is approximately inversely proportional to the √ of the number of channels (√5 = 2.24). DTS is said to include frequency domain sharing between sub woofer and surrounds, hence no bass under approx 160Hz in the surrounds. The SDDS system is said to have a fixed bit allocation per channel to maintain channel independence.
Achieving 5 to 8 channels of sound from a limited bit rate was not a simple task. Research and development exceeded $20 million. Regardless of their limitations, the ability to approximate the performance of the analogue magnetic format is an astounding achievement to say the least.
www.sdds.com
www.dts.com
www.dolby.com
Fact or Fiction The above statements on digital formats are a summery of different views by projectionists, web sites, periodicals and competing experts at a local wine bar. Vested interests behind digital technology tend to be secretive and are said to willingly waste unlimited resources on law suits. These competing formats were served up fait accompli, each one claiming it is the best.
The cinema going public had and have no say whatsoever, including the majority of the worlds computer engineers and scientists. What should have been done was an open collective approach to achieve a single digital loss-less format (compressed or un-compressed) that had the performance capacity and fidelity of the best analogue magnetic format.
What we got was something that fell far short of it. Whether the SDDS system is capable of this, who knows, as the majority of the public have not experienced sound in this format. I am sure we all would have no hesitation to pay a small % on the ticket price for an independent authority to represent the best outcome for cinematic experience.
Understanding Digital sound
An audio analogue signal on magnetic tape or vinyl records is infinite in detail, but suffers deterioration and increased noise from repeated use and mass transfer. Analogue sound quality is measured by frequency response. The lower the frequency response the lower the quality. Digital audio does not suffer from loss of frequency response, but with a low bit rate it suffers from quantumising noise (random static) and smearing.
Digital reduces the infinite detail of an audio analogue signal to a representation of finite bits of 1's and 0's Each 1 and 0 is absolute and therefore is simple to produce and mass transfer, and does not suffer deterioration or noise with repeated use. The maximum allowable number of bits 1's and 0's per second (b/s) is the only limitation of digital technology to provide full sound quality.
The 1's and 0's bits are grouped into 'words' of 16 bits (1000110001101010) for domestic CD , and 18 bit 'words' for Pro-audio (100011000110101001). A word can consist of any number of bits. The number of words per second is called 'sampling rate'. Domestic CD sampling rate is 44.1K words per second. Pro audio sampling rate is 48K words per second.

Bits per word defines dynamic range.
Sampling rate defines frequency response and must be greater than x 2 highest audio frequency.
dB FS Full Scale 0 dBFS is the highest level of word sample.
Therefore lower audio levels will be - dBFS numbers.
1111 1111 1111 1111 = 0 dBFS
0000 0000 0000 0001 = -96 dBFS.
16 bit word = 96 dB dynamic range.
20 bit word = 120 dB dynamic range.
24 bit word = 144 dB dynamic range.
Note: Only a word group of exactly 8 bits is called a 'Byte' and refers to storage capacity. A CD can store 700 Mega Bytes (700MB) We must not confuse bits with Bytes. Bytes is upper case 'B'. bits is lower case 'b'.
The greater the total numbers of bits/second (b/s), the faster the sampling rate, and/or the greater number of bits in a word. But for a given number of bits per second (b/s) there can be a choice between a slow sampling rate with a large number of bits per word, or a fast sampling rate with a less number of bits per word. The continuing explanations will simply refer to total bits/second (b/s) to represent audio quality.
Domestic CD is 1,411,000 bits per second (1.411Mb/s) for 2 channels. Therefore each channel is 705,600 bits per second (705.6Kb/s) For the majority of people this bit rate is high enough to enable music fidelity to be in-distinguishable from quality analogue formats (20Hz - 20kHz with 96dB dynamic range).
Before CD players were available the only domestic digital recording medium was video tape recorders. The fractional number of 44.1K sampling rate (words per second) was the maximum allowable for high fidelity audio to be digitally recorded onto video tape formats, and the 44.1K sampling rate was retained when domestic CD arrived. Pro audio Digital Audio Tape (DAT) is 48K sampling rate.
Basic explanation
http://www.cs.columbia.edu/~hgs/audio/44.1.html
Audio demonstration of various sampling rates
http://www.cs.cf.ac.uk/Dave/Multimedia/node150.html
Quantization noise wikipedia.org/Quantization

Resolution The primary reason to have the highest sampling rate possible is to obtain the best small signal resolution for fine harmonic detail and nuances within the music. The second reason for high sampling rates is to enable the signal to be digitally EQ modified and processed. It is therefore understandable that digital recording naturally favours increased level over fidelity. This will be discussed in further detail on www.sound.westhost.com
Digital recording The trend is to increase the recorded sound to the maximum possible level. This can easily be achieved without incurring overload by over-modulation, uniquely suited to digital recording, in comparison to previous analogue recording. Modern production techniques are often made up of what appears to be an infinite number of competing 'sound grabs'. The dominant objective it to increase average level by adding more and more, including eliminating space within the music (easily achieved with modern software). Once taken to the highest level, the digital recording can then be compacted by excessive over-use of dynamic compression, enabling more to be added.
This childish behaviour in the mis-use of digital recording has resulted in rendering harmonic detail and nuances in-audible, thereby masking the fidelity in music. Possibly the worst perpetrators of this problem are from the proliferate questionable audio recording schools and software merchants who promote these irresponsible practices to impressionable young people desperate to enter the recording and pop industries.
Digital compression
- Loss-less un-compressed format is for Pro-audio digital and domestic CD. Silence is recorded and played back at the full file size bit rate.
- Loss-less compressed format is similar to a ZIP file that reduces file size by not recording the silence. When played back it replaces the original silence at full file size.
- Lossy-compressed Lossy compression is Smoke and Mirrors which evolved from psycho-acoustic research. Silence including selected detail within the music can be discarded without the average person being able to notice, obtaining 30% to 90% reduction in file size. Discarded information can not be retrieved.
MPEG (MP3) Moving Picture Experts Group use a variations of techniques described as "perceptual noise shaping" or "perceptual sub-band transform coding" It is used for internet music downloads where only very small file sizes can be used. MP3 allows for various compression rates to be chosen. But how much information can be throw away without noticeably deteriorating of the quality of sound ?
256Kb/s 5:1 compression Music quality almost indistinguishable from the original CD.
192Kb/s 7:1 compression Popular choice for reasonable quality.
128Kb/s 11:1 compression Popular for internet download music and ipods.
96Kb/s 14:1 compression Easily discernibly lower sound quality than original CD recording.
64Kb/s 22:1 compression Mono speech only don't attempt music.
The best test to hear a comparison of MP3 lossy compression to original un-compressed CD sound is to use white or pink noise, audience applause, rain on a tin roof, a bundle of keys thrown up in the air and caught, and worst of all a Harpsichord.
Achieving an acceptable performance from a very limited bit rate is a technological miracle greater than the biblical parable of the fishes and loaves. However when applied to multi-channel cinema sound we must not loose sight of marketing deception when promoting a brand image for 'white bread' as vitamin enriched. As in a product stripped of nutrients (or necessary bit rate) and selling it as being enriched with cleaver deletion algorithms or artificial vitamins enabling it to taste or sound acceptable. Hopefully when MP99 arrives it will discard clap-trap, boring cliché dialogue and TV commercials as well.
www.digitalradiotech.co.uk/mpeg_coding mpeg coding
Vinyl Many who grew up in the 60s and 70s with a hi-quality sound system and large vinyl collection can clearly hear the degradation of music quality of most lossy-compressed digital formats. But the majority of the modern digital generation have grown up in an excessively noise polluted world where hearing fine detail in nature and music is often not possible.
Compressed - lossy Digital cinema sound
It is argued that much of what is recorded on an un-compressed loss-less CD format can not be heard. What cannot be heard, cannot be heard, therefore silence and any sounds below the threshold of hearing, or below the general ambient noise level of 40dBA can be deleted. Loud sounds mask softer sounds and the softer sounds can be deleted when louder sounds are being played. Some frequencies that are close together can mask each other, therefore the masked sounds can be deleted.
The perceived sound quality is dependant on limitations of listener attention, being in a high reverberant environment with ambient noise, listening un-attentively to a small cheap limited fidelity home cinema system while being distracted by vision. The final essential factor is that the listeners expectation has been influenced by marketing.
When all these external factors are combined they effectively mask the distortion anomalies created by the lossy compressed digital sound.

- Ambient noise of cinema is approx 40dBA, any sound below this level can be deleted.
- Dynamic range between background noise and maximum level is approx 40dB.
- Hearing sensitivity of low and high frequencies is limited, only 20dB dynamic range required.
- Sound outside of our ability to hear direction (chirping cricket) can be collapsed to mono.
- High level sounds psycho-acoustically mask similar sounds of lower level, which can be deleted.
Bit pool: To achieve these deletions, plus many more, requires a sample (every fraction of a second) of the total information to be stored in a bit pool for analysis. Instantaneous decisions are made of what information can be deleted. But when all channels are over-used (at the same time), beyond the capacity of the bit pool, some essential information may have to be dumped.
Depending on the % of deletion, unpredictable outcomes may occur. A frequency band in one channel may be deflected to another. Frequency bands that are similar in different channels may be deleted leaving only the loudest heard. Similar bands from different channels may appear in the center channel. Ringing or pre-echo of percussive or transient sounds may occur etc etc etc.
Psycho-acoustic masking These random artefacts occur within a fraction of a second and are averaged by our hearing and expectation. The majority of non-discerning audience in a cinema do not notice if all channels are collapsed to mono, or if the surrounds are on. Sight can influence what we believe is the direction of sound. Also none of these lossy compression techniques reduce frequency response. For most people hearing high frequencies of any type is thought of as high fidelity.

The above compressed right pic has colours, and some colours are exaggerated, and we clearly see the picture as distorted, this is because the picture remains static in time. However because audio constantly changes in time, we can be more easily fooled. Stand back from the computer screen approx 3 meters (10ft) slightly squint or de-focus the eyes and notice how similar the 2 pictures become. This effect is similar to psycho-acoustic masking.
When audio is digitally compressed the outcome is similar, and some hi-frequencies become exaggerated giving a false perception of fidelity. However an attentive listener can easily hear poor music resolution, smearing, image loss, reduced depth of field and chaotic imbalance between channels.
If the anomalies of lossy compression are not to be heard, it requires a high correlation of similar sounds between the channels. A simple test to hear limitations of lossy compressed 5.1 formats is to put different full fidelity music on each sound track.
- Left channel Mahler's 5th
- Center channel Beethoven's 4th
- Right channel Tchaikovsky's Nutcracker suite
- Left-rear channel Loud Rock music
- Right-rear channel Rap or Techno music
- .1 Sub-bass channel African drums
This test is unrealistic as this level of sound separation is not required for multi-channel sound in film production. But this test will reveal the limitations and demonstrate what can be achieved. The simplest test that can be achieved is people simultaneously speaking different languages, recorded and then played back, with consistent separation results from each channel. However if the people are rotated as if on a merry-go round, strange things can start to happen.
Demonstration trailers promoting digital sound often consist of loud impressive animated computer sounds with minimal transients and harmonics, heard through mostly limited fidelity speaker systems, where the limitations and anomalies of lossy compression are not heard. Also the majority of companies behind digital technology are secretive, aggressively protect their interests and do not openly disclose problems and limitations.
Basic mixing rules that can be applied to achieve a consistent outcome in minimising lossy compression artefacts, but they do not need to be obeyed:
- Dialogue to center channel only
- Front left and right for low level background music
- Surrounds used sparingly, during minimal use of front channels
The future aim for when digital cinema is available, is that the audio will hopefully be available in loss-less compressed format replayed exactly as the original recording was made. Only then will the sound fidelity and channel separation match the best analogue magnetic format of the past without its limitations.
One idea is that the film will be delivered on large format CDs and loaded into high speed hard drives from which the film will then be shown. However these decisions are not yet finalised.
www.www.mkpe.com Has a good critical historical explanation of digital sound formats.
www.jpeg.org go to JPEG 2000 link for new digital cinema standards.
Digital fidelity "Multi-channel digital transmission (encoding and decoding) that utilizes low bandwidth optical recording techniques is a compromise that must be appreciated for what it represents as an alternative to previous linear formats. Data compression, bit errors and dropouts are inherent in all restricted bandwidth systems that favour dynamic range as opposed to bandwidth and linearity.
Work done in the field of digital telephony has resulted in improved speech intelligibility at the expense of fidelity. This is of little concern for a dialog or sound effects channel but is entirely another matter for music. As the majority of cinematic productions are a complex mix of all these ingredients it is little wonder that music fidelity is masked by competing sound components that can be 6 to 10dB higher in level.
The musical content of a modern production can thus be seen as a series of 'sound grabs' competing with every other component in the mix. This aesthetic masking has now become a method of increasing the average levels without incurring overload by over-modulation". written by Keith McPherson (audio and telecommunications engineer)
A Chain B Chain
Cinema sound management is divided into 2 sections. A chain refers to the sound track on the film stock, projector sound reader and Dolby decoder including other competitive decoders. B chain refers to power amplifiers, crossovers and speakers, including cinema room EQ. Traditionally the decoders and amplifiers are mounted in a 19in rack in the projector room. The long speaker cables from the projector room to the screen and surround speakers are mostly hi-current low-resistance 110V or 240V power cables which are installed by electrical contractors when the cinema is made. Commercial cinemas are not attached to any form of audiophile nonsense and therefore do not use 'magical' speaker cables.

It can be questioned why the amplifiers are not placed with the speaker system with short speaker cables, and only requiring a long signal cable from the projector room. This approach is normal with live entertainment sound systems; the answer is simply historical convention.
B Chain Speaker positions
All cinemas follow basic rules of speaker placement to provide a consistent outcome which when looked at in detail is the most pragmatic approach. The speakers behind the screen have to be angled accurately at the points shown in the below pic (to begin with). Then re-adjusted by ear to achieve a stereo image that maintains maximum consistency throughout the room. This may require a larger toe in from the left right speakers to minimise reflections from the adjacent side walls. Also an increased toe sometimes provides a better stereo image for sound buffs who prefer to sit at approx 1/3 distance from the screen. Those who wish to sit at the rear of a cinema; stereo image will have no value.

Different cinema screens will attenuate high frequencies to varying degrees, and it is essential to place the high frequency horn as close as possible to the screen. Air also absorbs high frequencies over long distances which has to be adjusted for large cinemas. A variable 3dB per octave lift above 3kHz can easily compensate for all high frequency losses.

The front speaker cabinets should also be placed in a large baffle behind the screen as shown with the JBL speaker system in the above pic. The baffle provides forward propagation for the lower frequencies. The baffle also stops sound from reverberating against the wall behind the screen. The difference between having a screen baffle is very noticeable in improving the sound system performance. However very few cinemas are prepared to incur this extra expense. Information on the left-center-right screen speakers are covered in more detail in the chapter of 'Large Systems'.
Surround speakers
Live musical concerts including opera do not have surround sound. Sounds from other directions cause audience distraction. Reverberation and echoes reduce intelligibility, therefore favoured seats are closer to the stage. However the movie experience is enhanced by surround sound to create spatial environmental effects; providing it does not distract or conflict with the front system.
Home cinema marketing states that the surrounds should be the same speakers as the front. But this is never the case in commercial cinemas. A stated requirement for commercial cinemas is that the total sound energy from all the surrounds must be capable of matching one of the front screen speaker systems. However sound energy from the surrounds is almost never required to equal the loudness capacity of a front system. This is possibly stated to insure that the surrounds are always capable of delivering what they should.

Surround speakers normally begin at approx 1/3 of cinema distance from the screen. The basic mounting positions above are a guide and the final positions should be adjusted by ear. The height and downward angle of the surround speakers is aimed at achieving the minimum loudness difference from wall to center seats (approx -3dB). Also the total surround level is calibrated with pink noise to be approx -3dB below the level of the main screen system.
Surround delay time A mandatory 20milli second delay is applied to maintain hearing attention to the front speakers. Delay time is then extended to the distance time of the cinema length in 1 milli-second steps (1 milli-sec = 340mm 1.1ft). A cinema length of 34m (110ft) will require an extra 100 milli (1/10) second delay. The Dolby processor automates this procedure. Be sure to follow the processor instructions carefully and double check the cinema measurements. The final adjustment is calibrated by ear and any error must be biased toward reducing surround level, not toward increasing surround level.
The surround speakers should not be heard directly as point sources, as this causes distraction from the screen system. Surrounds are meant to disperse the sound in a diffused manner, similar to how we hear sound in the far field natural environment. But surround speakers are constrained by walls, that cause the sound to be heard from the nearest speaker only. This problem puts impossible constraints on the recording engineer.

Line source surround The perfect surround system would consist of large numbers of 8in speakers on each wall as in the above pic. A line source disperses at -3dB / 2 distance, not -6dB as from point sources. Also a horizontal line source behaves as a natural diffused sound field, similar to how sound disperses in nature from wind, rain, motorways, urban environments and crowds. The line length can be sub-sectioned into groups of 4 to 6 speakers. Each speaker group can be independently powered and delay sequenced to obtain moving spatial sound effects similar to chaser lights.
Using large numbers of speakers to achieve a line source and sequenced delays would not have been economically viable in the early days of cinema. A high quality 8in speaker in the west may be $50 - $100 , but these speakers are often shipped from China in packs of 200, for less than $10ea. The digital management for sequenced delay is simply achieved. All that is required is will, imagination and a passion for cinematic entertainment.
There are never ending debates about which surround speakers are best. In commercial cinemas it is inadvisable to use speakers with 70V line transformers, as applied in the ceilings of shopping centres. These speakers are often of the lowest quality and can easily be destroyed with a few Watts. Also line transformers restrict bandwidth and can saturate with dynamic range, especially at low frequencies. When a line transformer saturates it behaves similar to a short circuit and creates extreme distortion.
If possible wire each surround speaker independently to the projector room, and then connect the speakers from each row in series parallel arrangement at the amplifier rack. This procedure uses a lot of excess cable, however this enables simpler maintenance in the future, to isolate and replace a faulty speaker.

Parallel vs Series-parallel Connecting speakers in series parallel, may be essential where only one amplifier is available. However, connecting speakers in series causes the distortion of the speakers to be reflected into each other. Also the impedance of a speaker at resonance and at hi-frequencies can be approx x4 greater than the rated impedance. This can cause chaotic behaviour including restricting the power and therefore the loudness the speakers can be heard at. The majority of amplifiers are designed for approx 4R output impedance which only allows for 2 8R speakers to be connected in parallel. Therefore a stereo amp will enable 4 8R speakers to be wired to it. Parallel wiring of speakers speakers requires extra amplifiers and higher cost and is worth the effort.
Speaker phase Understandably the surround speakers must be in phase with each other. But because surrounds are collectively delayed in time, it is not possible to apply speaker phase that is matched to the front system. However it is correct practice to have the speaker phase + in reference to the amplifiers.
Bass - Sub-bass
The ability to provide deep bass had been understood since the 1930s. Some of the large wide screen cinema sound systems in the 1950s could easily manage 40Hz. Also during the 1950s some DIY (do to yourself) hi-fi enthusiasts made large brick speaker enclosures with 15in speakers in the corners of their living rooms easily achieving 20Hz. The only limitation was the needle jumping off vinyl records. Many Rock concerts in the 1960s used large folded horn bass bins.
The first mass audience experience in cinema of powerful separate sub-bass was created for the "Earthquake" movie in 1974 using very large Cerwin Vegar bass bins in a system described as 'Sensurround'.
Sound quality especially sub-bass was never seen as being important for 35mm main-stream cinema. The majority (but not all) cinema sound systems were and still are made as cheap as possible; 2 x 15in with a horn, as in the below pic. Bass frequencies of 40 - 60Hz are approx -6 to -12dB below voice range. Because cinemas are quiet environments the bass can still be heard at this lower level. This is the reason the problem had not been previously addressed.

By the late 80s Dolby SR became available providing sub-bass extention which was promoted vigorously. When digital arrived the sub-bass was available as a separate LFE Low frequency Effects channel. Dolby's success was in understanding that the majority of cinema management would spend as little as possible to improve sound. The separate sub-bass extention is the simplest low cost solution, requiring only 1 extra amplifier with a single 18in bass speaker (approx $2,000) to be added to an existing cinema system.
Our ears are in-sensitive to bass frequencies, and combined with the 18in speaker being less efficient than the speakers in the main system which requires the 18in to be equalised and driven with greater power, to be heard at the same level as the voice frequencies. 1,000 Watt amplifiers are commonly used. This is the primary reason sub-bass is not included in the main front channels and has to be separate.
An average 18in bass speaker cone has a fundamental resonance (Fs) of approx 35Hz in free air. When the speaker is put in a 10 cubic ft (260 Litre) box the speaker resonance is made higher; approx 50 - 60Hz. To achieve deep bass a resonant port is put in the box and tuned to approx 30Hz, not 20Hz as often believed.
The port (Hemoltz resonator) will generate a resonant note only at the frequency it is tuned to in reverse phase. The port resonance is also activated by any movement of the cone at other frequencies at a lower level. Many musicians who play electric bass guitar, including a % of discerning audiophiles, do not use bass boxes with ports. However ported boxes have a fanatical following singing their virtues.

Port problems For a port to be maximally effective it should be similar in diameter to the speaker and very long. But a large port similar to the size of the speaker is not practical as it would have to extend outside the box. So for the port to fit inside the box it is reduced to a smaller size. These smaller ports are less efficient and generate greater air velocity, which can create whistling sounds. Also any frequencies from the amplifier that are below the port resonance, cause decoupling of air loading to the cone, generating excessive cone movement (exertion) that can easily destroy the speaker. Cinema projectionists often describe how common this problem is.
To deal with this problem many speaker manufacturers make the cone suspension tight, which raises the Fs, compromising bass performance for reliability and higher power rating. Also a filter is sometimes used to stop bass frequencies below port resonance getting to the speaker. All of this amounts to being a bag of worms that has to be managed. But ported boxes are the only way to achieve effective 'cheap' deep-bass. This is not right or wrong but simply physics.
Sealed box In a sealed box, a speaker cone must move 4 times the distance for each octave decrease to maintain the same acoustic output, providing the bass wavelength does not exceed x 10 speaker diameter. This is the reason a 4in speaker cannot deliver sub-bass at a reasonable level for home cinema regardless of how far the cone can be made to move. This can be simply demonstrated by waving ones' finger in the air, compared to a large sheet of paper.
Many active sub-woofers for home cinema and vehicles use 10in or 12in speakers in small sealed boxes of approx 1 - 2 cubic ft (30 - 60 litres). The speaker resonance can be as high as 60 - 100Hz. Below system resonance the cone cannot increase excursion at 4 times the distance for each octave decrease, which is the minimum required to maintain constant acoustic output. Below system resonance the cone excursion is kept constant, by air compression in the small box. Bass efficiency rolls off at -12dB/octave below resonance. (-12db = 1/16)

Sub-woofer EQ Many sub-bass amplifiers have special equalization (EQ) to boost amplifier power, compensating for the decrease in efficiency below resonance. If the speaker-box system resonance is 60Hz, and the system is required to be flat down to 30Hz, then the amplifier will have to deliver (+12dB) 16 times more power at 30Hz, to force the cone to move 4 times the distance. These speakers have heavy cones and are inefficient. Amplifier power for domestic application can be approx 300 Watts.
To understand the management of bass and sub-bass frequencies requires relating to the frequencies as their wavelengths. The velocity of sound is approx 344m (1125ft) per second.
- 100Hz wavelength = 3.4m (11.25ft)
- 42Hz wavelength = 8m (26ft) lowest note on double bass
- 20Hz wavelength = 17.2m (56.4ft) lowest sub-bass effect

The imagined ideal is for the speaker diameter to be equal to the longest bass wavelength. But this would require the sub-woofer to be the approx size of the cinema screen.
Perfect cinema sub-bass. For the sub-bass to maintain the same directivity and efficiency as the 2 x 15in speakers in front main speaker boxes, the total area of the sub-bass speakers should be approx x4 the area of all the 15in speakers in the left-center-right boxes. This requires 6 x 30in speakers. Because 30in speakers are not normally available, 6 x 18in speakers is practical.

In sealed boxes the 18in speakers will have to have a low Fs fundemental resonance (very loose suspension) less than 25Hz. The boxes will need to be very large approx 20 cubic ft (600 litres). The speakers will still require extension EQ to force them to reach down to 20Hz at equal power. The bass boxes will deliver more propagational energy if they are stacked together in one place. However to minimise excessive cancellation from standing waves throughout the cinema, it is best to separate the bass boxes as in the above pic.
High quality 18in speakers are expensive, and most are made with high Fs (stiff surrounds) to keep the voice coil centred, which unfortunately defeats its ability to produce deep bass in a sealed box. Most 18in speakers are designed to be used in ported boxes only.

A cost effective solution that performs as well (if not better) is 24 x 15in hi-fi speakers (2in voice coils approx $100ea). These speakers are readily available with soft surrounds that have a low Fs approx 25Hz. The cost is low because they are made in large quantities. The 15in speakers have to be in compound pairs, each pair acts as a single speaker. This arrangement does not look attractive, however it gives excellent linearity, low distortion, at very low frequencies. Each sealed box will still have to be approx 20 cubic ft (600 litres) and may require a small amount of extended bass EQ to achieve 20Hz at equal power.
Sub-woofer Phase It is not stated that sub-woofer phase is required to be matched to the front left-center-right speakers. This is because it is regarded as a separate channel with different information. But this not a correct assumption because at times the same bass information can be applied to the sub-bass channel and front speakers. This should be checked independently with a signal generator to ensure both sub-woofer and front speakers are in phase at the crossover frequency. Do not attempt phase correction with pink noise measurement using microphones in the far field as this result can be randomly influenced by cancellations from standing waves.
Time alignment As an imaginary concept if the whole frequency response (from each speaker) radiated from a single point then all the frequencies would be in time with each other. With large cinema sound systems the low and hi-frequency drivers are separated by small physical distances. With the old large Altec A4 system below pic, the 15in speakers and horn driver were mechanically in line on axis, so there is zero time difference to the audience. But the vertical height difference between the drivers is approx 2m (6ft) and at 344m /second is a 6 milli-second time difference which can be considered audible for someone sitting underneath the speaker system in the front row. This is similar to being at the front or on stage when listening to a live band.

Time differences of less that 10 milli-seconds of sharp transient clicks can not be heard as separate, but as a single fattened partially muffled click. A common practice in pop recording is to add multiple short delays of less than 15 milli seconds to voices and instruments to give them a fatter sound. However time differences of transient clicks beginning from 11 milli-seconds gradually start to be heard and become clearly separate when approaching 30 milli-seconds 10m (30ft)
The problem of separate double clicks was first noticed with very large speaker systems in the 1930s when Fred Astir and Ginger Rogers tap danced. At first this was confusing because the distances between the low and hi-frequency drivers were not great enough for the distinct double clicks to be as extreme as they were.
The early multi-cell horns were designed to have a wide dispersion to provide sound to the upper and side balconies. But these wider dispersion horns caused slap echoes from the ceiling and side walls to be greatly increased. Modern cinema horns have a narrow controlled directivity compared to the earlier multi-cell wide directivity horns.
Time alignment correction including phase correction should be considered as good housekeeping, regardless if it is audible or not.
Time alignment myths Audiophiles with golden ears claim that time alignment correction for very small differences between woofer and tweeter of approx 100mm (4in) have to be time align corrected. They believe that this small correction transforms the entire performance of a sound system, similar to using magical speaker cables. Time alignment differences of less than 1 milli-second are acoustically transparent as double blind ABX tests have proved. Many audiophiles will not buy any product including power cables if it does not have the words 'Time Aligned' printed on the packaging. Unfortunately there are many so called recording engineers that have the same mystical beliefs.
Large sound systems
Screen Image -V- Sound Image
In most cinemas the physical size of the sound system is small compared to the screen and therefore the sound experience does not match the experience of the picture. Small speaker systems disperse as a point source at the inverse square law from the screen, whereas the picture is radiating from the whole screen area. The general view of the majority of people is that sound is only 10% of the cinema experience.

Cinema is successful because the picture appears to have no boundary when viewed on the big wide screen. So why not remove the boundary of a small low fidelity sound system, and make the sound as big as the picture in terms of experience.
Imagine being in a rain forest and very attentive to the sounds around you, hearing subtlety of detail which extends into infinity. Record this, go home and play it back on a small stereo. It will only be recognisable because of remembering. Turning up the volume does not make it sound real. It is made real by removing the boundary of a small low fidelity sound system and stepping into large scale 4way active electro-acoustic technology.
Inverse square law refers to energy radiating from a point source. When the reflected light (image) radiates from the screen, the inverse square law does not begin at the screen, but from the projector, at the rear of the cinema, or the same distance from an imaginary point behind the screen.

The projected image from the screen always looks big wherever we sit in the cinema. This is the reason home cinema can never create the same experience as going to a large theatre. This is similar to why the energy from the sun or the sun's size does not appear to decrease by moving another 30m (100ft) further away from it at the distance of the earth.
We can all understand that using a small home cinema screen in a large cinema, and simply increasing the luminance (brightness) does not make the image appear larger. But this is precisely what most people (including those that are technical) falsely believe about sound systems. Putting a small home cinema sound system behind the screen and simply turning up the volume does not make is sound like a large cinema sound system. It will only sound like a small sound system with the volume turned up. This point is made so not to confuse loudness with sound stage or sound image, which refer to propagation directivity and radiating area.
Wavelength propagation The difference between the wavelengths of various colours of light are less than 1 octave, approx 1/2 micro meter. We can assume that all light from the screen radiates at a similar wavelength which is very small compared to the radiating area (screen size). Therefore the image will appear consistent in reference to screen size and observed distance.
However this is not so with sound. Sound is kinetic energy (air vibration) at very low velocity of 344meters /second. Wavelength at 100Hz = 3.4m (11ft) Wavelength at 10kHz = 34mm (1.4in) a large ratio difference of 100:1
For the sound image (dispersion and propagation) to be consistent over the frequency and dynamic range, the acoustic radiating diameter should approx = wavelength, increasing x 2 for each octave decrease. This imaginary ideal is impossible because the speaker would have to be as large as the screen at 20Hz wavelength = 17m (56ft).
The average home cinema speakers are approx 4in - 8in. The lower frequency wavelengths are approx x10 to x100 larger than the speaker diameter. The higher frequency wavelengths are approx 1/10 to 1/2 speaker diameter. Therefore the sound image (dispersion) will be inconsistent across the frequency spectrum, also the spectral balance of low to high frequencies will change with power.
The average commercial cinema speaker system has 2 x 15in speakers which are small compared to bass wavelengths. whereas the high frequency horn is matched to the higher frequencies and has a different behaviour to the front loaded 15in speakers. The lower frequency energy from the front loaded cone speakers lags behind as the power increases.

Horns Theoretically the sound waves at the horn mouth are not as curved as if coming from a front loaded cone speaker. The sound waves from the horn appear slightly straighter as if radiating from a larger surface area, similar to the screen but on a smaller scale. This is why the hi-frequency horn appears to have a more forward projected sound image and greater voice clarity compared to front loaded cone speakers.

As most cinema sound systems are 2 way passive with 15in front loaded speakers, it is best to get as much of the sound spectrum as possible to come from the hi-frequency horn and take advantage of its increased directivity. Manufacturers of most horn 2in compression drivers recommend a crossover frequency of no lower than 800Hz at specified power.
500Hz crossover Providing the designers of the 2 way speaker systems are willing to limit the power to the horn compression driver to less than 1/2 of its specified power, it is possible to lower the crossover to 500Hz before the driver diaphragm is at risk of being damaged. This also requires the horn length and mouth be made larger than normal, sometimes described as long-throw or constant directivity horns.

Many large sound systems of the past were 2 way passive and had very large extended horn shaped baffles for the low-frequency 15in speakers. This helped obtain a similar directivity and efficiency to the hi-frequency horn on top. Early valve amplifiers were approx 30 to 60 Watts which provided a strong motivation for design engineers to make the speaker system as efficient as possible. The overall sound quality appeared consistent over the frequency range, compared to the majority of today's smaller cinema sound systems.

The aim is for the sound to match and picture so they both appear in the same proportion every where in the cinema. This requires the sound spectrum to be divided into 4 sections. The below pic shows the large variation of wave lengths over the sound spectrum that have to be managed by the speakers is each section.
Above 700Hz compression drivers + horns are ideally suited to the smaller wavelengths of higher frequencies. Below 700Hz in larger 300+ seats cinemas the 15in speakers should be placed in horn shaped cabinets to increase the effective radiating area and directivity to match the high frequency horns. Below 100Hz the wavelengths are so long that the floor and walls are able to act as an extension of the horn cabinets and therefore maintaining their directivity.

Active All large sound systems require separate amplification for each of the frequency bands to achieve maximum performance with minimal error and distortion. Active management also eliminates any possibility of inter-modulation between the different frequency sections. The Lenard K4 system uses a separate amplifier for each speaker component. Electronic crossovers should be 4th order (24dB/octave) and have time alignment correction available.
Jokes are often made by technical people that all sound system problems could be solved if everyone in the cinema wore headphones. Sound fidelity is highest at close proximity to a sound system that radiates from a point source. Monitor systems in recording studios for mixing cinema sound use smaller sound system similar to the Opal speakers in the pic below on left.

Diffusion Radiating sound from large areas with multiple speaker components increases the possibility of similar frequencies to interact causing diffusion. The extreme of this problem is heard with large concert PA systems. The radiating area of an average cinema sound system is approx 0.75m² (8ft²). The Lenard K4 radiating area is approx 5m² (50ft²). Each of the large cinema sound systems in the below pics have different advantages. All approaches are excellent in performance and are designed to minimise the problems of diffusion when dispersing sound from large areas.

The HPS-4000 is a 4 way passive system that follows the best traditional convention of horn designs. Because the horns are forward facing they require a larger distance behind the screen which is not available in many cinemas that put screens very close to the wall. Depth 1,117mm (44in)
www.hps4000.com
The JBL 5674 is JBLs largest 3 way cinema system which reflects the traditional approach of favouring the upper voice range, typical of what we recognise cinema sound to be. The system is designed be placed in a large baffle wall to increase the directivity of the front loaded 15in speakers. Depth 813mm (32in)
www.jblpro/pages/cinema.com
The Lenard K4 is a 4 way fully active system that requires minimum stage distance behind the screen and does not require the added expense of a baffle wall. Depth 457mm (18in) This unique rotational approach owes its origin to the industrial designer Michael Dixon who stated that increasing lower voice radiating area (100Hz - 800Hz) to improve propagation would add extra realism to the cinematic sound experience, which proved to be correct.
www.ddd.net.au
Conventional cinema systems require the (.1) information is sent to an independent amplifier and sub-woofer speaker. With the Lenard K4 system the (.1) information is electronically remixed with its own gain control to the (left-centre-right). The .1 sub-bass is delivered through all the folded horn bass bins generating close to a plain wave front, providing vastly greater impact.

Projected acoustic center The aim of all large scale systems is to create the experience that the acoustic center is projected forward (approx 6dB) into the audience, giving a feeling that the sound appears to come from half the distance that it actually is. Providing the cinema is acoustically treated to be close to anechoic as possible the projected acoustic center gives us the experience of Synesthesia.
Synesthesia
Synesthesia is the experience we have when one sensory system affects another. A passing scent can evoke a strong memory; visually we can describe a colour as warm red or cold blue. Music can influence our feelings and how we experience seeing a picture. Synesthesia stems from the function of our imagination carrying sensory experience into altered states. Capturing one of our senses (sound) into 3D experience, caries the other senses with it, giving the picture a larger 3D perception and enhance our enjoyment of the movie.
Approx 1 in 30 films are created by people who have this understanding and skill. The dialogue leads with visual form edited to the music score 'Form follows content'. This procedure is fully thought out in pre-production. Often the (left-center-right) speakers are utilised as a tri stereo field giving a full auditory depth of field to the picture.
Synesthesia also has an equally negative effect on reducing our enjoyment of the film. Limitations in recording skills often result in the trend to use excessive dynamic compression. Limitations of bandwidth and dynamic range, imposed in the recording process and the projector decoding A Chain, combined with limited fidelity of the B chain speaker system, including poor acoustics of the cinema, accumulate in having an overriding negative effect on the experience of a film.
- Reverberation from cinema walls, distances us from the film as flatness.
- Harsh sound alignment causes listening fatigue and contributes to physical discomfort.
- Excessive dynamic compression limits the illusion of distance or perspective in the picture.
- Poor fidelity speakers colour the sound, unconsciously affecting our visual experience of colour vibrancy.
3D Spatial realism
Stereo image was discovered early in the recording industry. Two microphones spaced similarly to our ears, recorded onto separate tracks and played back through headphones. The result is a three dimensional (3D) experience of the music, similar to the visual photographic 3D stereo image.
Production techniques for 3D stereo imaging have improved over time, but are dependant on the fidelity of the sound system and minimal reverberation of the room. 3D stereo imaging cannot be heard from small low fidelity speaker systems, or in reverberant cinemas.

3D spatial realism through speakers requires a minimum of 2 stereo fields. A single field from only 2 speakers enables one part of the 3D experience to be obtained. As in the above pic three speakers can create three stereo fields from which fields 1 and 2 create spatial localisation. Sound fields can be positioned and maintained into a relative stable left-center-right correlation over a 60deg listening angle.
Surround speakers are for diffused environmental effects, also the 5.1 protocol for home cinema puts the rear speakers at too great an angle to be correlated into stereo fields.
The objective is to match the sound with the moving image, enabling the picture to come alive. This requires sound to carry detail represented in the picture. If done well, the sound caries the moving image from 2 to 3 dimensions. This effect is called Synesthesia.
Cinema acoustics
Before sound systems existed the large grand Cinemas of the past were modelled on Opera houses with an orchestra pit in front of the screen which was later replaced by an organ that came up through the floor. Opera houses evolved to make use of reverberation to increase sound level to the audience. Many composers including Mozart hated the excessive reverberation of large concert halls which restricted their music. Mozart often preferred to perform outside where the detail of the music could be heard as he intended.
The cost of increasing sound level by reverberation is at the loss of intelligibility. What evolved was an imagined ideal of correct reverberation, that is, reflective area around the opera stage of short distances (short path-lengths) opening up to larger areas of longer distances (long path-lengths).
It can be argued that specific pieces of classical music suit different characteristics of reverberation. But there is no such thing as one type of reverberation that suits all classical music and all acoustic instruments. When sound systems evolved, Cinemas became trapped with the problems associated with the excessive reverberation of large auditoriums.
The subject of architecture is obsessed with status and visual form, often with little or no interest in what cannot be seen (acoustics). Many architects believe that city environment including our homes and especially auditoriums and cinemas should be as reverberant as possible. Sound absorbed is negatively described as 'room loss'. Inadequate and false understanding of acoustics has directly contributed to the excessive noise pollution of our cities.
Many early large cinemas had little or no acoustical absorption. Decorative ceilings may provide sound diffusion but little absorption. No cinema goer is interested at looking at an ornate foyer or the inside of the cinema when watching a film. The money wasted would have been better spent on properly acoustically treating the cinema.
Modern multiplexes often have red pleated curtains on walls which is mostly decorative and will absorb high frequencies, but may have little effect on absorbing low-frequencies. Often there is no acoustical absorption on ceilings except for standard acoustical tiles used in most office buildings.
George Augspurger a previous technical director of JBL and also an excellent educator, stated that the 3 Rs of acoustics are -
- Room resonance
- Early reflections
- Reverberation
audioheritage.org Profile of George Augspurger
www.artusaindustries.us/university.html Acoustic A-Z definitions
Slap echoes and Reverberation
When surround sound evolved, the Academy directive stated that the story must not be dependant on the added sound tracks. An audience attending a mono only cinema must be able to understand the story. Because excessive slap echoes and reverberation destroy intelligibility and stereo imaging, the added sound tracks have limited effect. Many films are produced with dialogue from the center speaker only, left-right and surrounds minimally used similar to the percentages shown in the below pic.

A simple hand clap at the front of the cinema will reveal the slap echoes across the walls floor and ceiling which evolve into reverberation, and will be clearly heard. As a general rule the first early reflections within 30 milli-seconds are heard as being part of the direct sound. Whereas after 30 milli-seconds the later slap echoes and reverberation are heard as being from the room.
But most often the sound from the speakers will be continuous and constantly generating slap echoes and reverberation throughout the cinema room, which becomes included within the following information from the speakers, continuing the cycle.
Regardless of those who argue that reverberation provides an agreeable aesthetic experience, it will in varying degrees mask and contaminate the new incoming sound and directly interfere with speech intelligibility, destructively changing what we are hearing from what the director intended, potentially undermining the story line of the film.
The wall curtains, seats and audience absorb sound, differently at all frequencies. The time for reverberation (at any one frequency) to diminish to -60dB (1/1,000,000) of its original energy, is called Reverberation Time (RT or T60).

The majority of modern cinemas are concrete constructed which limits sound escaping from the room. The curtain material on walls absorbs high frequencies but has little effect on absorbing low frequencies as in the pic below. If pleated curtain material absorbs 50% (-3dB) of sound energy at 500Hz the sound would have to strike the curtain 20 times to be reduced to 0.0001% (-60dB). -3dB is only heard as a slight reduction.

Compare a small and large cinema made from the same building materials, and fire a cap gun. Sound travels at approx 344m (1125ft) per second. Each time sound strikes a wall, a % is absorbed and a % is reflected, so on and so on. The average distance between walls floor and ceiling is described as 'Mean free path' or 'Mean free time'.
In a small cinema the average distance between walls is closer compared to a larger cinema. Sound will be absorbed and reflected from walls more often (in the same period of time). In a large older cinema without curtain material on walls, the hi-frequency reverberant energy will be less absorbed and appear brighter, except for air attenuation. However in a smaller modern multiplex cinema with curtain material on the walls the hi-frequencies will be readily absorbed but not the bass frequencies, therefore the reverberant bass energy will appear to be greater.

Bass frequencies have long wavelengths and due to propagation directivity, the inverse square law does not apply in a small cinema. This is similar to sub-bass in vehicles.
Due to the tradition of how cinema sound is calibrated in most multiplexes, the direct on-axis energy from the speakers is aligned to sound harsh or trebly, relying on the reverberant propagation bass energy, to fill in the difference. This traditional alignment approach is critically questioned further in this text.
In a large cinema the bass energy will have less propagation from the walls, and the inverse square law will apply to some degree. At two-thirds of the way back in a large cinema the bass energy will be less by comparison. Overall there is still insufficient absorption for low frequencies. This causes the lower end of the sound spectrum to be muddled. Action films depend on bass energy for effect, which is recent in film history, and most cinemas were designed before this trend came about.
Critical Distance
The management of acoustics for entertainment venues requires understanding Critical distance. Critical Distance is where the direct sound from the speakers and reverberant sound energies are equal. The Critical Distance is different at all frequencies. The more reverberant a room, the closer is Critical Distance. The more absorbent a room, the further is Critical Distance.

Direct sound from the speaker system diminishes in level as a function of the distance (inverse square law) whereas reverberation constantly spreads throughout the cinema room. Because there is always new sound from the speakers, reverberation keeps building up until the incoming sound, equals the sound absorbed (steady-state). At speech the direct sound energy may be equal to the reverberant, but at the lower frequencies the reverberant energy may be 4 to 10 times greater than the direct.
If reverberant energy becomes 12dB or greater than the direct sound from the speakers, toward the rear of the cinema, intelligibility becomes lost. The simplest way to find 'Critical Distance' is to play compressed pop music through the sound system. Walk around the room, and you will be surprised how easy it is to identify the critical distance.
Good acoustic design in all entertainment and cinema venues requires the Critical Distance to be as far as possible from the sound source, and any resultant reverberation minimal and even at all frequencies. Un-even reverberation imposes annoying colouration in the sound. It is understandable that 100% anechoic is not physically possible in a large public environment but the closer to achieving it the better.
This text dictates that there is no such thing as one characteristic of reverberation that suits all applications and music. Those who argue for cinema room reverberation providing an aesthetic pleasing experience are describing personal preference only related to themselves, similar to chocolate verses vanilla. The correct reverberation is the decision of film director (not the cinema) heard from an excellent sound system, especially from excellent surrounds, preferably as a horizontal line source.
Acoustical absorption
Acoustical absorption of furnishing and curtain fabrics against walls, readily absorb high frequencies, but have limited absorption at low frequencies. The further curtain fabrics are placed away from walls, the better the absorption is to include lower frequencies. The amount of sound energy absorbed depends on type of material, weight and pleating width. Rock wool (fibreglass) has the highest absorption capacity, converting molecular air movement to heat (at molecular level). Fibreglass consists of minute razor sharp fibres that are irritant and need to be contained within fabric.

The 1/4 wave-length rule. Acoustically absorbent material must be placed away from walls and ceiling, at a distance of 1/4 wavelength of the lowest frequency to be absorbed. This will include all higher frequencies if the absorbent material is soft furnishing or fibreglass. Please note that the ceiling should also be included.
Understandably, placing acoustical absorbent material 2m (6ft) from all walls may be thought of as impractical but the closer to achieving it the better. The slight reduction in the visual size of the cinema, will only be noticed when the lights are on. Acoustically the cinema will sound and feel LARGER. Also an acoustic absorbent environment is relaxing and calming and greatly enhances the enjoyment of the film.
Pirates. A film of a pirate ship on the open seas is in contradiction to the echoes and reverberation we are hearing from the cinema. Visually this is similar to putting mirrors on cinema walls, or watching a film with the lights on.
Only if the cinema approaches being anechoic at all frequencies does the auditory experience of being on the open seas with the pirates appear real. The pirates are caught and locked in a cell. The recorded reverberation of the cell should be heard from the surround speakers only. This gives us a feeling of being in the cell with the pirates trying to escape.
The human mind evolved to un-consciously obtain intelligence from direct sound, by psychologically separating the reverberant sound to give us an accurate acoustical sense of distance and the size of a cave or room. The open ocean, forest or field, give little or zero reverberation or echo.
www.acoustics.salford.ac.ukAnechoic myth There is a myth that being in a 100% anechoic environment is an emotionally negative experience. This myth needs to be well and truly busted. The myth possibly originated from experiments on university students during the 1950s, who were locked in a tiny anechoic test chamber, to represent being buried alive in a small airless coffin, with no possibility of escape, to find out how they would emotionally react.
Social experiments have proved that when living in a noise polluted city, to be able to find solace in a silent softly furnished acoustic absorbent environment, with free access to come and go as we please; all of us, that is, everyone without exception, relaxes letting go of anxiety and stressful thoughts.
When the lights are dimmed with beautiful music 'particularly classical' the experience becomes almost magical. The most wonderful enhancement before any film is a prologue of the music for approx 5 minutes. If only cinema management fully understood this, what a magical experience going to the movies would be.
Any cinema can be simply made to be close to anechoic at low cost, by simply approaching the solution in small steps. Fibreglass or Rockwool is in-expensive and can be placed into open low weight frames. There is possibly nothing else that will work as well, except asbestos. The fibreglass or Rockwool needs to be contained in fabric that stops irritant fibres getting loose. Polyester fibre has some acoustic absorption at high-frequencies but is ineffective at low frequencies, so don't use it.
The containing fabric should have to have its own fire retardant specification to comply with public building approvals. The installation procedure can be done all at once, or spread out over time in small steps so not to interfere with session times. Check that the installed procedure is compliant with correct fire and building regulations, including insurance.
'C' weighting is the flat acoustic measurement response applied to sound systems and entertainment venues including cinemas and recording studios. 'A' weighting measurement is for environmental and industrial noise measurement and includes building materials. 'A' weighting is not a flat response measurement because it is insensitive to bass frequencies, similar to our hearing at a low loudness level.
'A' weighting All building materials are quoted in 'A' weighting measurement and therefore these measurements cannot be used in reference to bass frequency acoustical absorption and transmission reduction when applied to cinemas and entertainment venues. However it is simple to do cross calculations to the 'C' weighting measurement to obtain a flat frequency response reference to include the lower frequencies.
Many companies that supply acoustic absorbent material that complies with fire regulations will advise correctly. Excellent information is available on their sites.
www.rockwool.com
www.ecophon.com
www.acoustilog.com
www.acousticalsurfaces.com
Many architectural companies that specialise in acoustics do exceptionally good work and need to be commended. But there are also questionable companies that market themselves as acoustic specialists. Many entertainment venues and cinemas have been caught out by building approvals that were for the 'A' weighting measurement only, and were later closed down because of excessive bass noise leakage.
Questionable practices The worst practices are from kickbacks made from specifying expensive materials falsely described as absorbent that may be polyester fibre or a grey hardened variation of it. A 1.2m x 1.8m (4ft x 6ft) sheet, or its equivalent size in large rolls may be approx $5 from the importer, but when sold through high mark-up distributors that re-label the material it can be on-sold for x10 approx of its original price with a high % kickback to the specifying expert.
Acoustics is not rocket science, and there is not one thing about the subject that can not be explained to a primary school child. Things to be suspicious of.
- Telling you something you can not understand.
- Doing expensive analysis with elaborate test equipment.
- Discounting the 1/4 wavelength rule.
- Insists on using absorbent material that looks like polyester fibre.
- Not willing to volunteer demonstration against fibreglass or Rockwool.
- Smooth sales talk, bragging and name dropping.
- Representing a logo of a licensing company with a large yearly fee.
Repeat Do not accept any material as being acoustically absorbent sold by a so called expert or specification sheet, unless it has been proved to be so. Always insist on a direct comparison to Rockwool or fibreglass, by simply holding it up to your ear and doing an A B check.
Pink noise measurement
White noise is random noise we hear as hiss over the entire frequency spectrum created by all electronic circuits. White noise is also created in nature as wind rustling through leaves or surf at the beach . White noise has equal energy per cycle. As the frequency (cycles) doubles (for each octave), so does the noise energy (+3dB/octave), resulting in white noise sounding trebly as hiss.

Pink noise is filtered white noise so each octave has equal energy, therefore a flat response and is similar to music and useful for acoustic measurements and approximations (only) for sound system alignment.
Cinema sound alignment
Traditional method
The cinema industry evolved before Radio and TV, therefore the later broadcast engineering standards have not applied. Some cinemas do the best they can to provide the highest quality vision and sound possible. But there is no requirement for a cinema to have identical left-center-right speakers, or for the amplifiers to be similar, or have calibrated gains. Components can also be hotch-potched from disused cinemas.
The Dolby processor alignment approach is based on adapting to the way the cinema industry evolved. The Dolby processor enables the whole result, from the AB chain, heard by the audience, to be calibrated from within the processor decoder. The recorded sound on the film stock, projector sound reader, decoders, noise reduction systems, EQs, amplifiers, speakers, and acoustics are managed as a single entity.
This grouped procedure of alignment is unique to cinemas. It was introduced to simplify the procedure for non-technical people. The majority of audio and broadcast engineers and critical observers outside of the cinema industry do not agree with traditional cinema grouped alignment method. Professional recording studios and the broadcast industry align each item independently with absolute accuracy, usually done by electronic engineers.
Grouping the whole A and B chain as a single entity is error prone and often results in chaotic outcomes with large differences between cinemas. One of the many limitations of the combined alignment procedure is, if any one item within the AB chain is modified or changed without an external 0VU reference, then the entire Dolby processor calibration should be repeated; which is rarely done. Further problems of this alignment procedure will be discussed through the next 2 chapters.
The basic procedure is to put 3 or 5 omni Microphones spread out at two-thirds toward the rear in the cinema in the middle of the reverberant field. The mics are placed in a non-symmetrical random pattern so not to pick up standing waves. Pink noise from the Dolby processor is put through the sound system and the result is aligned so that the direct and reverberant sound energies combined give the required response.

85dB SPL Reference. ('C' weighting) The most important reference for all cinemas from the pink noise sound level measurement is for the Dolby processor volume control to be set at No 7 at 85dB SPL in the cinema room. The surround level is then calibrated to -3dB as 82dB SPL. This reference is applied at the original recording, insuring the audience will hear the sound at the correct level the film director intended, with the processor level control set at No 7.
This general loudness alignment for the 85dB reference inclusive of direct and reverberant sound is universally accepted as being practical, but calibration for fidelity and articulation using this combined method are the issues that are hotly debated.
Alignment problems The near field response of the speaker system (within 3 meters) is negated, as the traditional alignment procedure only looks at the combined direct and reverberant sound energies toward the rear of the cinema. Depending on the acoustics of the cinema, the direct on-axis energy from the speakers is aligned to sound harsh or trebly, relying on the reverberant and added propagation bass energy from walls floor and ceiling, to fill in the difference.

Because there is no reference in the traditional alignment procedure to address the difference between the direct and reverberant sound, then this combined procedure of alignment, that includes the chaotic behaviour of the reverberant field, must seriously be brought into question.
Discerning music lovers and audiophiles only accept direct sound from a flat speaker system as having integrity. Recorded reverberation suited to the music is accepted. Unwanted reverberation from the listening environment (cinema) is regarded as chaotic noise. Using room reverberation (noise) as a filler reduces fidelity and intelligibility.
Division of opinions Many within the cinema industry are content with the traditional procedure. But just as many discerning people inside and outside of the cinema industry are dissatisfied. They believe that the fidelity of cinema sound falls far short of what can be achieved. It is argued that the majority of the public are not interested in how cinema 'sound sounds'. But there appears to be no un-biased research on the public discernment of the cinema experience.
Old JBL documentation of sound system application and alignment, states a sound level difference between the front to the rear of a cinema of less than 6dB. Considering that the direct sound energy would decrease approx 24dB for the inverse square law, for an average large cinema, and the directivity of the hi-frequency horn would give an approx 6dB improvement, that means approx 12dB of reverberant energy is accepted. It is difficult to understand how any intelligibility is heard with this level of reverberation at the rear of a cinema.
'X-curve'
All cinemas are meant to be aligned to the creed of the 'X-Curve' without question, The X-curve was originally created to provide consistency to the traditional alignment method. The X-curve is a high frequency roll-off beginning at 2kHz at -3dB/octave, then -6dB/octave from 10kHz. Various correction factors are given for different size cinemas. ISO Bulletin 2969 'Curve X' is also described in ANSI PH22.202M-1984 / SMPTE 202M.

There appears to have been an examination of many cinemas over 30yrs ago, and the X-curve is said to have resulted from pink noise measurements of an assumed average cinema. The observed result was that the low-frequency reverberation was greater than the hi-frequency reverberation. This is understandable because concrete constructed cinemas with curtains on walls will readily absorb hi-frequencies but have less effect on absorbing low-frequencies.
- Measurements were done at 2/3 toward the rear of the cinema to include the reverberant field.
- Equalization was then applied to the cinema speakers so the result would approximate the energy spectrum of recording studios monitors, heard at the position of the mixing console.
- The shape of the equalization applied to the cinema speakers is said to be the X-curve.
There are many who argue that the X-curve has no beneficial effect, and has further complicated the original problem. Depending on the acoustics of the cinema, the direct on-axis energy from the speakers is aligned to sound harsh or trebly, relying on the cinema reverberation, to fill in the difference. Critics say that the X-curve does not discriminate for fidelity and articulation as it makes no difference if the reverberant sound energy was replaced with 100 monkeys randomly bashing tin cans. All that matters is that the combined energy response heard by the cinema audience (regardless of what it consists of) is similar to the energy response of the original recording heard at the mixing console.
The detailed rationale and arguments behind the X-curve have not been fully explained to scientific and engineering people on the outside of the cinema industry, who are mostly un-aware that the X-curve exists. 'Why is this so ?' Please read these links.
www.micasamm.com A History of the X Curve by Tomlinson Holman.
www.hometheaterhifi.com Cinema Sound and EQ by Brian Florian.
Bringing alignment procedures into question that have been practiced without question for many years (generations), is similar to the emotional reactions when questioning religion without being able to mention God, or the questioning of God without being able to mention religion.
External reference A possible reason for the lack of clarity about the X-curve is that it appears to have no external reference outside of itself. An external reference is essential for logical assessments otherwise miss-conceptions occur. One example of logical discrepancy infers that; the larger the cinema, the greater the hi-frequency reverberation. This implies that an infinitely large cinema would have infinite hi-frequency reverberation. If the X-curve roll off is meant to compensate for this, then an infinite hi-frequency roll off is required to compensate. (An infinitely large cinema would have zero reverberation because it would be a free field).
In reality the same sound arriving at different times from being reflected from walls and ceiling, can result in approx -10dB loss of high frequency energy toward the rear of an excessive reverberant auditorium or cinema. These cancellations are described as a comb filter, simplified in the below pic.

False assumption It is easy to assume that the reverse could be applied. That is, regardless of the reverberation, including that the sound system behind the screen does not need to have a flat frequency response, it does not matter.
Because, by placing a microphone in the reverberant field with a 1/3 octave graphic equaliser, then by simply adjusting the controls, the result can be made to align to the X-curve as if there was a flat speaker system behind the screen similar to the recording monitor. Therefore it appears that a 1/3 octave graphic equaliser (or its equivalent) solves all problems.

Nobody is 100% sure of this because it is not directly stated. Different authoritative sources have different interpretations of this description. But this reverse procedure based on assumption regardless of how it is described appears to have become the X-curve alignment.
wikipedia.org/Socrates Socrates would have difficulty accepting the logic of this approach if cinemas existed in his time. Obscure and subjective forms of acoustical alignment possibly existed with amphitheatres in Roman times and definitely existed amongst organ pipe tuners. Piano tuners often tell humorous stories of this history.
Is it possible that many years ago, a committee established to address the sound alignment problem, decided to align sound systems to comply with the apparent roll off above 2kHz at -3dB/octave (heard at 2/3 of the distance from the screen) as a simple means to provide consistency between older cinemas that did not have acoustical absorption on walls ?

Or was the obscure aim of the X-curve procedure possibly intended to provide a uniform energy response to compensate for reverberation in an assumed average 500 seat cinema, to which cinemas of all other sizes can be then be adjusted or somehow conformed to ? Or was the X-curve procedure supposed to provide a common base-line between mixing rooms and cinemas ?
A simple solution is to put a blanket over every cinema so all cinemas will sound the same. It is no more effective than if everyone put cotton wool in their ears, the outcome would be the same.
The objective behind this statement is not to negate the positive intention of those who originally initiated this idea many years ago, when circumstances were very different to today, but to challenge those who un-questioningly continue to practice this procedure by providing scientifically proven data to justify why it should be sustained.
Therefore three rhetorical statements need to be made.
- Generalised assumption.
How specifically does the X-curve provide a uniform energy response or common base-line between mixing rooms and cinemas ?
- Deletion of proof.
How or where is the evidence for a uniform energy response or a common base-line obtained ?
- Distortion of logic.
How can changing RT with EQ be made comprehensible in logic ?
Most modern multiplexes have pleated curtains on walls which effectively absorb hi-frequencies, plus hi-frequency air absorption and screen attenuation, including that most front speakers are 2-way without tweeters; all together approximate the X-curve. When combined with the compulsory X-curve alignment procedure, it does not make logical sense and now appears as a contradiction.
The home cinema 're-EQ' for small room acoustics in THX approved systems is supposed to be a variation (whatever that means) of the commercial cinema X-curve. But there does not appear to be a clear un-ambiguous definition of sound alignment procedures in recording studios for the X-curve to be refe