Does USB alleviate the sloppy nature of MIDI?

USB MIDI spec developer talks about jitter and latency. Skip down to the posts by SynMike.

If I send only 1 Note-On message every second then a classic MIDI connection will be better than a USB connection in both latency and jitter. But USB has the capability to send many notes or more simultaneously all in one USB packet – no delay between the 1st and the 100th. USB can send thousands of messages every millisecond. So the more dense your MIDI data, the better USB compares against a classic MIDI connection. The overall jitter of a crowded collection of messages is much lower on USB than on a classic MIDI connection.

Link | Posted on by | Tagged , | Leave a comment

Aspect Ratio Problems and Solutions

Generally speaking, a video file consists of a video stream inside a container. The video stream is made up of a grid of pixels, which gives the Frame Aspect Ratio or FAR. The pixels themselves have their own aspect ratio, the Storage Aspect Ratio or SAR.  The ultimate Display Aspect Ratio or DAR is what you get when you multiply the FAR and SAR together:


So when I need to fix my 720 x 480 recordings to a 16:9 aspect ratio, I need to set the SAR to 40:33 which is (720/480)*(40/33) which comes out to 1.818. Technically SAR should be 32:27 to get 1.778 but 40:33 is the legal ITU-R BT.601 or MPEG4 value.

To fix this sort of thing without re-encoding, use MP4Box:

mp4box -add input.mp4#video:par=40:33 -add input.mp4#audio output.mp4

Assuming that the input.mp4 has a video and audio component to it. This also works with mpeg2 streams.


Posted in Video | Tagged , | Leave a comment

Reference Recordings

Chemical Brothers track Orange Wedge from Surrender

Talking Heads Speaking in Tongues

Beck Guero

Ray Anderson Alligatory Band Don’t Mow Your Lawn

Miles Davis Kind of Blue

Peter Gabriel So

Fresh Aire I-V

Pink Floyd The Wall

Jazz at the Philharmonic Buddy Rich Gene Krupa Drum Battle

Billy Cobham Warning

Dave Brubeck Time Out and Time Further Out and 25th Anniversary Reunion

Bach -choose an organ one

Led Zeppelin -choose one

Michael Jackson Thriller

Tracy Chapman Fast Car

Harmon How To Listen tracks

Burnin’ for Buddy

From Izotope interview with Adam Ayan:

  • Foo Fighters The Colour and the Shape
  • Sheryl Crow’s self-titled album
  • Madonna Erotica
  • Garbage Version 2.0
  • Rascal Flatts Me and My Gang
  • Sarah McLachlan Laws of Illusion
  • Carrie Underwood Play On and Blown Away
  • Augustana Augustana

Bob Katz’ CD Honor Roll

Posted in Audio Production | Tagged | Leave a comment

Speakers and Room Correction

Sean Olive’s blog on Evaluation of Room Correction Products

Commentary on Audyssey is either no better or worse than no eq: Post #284

Trained listeners prefer the same as untrained listeners, but can reach the conclusion faster and more consistently: Post #274

The next post (#275), regarding phase of filters:

I have seen no experimental evidence in the scientific literature to indicate that FIR filters have any psychoacoustic benefit over IIR filters in their application to loudspeakers and room correction. My personal view, which is shared by many respected audio scientists including Stanley Lipshitz and Floyd Toole, is that FIR filters only add cost and unnecessary complexity to loudspeaker/room correction, with no apparent gain in performance. They also give marketing and sales departments something to talk about (hence the confusion that you suffer from ).

A flat anechoic speaker response results in a in-room response that tilts down. Audyssey corrects for a flat in-room response, thus it sounds bass thin.

Same page, post #298

The frequency responses Sean posted is the In Room response. The room reinforces the lower frequencies while dampening the higher ones. If we removed the effects of the room Toole, Harman et al., have shown the resulting Anechoic response (on axis) is very flat and smooth, with the typical sloping in level as we move off axis. This has been Harman’s position for a long time. So really the requirement of flat and smooth -anechoically- has not changed. Now how much slope there is to the off axis responses is important (and also contributes to the in room response having a downward slope), and thankfully Sean has published the slope (for a specified measurement condition) of the best 10% of loudspeakers reviewed by Harman in an AES paper number 6190. Well worth the $20 if you are not a member.

Posted in Audio Production | Tagged | Leave a comment

AAC and Perceptual Encoding

Start here: and read the stuff by j_j who is this guy: James Johnston. And he has a blog.

the audio production community went anti-science rather before the republicans.

Regarding the Wikipedia article on Joint Stereo:

To say the least, it’s incomplete, and completely lacking in understanding of human hearing.

Joint stereo, done correctly ( which is impossible in MP3 due to the standard), allows the encoder to correct for a number of otherwise obvious imaging and noise-audibility problems.

Further down, post #164

“CD quality”, to me, is when the subject can not reliably recognize the reference.

Also, many tests were in headphones, and there are very different artifacts one hears in speakers vs. headphones.

There are also different classes of artifacts that different people hear. Some (like me) can not stand imaging artifacts. Some people are hypersensitive to pre-echo, but don’t give two whits about imaging artifacts. The test listeners were by and large people who dislike pre-echo, not people who notice imaging artifacts.

Don’t blame me, I was the one with the codec that didn’t have imaging artifacts.

[from Cheebs Goat] If the encoder detects information that is center-panned, it only keeps one “channel” of it and “tags” it as center data for the decoder instead of keeping identical information in both the left and right channel. [end Cheebs Goat]

That’s what the encoder for MPEG-1 Layer 2 does. (MP2)

MP3 does M/S in addition, but not in a good way. Either you do the whole spectrum as M/S or you do it all as L/R. This does not deal with time delay or more than one stereo image in the music. ’nuff said?

AAC does M/S, intensity (i.e. what is described above) both, and can do so differently in different frequency bands. It’s the minimum necessary to avoid imaging artifacts at high rates. The encoder required to drive it is a bit more complex, but would do 8 channels in real time on an old 1GHz pentium.

Hmm, can one attach .wav files? Yeah. If I can get the time this afternoon I’ll post a couple of wav files to the thread (not music, just noise and tones) that show why you must have joint stereo coding. Basically, it’s the Suzanne Vega problem, known in the science as “Binaural Masking Level Depression”.

Post #171 has some test tones (and code to generate them) to test phase issues and masking, post #190 explains what’s in them.

The missing fundamental problem, post #334

Originally Posted by SweetLossy

[j_j]: This does work for lousy loudspeakers, it’s not the same as the real thing at all, but first you have to have the original bass frequencies in order to do this. If somebody has removed them already .

Post #391 has buz.wav test tone to test a codec for various things. Not for listening.

Post #452 has buz.wav as run through various encoders and decoded back to .wav

#458 and 459 have stereo yech.wav and various encode/decode

And from another thread:

I should add: In AAC, the problems with “joint stereo coding” are corrected, the encoder does whatever requires less bits/sample, always, taking into account the relevant part of M/S psychoacoustics, called “Binaural Masking Level Depression”, also known as the ‘Suzanne Vega Effect’ to some. (it’s only one of the things going on with Vega, though)

A killer sample:

If you find yourself confronted with it, take Track 11 of Ry Cooder’s album “Jazz” and put the first 30 or so seconds through most any audio encoder.

Tough clip, that.

Posted in Audio Production | Tagged , , , | Leave a comment

Reverb Tips

A terrific reverb tutorial. Even though it’s talking about the Sonnox Reverb, it’s applicable to all ‘verbs.

Sonnox Oxford Reverb – On Vocals (Musikmesse – Fab)

Posted in Audio Production | Tagged , , | Leave a comment

Better Stereo

From Moulton Labs:

Arrange your speakers in a pentagon. This puts LCR at -72, 0, +72 degrees (which is very close to the 7.1 suggestion of 60-70 degrees for the side speakers). Set your stereo pans at +/- 50 between L/R vs. Center, with Divergence at 100 and Center% at 100.

What is Divergence, you might ask? Go here: which explains how ProTools/Neyrinck Mix51 surround terminology works.

Essentially, 100 Divergence means don’t spread it around. 100% Center means send L+R to center.

See also:

Posted in Audio Production | Tagged , | Leave a comment