I was talking to my friend Ian Stewart (http://ianstewartmusic.us/blog) recently and we discovered that we had both been playing around with the same idea which was to find a solution to calculate Replay Gain measurements to predict how much (and how) Spotify would adjust the loudness levels of a song.
The "word on the street" (meaning online, forums and Facebook groups) is that "all streaming services uses LUFS values to normalize to". As of this writing, that is actually very incorrect. I say that with confidence even though I've written about it myself and helped Ian Shepherd to crunch some numbers for i.e. Youtube loudness for instance. But we know for a fact that Spotify uses Replay Gain.
So we only needed to find a tool that measures Replay Gain.
I knew Foobar does it, but I'm on Mac so that was too much of a workaround to make that work. I had heard that Audacity did it but I didn't find the option to do so, but Ian Stewart found the plugin that does it... and off we went to test it.
Despite the recent "Spotify lowers its loudness target by -3 dB discussions", they are still adding +3 dB into Replay Gain... they were apparently adding +6 dB before! So we had to take that into consideration in our tests.
So here's what you do (Mac):
1. Go and download Audacity and install it.
2. Go to http://wiki.librivox.org/index.php/Measuring_Volume_within_Audacity and download the ReplayGain.ny plugin.
3. Open the Application folder, right click on Audacity and select "Show Package Contents" open the Contents folder and then the "plug-ins" folder and place the ReplayGain.ny plugin there.
4. Open Audacity and then you should find the Replay Gain plugin in the Effect tab/list. (If you don’t see it right away, select “Add / Remove Plug-ins…” in the Effect menu, find and select ReplayGain, end click Enable).
5. Open the file of the song you want to measure in Audacity and select the Replay Gain plugin.
6. When the plugin is open you get the chance to "Normalize or Analyze" and you choose "Analyze" and press OK.
7. Take the number it gives you and add +3 dB to it and then you should have the amount of gain change that Spotify will apply!
Naturally, we wanted to test this to make sure it was sound advice. My method and results are below, followed by Ian’s method and results.
1. I took a master file and loaded it in RX (from iZotope) and ran the Waveform Statistics and got the sample peak and integrated LUFS number.
2. Opened the file in Audacity and ran the ReplayGain calculations and added +3 dB to estimate what Spotify would do.
3. Opened menuBus and added Insight (metering plugin from Izotope) to measure the Spotify stream of the same song I had measured in RX and Audacity. I also love being able to put Dynameter on there and measure with it, but I didn't need it for this experiment.
Here are the results from the tests I did (bear in mind that Insight measures True Peaks):
Song: Punch Drunk Love by Védís Hervör
RX: -13 LUFS, Sample peak -1 dBFS, (PLR 12)
Audicity RG level: -4.3 dB (prediction: Spotify will turn it down by -1,3 dB)
Insight measurement via menubus: Spotify lowers it by -1.4 dB / 1.6 LU (-14.6 LUFS, -2.4 dBTP) = peak difference of 0.1 from the estimate.
Song: Burning Bright Heart by The Shady
RX: -12,6 LUFS, sample peak -0.53 dBFS (PLR 12)
Audicity RG level: -5.4 (prediction: Spotify will turn it down by -2,4 dB)
Insight measurement via menubus: Spotify lowers it by -2.2 dB/LU (- 14.8 LUFS, - 2.7 dBTP) = peak difference of 0.2 from the estimate.
Song: 3 AM by The Nick Tann Trio
RX: -15.2 LUFS, Sample peak -1.03 dBFS (PLR 14)
Audicity RG level: -1.9 dB (prediction: Spotify will turn it UP by + 1,1 dB)
Insight measurement via menubus: Spotify turns it UP by +1 dB / 1.0 LU (-14.2 LUFS, -0.4 dBTP) (strange peak level though, maybe due to the signal hitting Spotify's limiter?) = LU difference of 0.1 from the estimate. EDIT / UPDATE: Spotify has changed the ceiling of their limiter to -1 dBFS as of May 2017.
Song: Next Stop by Milkhouse
RX: -14.0 LUFS, Sample peak -1.02 dBFS (PLR 13)
Audicity RG level: -2.9 dB (prediction: Spotify will turn it UP by + 0,1 dB)
Insight measurement via menubus: Spotify turns it down by + 0.4 dB / 0.5 LU (-14.5 LUFS, -1.4 dBTP) = LU difference of 0.5 from the estimate.
Song: Electric Distain by Daði
RX: -9.3 LUFS, Sample peak -0.09 dBFS (PLR 9)
Audicity RG level: -7.9 dB (prediction: Spotify will turn it down -4.9 dB)
Insight measurement via menubus: Spotify turns it down by - 4.8 dB / 5.0 LU (-14.3 LUFS, -4.8 dBTP) = LU difference of 0.0 from the estimate, 0.1 peak difference.
Song: Værð, náð, sátt by Íkorni
RX: -14.9 LUFS, Sample peak -3.05 dBFS (PLR 12)
Audicity RG level: -1.5 dB (prediction: Spotify will turn UP by + 1.5 dB)
Insight measurement via menubus: Spotify turns it UP by + 1 dB / 1.0 LU (-14.0 LUFS, -2.0 dBTP) = LU difference of 0.0 (approximated) from the estimate, 0.45 peak difference.
I'll probably add some more songs to the test / results when time allows.
Ian took a slightly different approach. Rather than compare against original masters, he wanted to see if songs streamed from Spotify (with loudness normalization on) were consistently 3dB louder than the ReplayGain plug-in would suggest.
To do this he recorded the top 10 songs from the Global Top 50 playlist and then analyzed them with the ReplayGain plug-in in Audacity. If all went according to plan, the plug-in should suggest a -3.0 dB adjustment for each song.
He found that out of those top 10, on average, ReplayGain recommended a -2.9 dB adjustment. That’s pretty close! Further, all measurements were within 0.5 dB of our predicted value. In comparison, the LUFS measurements for the same songs averaged at -14.1 LUFS, however individual songs deviated by as much as 1.1 LU (dB) from the commonly cited -14 LUFS “target”. So, is this method perfect? No. But it's pretty much spot on regardless. And it is significantly more accurate than using a -14 LUFS target if you REALLY want to know what Spotify will do to your tunes.
It's also worth mentioning some factors that could be responsible for the "margin of error" we are seeing (in both Ian's and Sigurdór's tests).
1. Lossy coding. Even though Spotify uses the Ogg Vorbis format and we tested using the highest bit rate available it's still lossy coding and lossy coding can and will create new peaks and slight (at best) distortions that can alter the peak levels (and create intersample peaks).
2. Slight Difference in frequency weighting between Spotify's algorithm and the ReplayGain plugin.
3. Songs that were turned up could have hit Spotify's limiter and that by itself will slightly alter the integrated loudness. How much depends on the material.
So that's it for now!
We hope you find this method useful and it can help you to gain (some pun possible intended) a better insight (... no pun...) into how Spotify's loudness normalization works and how you can easily predict how your song / mix / master will be normalized.
EDIT/UPDATE: After I published this post then Ian Stewart went and tweaked the ReplayGain plugin we used and you can now use it to analyze what Spotify will do without having to add +3 dBs to the results. You can read about his take on this and download the plugin from his site http://ianstewartmusic.us/blog/2017/9/3/level-check-spotifys-normalization-target