Personal Use: How do you “takeout” your Spotify data?

Like many, I’ve built dozens of playlists in Spotify and truly enjoyed how the algorithm knows me. It was almost surreal finding Discover Weekly playlists where every single song fit my musical taste du jour.

However, probably also like many, it was eventually overwhelming to hear thousands of new songs a year – to the point where I could no longer remember when I first heard or song, or whether it was a remix or the original. I was happy with the static selection in my playlists, but couldn’t justify paying $10/month for what amounts to a CDN and .m3u host…

Subscription services are often inherently designed around user retention. How do you export your Spotify data?

  • You can copy-paste from the desktop client to get a list of Spotify URLs – not useful without a subscription.
  • You can pay for questionable “Spotify to MP3” utilities online.
  • You keep paying.

I won’t get into DMCA, DRM, fair use, or other concepts here. Instead, let’s take a look at a cool script…

Overview

TL;DR: a script records your sound card to mp3 while it plays Spotify, and creates a new file for each song.

The blurry cam photo shows more or less what happens:

  1. User opens the Spotify desktop web client in Firefox (also works with YT).
  2. User runs the script, which waits for playback to start.
  3. User starts a playlist at the beginning.
  4. The script reads the song artist and title from dbus playerctl
  5. The filename for the current song is in the form “# – Artist – Title.mp3”, where # is the playback sequence count.
  6. The script pipes the sound card output from parec pw-record into the lame encoder.
  7. When the song title changes, the script ends the current file and starts a new one with a new filename.
  8. Repeats until playback status is not “Playing”.

The conversion happens in real-time, which means it can take many days.

Prerequisites

You will need to install a few packages.

sudo apt install spotify-client
sudo apt install bluez-tools
sudo apt install lame
sudo apt install playerctl

Script (for use with pipewire)

#!/bin/bash

# Obtains current song info
getSong() {
    artist=$(playerctl metadata artist | sed -e 's/[<>:"/\\|?*]/_/g')
    title=$(playerctl metadata title | sed -e 's/[<>:"/\\|?*]/_/g')

    filename="$artist - $title"
}

# Obtains playback status
getPlaybackStatus() {
    status=$(playerctl status)
}

# Global variables
previousSong=""
previousPID=0
songID=0

# TODO: query user to specify a playlist/directory name and create it

# Begin
echo "Running (wait for start)..."

# Wait until user starts playback
while [[ "$status" != "Playing" ]]; do
    getPlaybackStatus
done

# Main loop
while [[ true ]]; do
    getSong
    getPlaybackStatus
    if [ "$status" == "Playing" ]
    then
        if [ "$previousSong" != "$filename" ]
        then
            #ensure title and artist sync
            getSong

            # stop recording previous song
            if [ $previousPID != 0 ]
            then
                echo "stopping song '$previousSong'... "
                killall lame
                previousPID=0
                previousSong=0
            fi

            # start new recording using currentSong as filename
            ((songID=songID+1))
            echo "starting song '$filename'..."
            
            # Update this with browser node name
            recording=$(pw-record --target "Firefox" - | lame -s 48000 -r -b 320 - "$songID - $filename.mp3") &
            currentPID=$!

            previousSong=$filename
            previousPID=$currentPID
        fi
    else
        # stop recording - playback is ended
        if [ $previousPID != 0 ]
        then
            echo "stopping song '$previousSong'... "
            # this is not elegant but the PID returned was for the recorder, not lame
            killall lame
            previousPID=0
            previousSong=0
        fi
    fi
done

Script (Old – worked in 2022 with pulseaudio)

#!/bin/bash

# Obtains current song info from the Spotify DBUS entry
getSong() {
  artist=$(dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.freedesktop.DBus.Properties.Get string:org.mpris.MediaPlayer2.Player string:Metadata | sed -n '/albumArtist/{n;n;p}' | cut -d '"' -f 2)
  title=$(dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.freedesktop.DBus.Properties.Get string:org.mpris.MediaPlayer2.Player string:Metadata | sed -n '/title/{n;p}' | cut -d '"' -f 2)

  filename="$artist - $title"
  
  # TODO: clean filename to only use valid characters
}

# Obtains playback status from the Spotify DBUS entry
getPlaybackStatus() {
  status=$(dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.freedesktop.DBus.Properties.Get string:org.mpris.MediaPlayer2.Player string:PlaybackStatus | sed -n '/variant/{p}' | cut -d '"' -f 2)
}

# Global variables
previousSong=""
previousPID=0
songID=0

# TODO: query user to specify a playlist/directory name and create it

# Begin
echo "Running (wait for start)..."

# Wait until user starts Spotify playback
while [[ "$status" != "Playing" ]]; do
  getPlaybackStatus
done

# Main loop
while [[ true ]]; do
  getSong
  getPlaybackStatus
  if [ "$status" == "Playing" ]
  then
    if [ "$previousSong" != "$filename" ]
    then
	  #ensure title and artist sync
      getSong

      # stop recording
      if [ $previousPID != 0 ]
      then
        echo "stopping song '$previousSong'... "
        killall lame
      fi

      # start new recording using currentSong as filename
      ((songID=songID+1))
      echo "starting song '$filename'..."
	  
	  # Update this with your specific sound device
      recording=$(parec -d alsa_output.pci-0000_00_1b.0.analog-stereo.monitor | lame -s -r -b 320 - "$songID - $filename.mp3") &
      currentPID=$!

      previousSong=$filename
      previousPID=$currentPID
    fi
  else
    # stop recording - playback is ended
    if [ $previousPID != 0 ]
    then
      echo "stopping song '$previousSong'... "
	  # this is not elegant but the PID returned was for parec not lame
      killall lame
    fi
    echo "Playback ended, program exiting!"
    exit
  fi
done

Pre-Flight Checklist

Make sure you update the sound device used in parec to your own output device.

For the new pipewire script, you can change the pw-record target to your application name found using wpctl status.

Operation

  1. Open the player in your browser.
  2. Run the script.
  3. Start playlist playback.

The script will save the files in the current directory.

Tips for Success

For best results, you may want to do the following in the Spotify settings:

  • Set streaming quality to maximum.
  • Disable Autoplay.
  • Disable crossfading and smooth transitions.

You will need your computer volume to be non-0 and not muted – but in Ubuntu you can easily divert the sound out of your headphone jack with nothing plugged in 😉

Post-Production

Once your playlist is done playing and the script stops, you may need to do a few things:

  1. Group files in a playlist folder.
  2. Remove invalid characters from filenames (ex. ? ” / \ and so on).
  3. Use an MP3 tagging tool to update ID3 tags from the filenames.

# TODO

There are a bunch of ways it could be improved, but this was functional enough for me.

  • Remove invalid characters automatically
  • Prompt user for a playlist name, to create a new folder
  • Properly end the background recording process for each song.

Let’s see how long this works for.

Tags: , , , ,

Monday, January 10th, 2022 computers, music, software

Leave a Reply