Audio Samples of Jacobian Disentangled Sequential Autoencoder

These are random source-target pairs from the valiadtion set, and the Griffin-Lim Algorithm is used for mel spectrogram inversion. v-swap (z-swap) translates (preserves) instrument timbre and preserves (translates) pitch.

For each source-target pair, the baseline TS-DSAE comes first followed by the proposed J-DSAE.

NOTE: Start with a low volume on a headphone.

source recon. v-swap z-swap recon. target