Entries by yhyang

Lyrics-free Singing Voice Generation

The conventional approach to generate singing voices is through singing voice synthesis (SVS) techniques. A human user feeds lyrics and MIDI scores (a sequence of notes) to a well-trained SVS model, and the model generates audio recordings following the given lyrics and scores faithfully. The synthesis models have little freedom deciding what to “sing.” In […]

MuseMorphose: Music Style Transfer with A Transformer VAE

At Taiwan AI Labs, we are constantly pushing the frontier of deep music generation models. In the past year, we have rolled out Guitar Transformer (blog), which can compose human-readable guitar tabs with plausible fingerings, and Compound Word Transformer (blog), which vastly accelerated model training and inference thanks to carefully re-engineered music representation. Today, proudly making its debut […]

Guitar Transformer and Jazz Transformer

At the Yating Music Team of the Taiwan AI Labs, we are developing new music composing AI models extending from our previous Pop Music Transformer model (see the previous blog).  In October 2020, we are going to present two full papers documenting some of our latest result at the International Society for Music Information Retrieval […]

Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions

Paper (ACM Multimedia 2020):  https://arxiv.org/abs/2002.00212 (pre-print) Code (GitHub):  https://github.com/YatingMusic/remi   We’ve developed Pop Music Transformer, a deep learning model that can generate pieces of expressive Pop piano music of several minutes.  Unlike existing models for music composition, our model learns to compose music over a metrical structure defined in terms of bars, beats, and sub-beats.  […]