Proposal

Summary

We will generate video-game music using a Markov Model based algorithm, with parallel computation through GPUs for real-time music output. We will be implementing parallel algorithms to analyze speedup for two portions of the project:

  1. Matrix generation: building the Markov matrices that will be used for music generation
  2. Music generation: parallel synthesis of multiple melodic lines using the trained matrices from matrix generation

Background

Through the forms and rules outlined in music theory, humans already follow many rigid guidelines when composing music. Therefore, it seems natural to extend this technique to using an algorithmic composer. In particular, the genre of video game music, which is characterized by the overlapping of multiple melodic lines and a repetitive and structured format, seems ideal as a starting point for parallel algorithmic composers. The many melodic lines offer ample avenues for parallelization, allowing for parallel synthesis of different parts. In addition, the rigid structure makes such music easier to predict and generate with a Markov Model — resulting in higher quality and more harmonious music.

The Markov Model of music generation was chosen for its popularity, good musical results, and ease of implementation, understanding, and parallelization. The model is easy to understand and implement, as it is based on using many probability matrices to determine the next notes of a melodic line. These matrices are easy to store and manipulate in many different programming contexts, including the GPUs we wish to use. Other music generation algorithms introduce too many new challenges, such as the recursive nature of Grammars (which may be difficult as it is not known beforehand how may recursions/threads we would need to spawn), or the poor music production of Structure and Probability techniques (Dannenberg, 2018). Finally, it has been shown that Markov Models produce good musical results, where computer compositions and human compositions had similar rankings in formal listening tests (Elowsson and Friderg, 2012). Therefore, implementing a Markov Model for our project would be the best choice for producing music with integrity and quality.

In a Markov Model, multiple matrices are built from training data, or files of music that fits the patterns and genre of the music we wish to create. Most commonly, this takes the form of counting the number of transitions from one note to another, and storing this count in a matrix. These matrices are then normalized and turned into probability matrices. For each melodic line, the algorithm uses the current (and perhaps some number of past) notes to determine the correct matrix and matrix row to look at. It then uses that matrix section to determine the next note, based on the probabilities in the matrix. Because the matrices are training on music of the same structure and genre, it is likely that the resulting melodies also follow the same structure and sound similar to the training pieces. The resulting melodies are then combined and outputted through MIDI.

Implementation

Figure showing overview

Benefits of Parallelism per section:

  1. Matrix generation:
    • Matrix entries could be computed in parallel, as there are dependencies between entries. Since most entries hold counts for the number of transitions between 2 notes (determined by the row and column), these counts will be independent when notes are different.
    • There will be many matrices for varying melodic lines — bass, melody, counter-melody, etc. These matrices can be computed in parallel.
    • Generating weights based on varying levels of context (number of previous notes) require multiple data lookups that could perhaps be done in parallel.
    • Training on multiple files can be read/processed in parallel, and then an overall reduction/sum may be performed in order to increase training performance.
  2. Music generation:
    • Multiple melodic lines can be synthesized in parallel, allowing for faster generation of music, and perhaps leading to better performance. In addition, the increased speed, if significant, may allow for more complex or increased tempo in music generated.

Challenge

Although there are avenues for parallelization as seen above, music has many inherent synchronizational components that make it difficult to parallelize. For example, even if melodic lines can be computed in parallel, they need to sound harmonious when played together. Video game music often has theme changes that require concurrent operations, and all melodic lines must react in tandem to changes in theme, tempo, volume, etc. Although a parallel, faster algorithm for music generation could lead to more ability to generate complex, fast-paced music, the most basic requirement of music sounding pleasing to the ear forces a natural element of synchronization and therefore difficulty. We hope to explore this balance between the requirements on structure and harmony vs. the performance, speedup, and possibilities of complexity with parallelism, and determine how this balance affects the music we create.

What is hard

Workload Characteristics

Constraints and System Properties

Resources

In order to do the basic parallel computations on matrices and music generation, we want to use GPUs. Therefore, we need access to machines with GPUs. Our code base is essentially starting from scratch, with some guidance from 15418 asst2, where we used CUDA to process images. We will need to produce our own implementations of the various methods and structures to Markov models and training and generating music. We are using past computer music papers as references for building our Markov model for generating music (Dannenberg, 2018; Elowsson and Friderg, 2012). In addition, we need some sort of device to play the MIDI file. We are most likely going to use personal laptops, but we need to figure out how to write a program that can receive web packets and play MIDI in real time.

In terms of programming languages, we will be using Thrust, a C++ wrapper, in conjunction with CUDA in order to gain access to both GPU computation and MIDI (https://bisqwit.iki.fi/source/adlmidi.html C++ MIDI player). This allows us to have a familiar language that can satisfy the needs of the GPU computation and MIDI requirements.

Goals and Deliverables

We plan to create a video game music generator that produces music in real-time. The generator will first train multiple probability matrices in parallel over a large collection of files in the video game music genre, with significant speedup over sequential algorithms. Although it is dependent on the final model used, this training will most likely be based heavily on matrix sum operations and parallel reading into elements of a matrix. Therefore, we expect similar speedup goals as in general parallel matrix sum algorithms. For the second component, in musical synthesis, we plan to see significant speedup in the generation of multiple melodic lines. In the best case, this speedup would be linear in the number of melodic lines, although the many dependencies noted above may result in this being unlikely. Overall, we will produce a harmonious-sounding video game music parallel algorithmic composer that shows significant speedup in both training and synthesis over sequential versions.

We hope to create a Markov model that is more specialized in generating good video game music. Since we are focused on this one genre, we may make our algorithm more specialized in order to increase performance (either in runtime or musical quality). We would also like our application to be more widely accessible, which would require setting up a server and communicating to clients, most likely through a web browser.

Another stretch goal would be to demonstrate the feasibility of using parallel computation for video games in industry. For example, many soundtracks in video games are static and built-in, and thus don’t have much variation. Since video games already use GPUs, it would be interesting to see if music can have a role in using the GPUs in addition to the graphics which currently use them. We could explore the performance of using the GPUs to produce music with our algorithm in relation to the current method of producing music (static, built-in, sequential).

The demo will include speedup graphs that showcase the increased performance in utilizing GPUs over using sequential code. In addition, we will have live music generation during the demo that showcases our fast, parallel algorithm still produces a harmonious sound, showing we found solutions to the synchronization challenges noted above. The combination of melodious music and speedup graphs shows our understanding of the balance between dependencies in music and our parallel algorithm.

For the analysis component of our project, our algorithm will demonstrate significant speedup from utilizing GPUs, versus just using sequential code. This speedup may allow for more complex or faster-paced music generation. In addition, we hope to determine the balance between synchronization of melodic lines and conductors and parallel synthesis. Questions we hope to answer include how much communication is necessary to produce music-theoretically correct chords and harmonious sounding music, and how much synchronization and constraint a conductor places on music generation. Specifically, we hope to find quantitative ways to describe the balance between parallel synthesis of melodic lines and their interactions with each other and the conductor.

Platform Choice

We are planning to use the Thrust library on top of the C++ CUDA in order to write more efficient code. Thrust has significant performance benefits over C++, and also is easier to develop in, resulting in more productive development. In addition, the availability of MIDI APIs in C++ allow for us to use the same language in the final music output code.

Using GPUs to perform our computation is a good choice, as training and Markov Models use:

Schedule

Dates Goal Description
Nov 4 Lit Review + Model Learn about how MIDI is generated, and the various modifications to Markov Models and similar algorithms. Other research can include looking into video game music theory to properly determine the structure and rules we wish to follow. We want to decide on our model, determine why it works well, and what algorithm we will use to generate our music.
Nov 11 Basic Parallel Alg. Implement a CUDA program that generates matrices for our program.
Nov 18 Music Generation Implement the music generation part of our project so we can play generated MIDI files using the matrices trained from last week.
Nov 19 Checkpoint Summarize progress so far on project
Dec 1 Optimize Music Generation Optimize the music generation algorithm. Explore the balance between parallelizing melodic lines and harmonious output.
Dec 7 Optimize Matrix Generation Optimize the matrix generation, to speedup training time. Graph and analyze performance.
Dec 14 Wrap-up Complete final report
Dec 15 Presentation Practice presentation

Works Cited