Introduction:
In Winter 2019, I took a Deep Learning course taught by Prof. Honglak Lee of the Computer Science and Engineering department at the University of Michigan. I worked with Kunyi Lu, Mingyu Yang, Congrong Zhou and Jeffrey Dominic on using deep learning to classifying Magnetoencephalography (MEG) data of brains of Musicians and Non-Musicians. We worked with Dr. Kanad Mandke, a research associate at the University of Cambridge in UK on this project. A detailed paper is being written as this is the first known attempt at classifying brain scans of musicians and non-musicians in resting state.
Magnetoencephalography Data

A 1D Time series data of an MEG scan for 5 minutes.
- A linear model is assumed
- Noise is assumed to be uncorrelated.
When studying the brain connectivity profiles of musicians and non-musicians, these assumptions may not be appropriate for performing a classification between the 2 groups. This is because we do not have a linear model with which we can extract the features for the brain connectivity profiles. In addition, there may be distributed activity in the brain that would require multivariate analysis approach. Therefore, the subtle differences that may exist between the connectivity profiles of musicians and non-musicians might easily be overlooked using a classical feature selection approach.
In order to capture these subtle differences, we chose to implement a CNN to perform the classification task as it does not require any assumptions of the model and would be able to capture the neural activity across different regions in the brain. Furthermore, CNN based classification is robust to noise in data as opposed to univariate analysis.
Adjacency Matrix
Since the original time series data is long (5*30*600 = 90000) and not convenient for CNN, we also generated adjacency matrix to show the connectivity of all 78 regions from the time series data. An adjacency matrix refers to a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. In our special case of the finite graph, the adjacency matrix is a matrix with zeros on its diagonal. In order to obtain the adjacency matrix, we stack the time-series data of all the regions (78 in total) per subject together and compute pairwise linear correlation coefficients between each pair of columns of the preprocessed data . Thus, we transform a 5*90000 input into 5 * 78 * 78 , which is better for CNN implementation.

Adjacency Matrix obtained using last 2 minutes of the MEG scan
Graph Based Deep Learning Model:

Our graph based model
To classify between musicians and non-musicians, we formulated it as a graph classification problem. As we know, brain connectivity profile of a musician is different from that of a non-musician, which can be leveraged as a key feature to solve this classification problem. Therefore, we built a graph based deep learning network.
The input of the model is a graph representation of a brain: adjacency matrix. At first, we used an atlas that divides the brain into 78 regions. Then, we computed correlation between each two regions. To represent the brain as a graph, we treated each region as a node and treated the correlation as the weight of an edge. After getting the adjacency matrix, we used Node2Vec algorithm to embed each node in one graph to a 1-d dimension vector. Therefore, we can derive a 78 d_{embedding} feature map from a 78 * 78 adjacency matrix, where d_{embedding} denotes the dimension of 1-d embedding vector. For the purpose of data augmentation, we applied filters to the time series signals. For one person’s brain, we can use each frequency band to compute an adjacency matrix. Therefore, each person’s brain can formulate several adjacency matrices corresponding to the number of filters(we have 5 frequency bands in our project). To integrated the information in all frequency bands, we used an average pooling layer among the frequency band dimension for all feature maps from one person’s brain. Eventually, we passed the feature map to one convolution layer, two fully connected layers and one Sigmoid layer. The output is the probability of classifying to musicians. If the probability is larger than 0.5, the classification result is a musician. Otherwise, the classification result is a non-musician.
Time Series Classification
Although an adjacency matrix presents useful connectivity information that can help with our task, it is inherently static data. Our brains are highly dynamic systems and, as a result, the oscillations of brain regions are also highly dynamic. After a discussion with Dr. Mandke, we learned that two different brain regions generally oscillate out of phase with each other. However, when these regions communicate with each other, their oscillations rapidly synchronize and become in phase. Such changes in neural oscillations are extremely important to account. Consequently, what may look like noise in an MEG signal is actually very important data that could greatly help with classification. We used the famous EEGNet model for implementation.
Since the raw time series data captures these changes, we experimented with using this data directly for classification.
Each 5-minute MEG signal is split into five frequency bands: delta, theta, alpha, beta, and gamma. We downsampled the time series data by different sampling rates to produce input data with equal length. The data is formatted as a 5 x 78 x T tensor, where T is the length of the data, and passed to the EEGNet.
Results
Following are the results of the accuracy we got for different models. A paper currently is being written and which will include thorough details on analysis of the same.