Space-Time-Frequency Bag of Words Models for Capturing EEG Variability: A Comprehensive Study
|
|
- Virginia Norman
- 5 years ago
- Views:
Transcription
1 Space-Time-Frequency Bag of Words Models for Capturing EEG Variability: A Comprehensive Study By Kyung Min Su and Kay A. Robbins University of Texas at San Antonio January 2, 2015 Updated April 19, 2015 UTSA CS Technical Report TR
2 Contents 1. Introduction Bag-of-words models for EEG data Extracting descriptors Key frames Gradient descriptors Normalized frame vectors Peak-frame repeat method Building a dictionary Representing EEG data using a dictionary Matching words for a frame Histogram of words in an epoch Improving the quality of a descriptor Sub-intervals along time axes Frequency sub-bands Evaluation of parameters Gradient descriptor versus normalized frame descriptor Evaluation of the dictionary sizes Evaluation of clustering methods Evaluation of sub-intervals Evaluation of sub-bands Cross subject transfer learning Discussion Acknowledgments References
3 1. Introduction EEG data are multi-channel time-series recorded from the scalp surface to provide spatial and temporal measurements of brain signals. Because EEG headsets have varying channel numbers, channel placements, and sampling rates, EEG data may have different dimensions depending on the type of headset used for signal acquisition. These differences make it difficult to combine datasets for large-scale machine learning or data mining applications. Many traditional EEG features, including raw signal, are channel-specific [1], and not appropriate for processing multi-headset data of various channel configurations. Frame-based EEG features, which extract values from a field topography [2], [3], are less channel-specific. However, they usually assume that all EEG datasets are from the same headset. To represent EEG data regardless of headsets configurations, we have investigated several variations of the classical Bag-of-Words (BOW) model, a widely used technique to extract features from images for applications such as retrieval [4]. Images come in different sizes, shapes, and orientations. BOW approaches are effective in mapping such data to common feature sets. Traditional BOW models use a dictionary of local features based on key points and then construct a histogram of the occurrences of these features in an image. A disadvantage of BOW features is that they lose information about global spatial relationships of the key points in the image. However, this loss also makes the features robust to variations in scale and orientation. In this document, we describe several BOW approaches for EEG data that retain some frequency, spatial, and temporal relationships in EEG data. The proposed descriptors are relatively insensitive to the number of channels, channel placement, sampling rates, signal range, and subject response time. As a result, we can process EEG datasets of various configurations using a common dictionary of features.we have experimentally compared various approaches and parameters to provide an empirical basis for choosing optimal conditions. Section 2 describes the ideas of configuration independent EEG features based on BOW models, and Section 3 explains the implementation details and test results. Section 4 briefly discusses the implications of results for EEG analysis. 2. Bag-of-words models for EEG data In this section, we describe the proposed EEG descriptors based on Bag-of-words (BOW) models that provide a relatively uniform representation of EEG data across different data collections. Figure 1 shows an overview of the processing steps for the proposed method. Processing consists of three parts: extracting descriptors from EEG data, building a dictionary from the extracted descriptors, and representing the data using the dictionary. The details of each part are explained in the following subsections. 3
4 Figure 1 Overview of the processing steps for the proposed methods. 2.1 Extracting descriptors EEG data is multi-channel and includes both spatial and temporal information. Figure 2 shows an example of this data in channel time space. Each frame contains the values recorded at the headset channel positions at a particular time and captures spatial relationships among channels. To represent spatial information in BOW, we tried two descriptors: a gradient descriptor and a normalized frame vector. The gradient descriptor method of Section uses spatial gradients in the same way as the scale-invariant feature transformation (SIFT) [5] features of traditional image processing applications. The second method, explained in 2.1.2, directly uses the frame vectors of the EEG recording. Figure 2 EEG data in channel time space. Each column corresponds to a time sample Key frames Following BOW approaches described in the literature for images, we initially tried representing each frame using a traditional BOW gradient descriptor and forming histograms using key-frames defined in time. The method is analogous to the traditional SIFT methods [5] for finding key points and representing the key points using a gradient descriptor. Our EEG gradient descriptor method also finds the key frames, in which overall power is a local maximum, and then calculates gradient vectors from the interpolated topographic map of each key frame to capture the relative activity of the brain. A configuration-invariant gradient descriptor is generated by appending the histograms of the gradient vectors in the four quadrants in each key frame. The following sub-sections explain how to apply the gradient descriptor key frames method on EEG data in more detail. EEG key frames are defined as the frames that correspond to the peaks of the global field power (GFP). According to Lehmann [6], brain states can be estimated from stable patterns at the power peaks. Even though, the topographic maps between power peaks change continuously, their patterns at the peaks have (by necessity) a zero time derivative and hence retain the same pattern for a few time instants. Lehmann 4
5 and others have hypothesized that these patterns reflect brain status. Regardless of whether this is the case, the power peaks form dominant patterns that are analogous to image key points. For an L N array, representing L channels and N frames, define the i th frame, Y i, as: Y i = [y i1, y i2,, y il] T where y ij is the element at i th time frame and j th channel. The GFP is the actual standard deviation of values within each frame. The GFP of the i th frame, GFP i is defined as: L GFP i = 1 L (y ij 1 L y ik) j=1 Low GFP means there is little variation between the channels and very little spatial information in the frame. Therefore, we use only peak frames in GFP, which have a large variation and clear spatial distinction among channels. To visualize the effect of key frames, we simulate EEG signals using the DipoleSimulator of BESA [7]. The location and orientation of the simulated dipoles are shown in Figure 3 (a). Figure 3 (b) shows first 150 topographic maps corresponding to the first 150 frames of a simulated EEG signal for the dipole simulation shown in Figure 3(a). In this simulation, two of four dipoles are active at each time. Figure 3(c) shows the corresponding power curve for these frames. There are 10 peak frames in the first 150 frames, and these 10 peak frames have steady topographic patterns that change according to their dipole status (see Figure 3(d)). L k 2 dipole status frame (a) Dipole behavior for simulation of EEG signals (b) Topographic maps of 150 frames 5
6 (c) Global field power (GFP) curve (d) Topographic maps of the frames corresponding to GFP peaks. Figure 3 Topographic maps and GFP curves of simulated EEG data Figure 4 shows an example of actual EEG data that has been band-pass filtered in the alpha frequency band of 8 Hz to 12 Hz. This data is taken from the 109-subject publicly available BCI-2000 dataset [8], [9]. Figure 4(a) shows the topographic maps of 90 consecutive frames. Figure 4(b) shows the power curve corresponding to these frames, and Figure 4(c) shows the topographic maps at the power peak frames. The topographic patterns at the power peaks usually retain their spatial characteristics for several frames. (a) Topographic maps of 90 frames (b) Global Field Power (GFP) curve (c) Topographic maps of peak frames Figure 4 Topographic map and GFP curve for a BCI2000 dataset [7], [8] (Subject 1, Task 1, Alpha band pass filtered). 6
7 (a) sub-band1 (0~4Hz) (b) sub-band2 (4~8Hz) (c) sub-band3 (8~12Hz) (d) sub-band4 (12~16Hz) (e) sub-band5 (16~20Hz) (f) sub-band6 (20~24Hz) (g) sub-band7 (24~28Hz) (h) sub-band8 (28Hz ~ ) Figure 5 Curves of GFP from three types of headsets. 7
8 One difficulty with configuration independent descriptors based on GFP is that different headset configurations are likely to have a different amount of total power depending on coverage of the scalp and on frequency band. To compare GFP curves for different headset configuration, we generated virtual EEG data from Biosemi 64-channel EEG to simulate other headsets by choosing only those channels that closely mapped the position in the simulated headsets [10]. Figure 5 shows the power peaks of three test headsets and their correlations. In the figure, the blue solid represents the power curve of the original Biosemi 64-channel data, the green dashed line is the power curve of an Emotiv 14-channel simulated headset, and the red dash-dot line is the power curve of an ABM 9-channel simulated headset. The correlation between power curves of the simulated headset data and Biosemi data is shown in the legend area. The Biosemi headset covers the entire head area evenly, while the Emotiv headset does not have any detectors on the top of the head and the ABM has only 9 channels placed at the top of the head. Headsets with different channel configurations have highly correlated GFPs even with low numbers of detectors and biased placement Gradient descriptors Gradient descriptors provide low dimensional vector representations of topographic maps and are generated through a SIFT (scale invariant feature transform)-like method. Given the spatial location of each channel, we generate an interpolated topographical map on a spatial grid for each key frame in a multi-channel EEG data set. Because we interpolate EEG data into a topographical map of common dimensions, we can process all EEG data in the same way regardless of the physical channel configuration, provided there are sufficient channels to perform an interpolation. Figure 6 shows each step of the generation of the gradient descriptor from a topographic map. A topographic map of size (24 24) shown in Figure 6(a) is divided into a sample grid of size (8 8) as shown in Figure 6(b). Each grid cell corresponds to a (3 3) block of pixels from the topographic map. Figure 6(c) shows the gradient information, which is extracted from the (3 3) pixels of each cell. We use a histogram to represent the distribution of gradients in each quadrant area of the gradient map as shown in Figure 6(d). To reduce the boundary effects between the histogram bins, the smoothing process of SIFT method is also applied to each histogram. Finally, the four histograms are concatenated and normalized to be of unit length. This method follows the SIFT technique for image retrieval quite closely. However, traditional SIFT uses a global orientation to adjust the direction of the gradient vectors. We do not measure the global orientation of the gradient map because all gradient maps of EEG data already have the same orientation. Because we use relative gradient information on a topographic map, the features are invariant to the scale of EEG data. (a) Topographic map (24 24) (b) Sample grid (8 8) (c) Gradient map (8 8) (d) Gradient descriptor (2 2) Figure 6 Gradient descriptor for a frame (EEGLAB sample dataset, 32 channels). The gradient descriptors are the EEG counterpart to the SIFT descriptors of image processing. These descriptors form montage-independent EEG descriptors. However, the method requires a sufficient number of EEG channels to be distributed over the entire head surface in order to interpolate a topographic map with sufficient accuracy. 8
9 Figure 7 shows a case where a gradient descriptor method fails to generate the same (or similar) descriptors for the same brain pattern for different headsets. The ABM headset has only nine channels, and the Emotiv headset places its 14 detectors only around the headband area. To simulate EEG data of the same brain pattern but detected from these other headsets, we choose the closest 9 and 14 channels of Biosemi data, respectively. If we compute the gradient descriptors from each headset, as shown in Figure 7, their interpolated maps and gradient descriptors are not similar to each other, even for the same subject. In order to accommodate low-density headsets, we developed another method called normalized frame vectors, which we describe in the next section. Headset Biosemi 64 channels ABM 9 channels Emotiv 14 channels Channel placement Interpolated map Gradient descriptor Figure 7 Gradient descriptor from three different types of headsets for the same brain pattern. 9
10 2.1.3 Normalized frame vectors This section presents an alternative scale-independent spatiotemporal representation of EEG that uses normalization to remove the scale differences. After normalizing each channel to have zero mean and unit standard deviation, we normalize each frame so that it has a unit length. normalized frame i = Y i Y i where Y i is the i th frame and Y i is the norm of Y i After normalization, each normalized frame vector has the same length and represents the relative spatial relationship between channels. These normalized frame vectors are less sensitive to noise than the respective gradients. They can be used with or without temporal key frames to form a model Peak-frame repeat method The peak frame method has the advantage of capturing patterns that are stationary and hence more likely to contain data rather than noise. However, sometimes a signal will have a very low density of peak frames, particularly when the signal is applied to frequency sub-bands. Normalized frame descriptors on the other hand have a word for each frame, so the bin densities are much better. On the other hand, many of the normalized frames may contain a significant amount of noise. We also examined a hybrid approach in which replace each frame by the word corresponding to the closest peak frame that occurred at a time less than or equal to the current frame. We refer to this method as the peak-frame repeat method. The appendix compares the performance of the three methods for converting frames to words. 2.2 Building a dictionary In a bag-of-word model, a word in the dictionary represents a typical pattern of data. After extracting frame descriptors representing the relations between channels, we cluster the frame descriptors to find the typical relationships between channels. K-means clustering is the most popular clustering method, but it tends to assign more clusters to the dense areas of dominant classes. If the data is unbalanced, k-means clustering will assign more clusters to the dominant groups and fewer clusters to the minority groups. If we build a dictionary using k-means clustering, we may have less resolution among samples of the minority groups. However, the rare patterns often are important in EEG data. To find the best clustering method, we tried the five different clustering methods given in Table 1. Method k-means (k-means++) k-centers TABLE 1 CLUSTERING METHODS. Description K-means tries to reduce the sum of errors. Here, an error is the distance between the sample and its cluster center. K-means tends to place more cluster centers on the dense area. K-means++ is basically the same as k- means, but the algorithm uses a revised seeding method resulting in higher accuracies. While k-means clustering selects seeds randomly, k-means++ uses random seeding but gives more weight to samples far from the already chosen seeds [11]. K-centers decomposes the data into segments of similar volume. After picking a seed centroid randomly, it picks the next centroid from the remaining data so that it is as far as possible from all previous centroids. The method repeats the centroid assignment until the required number of 10
11 k-medoids subtractive (radius-based) affinity propagation centroids is reached [12]. K-centers differs from k-means++ in that it picks the next centroid deterministically. After assigning data into clusters using randomly picked data points as seeds, k-medoids randomly picks a new candidate data point for each cluster among cluster members and computes the within-cluster score. If the score improves, the candidate becomes the new medoid. The method continues to repeat over all medoids until no changes occur [13]. Subtractive clustering segments samples into groups of similar radius by removing all samples in the vicinity of the cluster centers. Subtractive clustering is less sensitive to the density of samples [14]. Affinity propagation simultaneously considers all data points as potential exemplars and messages are exchanged between data points to find the cluster members [15]. The clustering methods in Table 1 have different characteristics. We use a low-dimensional example to illustrate their differences. Test samples are generated from five clusters in two dimensions. In Figure 8(a), the cluster on the lower right has more samples than other clusters. All other clusters have similar densities. As shown in Figure 8 (b) ~ (d), the k-means and k-means++ methods assign more clusters to the lower right dense area, while subtractive clustering and affinity propagation methods assign clusters more evenly throughout the entire space. (b) k-mean clustering (c) k-mean++ clustering (a) test samples (d) subtractive clustering Figure 8 Voronoi diagram of test samples and four clustering methods. (e) affinity propagation In addition to choosing a clustering method, we also need to decide the number of clusters. These values are determined experimentally as explained in Section
12 2.3 Representing EEG data using a dictionary A dictionary of exemplar spatial patterns is formed by clustering. These patterns are either the gradient descriptors described in Section or the normalized frame vectors described in Section A bagof-words model represents an epoch by a histogram of matching words which are the best match of the spatial pattern at each frame or at key frames. The following sections explain the selection of matching words and experimental parameters. We assume that we have built the dictionary with a base headset --- in our case this will be 64-channel Biosemi Matching words for a frame To represent data, we need to find matching words for each frame by comparing corresponding channels between the dictionary and the data. Many EEG headsets follow the system [16] for placing electrodes so electrode positions for small headsets are subsets of the positions for larger headsets. For example, suppose the dictionary has been built from Biosemi 64-channel data. To represent data from an ABM 9 channel headset, we compare ABM data to POz (30 th ), Fz (38 th ), Cz (48 th ), C3 (13 th ), C4 (50 th ), F3 (5 th ), F4 (40 th ), P3 (21 th ), P4 (58 th ) channels of the base (Biosemi 64-channel) headset. According to the international system, the above nine channels of the Biosemi headset exactly overlap the nine channels of the ABM headset, assuming that the experimenters have not manually measured head positions. The matching procedure is: If a new headset has more channels than the headset of a dictionary, we use only overlapped channels between them during the matching words processing. If a new headset has channels that do not spatially overlap with the headset corresponding to the dictionary, we can use interpolation or nearest-neighborhood selection to generate matching channels for the new headset. Once we have matched channels, we need to determine how to select the corresponding words. When we assign a sample to the matching words, we can use two approaches: hard assignment and soft assignment [17]. Hard assignment associates a sample to the nearest word, while soft assignment associates a sample based on a probability. Hard assignment is fast, but it has uncertainty and implausibility issues. Uncertainty happens when two or more words are relevant to a sample. In this case, we cannot claim one matching word for the sample because all candidate words are close enough to the sample. Implausibility happens when all words are too far from a sample. In this case, we cannot claim that the sample and its matching word are similar to each other. In this work, we use soft assignment to represent a sample as the matching probability of words, which is estimated based on the distance between a sample and a word in the dictionary. The soft assignment calculates a weight by multiplying a Gaussian density function and the Euclidean distance function [17]. On the assumption that all words have the same distributions of samples, we use the same smoothing parameter σ for all words. In our test, the parameter σ was tuned empirically. The word weights for a frame are normalized so that they sum to 1. That is, if a dictionary has N words, each frame is represented as a weight vector of length N whose elements sum to Histogram of words in an epoch After representing raw EEG data as a probability vector of matching words, we represent an epoch as the histogram of words by adding the probability vectors into a histogram and making the histogram a unit vector. If we use entire frames in the epoch, the epoch feature vector will be the average of the probability vectors for the individual frames. If we use only power peak frames, the epoch feature will be the average 12
13 of probability vectors corresponding to power peak frames. No matter which method is used, the size of epoch feature vector is always the size of the dictionary, and feature vectors can be processed in the same way as long as the same dictionary is used. Another advantage of this approach is that the histogram is invariant with respect to the sampling rate. Hence, we can easily share information across different datasets without regard to their configurations if one universal dictionary is used for all datasets. 2.4 Improving the quality of a descriptor Because a bag-of-word model is a dimension reduction technique, it comes with some information loss and performance degradation. To improve the quality of descriptor, we have applied two additional strategies: sub-intervals and sub-band pass filtering Sub-intervals along time axes The proposed EEG descriptor uses a histogram to represent an epoch. Because a histogram is an orderless feature, it does not capture temporal relationships between frames. To compensate for the loss of temporal order information, we divide an epoch into small sub-intervals and represent each sub-window separately using a histogram. We then represent the epoch by concatenating histograms of all subwindows. This allows the BOW features to retain some temporal ordering information but not does not require perfect time-locking to capture features Frequency sub-bands It is generally known that neural oscillations in certain frequency bands have specific biological meanings [18]. Table 2 lists some common frequency bands and example locations on the scalp. Many studies have shown that power in specific bands such as theta or alpha is associated with processes such as fatigue or attention shifting. Because the goal of this work is to develop general-purpose EEG features, we use a series of band-pass filtered signals covering the low frequency spectrum, instead of exclusively choosing a specific band. If we use M bands for filtering, we obtain M word lists from M filtered data bands. The feature is the concatenation of these M word lists. We also create a separate dictionary for each frequency band. TABLE 2 PHYSICALLY RELEVANT EEG FREQUENCY BANDS [18]. Band Frequency Location Delta 0 ~ 4 Hz Frontally in adults, posteriorly in children Theta 4 ~ 7 Hz Lateralized or diffuse Alpha 8 ~ 15 Hz Posterior regions of the head, both sides, central sites (c3-c4) at rest Beta 16 ~ 31 HZ Both sides, symmetrical distribution Gamma 32 ~ Hz Somatosensory cortex Mu 8 ~ 12 Hz Sensorimotor cortex 13
14 3. Evaluation of parameters The proposed BOW descriptors have many parameters including the size of a dictionary, the type of clustering method, the number of sub-intervals, and ranges of frequency sub-bands. This section describes the test datasets and empirically evaluates some of the parameter choices. The tests in this section use the Visual Evoked Potential (VEP) oddball task dataset from the Army Research Laboratory (ARL) [10], [19]. The dataset records EEG signals while two types of images were presented: an image of an enemy combatant (target) and an image of a U.S soldier (non-target). As shown in Table 3, the data is not balanced and target samples are uncommon relative to non-target samples. The ratio of target to nontarget images is about 1:7. ABM dataset and Emotiv dataset also have the similar ratio of target to nontarget samples. TABLE 3 PERCENTAGE OF EACH CLASS IN BIOSEMI DATASET. Class Sample number Percentage Non-target (label 34) % Target (label 35) % Other % Total 13, % The images are presented in random order at a frequency of 0.5 Hz, and subjects were instructed to identify each image with a button press. The same subjects performed the test using the three different headsets specified in Table 4. TABLE 4 TEST HEADSETS. Headset ABM Biosemi Emotiv Channels (EEG + External) 10 (9 + 1) 68 (64 + 4) 14 (14 + 0) Sampling rate 256 Hz 512 Hz 128 Hz Before extracting feature vectors and doing the classification tests, we used the fully-automated PREP preprocessing pipeline to remove experimental artifacts from the data. The data is high-pass FIR filtered at 1 Hz, and line noise is removed using a multispectral tapering technique. Bad channels are identified based on four criteria: extreme amplitudes (deviation criterion), lack of correlation with any other channel (correlation criterion), lack of predictability by other channels (predictability criterion), and unusual high frequency noise (noisiness criterion) [20]. After interpolating removed bad channels, we removed the remaining noise and subject-generated artifacts using the Artifact Subspace Reconstruction (ASR) method [21] implemented as a part of clean_rawdata0.31 EEGLAB plug-in [22]. We ran ASR with the default parameters except for a burst criterion parameter of 20 in order to retain as much signal as possible. We sub-band pass filtered the cleaned data using the EEGLAB pop_eegfiltnew function and then normalized the data in two steps. In the first step, we normalized each channel to have zero mean and unit standard deviation so that all channels have the same scale. In this study, we use feature vectors extracted from each frame, so we did additional normalization for each frame to remove scale differences between frames. After normalization, each one-second EEG epoch is represented using the proposed BOW histogram features. Last, we apply PCA on the extracted features to reduce the dimension. 14
15 To test various parameter options for generating histogram features, we compared classification results using the parameters summarized in Table 5. Because of the complexity of the parameter choices, we chose to vary each parameter independently. With the exception of section 3.6 the following sections do within subject tests in which training samples and test samples are from the same subject. To pick balanced samples, we randomly pick the same number of samples from each class. The measures for the within subject tests are the average classification accuracies over 14 test subjects. Other details of the tests are explained in the following sub-sections. TABLE 5 PARAMETERS FOR EVALUATION. Parameter Options Default Clustering methods k-means, k-centers, k-medoids, subtractive clustering, affinity propagation k-means Size of a dictionary 50, 100, 150, 200, 250, 300 words 100 Number of sub-intervals 1 ~ 16 1 Number of sub-bands 1, 2, 4, 8, 12, 16 1 Classifier Linear discriminant analysis (LDA), ARRLS [23] LDA Frames Only peak frames, Only peak Peak frames with gaps filled frames Entire frames Descriptor Normalized frames, gradient descriptor Normalized frames 3.1 Gradient descriptor versus normalized frame descriptor To describe EEG frames, we compared two different features: gradient descriptor and normalized frame descriptor. Table 6 shows the classification performance when these two features were used for a bag-ofwords model. To compare these two features, we built a bag-of-word model using each descriptor with the Biosemi 64-channel data and tried a classification test in which the training set and the test set were from different headsets. The overall goal is to find robust features that transfer across headsets, subjects, and paradigms. As explained in section 2.1, gradient descriptors are susceptible to differences in channel configurations. Table 6 compares the performance for different training headsets when the test headset is a 64-channel Biosemi headset. For this test, we used the default parameters of Table 5 (k-means clustering, 100-word dictionary, one sub-interval, one sub-band, LDA classifier) and did the within-subject classification test, in which training samples and test samples are from the same subject. This is a difficult test, as neither the ABM nor the Emotiv headsets have good coverage of the scalp. If the training headset and the test headsets are the Biosemi headset, the gradient descriptor and the normalized frame descriptor have the similar performance. However, if the training headset is Emotiv, the performance of the gradient descriptor degrades more than the normalized frame descriptor. 15
16 TABLE 6 DESCRIPTOR COMPARISON FOR WITHIN SUBJECT CLASSIFICATION. Test headset Biosemi Training Gradient descriptor Normalized frame descriptor headset Accuracy Drop Accuracy Drop Biosemi ABM Emotiv In both cases, the accuracy drops considerably for training headsets with low numbers of channels, although the gradient descriptor appears to do slightly worse. We use the normalized frame descriptor for representing various EEG data in the remainder of the tests. 3.2 Evaluation of the dictionary sizes To find the best size of a dictionary, we did the within subject classification test on VEP datasets with various sizes of dictionaries ranging from 50 to 300. The within subject classification test uses training samples and test samples from the same headset and the same subject. These tests used the dictionary built from the 64-channel Biosemi data with no sub-bands and no sub-intervals. The VEP dataset consists of three types of headsets listed in Table 4. We assume that training and test samples are from the same headset. The test uses the default parameters in Table 5, but dictionary sizes vary from 50 to 300. Table 7 shows the average accuracies of 10 runs for various dictionary sizes. As shown in Table 7, the classification performance does not appear to be very sensitive to the size of the dictionary. We chose 100 words as the default dictionary size. TABLE 7 COMPARISON OF VARIOUS DICTIONARY SIZES FOR WITHIN SUBJECT CLASSIFICATION. Word number ABM Biosemi Emotiv Average Evaluation of clustering methods Table 3 shows the percentage of each class in our test EEG data. According to Table 3, the data is not balanced and interesting target samples are uncommon relative to less-interesting non-target samples. If the clustering method used to build the dictionary is sensitive to the bias of the data, the dictionary built from the data could be biased to these unimportant but dominant samples. The low-dimensional example of Figure 8 shows k-means and k-means++ assign more clusters to the dominant (dense) areas, while subtractive and affinity clustering assign clusters more evenly throughout the entire sample space. However, it is not clear how well these observations apply to very high-dimensional data. We performed a more direct test of performance of how sensitive the overall feature representation is to selection of the clustering procedure. We tried a within-subject classification test with five different dictionaries built using five different clustering methods. Table 8 shows the average accuracies of 10 runs. Although affinity propagation is slightly better than other clustering methods for unbalanced data, the difference in accuracies does not appear to be sufficient to warrant the significant increase in execution time required to compute the affinity clustering results. The k-means clustering method is available with a GPU implementation, making computation much faster [24]. 16
17 TABLE 8 ACCURACY FOR WITHIN SUBJECT CLASSIFICATION WITH BOW DICTIONARIES BUILT USING DIFFERENT CLUSTERING METHODS. Accuracy (%) affinity subtractive propagation clustering k-means k-centers k-medoids ABM Biosemi Emotiv Average For clustering methods that use random seeds, we usually repeat clustering more than one time to find the best seeds. Table 9 shows the classification accuracies of three clustering methods using random seeds with five replicates. As shown in the table, there is very little difference among replicates. Based on these results, we suggest using k-means clustering with two or three replicates to build a suitable dictionary. TABLE 9 WITHIN SUBJECT CLASSIFICATION ACCURACY FOR 5 CLUSTERING REPLICATES. Clustering method k-means k-centers k- medoids Headset Replicates Average Standard deviation ABM Biosemi Emotiv ABM Biosemi Emotiv ABM Biosemi Emotiv Evaluation of sub-intervals The last sections showed the performance of bag-of-models when entire band and entire period of an epoch were used. The following sections show how to improve the quality of the descriptor by using subbands and sub-intervals as described in Section 2.4. To find the optimal number of sub-intervals, we performed within-subject classification on the VEP data with various numbers of sub-windows and one frequency band. Table 10 shows there is a definite increase in performance across headsets as the number of subintervals increases to five. The performance increases level off beyond eight. 17
18 TABLE 10 WITHIN SUBJECT CLASSIFICATION ACCURACY FOR DIFFERENT NUMBERS OF SUBINTERVALS. Subintervals ABM Biosemi Emotiv Average Evaluation of sub-bands To test the effect of frequency resolution on classification accuracy, we divided the frequency range from 0 Hz to 32 Hz into equal sized sub-bands. For example, if the number of sub-bands is two, one band uses 0 Hz to 16 Hz and another band uses 16 Hz to 32 Hz. We use EEGLAB s pop_eegfiltnew() to extract data in specific frequency ranges [25]. To reduce the overlap between sub-bands, the transition bandwidth was set to 1 Hz. Table 11 shows the results of within subject classification tests on VEP datasets with various numbers of sub-bands. Except the number of sub-bands, the test uses the default parameters, which are k-means clustering, 100 words dictionary, one sub-interval, and the LDA classifier. There is a weak dependence on number of frequency bands. TABLE 11 WITHIN SUBJECT CLASSIFICATION ACCURACY FOR DIFFERENT NUMBERS OF FREQUENCY BANDS. Sub-bands ABM Biosemi Emotiv Average When optimized independently, classification accuracy improves with an increasing number of subintervals and frequency bands, but the effects fall off as the time and frequency subdivisions increase beyond a certain point. Based on physical considerations and these results, a selection of eight frequency sub-bands of width 4 Hz and eight subintervals of length 125 ms appears to be a good choice. While we have not done an extensive test of optimization together, this choice appears to be close to optimal in the cases we have examined. 3.6 Cross subject transfer learning Because bag-of-word descriptors are relatively independent of the EEG headset configuration, these features can be combined to improve classification across different configurations. To highlight the advantage of BOW descriptor, we performed cross-subject transfer learning tests on the VEP multiheadset data. As described in Table 4, the VEP test headsets have 10, 14 or 68 channels and sampling rates varying from 128 Hz to 512 Hz. To share information between different datasets, we use the ARRLS transfer learning classifier, which tries to reduce three factors simultaneously: structural risk, the joint distribution matching error, and the manifold consistency error [23]. 18
19 For comparison, we also tested RAW features. Because each headset has different parameters, we cannot combine them directly. In the case of RAW features, we resampled all data to 128Hz and used common channels between a test headset and a training headset for feature extraction. Then we used the raw signals of the common channels in an epoch as the feature vector. The BOW tests use eight sub-intervals and eight sub-bands with the default parameters of Table 5 for the remaining parameters. For the RAW features, the ABM to EMOTIV and EMOTIV to ABM are not available, since these headsets have no detector overlap. The BOW features are based on a dictionary built from 64-channel Biosemi EEG. Frames from each headset are matched to the best words based on closet channels, so BOW is available for all test conditions. In our experiments, test samples and training samples may be from different headsets of the VEP multiheadset data as shown in Table 12. In each run, we randomly pick samples from each headset and do the leave-one-subject out classification test. Therefore, we use one subject as the test subject and the rest subjects as the training subjects to predict the unknown labels of test samples from the test subject. The previous sections use the within subject classification test, which uses the training and test samples from the same subject, so their performance cannot be directly compared to the results in Table 12, which uses the training and test samples from the different subjects. Table 12 shows accuracy averaged over five runs. In each run, we randomly pick balanced samples from a training subject and a test subject. To get balanced samples from imbalanced dataset, we randomly pick the same number of samples from each class with replacement. As shown in Table 12, the BOW features can be used to combine data from different headsets and show better performance than RAW features in all cases. TABLE 12 ACCURACY OF CROSS SUBJECT TRANSFER LEARNING TEST FOR THE VEP COLLECTION. Test Training headset headset RAW * Bag-of-words ABM ABM Biosemi Emotiv Not available ABM Biosemi Biosemi Emotiv ABM Not available Emotiv Biosemi Emotiv * Use common channels between a test headset and a training headset. 19
20 4. Discussion This technical report evaluates the performance of the bag-of-words descriptor for EEG data. Our comparison of gradient descriptor and direct frame representations showed that frame representations had equivalent or better performance and were less sensitive to headset configuration. Gradient descriptors require a somewhat uniform distribution of detectors across the head in order to get an accurate estimate of the gradient, while frame descriptors can always be calculated. Sub-intervals and sub-bands approaches are useful to increase the quality of descriptors with dictionary sizes of 100 words for each subcase. However, sub-interval and sub-bands approach can introduce boundary effects when used with peak frames. To mitigate boundary effects, we propose to peak-frame repeat method. As shown in the appendix, this method shows comparable performance to the approach using entire frames, while reducing the amount of resources required to process frames. In particular, the method reduces the clustering time to build a dictionary and also reduces the memory space to store extracted frame representations because it only stores peak frames. Based on our test results, we recommend using frequency bands of width 4 Hz and subintervals on the order of 100 ms. Intuitively, this choice gives temporal and frequency resolutions at scales that are physically meaningful. Such a BOW representation of 1-second epochs gives features of length 6400, which is comparable in size to a raw representation of 1-second epochs of 64-channel EEG sampled at 128 Hz. An advantage of BOW is a common representation across headset configurations and sampling frequencies. Another important advantage of BOW is that it removes the exact time-locking requirements that are present in raw feature representations. Our results have not shown a strong dependence on the details of the actual dictionary in terms of the clustering method used or the number of replicates needed to find an optimal clustering. These results suggest that it is possible to use a single dictionary across headsets and experimental paradigms without losing much resolution. This is an important finding, as it allows the use of efficient GPU k-means clustering to produce dictionaries that are broadly applicable across headsets and paradigms. 5. Acknowledgments The authors gratefully acknowledge the use of the data from the VEP headset comparison study and to their collaborators at ARL including particularly David Hairston, Scott Kerick and Vernon Lawhern. This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. Computational support for this project was provided by the Computational System Biology Core, funded by the National Institute on Minority Health and Health Disparities (G12MD007591) from the National Institutes of Health. 20
21 References [1] F. S. Bao, X. Liu, and C. Zhang, PyEEG: An Open Source Python Module for EEG/MEG Feature Extraction, Comput. Intell. Neurosci., vol. 2011, pp. 1 7, [2] M. M. Murray, D. Brunet, and C. M. Michel, Topographic ERP Analyses: A Step-by-Step Tutorial Review, Brain Topogr., vol. 20, no. 4, pp , Jun [3] D. Brunet, M. M. Murray, and C. M. Michel, Spatiotemporal Analysis of Multichannel EEG: CARTOOL, Comput. Intell. Neurosci., vol. 2011, Jan [4] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, in In Workshop on Statistical Learning in Computer Vision, ECCV, 2004, pp [5] D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, Int J Comput Vis., vol. 60, no. 2, pp , Nov [6] D. Lehmann, H. Ozaki, and I. Pal, EEG alpha map series: brain micro-states by space-oriented adaptive segmentation, Electroencephalogr. Clin. Neurophysiol., vol. 67, no. 3, pp , Sep [7] BESA Brain Electrical Source Analysis. [Online]. Available: [Accessed: 19-Nov-2013]. [8] G. Schalk, D. J. McFarland, T. Hinterberger, N. Birbaumer, and J. R. Wolpaw, BCI2000: A General-purpose Brain-Computer Interface (BCI) System, IEEE Trans. Biomed. Eng., vol. 51, no. 6, pp , Jun [9] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, PhysioBank, PhysioToolkit, and PhysioNet : Components of a New Research Resource for Complex Physiologic Signals, Circulation, vol. 101, no. 23, pp. e215 e220, Jun [10] W. D. Hairston, K. W. Whitaker, A. J. Ries, J. M. Vettel, J. C. Bradford, S. E. Kerick, and K. McDowell, Usability of four commercially-oriented EEG systems, J. Neural Eng., vol. 11, no. 4, p , Aug [11] D. Arthur and S. Vassilvitskii, K-means++: the advantages of careful seeding, in In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, [12] S. Dasgupta and P. M. Long, Performance guarantees for hierarchical clustering, J. Comput. Syst. Sci., vol. 70, no. 4, pp , Jun [13] H.-S. Park and C.-H. Jun, A simple and fast algorithm for K-medoids clustering, Expert Syst. Appl., vol. 36, no. 2, Part 2, pp , Mar [14] S. L. Chiu, Fuzzy model identification based on cluster estimation, J. Intell. Fuzzy Syst., vol. 2, pp , [15] B. J. Frey and D. Dueck, Clustering by Passing Messages Between Data Points, Science, vol. 315, no. 5814, pp , Feb [16] R. Oostenveld and P. Praamstra, The five percent electrode system for high-resolution EEG and ERP measurements, Clin. Neurophysiol. Off. J. Int. Fed. Clin. Neurophysiol., vol. 112, no. 4, pp , Apr [17] J. C. van Gemert, C. J. Veenman, A. W. M. Smeulders, and J.-M. Geusebroek, Visual Word Ambiguity, IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 7, pp , Jul [18] Electroencephalography, Wikipedia, the free encyclopedia. 03-Feb [19] J. T. Anthony J. Ries, A Comparison of Electroencephalography Signals Acquired from Conventional and Mobile Systems, J. Neurosci. Neuroengineering, vol. 3, no. 1, [20] R. Kay, EEG-Clean-Tools, GitHub. [Online]. Available: Tools. [Accessed: 03-Apr-2015]. [21] T. Mullen, C. Kothe, Y. M. Chi, A. Ojeda, T. Kerth, S. Makeig, G. Cauwenberghs, and T.-P. Jung, Real-time modeling and 3D visualization of source dynamics and connectivity using wearable EEG, in th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2013, pp
22 [22] Plugin list process - SCCN. [Online]. Available: [Accessed: 17-Apr-2015]. [23] M. Long, J. Wang, G. Ding, S. J. Pan, and P. S. Yu, Adaptation Regularization: A General Framework for Transfer Learning, IEEE Trans. Knowl. Data Eng., vol. 26, no. 5, pp , May [24] K. J. Kohlhoff, M. H. Sosnick, W. T. Hsu, V. S. Pande, and R. B. Altman, CAMPAIGN: an opensource library of GPU-accelerated data clustering algorithms, Bioinforma. Oxf. Engl., vol. 27, no. 16, pp , Aug [25] A. Delorme and S. Makeig, EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis, J. Neurosci. Methods, vol. 134, no. 1, pp. 9 21, Mar
23 Appendix Within subject classification accuracy of using only peak frames ABM sub-intervals sub-bands sub-bands sub-bands sub-bands sub-bands sub-bands Biosemi sub-intervals sub-bands sub-bands sub-bands sub-bands sub-bands sub-bands Emotiv sub-intervals sub-bands sub-bands sub-bands sub-bands sub-bands sub-bands * * Error during LDA: pooled variance of training is not positive.
Brain-Computer Interface (BCI)
Brain-Computer Interface (BCI) Christoph Guger, Günter Edlinger, g.tec Guger Technologies OEG Herbersteinstr. 60, 8020 Graz, Austria, guger@gtec.at This tutorial shows HOW-TO find and extract proper signal
More informationIJESRT. (I2OR), Publication Impact Factor: 3.785
[Kaushik, 4(8): Augusts, 215] ISSN: 2277-9655 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY FEATURE EXTRACTION AND CLASSIFICATION OF TWO-CLASS MOTOR IMAGERY BASED BRAIN COMPUTER
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationMindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.
Andrew Robbins MindMouse Project Description: MindMouse is an application that interfaces the user s mind with the computer s mouse functionality. The hardware that is required for MindMouse is the Emotiv
More informationFeature Conditioning Based on DWT Sub-Bands Selection on Proposed Channels in BCI Speller
J. Biomedical Science and Engineering, 2017, 10, 120-133 http://www.scirp.org/journal/jbise ISSN Online: 1937-688X ISSN Print: 1937-6871 Feature Conditioning Based on DWT Sub-Bands Selection on Proposed
More informationReconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn
Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Introduction Active neurons communicate by action potential firing (spikes), accompanied
More informationPre-Processing of ERP Data. Peter J. Molfese, Ph.D. Yale University
Pre-Processing of ERP Data Peter J. Molfese, Ph.D. Yale University Before Statistical Analyses, Pre-Process the ERP data Planning Analyses Waveform Tools Types of Tools Filter Segmentation Visual Review
More informationPROCESSING YOUR EEG DATA
PROCESSING YOUR EEG DATA Step 1: Open your CNT file in neuroscan and mark bad segments using the marking tool (little cube) as mentioned in class. Mark any bad channels using hide skip and bad. Save the
More informationAN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS
AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS Susanna Spinsante, Ennio Gambi, Franco Chiaraluce Dipartimento di Elettronica, Intelligenza artificiale e
More informationDELTA MODULATION AND DPCM CODING OF COLOR SIGNALS
DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings
More informationResearch Article. ISSN (Print) *Corresponding author Shireen Fathima
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)
More informationTHE BERGEN EEG-fMRI TOOLBOX. Gradient fmri Artifatcs Remover Plugin for EEGLAB 1- INTRODUCTION
THE BERGEN EEG-fMRI TOOLBOX Gradient fmri Artifatcs Remover Plugin for EEGLAB 1- INTRODUCTION This EEG toolbox is developed by researchers from the Bergen fmri Group (Department of Biological and Medical
More informationAutomatic Rhythmic Notation from Single Voice Audio Sources
Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung
More informationSupplemental Material for Gamma-band Synchronization in the Macaque Hippocampus and Memory Formation
Supplemental Material for Gamma-band Synchronization in the Macaque Hippocampus and Memory Formation Michael J. Jutras, Pascal Fries, Elizabeth A. Buffalo * *To whom correspondence should be addressed.
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationqeeg-pro Manual André W. Keizer, PhD October 2014 Version 1.2 Copyright 2014, EEGprofessionals BV, All rights reserved
qeeg-pro Manual André W. Keizer, PhD October 2014 Version 1.2 Copyright 2014, EEGprofessionals BV, All rights reserved TABLE OF CONTENT 1. Standardized Artifact Rejection Algorithm (S.A.R.A) 3 2. Summary
More informationWHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?
WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.
More informationRegion Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling
International Conference on Electronic Design and Signal Processing (ICEDSP) 0 Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling Aditya Acharya Dept. of
More informationCommon Spatial Patterns 2 class BCI V Copyright 2012 g.tec medical engineering GmbH
g.tec medical engineering GmbH Sierningstrasse 14, A-4521 Schiedlberg Austria - Europe Tel.: (43)-7251-22240-0 Fax: (43)-7251-22240-39 office@gtec.at, http://www.gtec.at Common Spatial Patterns 2 class
More informationCommon Spatial Patterns 3 class BCI V Copyright 2012 g.tec medical engineering GmbH
g.tec medical engineering GmbH Sierningstrasse 14, A-4521 Schiedlberg Austria - Europe Tel.: (43)-7251-22240-0 Fax: (43)-7251-22240-39 office@gtec.at, http://www.gtec.at Common Spatial Patterns 3 class
More informationqeeg-pro Manual André W. Keizer, PhD v1.5 Februari 2018 Version 1.5 Copyright 2018 qeeg-pro BV, All rights reserved
qeeg-pro Manual André W. Keizer, PhD v1.5 Februari 2018 Version 1.5 Copyright 2018 qeeg-pro BV, All rights reserved TABLE OF CONTENT 1. Indications for use 4 2. Potential adverse effects 4 3. Standardized
More informationDATA! NOW WHAT? Preparing your ERP data for analysis
DATA! NOW WHAT? Preparing your ERP data for analysis Dennis L. Molfese, Ph.D. Caitlin M. Hudac, B.A. Developmental Brain Lab University of Nebraska-Lincoln 1 Agenda Pre-processing Preparing for analysis
More informationAdaptive Key Frame Selection for Efficient Video Coding
Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,
More informationMUSI-6201 Computational Music Analysis
MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)
More informationAutomatic Construction of Synthetic Musical Instruments and Performers
Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.
More informationIEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing
IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing Theodore Yu theodore.yu@ti.com Texas Instruments Kilby Labs, Silicon Valley Labs September 29, 2012 1 Living in an analog world The
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationSkip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video
Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American
More informationBrowsing News and Talk Video on a Consumer Electronics Platform Using Face Detection
Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com
More informationA HIGHLY INTERACTIVE SYSTEM FOR PROCESSING LARGE VOLUMES OF ULTRASONIC TESTING DATA. H. L. Grothues, R. H. Peterson, D. R. Hamlin, K. s.
A HIGHLY INTERACTIVE SYSTEM FOR PROCESSING LARGE VOLUMES OF ULTRASONIC TESTING DATA H. L. Grothues, R. H. Peterson, D. R. Hamlin, K. s. Pickens Southwest Research Institute San Antonio, Texas INTRODUCTION
More informationImproving Frame Based Automatic Laughter Detection
Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for
More informationComposer Identification of Digital Audio Modeling Content Specific Features Through Markov Models
Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has
More informationAn Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions
1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,
More informationResearch on sampling of vibration signals based on compressed sensing
Research on sampling of vibration signals based on compressed sensing Hongchun Sun 1, Zhiyuan Wang 2, Yong Xu 3 School of Mechanical Engineering and Automation, Northeastern University, Shenyang, China
More informationStaMPS Persistent Scatterer Practical
StaMPS Persistent Scatterer Practical ESA Land Training Course, Leicester, 10-14 th September, 2018 Andy Hooper, University of Leeds a.hooper@leeds.ac.uk This practical exercise consists of working through
More informationRemoval of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm
Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm Majid Aghasi*, and Alireza Jalilian** *Department of Electrical Engineering, Iran University of Science and Technology,
More informationStaMPS Persistent Scatterer Exercise
StaMPS Persistent Scatterer Exercise ESA Land Training Course, Bucharest, 14-18 th September, 2015 Andy Hooper, University of Leeds a.hooper@leeds.ac.uk This exercise consists of working through an example
More informationWHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs
WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs Abstract Large numbers of TV channels are available to TV consumers
More informationAnalysis of Packet Loss for Compressed Video: Does Burst-Length Matter?
Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter? Yi J. Liang 1, John G. Apostolopoulos, Bernd Girod 1 Mobile and Media Systems Laboratory HP Laboratories Palo Alto HPL-22-331 November
More informationDistortion Analysis Of Tamil Language Characters Recognition
www.ijcsi.org 390 Distortion Analysis Of Tamil Language Characters Recognition Gowri.N 1, R. Bhaskaran 2, 1. T.B.A.K. College for Women, Kilakarai, 2. School Of Mathematics, Madurai Kamaraj University,
More informationECG SIGNAL COMPRESSION BASED ON FRACTALS AND RLE
ECG SIGNAL COMPRESSION BASED ON FRACTALS AND Andrea Němcová Doctoral Degree Programme (1), FEEC BUT E-mail: xnemco01@stud.feec.vutbr.cz Supervised by: Martin Vítek E-mail: vitek@feec.vutbr.cz Abstract:
More information2. AN INTROSPECTION OF THE MORPHING PROCESS
1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,
More informationComparison Parameters and Speaker Similarity Coincidence Criteria:
Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability
More informationDeep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj
Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj 1 Story so far MLPs are universal function approximators Boolean functions, classifiers, and regressions MLPs can be
More informationAutomatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting
Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,
More informationDigital Correction for Multibit D/A Converters
Digital Correction for Multibit D/A Converters José L. Ceballos 1, Jesper Steensgaard 2 and Gabor C. Temes 1 1 Dept. of Electrical Engineering and Computer Science, Oregon State University, Corvallis,
More informationMultichannel Satellite Image Resolution Enhancement Using Dual-Tree Complex Wavelet Transform and NLM Filtering
Multichannel Satellite Image Resolution Enhancement Using Dual-Tree Complex Wavelet Transform and NLM Filtering P.K Ragunath 1, A.Balakrishnan 2 M.E, Karpagam University, Coimbatore, India 1 Asst Professor,
More informationReducing False Positives in Video Shot Detection
Reducing False Positives in Video Shot Detection Nithya Manickam Computer Science & Engineering Department Indian Institute of Technology, Bombay Powai, India - 400076 mnitya@cse.iitb.ac.in Sharat Chandran
More informationA Fast Alignment Scheme for Automatic OCR Evaluation of Books
A Fast Alignment Scheme for Automatic OCR Evaluation of Books Ismet Zeki Yalniz, R. Manmatha Multimedia Indexing and Retrieval Group Dept. of Computer Science, University of Massachusetts Amherst, MA,
More informationEnhancing Music Maps
Enhancing Music Maps Jakob Frank Vienna University of Technology, Vienna, Austria http://www.ifs.tuwien.ac.at/mir frank@ifs.tuwien.ac.at Abstract. Private as well as commercial music collections keep growing
More informationMultiple-Window Spectrogram of Peaks due to Transients in the Electroencephalogram
284 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 48, NO. 3, MARCH 2001 Multiple-Window Spectrogram of Peaks due to Transients in the Electroencephalogram Maria Hansson*, Member, IEEE, and Magnus Lindgren
More informationSpectroscopy on Thick HgI 2 Detectors: A Comparison Between Planar and Pixelated Electrodes
1220 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, OL. 50, NO. 4, AUGUST 2003 Spectroscopy on Thick HgI 2 Detectors: A Comparison Between Planar and Pixelated Electrodes James E. Baciak, Student Member, IEEE,
More informationRemoving the Pattern Noise from all STIS Side-2 CCD data
The 2010 STScI Calibration Workshop Space Telescope Science Institute, 2010 Susana Deustua and Cristina Oliveira, eds. Removing the Pattern Noise from all STIS Side-2 CCD data Rolf A. Jansen, Rogier Windhorst,
More informationRobust Transmission of H.264/AVC Video using 64-QAM and unequal error protection
Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection Ahmed B. Abdurrhman 1, Michael E. Woodward 1 and Vasileios Theodorakopoulos 2 1 School of Informatics, Department of Computing,
More informationDetecting the Moment of Snap in Real-World Football Videos
Detecting the Moment of Snap in Real-World Football Videos Behrooz Mahasseni and Sheng Chen and Alan Fern and Sinisa Todorovic School of Electrical Engineering and Computer Science Oregon State University
More informationPre-processing of revolution speed data in ArtemiS SUITE 1
03/18 in ArtemiS SUITE 1 Introduction 1 TTL logic 2 Sources of error in pulse data acquisition 3 Processing of trigger signals 5 Revolution speed acquisition with complex pulse patterns 7 Introduction
More informationLecture 2 Video Formation and Representation
2013 Spring Term 1 Lecture 2 Video Formation and Representation Wen-Hsiao Peng ( 彭文孝 ) Multimedia Architecture and Processing Lab (MAPL) Department of Computer Science National Chiao Tung University 1
More informationAn Overview of Video Coding Algorithms
An Overview of Video Coding Algorithms Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Video coding can be viewed as image compression with a temporal
More informationAudio-Based Video Editing with Two-Channel Microphone
Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science
More informationMultiSpec Tutorial: Visualizing Growing Degree Day (GDD) Images. In this tutorial, the MultiSpec image processing software will be used to:
MultiSpec Tutorial: Background: This tutorial illustrates how MultiSpec can me used for handling and analysis of general geospatial images. The image data used in this example is not multispectral data
More informationPre-processing pipeline
Pre-processing pipeline Collect high-density EEG data (>30 chan) Import into EEGLAB Import event markers and channel locations Re-reference/ down-sample (if necessary) High pass filter (~.5 1 Hz) Examine
More informationPS User Guide Series Seismic-Data Display
PS User Guide Series 2015 Seismic-Data Display Prepared By Choon B. Park, Ph.D. January 2015 Table of Contents Page 1. File 2 2. Data 2 2.1 Resample 3 3. Edit 4 3.1 Export Data 4 3.2 Cut/Append Records
More informationUNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT
UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT Stefan Schiemenz, Christian Hentschel Brandenburg University of Technology, Cottbus, Germany ABSTRACT Spatial image resizing is an important
More informationPERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS. Yuanyi Xue, Yao Wang
PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS Yuanyi Xue, Yao Wang Department of Electrical and Computer Engineering Polytechnic
More informationLecture 9 Source Separation
10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research
More informationAnalysis, Synthesis, and Perception of Musical Sounds
Analysis, Synthesis, and Perception of Musical Sounds The Sound of Music James W. Beauchamp Editor University of Illinois at Urbana, USA 4y Springer Contents Preface Acknowledgments vii xv 1. Analysis
More informationRobust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection
Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection Ahmed B. Abdurrhman, Michael E. Woodward, and Vasileios Theodorakopoulos School of Informatics, Department of Computing,
More informationHeart Rate Variability Preparing Data for Analysis Using AcqKnowledge
APPLICATION NOTE 42 Aero Camino, Goleta, CA 93117 Tel (805) 685-0066 Fax (805) 685-0067 info@biopac.com www.biopac.com 01.06.2016 Application Note 233 Heart Rate Variability Preparing Data for Analysis
More informationPredicting Performance of PESQ in Case of Single Frame Losses
Predicting Performance of PESQ in Case of Single Frame Losses Christian Hoene, Enhtuya Dulamsuren-Lalla Technical University of Berlin, Germany Fax: +49 30 31423819 Email: hoene@ieee.org Abstract ITU s
More informationUC San Diego UC San Diego Previously Published Works
UC San Diego UC San Diego Previously Published Works Title Classification of MPEG-2 Transport Stream Packet Loss Visibility Permalink https://escholarship.org/uc/item/9wk791h Authors Shin, J Cosman, P
More informationWhite Paper. Uniform Luminance Technology. What s inside? What is non-uniformity and noise in LCDs? Why is it a problem? How is it solved?
White Paper Uniform Luminance Technology What s inside? What is non-uniformity and noise in LCDs? Why is it a problem? How is it solved? Tom Kimpe Manager Technology & Innovation Group Barco Medical Imaging
More informationTorsional vibration analysis in ArtemiS SUITE 1
02/18 in ArtemiS SUITE 1 Introduction 1 Revolution speed information as a separate analog channel 1 Revolution speed information as a digital pulse channel 2 Proceeding and general notes 3 Application
More informationAPPLICATIONS OF DIGITAL IMAGE ENHANCEMENT TECHNIQUES FOR IMPROVED
APPLICATIONS OF DIGITAL IMAGE ENHANCEMENT TECHNIQUES FOR IMPROVED ULTRASONIC IMAGING OF DEFECTS IN COMPOSITE MATERIALS Brian G. Frock and Richard W. Martin University of Dayton Research Institute Dayton,
More informationAUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION
AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate
More informationDAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes
DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms
More informationHUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH
Proc. of the th Int. Conference on Digital Audio Effects (DAFx-), Hamburg, Germany, September -8, HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH George Tzanetakis, Georg Essl Computer
More informationRobust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm
International Journal of Signal Processing Systems Vol. 2, No. 2, December 2014 Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm Walid
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional
More informationAutomatic Piano Music Transcription
Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening
More informationDevelopment of 16-channels Compact EEG System Using Real-time High-speed Wireless Transmission
Engineering, 2013, 5, 93-97 doi:10.4236/eng.2013.55b019 Published Online May 2013 (http://www.scirp.org/journal/eng) Development of 16-channels Compact EEG System Using Real-time High-speed Wireless Transmission
More informationEffects of lag and frame rate on various tracking tasks
This document was created with FrameMaker 4. Effects of lag and frame rate on various tracking tasks Steve Bryson Computer Sciences Corporation Applied Research Branch, Numerical Aerodynamics Simulation
More informationHigh Quality Digital Video Processing: Technology and Methods
High Quality Digital Video Processing: Technology and Methods IEEE Computer Society Invited Presentation Dr. Jorge E. Caviedes Principal Engineer Digital Home Group Intel Corporation LEGAL INFORMATION
More informationHands-on session on timing analysis
Amsterdam 2010 Hands-on session on timing analysis Introduction During this session, we ll approach some basic tasks in timing analysis of x-ray time series, with particular emphasis on the typical signals
More informationChord Classification of an Audio Signal using Artificial Neural Network
Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationType-2 Fuzzy Logic Sensor Fusion for Fire Detection Robots
Proceedings of the 2 nd International Conference of Control, Dynamic Systems, and Robotics Ottawa, Ontario, Canada, May 7 8, 2015 Paper No. 187 Type-2 Fuzzy Logic Sensor Fusion for Fire Detection Robots
More informationCompleting Cooperative Task by Utilizing EEGbased Brain Computer Interface
Washington University in St. Louis Washington University Open Scholarship Engineering and Applied Science Theses & Dissertations Engineering and Applied Science Spring 5-18-2018 Completing Cooperative
More informationA PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES
12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.
Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute
More informationAn Efficient Reduction of Area in Multistandard Transform Core
An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai
More informationPython Quick-Look Utilities for Ground WFC3 Images
Instrument Science Report WFC3 2008-002 Python Quick-Look Utilities for Ground WFC3 Images A.R. Martel January 25, 2008 ABSTRACT A Python module to process and manipulate ground WFC3 UVIS and IR images
More informationThe Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng
The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,
More informationGetting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.
Getting Started First thing you should do is to connect your iphone or ipad to SpikerBox with a green smartphone cable. Green cable comes with designators on each end of the cable ( Smartphone and SpikerBox
More informationTERRESTRIAL broadcasting of digital television (DTV)
IEEE TRANSACTIONS ON BROADCASTING, VOL 51, NO 1, MARCH 2005 133 Fast Initialization of Equalizers for VSB-Based DTV Transceivers in Multipath Channel Jong-Moon Kim and Yong-Hwan Lee Abstract This paper
More informationOptimized Color Based Compression
Optimized Color Based Compression 1 K.P.SONIA FENCY, 2 C.FELSY 1 PG Student, Department Of Computer Science Ponjesly College Of Engineering Nagercoil,Tamilnadu, India 2 Asst. Professor, Department Of Computer
More informationCS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016
CS 1674: Intro to Computer Vision Face Detection Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 Today Window-based generic object detection basic pipeline boosting classifiers face detection
More informationDeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,
DeepID: Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University i of Hong Kong Machine Learning with Big Data Machine learning with small data: overfitting,
More informationMusic Recommendation from Song Sets
Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia
More informationResearch Article Design and Analysis of a High Secure Video Encryption Algorithm with Integrated Compression and Denoising Block
Research Journal of Applied Sciences, Engineering and Technology 11(6): 603-609, 2015 DOI: 10.19026/rjaset.11.2019 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:
More informationA Framework for Segmentation of Interview Videos
A Framework for Segmentation of Interview Videos Omar Javed, Sohaib Khan, Zeeshan Rasheed, Mubarak Shah Computer Vision Lab School of Electrical Engineering and Computer Science University of Central Florida
More information