In this paper, we propose and evaluate two parallel
implementations of Multi-dimensional Ensemble Empirical
Mode Decomposition (MEEMD) for multi-core (CPU) and
many-core (GPU) architectures. Relative to a sequential C
implementation, our double precision GPU implementation,
using the CUDA programming model, achieves up to 48.6x
speedup on NVIDIA Tesla C2050. Our multi-core CPU
implementation, using the OpenMP programming model,
achieves up to 11.3x speedup on two octal-core Intel Xeon
x7550 CPUs.