We present an acceleration strategy for a domain-decomposition iterative solver based on local Schur complements with respect to the skeleton of the computational domain. In finite element tearing and interconnecting (FETI) this results in an iterative application of a large number of dense matrices. For the offload of such kernels to the Intel® Xeon Phi coprocessors we use the Heterogeneous Active Messages (HAM) library. A simple load balancing strategy is presented to efficiently utilize both the host CPU and the available coprocessors during individual iterations.

1.
Intel Corporation
,
Intel C++ compiler 17.0 developer guide and reference pdf
, https://software.intel.com/enus/intel-cplusplus-compiler-17.0-user-and-reference-guide-pdf (
2016
).
2.
OpenMP Architecture Review Board, Openmp application programming interface
, version 4.5, http://www.openmp.org/specifications/ (
2015
).
3.
M.
Noack
,
Heterogeneous active messages C++ library
, https://github.com/noma/ham (
2017
), [Online; accessed 23-June-2017].
4.
Intel Corporation
,
Intel many-core platform software stack (MPSS)
, https://software.intel.com/sites/default/files/managed/72/db/mpss-performance-guide.pdf (
2016
).
5.
Intel Corporation
,
Introduction to the heterogeneous streams library
, https://software.intel.com/enus/articles/introduction-to-heterogeneous-streams-library (
2016
).
6.
L.
Maly
,
J.
Zapletal
,
M.
Merta
,
L.
Riha
, and
V.
Vondrak
, “
Comparison of Intel Xeon Phi offload runtimes
,” in
Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering
, Vol. accepted, edited by
P.
Iványi
,
B.
Topping
, and
G.
Várady
(
Civil-Comp Press
,
2017
).
7.
M.
Merta
,
L.
Riha
,
O.
Meca
,
A.
Markopoulos
,
T.
Brzobohaty
,
T.
Kozubek
, and
V.
Vondrak
, “
Intel Xeon Phi acceleration of hybrid total FETI solver
,” (
Advances in Engineering Software
,
2016
).
8.
Intel Corporation
,
Symmetric communications interface (SCIF) for Intel Xeon Phi product family users guide
, http://registrationcenter-download.intel.com/akdlm/irc_nas/9226/scif_userguide.pdf (
2016
).
9.
Intel Corporation
,
Intel Xeon Phi coprocessor: Software developers guide
, https://www-ssl.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-system-software-developers-guide.html (
2016
).
10.
M.
Noack
,
F.
Wende
,
T.
Steinke
, and
F.
Cordes
, “
A unified programming model for intra- and inter-node offloading on Xeon Phi clusters
,” in
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’14
(
IEEE Press
,
Piscataway, NJ, USA
,
2014
), pp.
203
214
.
11.
M.
Noack
, “HAM - heterogenous active messages for efficient offloading on the Intel Xeon Phi,”
Tech. Rep
.
14
23
(
ZIB
, Takustr.7, 14195
Berlin
,
2014
).
12.
J.
Jeffers
and
J.
Reinders
,
High Performance Parallelism Pearls Volume One: Multicore and Many-core
Programming Approaches
(Elsevier Science, 2014).
This content is only available via PDF.
You do not currently have access to this content.