GCC / libstdc ++ C ++实现17并行STL'par_offload'执行器策略,可以将执行卸载到HSA Full Profile设备。
This repository contains an experimental proof-of-concept of an
heterogeneous offload-capable Parallel STL. It introduces a new execution
policy which attempts to offload the execution to a device that supports HSA 1.0 Full Profile.
When the new ‘par_offload’ policy is used, the implementation tries to offload
the algorithm launch through the HSA runtime, and if it fails for
some reason, it gracefully falls back to host (CPU) execution.
NOTE: this project is in a preview stage and has received practically
no proper testing outside the included test suite. It is also not yet
optimized for any particular target.
In your program add #include <experimental/par_offload>
to your sources after including
a PSTL implementation you want to fallback to.
Then build your program with the ‘-fpar_offload’ switch:
g++ --std=c++17 -fpar_offload program.cc -o program
And run it normally:
./program
If you have a working HSA 1.0 Full Profile environment installed, it might
silently offload the algorithm’s execution. To verify that offloading happens, you can set
the environment variable HSA_DEBUG to 1 to get verbose output.
Currently the following algorithms can be offloaded:
An incomplete list of improvements/fixes to do:
par_offload was developed by Parmance engineers
for General Processor Technologies
who published the code to the open source community in May 2018.
It is heavily based on the excellent GCC HSA offloading work mostly done by
Martin Jambor and Martin Liška of SUSE.