-
Notifications
You must be signed in to change notification settings - Fork 58
Description
I am testing ASL on a Mac mini late 2014 with Intel Iris graphics hardware running macOS 10.12.6.
./asl-flow --devices outputs:
flow 1.0
Default computation device:
platform = Apple�
device = Intel(R) Core(TM) i5-4278U CPU @ 2.60GHz�
List of all available platforms and their devices:
Platform: Apple�
Number of devices: 2
Intel(R) Core(TM) i5-4278U CPU @ 2.60GHz�
Iris�
Now, trying to invoke the flow example with ./asl-flow --device Iris gives:
ASL WARNING: Requested combination of platform(Apple�) and device(Iris) not found! Using:
platform = Apple�
device = Intel(R) Core(TM) i5-4278U CPU @ 2.60GHz�.
Data initialization... Finished
Numerics initialization... Finished
Computing...Finished
Computation statistic:
Real Time = 26.8742; Processor Time = 98.7167; Processor Load = 367.329%
I found out, that the strings of the platform/device information contain trailing '\0' chars, while the parameter strings as delivered by boost do not, and the comparison (device == getDeviceName(queues[i])) on line 106 in aclHardware.cxx fails, since "Iris\0" is different from "Iris".
I changed the lines 501/502 in aslParametersManager.cxx to:
string pf = vm["platform"].as<string>(); if (pf.back() != '\0') pf += '\0';
string dv = vm["device" ].as<string>(); if (dv.back() != '\0') dv += '\0';
acl::hardware.setDefaultQueue(pf, dv);
And with that in place I can have ASL simulation run by the GPU.
Data initialization... Finished
Numerics initialization... Finished
Computing...Finished
Computation statistic:
Real Time = 13.3571; Processor Time = 1.27338; Processor Load = 9.53341%
EDIT: Now, seeing this issue message, I guess it would be better to remove the trailing '\0' chars from the strings of the platform/device information, since, quite obviously these nul's made it into the diagnostic output of the simulation tools.
BTW: The locomotive example with the default parameters 0.08/1.0/10001 crashes when run on the GPU, while it finishes on the CPU. If I change dx to 0.09, then it finishes on the GPU as well after 1200 s, while it took 20 times more on the CPU at a load of 400 %.