Skip to content

More than async #14

Closed
Closed
@Atry

Description

@Atry

We are implementing asynchronous computing in DeepLearning.scala 2.0.
However, in order to maximize the throughput, we need on-device computing graph instead of CPU driven asynchronous computing.

In DeepLearning.scala 3.0, we will implement applicative-based computing graph, avoiding flatMap or map. We will keep a proper number of kernel for an on-device command queue, e.g. 3 kernels. Most of the on-CPU Futures await for command queue available, instead of awaiting for result.

  • Counting the load of command queue
  • Make tuples of Buffer and Event

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions