tag:github.com,2008:https://github.com/sekwiatkowski/komputation/releasesRelease notes from komputation2018-01-15T03:10:35Ztag:github.com,2008:Repository/93887378/v0.12.52018-01-15T03:10:35Zv0.12.5:<ul>
<li>Switched CUDA C development to CLion</li>
<li>Used the <strong>JETBRAINS_IDE</strong> macro to declare CUDA's language extensions</li>
<li>Header include paths are now relative to the given source file</li>
<li>For real-time compilation with nvrtc, all include directives in the source code are replaced with a sequence of directives that use paths relative to the CUDA resource base directory.</li>
<li>Header files are now inferred from the source code and do no longer have to be specified in kernel instructions.</li>
<li>Fixed comparisons in the binary testing kernel</li>
<li>Replaced double constants with floats</li>
<li>Removed the unused numberEntries parameter from the kernel that replaces NaNs</li>
<li>Removed unused parameter from functions used for backpropagation kernels of recurrent layers</li>
<li>Resolved a name conflict in the max-pooling kernel</li>
<li>Simplified the definition of the stack of convolutional layers in the embedding toy demo with two filter widths</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.12.32018-01-09T06:32:52Zv0.12.3:<ul>
<li>Finished implementing experimental support for (fixed-length, left-to-right, vanilla) GPU-accelerated recurrent neural networks</li>
<li>Fixed the allocation of memory for the propagation result in CudaSquaredLoss</li>
<li>Added a helper function to access and print arrays on the device</li>
<li>Implemented a SumKernel to add up accumulated gradients for parameters that are in used in each instance</li>
<li>Added CUDA helper functions to cooperatively copy an array and add up two arrays</li>
<li>Moved the entrywise CUDA activation functions to header files</li>
<li>Removed unused array fill kernels</li>
<li>Added a pointer to the maximum number of input columns in BaseCudaContinuation</li>
<li>The shared parameter is passed directly to the CPU-specific ParameterizedSeries instruction. This makes it possible to use the same entries for the CPU and CUDA.</li>
<li>Removed the CUDA IDs from the ResultExtraction enumeration</li>
<li>Set the device activity function IDs to be constant</li>
<li>Added a CUDA version of the increment demo</li>
<li>Mentioned the demo in the README</li>
<li>Replaced kotlin-stdlib-jre8 with kotlin-stdlib-jdk8</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.12.22018-01-07T06:06:22Zv0.12.2:<p>Removed the projection of a zero initial state vector in the first step in CpuRecurrent</p>aisummarytag:github.com,2008:Repository/93887378/v0.12.12018-01-05T20:42:11Zv0.12.1:<ul>
<li>The summation of gradients based on the parameter index in CudaLookup is now deterministic.</li>
<li>Removed the hash table kernel</li>
<li>Replaced the use of the hash table with a pointer to the parameter indices</li>
<li>Rewrote the group sum kernel based on information about the indices of the first occurrence of a parameter and its remaining occurrences</li>
<li>Added a kernel two add up two arrays</li>
<li>Fixed backward propagation in CudaStack by replacing the cuBLAS axpy operation with the use of the addition kernel</li>
<li>The input memory can now store information about duplicate occurrences.</li>
<li>Improved the name of the setters in InputMemory</li>
<li>The optimizer kernels now check if the count is strictly positive.</li>
<li>Moved reusable batch size and output entries members to BaseCudaEntryPoint</li>
<li>Increased the batch size to 16 and changed hyperparameters in the TREC demos with two filter widths.</li>
<li>Mentioned the CUDA TREC demo with two filters in the README</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.12.02017-12-24T03:46:12Zv0.12.0:<ul>
<li>Simplified the specification of networks</li>
<li>The input dimensions over the continuations of the network are computed automatically.</li>
<li>Removed the Layer suffix from instruction factory functions</li>
<li>Overloaded the instruction factory function to simplify the specification of initialization strategies</li>
<li>Renamed Direction.Forward/Backward to Direction.LeftToRight/RightToLeft</li>
<li>Shortened "ActivationFunction" to "Activation" and "ActivationLayer" to "Activation"</li>
<li>Generalized BaseCudaEntrywiseActivationLayer to BaseCudaEntrywiseLayer</li>
<li>The specification of the minimum length is required in the lookup instruction and optional in the input instruction.</li>
<li>TREC categories are indexed based on all available training data.</li>
<li>Renamed "forward" layer to "continuation" and shortened "combination layer" to "combination"</li>
<li>Moved the architecture-specific interfaces from the general package to the respective architecture-specific packages</li>
<li>Improved the names used in SparseAccumulator and SparseUpdate</li>
<li>The series is passed on to the method of the ResultExtractionStrategy interface.</li>
<li>Introduced CpuCombinationSeries to implement the addition of the weighted previous state and the weighted current input.</li>
<li>Added the Cpu prefix to Series and ParameterizedSeries in preparation of the CUDA implementation of recurrent neural networks</li>
<li>Optimized the performance RNN implementation by adding the bias to the input rather than adding at each step</li>
<li>Fixed the specification of the number of rows in CpuLogisticLoss</li>
<li>Renamed the "Negation" demo to "Not"</li>
<li>Stopped experimenting with dynamic parallelism</li>
<li>CudaIdentity now implements CudaActivation.</li>
<li>Introduced a base class for higher-order layers</li>
<li>Differentiated the CUDA continuation base class into one class for layers that change the number of columns and one class for layers that don't.</li>
<li>Reused the code for the computation of launch configurations in CudaHashing and CudaGroupSum</li>
<li>Fixed the sparse updated in CudaLookup</li>
<li>Added a "copy" helper function that encapsulates System.arraycopy for copies</li>
<li>Added a setter to InputMemory that caches all possible data</li>
<li>Clarified references to the hash table in CUDA optimizers</li>
<li>CUDA layers pass a pointer to the length of the input data and the maximum length within the batch.</li>
<li>Unified the activation instruction factory functions over the two architectures</li>
<li>Moved the concatenation layer to a separate package</li>
<li>Added an instruction for weightings with shared parameters that is separate from the instruction for the weighting layer that uses a dedicated parameter</li>
<li>The two weighting instructions inherit from the new BaseWeighting class.</li>
<li>Added instructions for the tree series types: Series, ParameterizedSeries and CombinationSeries</li>
<li>Refactored the CPU RNN factory function based on the instructions</li>
<li>Continuation instructions implement HasOutputDimensions and CanSetInputDimensions, while entry point instructions only implement HasOutputDimensions.</li>
<li>Inlined some CUDA C helper functions</li>
<li>Moved the division by 2 in the squared loss function from the host to the device</li>
<li>Added the missing scaling of gradients in some of the optimization kernels</li>
<li>Refactored the for loops used to update entries in optimization kernels</li>
<li>Temporarily removed the CUDA forward layer tests</li>
<li>Updated the links in the README</li>
<li>Upgraded to Kotlin 1.2.10</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.11.32017-12-12T00:41:38Zv0.11.3:<ul>
<li>Added an instruction for bidirectional recurrent layers</li>
<li>Rearranged the parameters in the factory functions of the recurrent layer and the dropout layer instruction</li>
<li>Overloaded the dropout layer instruction factory function for the case of vectorial input</li>
<li>Mentioned the bidirectional recurrent layer and the new running total demos in the README</li>
<li>Updated the TREC sample code in the README</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.11.22017-12-10T14:13:23Zv0.11.2:<ul>
<li>The recurrent layer can now emit either all steps or the last step.</li>
<li>Added demos that compute the total of fixed-length and variable-length input</li>
<li>Mentioned the new recurrent layer implementation in the README</li>
<li>Included links to the demos in the README</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.11.12017-10-31T20:14:54Zv0.11.1:<ul>
<li>Implemented testing support for multi-class and binary classification problems</li>
<li>Constructors of optimization instructions are now internal.</li>
<li>Removed AttentiveDecoder and the reverse demo based on that decoder</li>
<li>Removed its specific dependencies: column repetition, row summation and transposition</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.11.02017-10-27T13:24:23Zv0.11.0:<ul>
<li>Implemented and tested Adam optimization for CUDA</li>
<li>Set a delta in the equality assertions of CUDA optimization tests</li>
</ul>aisummarytag:github.com,2008:Repository/93887378/v0.10.62017-10-27T09:41:23Zv0.10.6:<ul>
<li>Fixed compilation errors in the kernels for SGD and Momentum</li>
<li>Implemented and tested Adadelta optimization for CUDA</li>
</ul>aisummary