float16 support for GPU als model #661

benfred · 2023-05-30T04:50:35Z

This adds support for using float16 factors in the GPU version of the ALS model. This reduces the memory needed for the ALS model embeddings by half - while providing a small speedup in training time, and virtually no difference in the accuracy of the learned model.

All computations are still performed using float32 - including both training and inference. This is done with using mixed precision matrix multiplications during inference : the fp16 factors are multiplied together with results accumulated as fp32. During training, the factors are converted from fp16 to fp32 - and updates are calculated in 32-bit before being stored back as fp16.

benfred · 2023-05-30T17:45:53Z

both training and inference times are slightly faster with fp16 - but not drastically so:

dataset	dtype	training time (s)	similar_items time (s)
lastfm	float16	6.03652	7.99366
lastfm	float32	6.17446	8.58448
movielens-20m	float16	3.5967	0.984919548034668
movielens-20m	float32	3.6981	1.01374

This is as expected, since we're computing results in float32 - just storing in float16.

benfred · 2023-05-30T18:03:53Z

Running some quick experiments with cross-validation, and I got equivalent results with both fp16 and fp32 factors. This indicates that there isn't an accuracy hit to using fp16 factors in the learned model.

Running a simple experiment on the lastfm dataset:

from implicit.evaluation import precision_at_k, train_test_split
from implicit.datasets.lastfm import get_lastfm
from implicit.gpu.als import AlternatingLeastSquares

_, _, ratings = get_lastfm()
train, test = train_test_split(ratings.T.tocsr())

fp_16_model = AlternatingLeastSquares(factors=128, dtype="float16")
fp_16_model.fit(train)
p = precision_at_k(fp_16_model, train, test, K=10)
print("precision@10, fp16", p)

fp_32_model = AlternatingLeastSquares(factors=128, dtype="float32")
fp_32_model.fit(train)
p = precision_at_k(fp_32_model, train, test, K=10)
print("precision@10, fp32", p)

Prints out

precision@10, fp16 0.14532461631304008
precision@10, fp32 0.14520956046071604

(note this was with just default hyper-parameters - the goal here is to show if the results are equivalent between fp16/fp32 or not, rather than to be the best possible results for the lastfm dataset).

benfred and others added 13 commits June 24, 2022 21:59

float16 support for GPU models

1863da9

Merge branch 'main' into fp16

b0f3334

remove unneeded half select code

c83a135

accumulate YtY in fp32

32df552

use float32 for inference

c58acc4

fix up public api to use 'dtype' argument consistently

855ca1c

add tests for float16 on gpu

4a35c34

dtype checking

9ebf7d5

fix merge

b19fec8

fix warning

f2d57ad

remove template on l2_regularize_kernel

4c4bbf0

use same threshold for fp16/fp32

cd78929

fp16 calculate_loss

fa85b03

benfred merged commit ec36f33 into main May 30, 2023

benfred deleted the fp16 branch May 30, 2023 18:24

benfred linked an issue May 30, 2023 that may be closed by this pull request

Support storing factors as FP16 on the GPU #392

Closed

benfred mentioned this pull request May 30, 2023

Support storing factors as FP16 on the GPU #392

Closed

benfred mentioned this pull request Jun 6, 2023

Negative Loss on gpu ALS model #367

Closed

benfred added the enhancement label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

float16 support for GPU als model #661

float16 support for GPU als model #661

Uh oh!

benfred commented May 30, 2023 •

edited

Loading

Uh oh!

benfred commented May 30, 2023

Uh oh!

benfred commented May 30, 2023 •

edited

Loading

Uh oh!

Uh oh!

float16 support for GPU als model #661

float16 support for GPU als model #661

Uh oh!

Conversation

benfred commented May 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benfred commented May 30, 2023

Uh oh!

benfred commented May 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

benfred commented May 30, 2023 •

edited

Loading

benfred commented May 30, 2023 •

edited

Loading