How to use ggml_conv_1d? #883

chavinlo · 2024-07-06T05:02:17Z

Hello, I am trying to implement a model that makes uses of nn.conv1d in pytorch.
I don't have much experience with C++ but I've read the MNIST examples and part of stable-diffusion.cpp. However, I can't seem to find many examples of ggml_conv_1d.

I have tried this (the following is just simplified):

// ...
ggml_tensor * input = ggml_get_tensor(ggml_loader_ctx, "input"); // Has a "ne" (that's shape, right?) of {1, 1, 297328, 1}
ggml_tensor * weight = ggml_get_tensor(ggml_loader_ctx, "weight") // Has a "ne" of {10, 1, 512, 1} - {kernel_size, input_dim, output_dim, ???}

int stride = 5;
int padding = 0;
int dilation = 1;

ggml_tensor* output = ggml_conv_1d(ctx, weight_tensor, input, stride, padding, dilation);

And this would be the pytorch equivalent:

import torch
import torch.nn as nn
import torch.nn.functional as F

conv0 = nn.Conv1d(1, 512, 10, 5, bias=False) # This is just the declaration but the weights are imported from an already trained model
input = torch.load("inp.pt") # Shape: 1, 1, 297328
out = conv0(input) # Shape: 1, 512, 59464

I got the weight shape order from this code: https://github.com/PABannier/encodec.cpp/blob/main/encodec.cpp#L755
However, the output is just a bunch of NaNs (I think, because its all "Ï" on visual studio memory debug). I tried diving deeper and I think the issue is on ggml_im2col where const int64_t OW = ggml_calc_conv_output_size(b->ne[0], a->ne[0], s0, p0, d0); calculates a 0 value (which is wrong?). While debugging I think this is what it was calculating:
((1 + 2 × 1 − 1 × (10 − 1) − 1) / 5 + 1) which equals -0.4?

I've tried finding a solution but theres barely any documentation and I have already stared at the source code for hours and don't know what to do

This is the code I use to save the weights into GGUF format if necessary:

gguf_writer = gguf.GGUFWriter("test.gguf", "test0")

x = torch.load("inp.pt")
z = z.numpy()
z = z.transpose(2, 1, 0)

model = model.requires_grad_(False)
conv0_weight = model.conv0.weight.numpy()

gguf_writer.add_tensor("model0.test0.weight", conv0_weight, conv0_weight.shape)
gguf_writer.add_tensor("input", z, z.shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

The text was updated successfully, but these errors were encountered:

balisujohn · 2024-07-06T19:20:08Z

at first glance
{1, 1, 297328, 1}
should probably be
{297328,1,1,1}

There are some examples of ggml_conv_1d in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624

chavinlo · 2024-07-07T00:13:55Z

at first glance {1, 1, 297328, 1} should probably be {297328,1,1,1}

There are some examples of ggml_conv_1d in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624

Thanks, now the im2col tensor has a shape of {10, 59464, 1, 1} which is closer to pytorch's output. But it then returns a empty (full of -431602080.) on the ggml_mul_mat call (L6472). What exactly is the first tensor of mul_mat supposed to be? because im2col initializes a tensor with the shapes mentioned before but no value inside of it. And the a and b tensors are only stored on result->src[0] = a; and result->src[1] = b; respectively, right? (L6607)

tldr tensor a (what's returned from im2col) of mul_mat's arguments inside ggml_conv_1d is empty hence the output of conv1d will fail

balisujohn · 2024-07-07T00:30:46Z

whats the output shape of ggml_conv_1d with these arguments?

chavinlo · 2024-07-07T00:35:50Z

whats the output shape of ggml_conv_1d with these arguments?

The output shape of ggml_conv_1d is {59464, 512, 1, 1}. However, the data of the tensor is empty.

balisujohn · 2024-07-07T01:44:44Z

It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does ggml_conv_1d on your inputs instead of the current operation.

chavinlo · 2024-07-07T05:33:54Z

It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does ggml_conv_1d on your inputs instead of the current operation.

Sure, heres the tensors.pt file needed for the scripts below (it only contains the weight of a conv1d and a test input): https://huggingface.co/chavinlo/ggmltest/resolve/main/tensors.pt?download=true

Pytorch and save to GGUF:

import torch
import torch.nn as nn
import gguf

tensors = torch.load("tensors.pt")

conv = nn.Conv1d(1, 512, 10, 5, bias=False)

input_tensor = tensors["input"]
conv.weight = tensors["weight"]

x = conv(input_tensor)
print(x)
print(x.shape)
print(input_tensor.shape)
print(conv.weight.shape)

"""
Output should be:
tensor([[[-3.6160e-02, -2.8281e-02,  1.1107e-02,  ...,  3.0684e-02,
          -2.2186e-02, -6.3259e-03],
         [ 7.9663e-02,  3.0687e-02, -5.0905e-02,  ..., -1.9161e-02,
           3.1029e-02,  1.5162e-02],
         [ 2.3205e-01,  1.8175e-01, -1.0812e-01,  ..., -2.3548e-02,
           3.9255e-02,  1.1151e-01],
         ...,
         [ 8.6802e-04,  8.3316e-04,  3.2947e-04,  ..., -2.9786e-03,
           6.5938e-03,  1.1510e-02],
         [ 1.6648e-02,  2.3425e-02, -7.5188e-03,  ...,  8.7883e-03,
           4.2063e-03,  1.8971e-02],
         [-2.4058e-01,  3.3975e-01,  2.8910e-01,  ..., -1.3100e-01,
          -1.3514e-01,  1.4614e-01]]])
torch.Size([1, 512, 59464])
torch.Size([1, 1, 297328])
torch.Size([512, 1, 10])
"""

gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

GGML C++:

#include <iostream>
#include "ggml.h"

int main()
{
    std::string fname = "tensors.gguf";

    // ##### GGML Context #####
    static size_t buf_size = 1024 * 1024 * 128;
    static void* buf = malloc(buf_size);

    struct ggml_init_params ggml_params = {
        /*.mem_size   =*/ buf_size,
        /*.mem_buffer =*/ buf,
        /*.no_alloc   =*/ false,
    };

	struct ggml_context* ggml_ctx = ggml_init(ggml_params);
    // %%%%%%%%%%

    // ##### GGUF Model Loading Context #####
    struct ggml_context* ggml_loader_ctx;

    struct gguf_init_params gguf_params = {
        /*.no_alloc   =*/ false,
        /*.ctx        =*/ & ggml_loader_ctx,
    };

    gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
    // %%%%%%%%%%

    ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
    ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");

    ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);

    std::cout << &output;
}

I had to create two contexts because when calling gguf_init_from_file the buf_size would shrink to 1mb.
The C++ code returns the same results as the script I was using before. Same output shape {59464, 512, 1, 1} yet no data.

Also about the ggml forking... you mean implementing ggml_conv_1d myself or...?

balisujohn · 2024-07-07T06:57:03Z

here's an example of what I'm talking about: balisujohn/ggml-get-rows-error@acc0259 when I created a reproducible example an error with ggml_get_rows.You can also upload the .pt file and .gguf files into the repository to make reproducing the error easier.

balisujohn · 2024-07-07T06:59:12Z

I didn't see you had included the tensors.pt (I'll see if I can reproduce the error)

balisujohn · 2024-07-07T07:05:35Z

I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand or any variant of ggml_graph_compute anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.

chavinlo · 2024-07-07T16:22:27Z

I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand or any variant of ggml_graph_compute anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.

Added ggml_build_forward_expand and ggml_graph_compute_with_ctx and got an error on the ggml_compute_forward_im2col_f16 function (L14383) in the GGML_ASSERT(src0->type == GGML_TYPE_F16); (L14390) check.
So then I changed the weight type from float32 to float16 before saving to GGUF, reloaded it, and finally it returned a non-empty tensor. Does this mean that the weights have to always be in float16 precision? because even the ggml_compute_forward_im2col_f32 function (L14306) makes the same check for the weights to be in float16.

Not that big of an error, but the accuracy of the output tensor is a little bit off: GGML output's first value is -0.0361655578, while torch gives -0.0361604653. The shape is correct though, {59464, 512, 1, 1}.

Heres the updated C++/Pytorch code if necessary:

#include <iostream>
#include "ggml.h"

int main()
{
    std::string fname = "tensors.gguf";

    // ##### GGML Context #####
    static size_t buf_size = 1024 * 1024 * 128;
    static void* buf = malloc(buf_size);

    struct ggml_init_params ggml_params = {
        /*.mem_size   =*/ buf_size,
        /*.mem_buffer =*/ buf,
        /*.no_alloc   =*/ false,
    };

	struct ggml_context* ggml_ctx = ggml_init(ggml_params);
    // %%%%% End of Inference GGML Context %%%%%

    // ##### GGUF Model Loading Context #####
    struct ggml_context* ggml_loader_ctx;

    struct gguf_init_params gguf_params = {
        /*.no_alloc   =*/ false,
        /*.ctx        =*/ & ggml_loader_ctx,
    };

    gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
    // %%%%% End of GGUF Model Loading Context %%%%%

    struct ggml_cgraph* gf = ggml_new_graph(ggml_ctx);

    ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
    ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");

    ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);

    ggml_build_forward_expand(gf, output);
    ggml_graph_compute_with_ctx(ggml_ctx, gf, 1);

    std::cout << &output->data;
}

import torch
import torch.nn as nn
import gguf

tensors = torch.load("tensors.pt")
conv = nn.Conv1d(1, 512, 10, 5, bias=False)

input_tensor = tensors["input"]
conv.weight = tensors["weight"]

x = conv(input_tensor)
# to save
conv = conv.to(torch.float16)
print(x)
print(input_tensor)
print(conv.weight)
print("output shape:", x.shape)
print("input shape:", input_tensor.shape)
print("weight shape:", conv.weight.shape)
print("input dtype:", input_tensor.dtype)
print("weight dtype:", conv.weight.dtype)

"""
Output should be:

tensor([[[-3.6160e-02, -2.8281e-02,  1.1107e-02,  ...,  3.0684e-02,
          -2.2186e-02, -6.3259e-03],
         [ 7.9663e-02,  3.0687e-02, -5.0905e-02,  ..., -1.9161e-02,
           3.1029e-02,  1.5162e-02],
         [ 2.3205e-01,  1.8175e-01, -1.0812e-01,  ..., -2.3548e-02,
           3.9255e-02,  1.1151e-01],
         ...,
         [ 8.6802e-04,  8.3316e-04,  3.2947e-04,  ..., -2.9786e-03,
           6.5938e-03,  1.1510e-02],
         [ 1.6648e-02,  2.3425e-02, -7.5188e-03,  ...,  8.7883e-03,
           4.2063e-03,  1.8971e-02],
         [-2.4058e-01,  3.3975e-01,  2.8910e-01,  ..., -1.3100e-01,
          -1.3514e-01,  1.4614e-01]]])
tensor([[[ 0.3720,  0.3385,  0.2953,  ..., -0.0872, -0.1079, -0.1487]]])
Parameter containing:
tensor([[[-0.0186,  0.2178, -0.1289,  ..., -0.0457,  0.1654,  0.1256]],
        [[-0.1410,  0.2072,  0.1740,  ..., -0.0887, -0.0700, -0.0103]],
        [[ 0.2450,  0.2239,  0.1588,  ..., -0.1426, -0.1188,  0.1205]],
        ...,
        [[ 0.0122, -0.0559,  0.1382,  ..., -0.2484,  0.1337, -0.0367]],
        [[ 0.1155,  0.0986, -0.0650,  ..., -0.1870, -0.0693,  0.0563]],
        [[-0.1095, -0.1289, -0.2644,  ..., -0.2622, -0.0640, -0.0243]]],
       dtype=torch.float16)
output shape: torch.Size([1, 512, 59464])
input shape: torch.Size([1, 1, 297328])
weight shape: torch.Size([512, 1, 10])
input dtype: torch.float32
weight dtype: torch.float16
"""

gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

Also, thanks for your help, I really appreciate it.

balisujohn · 2024-07-07T16:58:30Z

The weights do always have to be float16, the slight numerical difference from torch is unsurprising, and no problem, happy to help.

chavinlo closed this as completed Jul 7, 2024

chavinlo mentioned this issue Aug 6, 2024

How much inaccuracy/difference from pytorch is to be expected? #915

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use ggml_conv_1d? #883

How to use ggml_conv_1d? #883

chavinlo commented Jul 6, 2024

balisujohn commented Jul 6, 2024

chavinlo commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024 •

edited

Loading

chavinlo commented Jul 7, 2024

balisujohn commented Jul 7, 2024 •

edited

Loading

chavinlo commented Jul 7, 2024

balisujohn commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024

balisujohn commented Jul 7, 2024

chavinlo commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024

How to use ggml_conv_1d? #883

How to use ggml_conv_1d? #883

Comments

chavinlo commented Jul 6, 2024

balisujohn commented Jul 6, 2024

chavinlo commented Jul 7, 2024 • edited Loading

balisujohn commented Jul 7, 2024 • edited Loading

chavinlo commented Jul 7, 2024

balisujohn commented Jul 7, 2024 • edited Loading

chavinlo commented Jul 7, 2024

balisujohn commented Jul 7, 2024 • edited Loading

balisujohn commented Jul 7, 2024

balisujohn commented Jul 7, 2024

chavinlo commented Jul 7, 2024 • edited Loading

balisujohn commented Jul 7, 2024

chavinlo commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024 •

edited

Loading

balisujohn commented Jul 7, 2024 •

edited

Loading

chavinlo commented Jul 7, 2024 •

edited

Loading