Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use ggml_conv_1d? #883

Closed
chavinlo opened this issue Jul 6, 2024 · 11 comments
Closed

How to use ggml_conv_1d? #883

chavinlo opened this issue Jul 6, 2024 · 11 comments

Comments

@chavinlo
Copy link

chavinlo commented Jul 6, 2024

Hello, I am trying to implement a model that makes uses of nn.conv1d in pytorch.
I don't have much experience with C++ but I've read the MNIST examples and part of stable-diffusion.cpp. However, I can't seem to find many examples of ggml_conv_1d.

I have tried this (the following is just simplified):

// ...
ggml_tensor * input = ggml_get_tensor(ggml_loader_ctx, "input"); // Has a "ne" (that's shape, right?) of {1, 1, 297328, 1}
ggml_tensor * weight = ggml_get_tensor(ggml_loader_ctx, "weight") // Has a "ne" of {10, 1, 512, 1} - {kernel_size, input_dim, output_dim, ???}

int stride = 5;
int padding = 0;
int dilation = 1;

ggml_tensor* output = ggml_conv_1d(ctx, weight_tensor, input, stride, padding, dilation);

And this would be the pytorch equivalent:

import torch
import torch.nn as nn
import torch.nn.functional as F

conv0 = nn.Conv1d(1, 512, 10, 5, bias=False) # This is just the declaration but the weights are imported from an already trained model
input = torch.load("inp.pt") # Shape: 1, 1, 297328
out = conv0(input) # Shape: 1, 512, 59464

I got the weight shape order from this code: https://github.com/PABannier/encodec.cpp/blob/main/encodec.cpp#L755
However, the output is just a bunch of NaNs (I think, because its all "Ï" on visual studio memory debug). I tried diving deeper and I think the issue is on ggml_im2col where const int64_t OW = ggml_calc_conv_output_size(b->ne[0], a->ne[0], s0, p0, d0); calculates a 0 value (which is wrong?). While debugging I think this is what it was calculating:
((1 + 2 × 1 − 1 × (10 − 1) − 1) / 5 + 1) which equals -0.4?

I've tried finding a solution but theres barely any documentation and I have already stared at the source code for hours and don't know what to do

This is the code I use to save the weights into GGUF format if necessary:

gguf_writer = gguf.GGUFWriter("test.gguf", "test0")

x = torch.load("inp.pt")
z = z.numpy()
z = z.transpose(2, 1, 0)

model = model.requires_grad_(False)
conv0_weight = model.conv0.weight.numpy()

gguf_writer.add_tensor("model0.test0.weight", conv0_weight, conv0_weight.shape)
gguf_writer.add_tensor("input", z, z.shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()
@balisujohn
Copy link
Contributor

at first glance
{1, 1, 297328, 1}
should probably be
{297328,1,1,1}

There are some examples of ggml_conv_1d in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624

@chavinlo
Copy link
Author

chavinlo commented Jul 7, 2024

at first glance {1, 1, 297328, 1} should probably be {297328,1,1,1}

There are some examples of ggml_conv_1d in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624

Thanks, now the im2col tensor has a shape of {10, 59464, 1, 1} which is closer to pytorch's output. But it then returns a empty (full of -431602080.) on the ggml_mul_mat call (L6472). What exactly is the first tensor of mul_mat supposed to be? because im2col initializes a tensor with the shapes mentioned before but no value inside of it. And the a and b tensors are only stored on result->src[0] = a; and result->src[1] = b; respectively, right? (L6607)

tldr tensor a (what's returned from im2col) of mul_mat's arguments inside ggml_conv_1d is empty hence the output of conv1d will fail

@balisujohn
Copy link
Contributor

balisujohn commented Jul 7, 2024

whats the output shape of ggml_conv_1d with these arguments?

@chavinlo
Copy link
Author

chavinlo commented Jul 7, 2024

whats the output shape of ggml_conv_1d with these arguments?

The output shape of ggml_conv_1d is {59464, 512, 1, 1}. However, the data of the tensor is empty.

@balisujohn
Copy link
Contributor

balisujohn commented Jul 7, 2024

It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does ggml_conv_1d on your inputs instead of the current operation.

@chavinlo
Copy link
Author

chavinlo commented Jul 7, 2024

It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does ggml_conv_1d on your inputs instead of the current operation.

Sure, heres the tensors.pt file needed for the scripts below (it only contains the weight of a conv1d and a test input): https://huggingface.co/chavinlo/ggmltest/resolve/main/tensors.pt?download=true

Pytorch and save to GGUF:

import torch
import torch.nn as nn
import gguf

tensors = torch.load("tensors.pt")

conv = nn.Conv1d(1, 512, 10, 5, bias=False)

input_tensor = tensors["input"]
conv.weight = tensors["weight"]

x = conv(input_tensor)
print(x)
print(x.shape)
print(input_tensor.shape)
print(conv.weight.shape)

"""
Output should be:
tensor([[[-3.6160e-02, -2.8281e-02,  1.1107e-02,  ...,  3.0684e-02,
          -2.2186e-02, -6.3259e-03],
         [ 7.9663e-02,  3.0687e-02, -5.0905e-02,  ..., -1.9161e-02,
           3.1029e-02,  1.5162e-02],
         [ 2.3205e-01,  1.8175e-01, -1.0812e-01,  ..., -2.3548e-02,
           3.9255e-02,  1.1151e-01],
         ...,
         [ 8.6802e-04,  8.3316e-04,  3.2947e-04,  ..., -2.9786e-03,
           6.5938e-03,  1.1510e-02],
         [ 1.6648e-02,  2.3425e-02, -7.5188e-03,  ...,  8.7883e-03,
           4.2063e-03,  1.8971e-02],
         [-2.4058e-01,  3.3975e-01,  2.8910e-01,  ..., -1.3100e-01,
          -1.3514e-01,  1.4614e-01]]])
torch.Size([1, 512, 59464])
torch.Size([1, 1, 297328])
torch.Size([512, 1, 10])
"""

gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

GGML C++:

#include <iostream>
#include "ggml.h"

int main()
{
    std::string fname = "tensors.gguf";

    // ##### GGML Context #####
    static size_t buf_size = 1024 * 1024 * 128;
    static void* buf = malloc(buf_size);

    struct ggml_init_params ggml_params = {
        /*.mem_size   =*/ buf_size,
        /*.mem_buffer =*/ buf,
        /*.no_alloc   =*/ false,
    };

	struct ggml_context* ggml_ctx = ggml_init(ggml_params);
    // %%%%%%%%%%

    // ##### GGUF Model Loading Context #####
    struct ggml_context* ggml_loader_ctx;

    struct gguf_init_params gguf_params = {
        /*.no_alloc   =*/ false,
        /*.ctx        =*/ & ggml_loader_ctx,
    };

    gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
    // %%%%%%%%%%

    ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
    ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");

    ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);

    std::cout << &output;
}

I had to create two contexts because when calling gguf_init_from_file the buf_size would shrink to 1mb.
The C++ code returns the same results as the script I was using before. Same output shape {59464, 512, 1, 1} yet no data.

Also about the ggml forking... you mean implementing ggml_conv_1d myself or...?

@balisujohn
Copy link
Contributor

balisujohn commented Jul 7, 2024

here's an example of what I'm talking about: balisujohn/ggml-get-rows-error@acc0259 when I created a reproducible example an error with ggml_get_rows.You can also upload the .pt file and .gguf files into the repository to make reproducing the error easier.

@balisujohn
Copy link
Contributor

I didn't see you had included the tensors.pt (I'll see if I can reproduce the error)

@balisujohn
Copy link
Contributor

I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand or any variant of ggml_graph_compute anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.

@chavinlo
Copy link
Author

chavinlo commented Jul 7, 2024

I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand or any variant of ggml_graph_compute anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.

Added ggml_build_forward_expand and ggml_graph_compute_with_ctx and got an error on the ggml_compute_forward_im2col_f16 function (L14383) in the GGML_ASSERT(src0->type == GGML_TYPE_F16); (L14390) check.
So then I changed the weight type from float32 to float16 before saving to GGUF, reloaded it, and finally it returned a non-empty tensor. Does this mean that the weights have to always be in float16 precision? because even the ggml_compute_forward_im2col_f32 function (L14306) makes the same check for the weights to be in float16.

Not that big of an error, but the accuracy of the output tensor is a little bit off: GGML output's first value is -0.0361655578, while torch gives -0.0361604653. The shape is correct though, {59464, 512, 1, 1}.

Heres the updated C++/Pytorch code if necessary:

#include <iostream>
#include "ggml.h"

int main()
{
    std::string fname = "tensors.gguf";

    // ##### GGML Context #####
    static size_t buf_size = 1024 * 1024 * 128;
    static void* buf = malloc(buf_size);

    struct ggml_init_params ggml_params = {
        /*.mem_size   =*/ buf_size,
        /*.mem_buffer =*/ buf,
        /*.no_alloc   =*/ false,
    };

	struct ggml_context* ggml_ctx = ggml_init(ggml_params);
    // %%%%% End of Inference GGML Context %%%%%

    // ##### GGUF Model Loading Context #####
    struct ggml_context* ggml_loader_ctx;

    struct gguf_init_params gguf_params = {
        /*.no_alloc   =*/ false,
        /*.ctx        =*/ & ggml_loader_ctx,
    };

    gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
    // %%%%% End of GGUF Model Loading Context %%%%%

    struct ggml_cgraph* gf = ggml_new_graph(ggml_ctx);

    ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
    ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");

    ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);

    ggml_build_forward_expand(gf, output);
    ggml_graph_compute_with_ctx(ggml_ctx, gf, 1);

    std::cout << &output->data;
}
import torch
import torch.nn as nn
import gguf

tensors = torch.load("tensors.pt")
conv = nn.Conv1d(1, 512, 10, 5, bias=False)

input_tensor = tensors["input"]
conv.weight = tensors["weight"]

x = conv(input_tensor)
# to save
conv = conv.to(torch.float16)
print(x)
print(input_tensor)
print(conv.weight)
print("output shape:", x.shape)
print("input shape:", input_tensor.shape)
print("weight shape:", conv.weight.shape)
print("input dtype:", input_tensor.dtype)
print("weight dtype:", conv.weight.dtype)

"""
Output should be:

tensor([[[-3.6160e-02, -2.8281e-02,  1.1107e-02,  ...,  3.0684e-02,
          -2.2186e-02, -6.3259e-03],
         [ 7.9663e-02,  3.0687e-02, -5.0905e-02,  ..., -1.9161e-02,
           3.1029e-02,  1.5162e-02],
         [ 2.3205e-01,  1.8175e-01, -1.0812e-01,  ..., -2.3548e-02,
           3.9255e-02,  1.1151e-01],
         ...,
         [ 8.6802e-04,  8.3316e-04,  3.2947e-04,  ..., -2.9786e-03,
           6.5938e-03,  1.1510e-02],
         [ 1.6648e-02,  2.3425e-02, -7.5188e-03,  ...,  8.7883e-03,
           4.2063e-03,  1.8971e-02],
         [-2.4058e-01,  3.3975e-01,  2.8910e-01,  ..., -1.3100e-01,
          -1.3514e-01,  1.4614e-01]]])
tensor([[[ 0.3720,  0.3385,  0.2953,  ..., -0.0872, -0.1079, -0.1487]]])
Parameter containing:
tensor([[[-0.0186,  0.2178, -0.1289,  ..., -0.0457,  0.1654,  0.1256]],
        [[-0.1410,  0.2072,  0.1740,  ..., -0.0887, -0.0700, -0.0103]],
        [[ 0.2450,  0.2239,  0.1588,  ..., -0.1426, -0.1188,  0.1205]],
        ...,
        [[ 0.0122, -0.0559,  0.1382,  ..., -0.2484,  0.1337, -0.0367]],
        [[ 0.1155,  0.0986, -0.0650,  ..., -0.1870, -0.0693,  0.0563]],
        [[-0.1095, -0.1289, -0.2644,  ..., -0.2622, -0.0640, -0.0243]]],
       dtype=torch.float16)
output shape: torch.Size([1, 512, 59464])
input shape: torch.Size([1, 1, 297328])
weight shape: torch.Size([512, 1, 10])
input dtype: torch.float32
weight dtype: torch.float16
"""

gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

Also, thanks for your help, I really appreciate it.

@balisujohn
Copy link
Contributor

The weights do always have to be float16, the slight numerical difference from torch is unsurprising, and no problem, happy to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants