How to copy a CSR matrix into a `sparsevec` column?

I have a sparse vector — the result of applying `sklearn`'s [TfidfVectorizer](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html):
```
<Compressed Sparse Row sparse matrix of dtype 'float64'
        with 4 stored elements and shape (1, 157541)>
  Coords        Values
  (0, 5051)     0.35521903059198523
  (0, 14956)    0.5566306658037382
  (0, 45152)    0.7328483894186835
  (0, 60738)    0.1640578566196061
```
which I want to copy into a table with a `sparsevec` column. As far as I understand from the documentation, the correct way to do this is the following:
```
        with cur.copy(
            "COPY my_table FROM STDIN WITH (FORMAT BINARY)"
        ) as copy:
            copy.set_types(["sparsevec"])
            copy.write_row((SparseVector(the_sparse_vector),))
```
but this produces an error:
```
psycopg.errors.DataException: sparsevec indices must not contain duplicates
```

I've investigated a bit and found [this line](https://github.com/pgvector/pgvector-python/blob/3f9e9a20b9f08033e7dc4e61ff4c43b34951d2ec/pgvector/sparsevec.py#L88) which uses `value.coords[0]` (not `value.coords[1]` for two dimensional input). Is this a bug? What should I do?

*Additional information about the example:*
1. The code
```
print(the_sparse_vector)
the_sparse_vector = the_sparse_vector.tocoo()
print(the_sparse_vector.ndim, the_sparse_vector.shape)
print(the_sparse_vector.coords)
print(the_sparse_vector.data)
print(SparseVector(the_sparse_vector))
```
outputs:
```
<Compressed Sparse Row sparse matrix of dtype 'float64'
        with 4 stored elements and shape (1, 157541)>
  Coords        Values
  (0, 5051)     0.35521903059198523
  (0, 14956)    0.5566306658037382
  (0, 45152)    0.7328483894186835
  (0, 60738)    0.1640578566196061
2 (1, 157541)
(array([0, 0, 0, 0], dtype=int32), array([ 5051, 14956, 45152, 60738], dtype=int32))
[0.35521903 0.55663067 0.73284839 0.16405786]
SparseVector({0: 0.1640578566196061}, 157541)
```
2. I have
```
psycopg           3.2.6
psycopg-binary    3.2.6
pgvector          0.4.0
scipy             1.15.2
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to copy a CSR matrix into a `sparsevec` column? #127

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to copy a CSR matrix into a sparsevec column? #127

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

How to copy a CSR matrix into a `sparsevec` column? #127