[Feature]: Modify the collection schema once collection is created and is not empty #20405

jeet129 · 2022-11-08T09:46:32Z

Is there an existing issue for this?

I have searched the existing issues

Is your feature request related to a problem? Please describe.

Many a times, when we start with an ANN collection definition we don't know the exhaustive list of fields which should be available for the use case and we create the collection with few known fields in the collection schema and as the application evolves there is a need to add/modify the schema defined earlier to accommodate more attributes.

Without this, the only way is to recreate a collection and do a fresh ingestion of data, which might not be an easy choice considering the longer data ingestion pipeline for huge collections.

Describe the solution you'd like.

We need a way to add new fields(non-mandatory/fields with default values)/drop existing(non-primary) fields from collection.
This way the same collection can be used to serve the different scenarios pertaining to a use case without a need to create a new collection and hydrating it with the data.

Also there should be an option to update the values for such attributes for existing entities.

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

zyy20191 · 2022-11-16T03:27:57Z

The ability to add field to an already created collection is really convenient, and I hope you can consider this requirement

xiaofan-luan · 2022-12-09T03:35:43Z

Let's keep it.
Agree this is very useful feature.
But this require a lot effort so I think if anyone has time pls take it. Otherwise we will wait for performance/stability issue solved and we start to work on it

sskserk · 2024-09-04T11:09:15Z

Hey Milvus Developers and Community,

Would it be feasible to implement a feature that includes basic routines for renaming, adding, or dropping columns? Even a simple set of these functions could significantly enhance our capabilities.

The use case is straightforward but has a substantial impact:

We have a large collection containing many thousands of records.
A considerable amount of time has been invested in calculating the stored embeddings.
A new release of the functional logic now necessitates the inclusion of an additional field.

A major challenge we face is determining how to properly migrate the data.

Currently, we are compelled to recreate the entire collection from scratch whenever an additional or modified field is needed. Given the vast amounts of data involved, this process is exceedingly challenging.

Providing a command-line tool that could handle these modifications would offer significant relief and improve our efficiency.

I also do suppose that physically modification of an existing collection can be practically an impossible task. It might require changes of the vector's data which is a computational challenge.

P/S: Would be happy to cooperate with somebody or assist with a corresponding MR.

xiaofan-luan · 2024-09-04T22:32:37Z

this is for sure already on our roadmap.

@tedxu and @smellthemoon is actually working on it so hopefully that would help..

@smellthemoon could you please followup with @sskserk and see how it can work with our latest modify schema feature

sskserk · 2024-09-05T08:23:55Z

@xiaofan-luan , @tedxu , @smellthemoon,

I am eager to test the new feature and am looking forward to receiving it. I'm ready to test a prerelease of this feature, just need to know when.

The implementation of this feature will undoubtedly mark a significant milestone. I anticipate that, as a result, a new Milvus-related product similar to "Flyway" might emerge in the future.

Your solution is already widely adopted by major companies, and this enhancement will further solidify its enterprise-grade capabilities.

Thank you for the positive update!

xiaofan-luan · 2024-09-05T23:35:04Z

@xiaofan-luan , @tedxu , @smellthemoon,

I am eager to test the new feature and am looking forward to receiving it. I'm ready to test a prerelease of this feature, just need to know when.

The implementation of this feature will undoubtedly mark a significant milestone. I anticipate that, as a result, a new Milvus-related product similar to "Flyway" might emerge in the future.

Your solution is already widely adopted by major companies, and this enhancement will further solidify its enterprise-grade capabilities.

Thank you for the positive update!

could, let's ship it

smellthemoon · 2024-09-06T08:27:10Z

In fact, the add field feature has been included in our development plan. Users can add a new column through add field operation. The values in this new column are all null values. After the add field operation is completed, the field data in insert/upsert request needs to has the data of the new column. I will keep you updated if there is any progress. @sskserk

smellthemoon · 2024-09-06T08:27:16Z

/assign

xiaofan-luan · 2024-10-28T22:38:59Z

On 2.0 we support null/default value.
The target for 3.0 is to support schema change.

iamkhalidbashir · 2024-10-30T10:29:33Z

is there any null value for embeddings field?
js lib if we pass null for an embedding field, we get this error

Error processing PDF: TypeError: Cannot read properties of null (reading 'length')
    at Function.concat (node:buffer:589:19)
    at /app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:241:47
    at Array.map (<anonymous>)
    at MilvusClient.<anonymous> (/app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:218:64)
    at Generator.next (<anonymous>)
    at fulfilled (/app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:5:58)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

xiaofan-luan · 2024-10-30T18:34:29Z

is there any null value for embeddings field? js lib if we pass null for an embedding field, we get this error

Error processing PDF: TypeError: Cannot read properties of null (reading 'length')
    at Function.concat (node:buffer:589:19)
    at /app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:241:47
    at Array.map (<anonymous>)
    at MilvusClient.<anonymous> (/app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:218:64)
    at Generator.next (<anonymous>)
    at fulfilled (/app/node_modules/@zilliz/milvus2-sdk-node/dist/milvus/grpc/Data.js:5:58)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

embeddings can not be null.

for data field, it can be null only if nullable enabled after milvus 2.5

@smellthemoon
Do we support alter a non-nullable field to nullable?

jeet129 added the kind/feature Issues related to feature request from users label Nov 8, 2022

jeet129 assigned xiaofan-luan Nov 8, 2022

xiaofan-luan added this to the 2.3 milestone Dec 9, 2022

bogdankostic mentioned this issue Dec 30, 2022

Milvus without SQLite - Milvus can CRUD vector with metadata for himself as of today deepset-ai/haystack#3594

Closed

sre-ci-robot assigned smellthemoon Sep 6, 2024

yanliang567 modified the milestones: 2.3, 2.5.0 Sep 29, 2024

xiaofan-luan modified the milestones: 2.5.0, 3.0 Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Modify the collection schema once collection is created and is not empty #20405

[Feature]: Modify the collection schema once collection is created and is not empty #20405

jeet129 commented Nov 8, 2022

zyy20191 commented Nov 16, 2022

xiaofan-luan commented Dec 9, 2022

sskserk commented Sep 4, 2024 •

edited

Loading

xiaofan-luan commented Sep 4, 2024

sskserk commented Sep 5, 2024

xiaofan-luan commented Sep 5, 2024

smellthemoon commented Sep 6, 2024 •

edited

Loading

smellthemoon commented Sep 6, 2024

xiaofan-luan commented Oct 28, 2024

iamkhalidbashir commented Oct 30, 2024

xiaofan-luan commented Oct 30, 2024

[Feature]: Modify the collection schema once collection is created and is not empty #20405

[Feature]: Modify the collection schema once collection is created and is not empty #20405

Comments

jeet129 commented Nov 8, 2022

Is there an existing issue for this?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like.

Describe an alternate solution.

Anything else? (Additional Context)

zyy20191 commented Nov 16, 2022

xiaofan-luan commented Dec 9, 2022

sskserk commented Sep 4, 2024 • edited Loading

xiaofan-luan commented Sep 4, 2024

sskserk commented Sep 5, 2024

xiaofan-luan commented Sep 5, 2024

smellthemoon commented Sep 6, 2024 • edited Loading

smellthemoon commented Sep 6, 2024

xiaofan-luan commented Oct 28, 2024

iamkhalidbashir commented Oct 30, 2024

xiaofan-luan commented Oct 30, 2024

sskserk commented Sep 4, 2024 •

edited

Loading

smellthemoon commented Sep 6, 2024 •

edited

Loading