-
Notifications
You must be signed in to change notification settings - Fork 303
Add new string types to tables section of docs #2699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR updates Markdoc documentation: it replaces Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/routes/docs/products/databases/tables/`+page.markdoc:
- Around line 1245-1249: Update the doc to stop showing deprecated `string` in
examples and clarify migration rules: change any example columns using `type:
'string'` to a recommended replacement such as `type: 'varchar'` (or
`text`/`mediumtext`/`longtext` as appropriate) and edit the "Migrating from
string columns" info block to state explicitly whether the migration chooses the
resulting text type based on the declared column size (e.g., the `size` option
you specify) versus the size of existing row data—if the implementation uses the
declared size, say "based on the declared size (`size` option)"; if it inspects
actual data, say "based on the maximum size of existing data"—so consumers and
examples (e.g., snippets that previously used `type: 'string'`) are consistent
with the guidance.
🧹 Nitpick comments (1)
src/routes/docs/products/databases/tables/+page.markdoc (1)
1227-1232: Clarify howvarcharvstextshould be chosen.Both rows show the same 16,383-character limit and near-identical descriptions (Line 1229–1232), which makes the selection criteria unclear. Consider adding a short note distinguishing intended usage (e.g.,
varcharrequires asize,textis for longer prose), or adjust limits if they differ.
…into new-string-types # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/routes/blog/post/new-string-types/`+page.markdoc:
- Line 23: Update the table row that currently reads "`varchar` | 16,383 | Fully
indexable if size < 768" to explicitly state the unit as bytes and note the
encoding/setting caveats: say something like "Fully indexable if size < 768
bytes (InnoDB index limit); with innodb_large_prefix or Barracuda this can be
larger; with utf8mb4 (4 bytes/char) the max indexable characters is ~192 chars."
Also fix the conflicting sentence that says "768 characters" so it instead
references "768 bytes" and mention the effective character limit varies by
character set (e.g., utf8mb4 = 4 bytes/char).
- Line 6: The frontmatter "cover" property in the Markdoc post (+page.markdoc)
points to the wrong image (/images/blog/announcing-screenshots-api/cover.png);
update the cover frontmatter to reference this post's dedicated cover image
(e.g., /images/blog/new-string-types/cover.png or the correct asset name) so the
file's cover property matches the post's images.
|
|
||
| | Type | Max characters | Indexing | | ||
| |--------------|-------------------|-----------------------------------------------| | ||
| | `varchar` | 16,383 | Fully indexable if size < 768 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify "if size < 768" — characters or bytes?
The table states varchar is "Fully indexable if size < 768" but doesn't specify the unit. Line 32 says "768 characters," but the underlying MariaDB/InnoDB index limit is 768 bytes (or 3072 bytes with innodb_large_prefix). With multi-byte encodings (e.g., utf8mb4 at 4 bytes/char), the effective character limit for full indexing would be much lower. Please verify the correct threshold and unit.
🤖 Prompt for AI Agents
In `@src/routes/blog/post/new-string-types/`+page.markdoc at line 23, Update the
table row that currently reads "`varchar` | 16,383 | Fully indexable if size <
768" to explicitly state the unit as bytes and note the encoding/setting
caveats: say something like "Fully indexable if size < 768 bytes (InnoDB index
limit); with innodb_large_prefix or Barracuda this can be larger; with utf8mb4
(4 bytes/char) the max indexable characters is ~192 chars." Also fix the
conflicting sentence that says "768 characters" so it instead references "768
bytes" and mention the effective character limit varies by character set (e.g.,
utf8mb4 = 4 bytes/char).
| featured: false | ||
| --- | ||
|
|
||
| Until now, Appwrite offered a single `string` column type that internally switched between four different storage types based on the size you specified. This meant you had no visibility into how your data was actually stored, how it could be indexed, or why certain size limits existed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use abstracted instead of switch or in addition
|
|
||
| # Why we made this change | ||
|
|
||
| We regularly dogfood Appwrite in our own projects to find areas we can improve. While building [Imagine](https://imagine.dev), our AI-powered builder for web apps, which uses Appwrite under the hood for out-of-the-box server functionality, we quickly noticed how hard it is for AI models to reason about the `string` column type. Here's what we found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can share the Imagine story, but I think this confusion came from humans engineers before agents struggled with it. So we made the fix for both - win win
|
|
||
| # Deprecation of the string type | ||
|
|
||
| The `string` column type is now deprecated. It will continue to work for existing columns, but we recommend using the explicit types for all new columns going forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it clearer that we plan to continue backward support just to avoid any confusion even if its clear, we can make x2 clear.
|
|
||
| We saw this firsthand with Imagine. Imagine runs an Appwrite Cloud agent that provisions database resources on your behalf using the Appwrite CLI and `appwrite.json` as the schema. The agent is trained to build sophisticated architectures, and it would easily create just enough string columns to hit the row width limit. | ||
|
|
||
| We could have solved this with better prompting, but that would only fix Imagine. We wanted to improve the experience for everyone, whether you're using AI tooling, the Console, or the SDKs directly. With explicit types, an AI model now naturally picks `longtext` for something like blog content, which is stored off-page and doesn't consume row space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing something about the lack of visibility of the varchar row size limit, I think this was the root issue with the first approach. Now we make things more visible (even on the console) and help devs and agents alike make better decisions around their desired data structures.
|
|
||
| We saw this firsthand with Imagine. Imagine runs an Appwrite Cloud agent that provisions database resources on your behalf using the Appwrite CLI and `appwrite.json` as the schema. The agent is trained to build sophisticated architectures, and it would easily create just enough string columns to hit the row width limit. | ||
|
|
||
| We could have solved this with better prompting, but that would only fix Imagine. We wanted to improve the experience for everyone, whether you're using AI tooling, the Console, or the SDKs directly. With explicit types, an AI model now naturally picks `longtext` for something like blog content, which is stored off-page and doesn't consume row space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine use cases is great, but the human use case should get here equal attention. It should be 50:50
|
|
||
| # Why we made this change | ||
|
|
||
| The root issue with the legacy `string` type was a lack of visibility. It abstracted away important storage details, which meant developers had no way to make informed decisions about their data structures. This led to problems for both human engineers and AI agents alike. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe add a bit more context to why we made this decision from the first place and be honest about us realizing that it compromised on important details like visibility and enough control to devs. We can also mention that finding the right abstraction balance is a difficult task, and from time to time based on user feedback and experience we have to adjust and respond which is what were doing now.
Summary by CodeRabbit