-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTREE indexes to be case sensitive and GIN are case insensitive #335
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I think we still need this. |
To be able to work on this we'd need to know the collumns that are most searched by for it to be efficent. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
There is a blocker here, some of the fields to be indexed are rather too long and the DB can not index them. Suggestions on how to handle this are much welcome @akmiller01 @dean-breed @bill-anderson |
You don't want BTREE indexes on titles and descriptions, do you? In my old money (of my particular old currency) we had cross-reference indexes for free text fields. Is this GiST in new money? |
That's right @bill-anderson. GiST and GIN are to be used for those extremely long text columns, but they too have a max string length (which is about 8000 characters) they can be applied to. Unfortunately, some of our data exceeds that length! |
So what I think we should do is move this into an ETL discussion to limit all free-text fields to, say, 5000 characters. I don't think that will affect the integrity of the data. But we should do a bit of research on the long culprits. |
@wakibi Would it be possible to get a list of reporting organisations and columns which would be truncated? |
@wakibi Can I get this list? |
@dean-breed sorry Chris is on leave ... let me stick this on Slack where he can't miss it. |
@dean-breed Find attached. I have added all those with more than 7500 (my reasoning is those approaching that figure could easily soon hit the 8000 ceiling as well). |
@wakibi - I could do something on this myself but to avoid mistake likelihood. Could you output the reporting org reference also? |
Attached. I had forgotten to format the first one. |
Hi Chris @wakibi, is it possible to output this list as part of the ETL process? I could then include the information in the data quality file? |
Hi @dean-breed, do you want it sent as an email attachment? |
That would be great if it could. Or just stored in a repo somewhere. Either works @wakibi |
By their nature, BTREE indexes are case sensitive and that should be included in our documentation. Removing the case sensitivity on BTREEs removes the performance gains they were used in the first place. Users need made be aware of this
The text was updated successfully, but these errors were encountered: