-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fixed the problem that importing the specified database name does not take effect. #1930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughUpdated listTables in metadataService to filter non-DUCKDB tables by connectionInfo.database. It retrieves tables via ibisAdaptor.getTables and applies schema/database-name matching rules; DUCKDB behavior using wrenEngineAdaptor is unchanged. If no database is specified or tables lack schema info, results default to unfiltered or backward-compatible inclusion. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant MetadataService
participant IbisAdaptor
participant WrenEngineAdaptor
Client->>MetadataService: listTables(connectionInfo)
alt connectionInfo.engine == "DUCKDB"
MetadataService->>WrenEngineAdaptor: getTables(connectionInfo)
WrenEngineAdaptor-->>MetadataService: allTables
MetadataService-->>Client: allTables
else non-DUCKDB
MetadataService->>IbisAdaptor: getTables(connectionInfo)
IbisAdaptor-->>MetadataService: allTables
alt connectionInfo.database provided
Note over MetadataService: Filter rules:<br/>1) schema.table → schema==database<br/>2) properties.schema/database_name==database<br/>3) no schema info → keep
MetadataService-->>Client: filteredTables
else no database
MetadataService-->>Client: allTables
end
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
wren-ui/src/apollo/server/services/metadataService.ts (2)
75-80
: Consider server-side filtering via ibisAdaptor to reduce over-fetching.If IIbisAdaptor can accept a database/schema filter, pass it down to avoid fetching cross-database tables and then filtering in-memory. This cuts latency on large catalogs and reduces payload sizes. If not currently supported, consider extending the adaptor API in a follow-up.
102-105
: Exclude ambiguous tables when a database is set
listTables currently falls back toreturn true
for tables without anyschema
ordatabase_name
property even whenconnectionInfo.database
is specified, which can leak cross-database tables. Add a strict mode (feature flag or data-source predicate) to filter out tables lacking explicit schema/database info when a database is set, or emit metrics to monitor how often ambiguous tables are returned.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
wren-ui/src/apollo/server/services/metadataService.ts
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
wren-ui/src/apollo/server/services/metadataService.ts (1)
wren-ui/migrations/20240531085916_transfer_model_properties.js (2)
connectionInfo
(18-21)connectionInfo
(31-34)
// Filter tables by database/schema if database is specified in connectionInfo | ||
const database = connectionInfo?.database; | ||
if (database && allTables.length > 0) { | ||
// For data sources that use database as schema name, filter tables by schema | ||
return allTables.filter((table) => { | ||
// Check if table name contains schema information | ||
// Table names might be in format: schema.table or just table | ||
const tableName = table.name; | ||
const tableNameParts = tableName.split('.'); | ||
|
||
if (tableNameParts.length > 1) { | ||
// If table name has schema prefix, check if it matches the database | ||
const schemaName = tableNameParts[0]; | ||
return schemaName === database; | ||
} else { | ||
// If no schema prefix, check table properties for schema information | ||
const tableSchema = | ||
table.properties?.schema || table.properties?.database_name; | ||
if (tableSchema) { | ||
return tableSchema === database; | ||
} | ||
// If no schema information available, include the table | ||
// (this maintains backward compatibility) | ||
return true; | ||
} | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Normalize identifiers and handle multi-part names (e.g., BigQuery) to avoid false negatives.
Current logic assumes the schema is the first segment before a dot. For BigQuery, fully-qualified names can be project.dataset.table, where the dataset is the penultimate segment. Also, comparisons are case/quote sensitive, and properties may expose dataset/schema under different keys. Suggest normalized, data-source-aware matching with broader property fallback.
Apply this diff within the current block to make matching robust while preserving your backward-compat behavior:
- // Filter tables by database/schema if database is specified in connectionInfo
- const database = connectionInfo?.database;
- if (database && allTables.length > 0) {
- // For data sources that use database as schema name, filter tables by schema
- return allTables.filter((table) => {
- // Check if table name contains schema information
- // Table names might be in format: schema.table or just table
- const tableName = table.name;
- const tableNameParts = tableName.split('.');
-
- if (tableNameParts.length > 1) {
- // If table name has schema prefix, check if it matches the database
- const schemaName = tableNameParts[0];
- return schemaName === database;
- } else {
- // If no schema prefix, check table properties for schema information
- const tableSchema =
- table.properties?.schema || table.properties?.database_name;
- if (tableSchema) {
- return tableSchema === database;
- }
- // If no schema information available, include the table
- // (this maintains backward compatibility)
- return true;
- }
- });
- }
+ // Filter tables by database/schema if database is specified in connectionInfo
+ const database =
+ connectionInfo?.database ??
+ connectionInfo?.schema ??
+ connectionInfo?.dataset;
+
+ // helper: strip quotes/backticks/brackets, lowercase, trim
+ const normalizeIdent = (v?: string) =>
+ typeof v === 'string'
+ ? v.replace(/[`"'[\]]/g, '').trim().toLowerCase()
+ : v;
+
+ if (database && Array.isArray(allTables) && allTables.length > 0) {
+ const dbNorm = normalizeIdent(database);
+ return allTables.filter((table) => {
+ // Prefer explicit properties if present
+ const propSchemaRaw =
+ table.properties?.schema ??
+ table.properties?.schema_name ??
+ table.properties?.database_name ??
+ table.properties?.database ??
+ table.properties?.dataset ??
+ table.properties?.dataset_name;
+ const propSchema = normalizeIdent(propSchemaRaw);
+ if (propSchema) {
+ return propSchema === dbNorm;
+ }
+
+ // Fallback: parse from table.name
+ const parts = String(table.name ?? '')
+ .replace(/[`"']/g, '')
+ .split('.')
+ .filter(Boolean);
+
+ // Data-source-aware extraction of schema/dataset segment
+ let schemaFromName: string | undefined;
+ if (dataSource === DataSourceName.BIGQUERY) {
+ // project.dataset.table -> dataset is the penultimate segment
+ schemaFromName = parts.length >= 2 ? parts[parts.length - 2] : undefined;
+ } else {
+ // postgres etc.: schema.table -> schema is first segment
+ schemaFromName = parts.length >= 2 ? parts[0] : undefined;
+ }
+ if (schemaFromName) {
+ return normalizeIdent(schemaFromName) === dbNorm;
+ }
+
+ // If no schema information available, include the table
+ // (maintains backward compatibility)
+ return true;
+ });
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Filter tables by database/schema if database is specified in connectionInfo | |
const database = connectionInfo?.database; | |
if (database && allTables.length > 0) { | |
// For data sources that use database as schema name, filter tables by schema | |
return allTables.filter((table) => { | |
// Check if table name contains schema information | |
// Table names might be in format: schema.table or just table | |
const tableName = table.name; | |
const tableNameParts = tableName.split('.'); | |
if (tableNameParts.length > 1) { | |
// If table name has schema prefix, check if it matches the database | |
const schemaName = tableNameParts[0]; | |
return schemaName === database; | |
} else { | |
// If no schema prefix, check table properties for schema information | |
const tableSchema = | |
table.properties?.schema || table.properties?.database_name; | |
if (tableSchema) { | |
return tableSchema === database; | |
} | |
// If no schema information available, include the table | |
// (this maintains backward compatibility) | |
return true; | |
} | |
}); | |
} | |
// Filter tables by database/schema if database is specified in connectionInfo | |
const database = | |
connectionInfo?.database ?? | |
connectionInfo?.schema ?? | |
connectionInfo?.dataset; | |
// helper: strip quotes/backticks/brackets, lowercase, trim | |
const normalizeIdent = (v?: string) => | |
typeof v === 'string' | |
? v.replace(/[`"'[\]]/g, '').trim().toLowerCase() | |
: v; | |
if (database && Array.isArray(allTables) && allTables.length > 0) { | |
const dbNorm = normalizeIdent(database); | |
return allTables.filter((table) => { | |
// Prefer explicit properties if present | |
const propSchemaRaw = | |
table.properties?.schema ?? | |
table.properties?.schema_name ?? | |
table.properties?.database_name ?? | |
table.properties?.database ?? | |
table.properties?.dataset ?? | |
table.properties?.dataset_name; | |
const propSchema = normalizeIdent(propSchemaRaw); | |
if (propSchema) { | |
return propSchema === dbNorm; | |
} | |
// Fallback: parse from table.name | |
const parts = String(table.name ?? '') | |
.replace(/[`"']/g, '') | |
.split('.') | |
.filter(Boolean); | |
// Data-source-aware extraction of schema/dataset segment | |
let schemaFromName: string | undefined; | |
if (dataSource === DataSourceName.BIGQUERY) { | |
// project.dataset.table -> dataset is the penultimate segment | |
schemaFromName = parts.length >= 2 ? parts[parts.length - 2] : undefined; | |
} else { | |
// postgres etc.: schema.table -> schema is first segment | |
schemaFromName = parts.length >= 2 ? parts[0] : undefined; | |
} | |
if (schemaFromName) { | |
return normalizeIdent(schemaFromName) === dbNorm; | |
} | |
// If no schema information available, include the table | |
// (maintains backward compatibility) | |
return true; | |
}); | |
} |
🤖 Prompt for AI Agents
In wren-ui/src/apollo/server/services/metadataService.ts around lines 81 to 107,
the current filter assumes the schema is the first dot-segment and does
case/quote-sensitive exact matches which fails for multi-part names (e.g.,
project.dataset.table), quoted identifiers, and varying property keys; update
the filter to: split name into segments and treat the penultimate segment as
schema for names with 3+ parts (first segment for 2-part names), normalize
identifiers by trimming surrounding quotes and lowercasing both table-derived
segments and the connectionInfo.database, check multiple property keys for
schema/dataset (e.g., schema, database_name, dataset, dataset_id, datasetId),
compare using normalized values, and preserve the backward-compat behavior of
including tables when no schema info is available.
Problem presentation
This shows that there are two different local database names
ranch-vue
andranchvue
.Import the database configuration in WrenAI's Settings, as shown below, which shows that the database name is specified as
ranchvue
.As shown in the figure, the database name is specified as
ranchvue
, But when clickingNext
, the tables corresponding to bothranch-vue
andranchvue
databases will be loaded in. This is obviously wrong, because I have specified the database name asranchvue
, and the result is obviously that it is not loaded according to the specified database name.problem solving
Modify wren-ui -> metadataService.ts -> listTables, and add code in the listTables function to filter the specified database name.
Finally, this problem has been solved. As shown in the figure, only the tables corresponding to the database name
ranchvue
are displayed.Summary by CodeRabbit