Column types for Views #838
Replies: 6 comments 10 replies
-
I think these work and will solve most of the immediate problems users might have. I also think that we should have the ability to list or count items based on specific user criteria for list and count.
I had the same impression regarding count vs. maximum as separate types, and I'd go with function as a type in this case.
For convenience, I'm wondering if users would appreciate being able to extend types (save functions as custom types). |
Beta Was this translation helpful? Give feedback.
-
I have some thoughts. But: I'm a little worried that I don't understand or our type system well enough, or perhaps I'm misunderstanding the question. So this could be totally off-base.... Could we instead make Beyond "read-only", if there are other characteristics that distinguish views types from table types, I'd lean towards using a similar approach of making type properties based on the smallest possible kernel of that characteristic, conceptually. As a counter-example, if we think of "Maximum" as a type, I predict we're actually going to end up wanting separate types for "Maximum Number" and "Maximum Date" (and others too) because I should be able to do different things with a date than I can with a number. |
Beta Was this translation helpful? Give feedback.
-
I'm uncomfortable with overloading the concept of "type" in this way. The As for alternate representations: I think we should separate the data type of a column from the data source or generation method in the user's mind. I.e., you could have an integer column with data from input, or you could have an integer column with data from an aggregation. You could have a composite column with data from input, or you could have a composite column with data from a foreign table. Separating the source or generation of data from its type would be clearer, and would also remove some of the difficulties around the "read only" discussion. Types wouldn't control "read only", and the data source or generation method would make "read only" or not obvious.
I think we should have a few concepts (avoiding the terminology of types, but trying to capture the concepts you seem to be going for):
Totally agree. I think we should separate the list type from the method used to generate the list, however. Lists can come from aggregations, but they could also come from interpolating existing data, or even extrapolating from it. They could also be user input.
See my above comment; I think this is one of many tricky problems that would be avoided if we don't overload the concept of "type" to handle "where did the data come from".
I think we should provide common data generation methods, and ways to combine them.
We're going to have to be reaally careful how we present data generation options to users. There are some methods (e.g., window functions, CTEs, etc.) that are way more powerful than anything available in spreadsheets. However, there are some concepts that are just kind of not available in DB-land. For example, think of the "total" cell in a spreadsheet. You have a column representing per-sale profit that's generated from each row, and then a cell called "total profit" at the bottom of the profit column that's the sum of all cells in the profit column. This is obviously not something that is appropriate in a database table, or even a view.
My inclination would be to guide the user toward having a separate view that's just the I think we also need to make sure that it's clear that horizontal aggregation and vertical aggregation aren't the same, and they have different properties in a DB. Any vertical aggregations (e.g., We should also consider window functions for aggregations and formulas. You can do some really cool stuff (rolling averages, accumulated sums, etc.) that would be pretty useful in a number of situations. TL;DR: I think we should avoid lumping data generation, functions, and other calculation in with "types". We should instead make it a separate concept and handle it appropriately. We may even want to have different concepts for fundamentally different ways of generating data in the DB. |
Beta Was this translation helpful? Give feedback.
-
I agree with @seancolsen and @mathemancer. My first instinct was to just have the "derived" columns like sum, list or max (columns whose values are derived from other data in the database) be same old Mathesar types, but read-only, like Sean suggested. Like Brent pointed out, it's not that these columns contain new data types (they don't), but that their value is derived not from external input (like user input or some something else outside the database), but from data that's already on the database. A formalization of the concept of "value origin" and of a database function (e.g. sum, list, max), which would effectively be a type of value origin, is a good solution, I think. |
Beta Was this translation helpful? Give feedback.
-
It sounds good to me to have view specific data types.
I see the point made by @seancolsen and @mathemancer regarding having the same original mathesar type for the aggregation columns. It does make sense to me to have the user choose a data source for the column separate to the type as this would also solve other cases like representing auto-generated default-value for columns (such as for id). But the design for this is going to be tricky. Hiding it behind a data-source option on the UX might lead to the user not knowing if we support aggregations. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the feedback, everyone. I think the simplest solution might be for all columns in Views to have both "type" and "source" attributes. The "source" can be another table or a function. I'll need to do some more thinking about how we can make the source attribute user friendly, I'll start another discussion when I have some thoughts about it. @ghislaineguerin This is going to affect the design for all the other Views issues as well, since now every column in Views will have a new attribute. |
Beta Was this translation helpful? Give feedback.
-
As I've been thinking about Views, I'm conceptualizing them as a superset of tables. Columns in Views can have the same data types as tables but they can also be data that doesn't directly map to table data types and is computed or summarized in some way. I think that we'll need to create additional "Mathesar types" that only apply to views.
As an example, let's take the view referenced in the Views wiki page. We would have the following data types:
INTEGER
VARCHAR
SELECT array( SELECT Name FROM Actors );
SELECT MAX(Date) FROM Movie Watch WHERE Movie ID == <id>
SELECT COUNT(Date) FROM Movie Watch WHERE Movie ID == <id>
VARCHAR
INTEGER
Here,
List
,Maximum
andCount
would be View-specific Mathesar data types mapping to a certain type of SQL query. We may not allow users to update data in these types, but we'd be able to recognize them, display them differently, and allow users to create new columns of these types. Please note that these are examples of the concept, I haven't finalized these names or specific behavior.Questions
Beta Was this translation helpful? Give feedback.
All reactions