Description
openedon May 27, 2023
When using owned JSON entities, we currently use JSON_VALUE to extract scalars out of them. Since JSON_VALUE always returns nvarchar(4000), we apply a cast to the results based on the model type. For example:
_ = await ctx.Blogs.Where(b => b.Details.Foo == 8).ToArrayAsync();
... translates to:
SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
WHERE CAST(JSON_VALUE([b].[Details],'$.Foo') AS int) = 8
There are issues with this translation:
- Binary data (varbinary) cannot be extracted in this way: the standard JSON representation for binary data is base64, but a regular relational CAST doesn't do that (see comment). OPENJSON with WITH does decode base64 data, since it applies a JSON-specific conversion. Binary data specifically is covered by #33435.
- On the other hand, spatial data cannot be converted with OPENJSON with WITH, only with a regular cast. The same may be true of hierarchyid.
- We need to go through all supported SQL Server types
- JSON_VALUE returns null for strings larger than 4000 (or can throw in strict mode) (see #29477).
- There's good reason to believe that this translation is inefficient compared to OPENJSON with WITH, where the query supplies more information to SQL Server.
The alternative translation would be to use OPENJSON with WITH instead:
SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
CROSS APPLY OPENJSON([b].[Details]) WITH ([Foo] int '$.Foo') AS [d]
WHERE [d].[Foo] = 8
In other words, every scalar access adds a CROSS APPLY to an OPENJSON invocation, applying the JSON conversion with WITH.
When the property being extracted is known to be a short string in the model (fitting in nvarchar(4000)), we could keep the current JSON_VALUE translation. When the type being extracted is incompatible with OPENJSON with WITH (e.g. geometry), we could do OPENJSON without WITH and then apply a relational cast (we should avoid JSON_VALUE still it truncates).
Note: consider the indexability of this technique, compared to JSON_VALUE (with computed columns)