Skip to content

Conversation

davidangb
Copy link
Contributor

@davidangb davidangb commented Aug 5, 2025

Ticket: CORE-650

  • Refactor CaseSensivitySpec to use Quicksilver
  • move queries away from ENTITY_KEYS to use ENTITY instead; this avoids any case-sensitivity issues with ENTITY_KEYS (this helps with CORE-618)
  • add a case-sensitive character set and collation to the metadata query that calculates attribute names

The biggest problem we had was the third bullet above: the listEntityKeysViaEntity methods ignored case on attribute names.

in my example workspace, here is the before/after of the entity type metadata response:

Before

{
  "MyType": {
    "attributeNames": [
      "CASE",
      "notes"
    ],
    "count": 3,
    "idName": "MyType_id"
  },
  "nametest": {
    "attributeNames": [
      "foo"
    ],
    "count": 1,
    "idName": "nametest_id"
  },
  "program": {
    "attributeNames": [
      "myNewColumn",
      "program_name"
    ],
    "count": 1,
    "idName": "program_id"
  }
}

After

{
 "MyType": {
   "attributeNames": [
     "CASE",
     "notes"
   ],
   "count": 2,
   "idName": "MyType_id"
 },
 "mytype": {
   "attributeNames": [
     "CASE",
     "case",
     "notes"
   ],
   "count": 1,
   "idName": "mytype_id"
 },
 "nametest": {
   "attributeNames": [
     "foo"
   ],
   "count": 1,
   "idName": "nametest_id"
 },
 "program": {
   "attributeNames": [
     "myNewColumn",
     "program_name"
   ],
   "count": 1,
   "idName": "program_id"
 }
}

sql"""SELECT distinct entity_type, attribute_key
FROM ENTITY_KEYS , JSON_TABLE(attribute_keys, '$$[*]' COLUMNS(attribute_key VARCHAR(256) PATH '$$')) t
where workspace_id=$workspaceId;""".as[EntityTypeAndAttributeKey]

Copy link
Contributor Author

@davidangb davidangb Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, and below, I removed use of the ENTITY_KEYS table. This resolved some issues with case-sensitivity of attribute types.

def listEntityKeysViaEntity(workspaceId: UUID): ReadAction[Seq[EntityTypeAndAttributeKey]] =
sql"""SELECT distinct entity_type, attribute_key
FROM ENTITY, JSON_TABLE(JSON_KEYS(attributes, $slickAttrsPath), '$$[*]' COLUMNS(attribute_key VARCHAR(256) PATH '$$')) t
FROM ENTITY, JSON_TABLE(JSON_KEYS(attributes, $slickAttrsPath), '$$[*]' COLUMNS(attribute_key VARCHAR(256) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin PATH '$$')) t
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, and below, I added the character set/collation. This resolved case-sensitivity issues for attribute names.

actual should not be empty
actual.get.attributeKeys.parseJson.convertTo[List[String]] should contain theSameElementsAs attributeNames
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these instead of fixing them, since ENTITY_KEYS will go away

@@ -1,9 +1,9 @@
package org.broadinstitute.dsde.rawls.entities.local

import akka.actor.ActorSystem
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the change in this file is making it work with Quicksilver. I've added comments in places where some other change was needed.

val metadata = provider.entityTypeMetadata(false, testContext).futureValue
metadata("cat").attributeNames.size shouldEqual exemplarAttributeNames.size
val metadata = provider.entityTypeMetadata(useCache = false, testContext).futureValue
metadata("cat").attributeNames should contain theSameElementsAs exemplarAttributeNames
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the assertion on .size. The should contain theSameElementsAs assertion covers that, and outputs more helpful errors when the test fails.

exemplarAttributeNames.size shouldBe allAttributes.size
allAttributes.foreach(attr => assert(attr.deleted))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test doesn't make sense in a quicksilver world, so I've deleted it

.getEntity(caseInsensitiveAttributeData.head.entityType, caseInsensitiveAttributeData.head.name, testContext)
.futureValue
.attributes
allAttributes.keys.map(AttributeName.toDelimitedName) should contain theSameElementsAs exemplarAttributeNames
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the assertion on .size. The should contain theSameElementsAs assertion covers that, and outputs more helpful errors when the test fails.

val cachedTypes = runAndWait(entityTypeStatisticsQuery.getAll(testWorkspace.workspace.workspaceIdAsUUID))
cachedTypes.keySet shouldBe exemplarTypes
val cachedTypes = cachedKeys.map(_.entityType)
cachedTypes should contain theSameElementsAs exemplarTypes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a subtle assertion change here from shouldBe to should contain theSameElementsAs because we're now asserting on a Seq instead of a Set

@davidangb davidangb marked this pull request as ready for review August 5, 2025 18:06
@davidangb davidangb requested a review from a team as a code owner August 5, 2025 18:06
@davidangb davidangb requested review from calypsomatic and kevinmarete and removed request for a team August 5, 2025 18:06
@davidangb davidangb merged commit 4eb691c into develop Aug 5, 2025
25 checks passed
@davidangb davidangb deleted the da_CORE-650_quicksilverCaseSensitivity branch August 5, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants