[Tagging] Implementation of Tag Validation

##  Background

While implementing tagging tests, a number of bugs with the `slugify()` code and tagging were revealed.  In particular, issue #123 and #124 have started a discussion around what we want to allow or disallow in tags, how we want validation and error handling to work with tagging, and weather or not we want to include related tags from a particular endpoint (e.g. resources, hangouts, learning paths, discussions, projects, etc.).   Since much of that discussion happened inside the PR for tagging tests, we decided we needed to port it to it's own issue, as we work out the intended behavior and archetecture.

Relevant section of  PR #121  around the implementation of tag validation is below:

______


#### [chris48s](https://github.com/chris48s)** [7 days ago](https://github.com/codebuddies/backend/pull/121#discussion_r399795637) 

Author Member

In these two cases, the tests pass based on the values from the fixtures but it seems like these aren't generating particularly useful slugs (esp when you consider the objective of slugging is to make the text unique and URL-safe). What should we be doing here?
If a single emoji isn't a valid tag name, shoudl we just be validating here so that trying to save a tag called "🐸" throws `ValidationError`?


#### **[BethanyG](https://github.com/BethanyG)** [7 days ago](https://github.com/codebuddies/backend/pull/121#discussion_r399869794) 

Member

I think that if we can't get emojis to work correctly through slugifying, then yes - we should throw a validation error. I can log an issue about that...along with our issues around Hindi and Telugu. I really don't want to have to transliterate, but we may need to go that route if we can't find an alternative to what slugify is doing to those languages.


#### **[BethanyG](https://github.com/BethanyG)** [7 days ago](https://github.com/codebuddies/backend/pull/121#discussion_r399926078) 

Member

Logged issue [#123](https://github.com/codebuddies/backend/issues/123) for the script fail and [#124](https://github.com/codebuddies/backend/issues/124) for the emoji.


#### **[lpatmo](https://github.com/lpatmo)** [7 days ago](https://github.com/codebuddies/backend/pull/121#discussion_r399974490) 

Member

(fwiw from a product perspective, I think it's totally fine for us not to support emojis in tags. Validating against it sounds reasonable!)

👍 1


#### **[BethanyG](https://github.com/BethanyG)** [20 hours ago](https://github.com/codebuddies/backend/pull/121#discussion_r403644974) 

Member

So - I think adding in validators (and tossing errors) for emoji and picto-ascii would be a good thing. But we have some behaviors to think through for the api/app. I am seeing multiple scenarios - but also have questions:

1. Single or repeated unicode or ascii emoji/symbols as the sole content of a tag : reject out of hand.
   - validator for DRF should toss an error here. HOWEVER - does that mean the entire creation of the `resource` fails? I don't think so, since tags are not *technically* part of a `resource`....but that then creates some issues.
   - if a validation error **is** thrown, do we create the `resource`, but hand back a message to the effect that one or more of the `tags` was invalid? HTML [**207**](https://httpstatuses.com/207) with an embedded 400 and a message of *"Your tag text contains one or more unsupported characters, and was not saved."*? This needs some detailed scoping.
   - if a validation error is thrown, do we create the `resource` but omit all the tags and hand back a message?
   - if only **one** of multiple tags has a problem, do we drop the "bad" tag, or do we fail the whole set of tags?
2. Single or repeated unicode or ascii emoji/symbols as ***part*** of a tag : do we drop these silently - creating the tag/slug without them, or do we throw a validation error for the entire tag? And if we do throw a validation error for the entire tag, which scenario from above do we apply when creating the `resource` ?
3. When we validate, will we be validating for both the name and the slug?


#### **[chris48s](https://github.com/chris48s)** [12 hours ago](https://github.com/codebuddies/backend/pull/121#discussion_r403695805) 

Author Member

> Single or repeated unicode or ascii emoji/symbols as the sole content of a tag : reject out of hand.

Done.

> Single or repeated unicode or ascii emoji/symbols as part of a tag : do we drop these silently - creating the tag/slug without them, or do we throw a validation error for the entire tag?
> When we validate, will we be validating for both the name and the slug?

So far, the only validation I've done applies to the *slug*, so basically whichever variation claims the slug first gets it i.e: "Javascript 🙂" is a legal tag name, and if you create "Javascript 🙂" first, that tag now owns the slug `javascript`.

If you want to apply the same rules to the name, I wonder if it is actually useful for the tag name and slug to be different?

------

The other questions on API behaviour are probably best moved to another issue to avoid trying to boil the ocean in one PR as this is already getting long. Lets just focus on the model behaviour here.

My gut instinct though is that working out these behaviours would probably be much easier if the operations of:

- creating tags
- creating resources
- applying tags to resources

were 3 separate endpoints/operations. Then you can know if your tag is valid before you apply it to a resource. You can still tie those 3 things into what feels like a single 'page' on the frontend for a nice UX.


#### **[lpatmo](https://github.com/lpatmo)** [3 hours ago](https://github.com/codebuddies/backend/pull/121#discussion_r403755601) 

Member

Ohh, I see -- so after a user finishes typing a tag, we can fire off a POST request to a /tags endpoint to check whether that tag is valid. *Then* they submit the form.

> applying tags to resources

Can you say more about how this endpoint would work?

------

> HOWEVER - does that mean the entire creation of the resource fails?

From a naive user perspective, if I was submitting to create a resource and had a tag that was invalid in some way (e.g. had an emoji character), then having the entire form submission fail on submit (e.g. no tags or resources is created) is reasonable to me. I'd expect to see an error about some invalid character in one of the tags, change it appropriately, then re-submit the form.


#### **[BethanyG](https://github.com/BethanyG)** (https://github.com/codebuddies/backend/pull/121#discussion_r403776601) 

Member

Humm.. as [@chris48s](https://github.com/chris48s) suggested -- we probably should take this to a separate issue. So I will make that shortly. Especially since this convo will be hidden once the PR is merged in.

I think the web UX could be fairly straightforward, and could include both a UI message (***the following characters are not allowed for tags***) & blocking the user from typing said emoji/ascii/character sets (a background form/field validation that dropped or highlighted any unicode or ascii characters in that range, similar to what I have seen with some set password fields). I don't think there is even a need there to fire off a POST - we just need to agree which characters/character sets are going to be off limits, and then **fail** them before they can get to the backend (theoretically).

But the API itself is a bit different (since we're building for more than the website, and can't give feedback to users in the same way) -- and in that case, for clarity, we probably do want to have a whole creation failure of an object with a message about why. (***"your tags contain invalid/out of bounds characters. Characters in the following ranges are not allowed. ..."***).

OR we have a a separate tagging endpoint & action (*pass in the GUID of the object to be tagged, along with a list of tags and get back a verification that tagging for that object has succeeded or failed w/ a message about why*).

Now - **what** characters would cause that is still a bit open....

I would disallow emoji and picto ascii altogether in either tag **names** or **slugs**. Why go there? Doesn't seem to add significant value to our users at the moment.

On the other hand, having the ability to tag or note something in your native language **does** have value - so I still want to noodle on that problem.

#### lpatmo](https://github.com/lpatmo)** [22 minutes ago](https://github.com/codebuddies/backend/pull/121#discussion_r403802305) 

Member

Sounds good! (Btw, to clarify my earlier statement, I was talking about the error message from an API call as well -- that is, even though we'd have front-end validation, I'd like to see an error message returned if I say make a POST request in Postman to create a resource with problematic tags included. Then I'd expect the entire request to fail, if that makes sense. :P)

[![image](https://user-images.githubusercontent.com/4512699/78518311-0101de00-7775-11ea-9a05-d918f3a5b138.png)](https://user-images.githubusercontent.com/4512699/78518311-0101de00-7775-11ea-9a05-d918f3a5b138.png)
--> error

_____

##  Decisions We've Made So Far:

1.  The backend/api will throw a `ValdiationError` for any `tag` that contains what we consider "invalid" characters.
2.  A validation error will cause the POST request to create a new `resource` to fail with a return message along the lines of "One or more of your tags contain invalid/unsupported characters."


##  What We Still Need to Decide/Discuss

1.  What sets of characters will cause a `ValidationError`?
2.  Will both the **name** of the tag and the **slug** of a tag be validated?
3.  Will we continue having tags returned as part of a `resources` endpoint, or will we move toward having `tagging` as its own endpoint?
4.  If `tagging ` becomes its own endpoint, what does that look like, and what is the expected flow/interaction?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Tagging] Implementation of Tag Validation #127

Background

chris48s** 7 days ago

BethanyG 7 days ago

BethanyG 7 days ago

lpatmo 7 days ago

BethanyG 20 hours ago

chris48s 12 hours ago

lpatmo 3 hours ago

BethanyG (#121 (comment))

lpatmo](https://github.com/lpatmo)** 22 minutes ago

Decisions We've Made So Far:

What We Still Need to Decide/Discuss

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Tagging] Implementation of Tag Validation #127

Description

Background

chris48s** 7 days ago

BethanyG 7 days ago

BethanyG 7 days ago

lpatmo 7 days ago

BethanyG 20 hours ago

chris48s 12 hours ago

lpatmo 3 hours ago

BethanyG (#121 (comment))

lpatmo](https://github.com/lpatmo)** 22 minutes ago

Decisions We've Made So Far:

What We Still Need to Decide/Discuss

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions