Skip to content

Language: add support for accessing ENUM display names #6169

Closed
@beccasaurus

Description

@beccasaurus

Feature Request: add support for accessing display name of enum values

For example:

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types

client = language.LanguageServiceClient()
document = types.Document(content=text, type=enums.Document.Type.PLAIN_TEXT)
tokens = client.analyze_syntax(document).tokens

for token in tokens:
    print(token.part_of_speech.tag)  # <-- tag is an enum field.
                                     #     link below to RPC reference for Tag.

Expected result, based on all of the other languages' GAPIC clients:

NOUN
ADV
VERB
NOUN

Actual result:

6
3
11
6

I get the expected result when using GAPIC client libraries in all of the other languages.

I would characterize the current behavior as: surprising


The current developer experience, if you want to access the enum values:

  1. Find the relevant enum, eg. Natural Language Part of Speech Tag (noun, verb, pronoun, etc)
  2. Copy the display names into an indexed list in your program,
    tag_names = ('UNKNOWN', 'ADJ', 'ADP', 'ADV', 'CONJ', 'DET', 'NOUN', 'NUM', 'PRON', 'PRT', 'PUNCT', 'VERB', 'X', 'AFFIX')
  3. Access the display name using the enum index int value available on the pb object
  4. Check back on the .proto occasionally incase new enum values have been appended

There may be a better way, but this is the process we use to author our Python samples.

Example: sample for analyzing syntax of text (natural-language/docs/analyzing-syntax)

def syntax_text(text):
    """Detects syntax in the text."""
    client = language.LanguageServiceClient()

    if isinstance(text, six.binary_type):
        text = text.decode('utf-8')

    # Instantiates a plain text document.
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    # Detects syntax in the document. You can also analyze HTML with:
    #   document.type == enums.Document.Type.HTML
    tokens = client.analyze_syntax(document).tokens

    # part-of-speech tags from enums.PartOfSpeech.Tag
    pos_tag = ('UNKNOWN', 'ADJ', 'ADP', 'ADV', 'CONJ', 'DET', 'NOUN', 'NUM',
               'PRON', 'PRT', 'PUNCT', 'VERB', 'X', 'AFFIX')

    for token in tokens:
        print(u'{}: {}'.format(pos_tag[token.part_of_speech.tag],
                               token.text.content))

/cc @lukesneeringer

/fyi @vchudnov-g @pongad @theacodes

Metadata

Metadata

Assignees

Labels

api: languageIssues related to the Cloud Natural Language API API.codegentype: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions