Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large comments , The aspects and sentiments are not getting extracted #318

Open
dexter16111989 opened this issue May 11, 2023 · 5 comments

Comments

@dexter16111989
Copy link

dexter16111989 commented May 11, 2023

Below is my code , If the text is too long the aspect & sentiment are not getting extracted , Please correct me if I am doing something wrong.

from pyabsa import AspectTermExtraction as ATEPC, available_checkpoints

# you can view all available checkpoints by calling available_checkpoints()
checkpoint_map = available_checkpoints()

aspect_extractor = ATEPC.AspectExtractor('multilingual',
                                         auto_device=True,  # False means load model on CPU
                                         cal_perplexity=True,
                                         )

# instance inference
atepc_result=aspect_extractor.predict(["As per my previous review, I've had a few bumps along the road in the early days with Amazon. However after ordering frequently for over 2 years now, I can honestly say that either these issues have been resolved, or I'm always compensated in some way for any inconvenience." ],
                         save_result=True,
                         print_result=True,  # print the result
                         ignore_error=True,  # ignore the error when the model cannot predict the input
                         )


O/P of aptec_results

results_op

@dexter16111989 dexter16111989 changed the title For comments with large the aspects are not getting extracted Large comments , The aspects and sentiments are not getting extracted May 11, 2023
@yangheng95
Copy link
Owner

There are two possible reasons,

  1. The text is too long, please try to cut the text
  2. You data does not well fit the training data, you can try to train the model based on your dataset
    If these information does not help, please comment and @me. Thanks.

@SuryaprakashM
Copy link

Hi,
Is there any specific limit on the length of text ?

@yangheng95
Copy link
Owner

80 words for the provided checkpoints and 512 maximum in your own training (set config.max_seq_len=512)

@SuryaprakashM
Copy link

Thank you !

@combokang
Copy link

80 words for the provided checkpoints and 512 maximum in your own training (set config.max_seq_len=512)

Any chance to use your models with over 512 tokens?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants