Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can I have more column in train set #43

Open
shainaraza opened this issue Mar 13, 2020 · 10 comments
Open

can I have more column in train set #43

shainaraza opened this issue Mar 13, 2020 · 10 comments

Comments

@shainaraza
Copy link

other than the specified format as below, can I have more columns as features?
guid: An ID for the row.
label: The label for the row (should be an int).
alpha: A column of the same letter for all rows. Not used in classification but still expected by the DataProcessor.
text: The sentence or sequence of text.

@ThilinaRajapakse
Copy link
Owner

Not without creating your own model class. Transformer models only accept a sequence of text as its input.

@shainaraza
Copy link
Author

thanks you very much for yours reply.
can I make some change here
def init(self, input_ids, input_mask, segment_ids, label_id):
self.input_ids = input_ids
self.input_mask = input_mask
self.segment_ids = segment_ids
self.label_id = label_id

@ThilinaRajapakse
Copy link
Owner

I'm not sure where that piece of code is from. Essentially, you'll need to edit the BertForSequenceClassification class in the transformers library so that it can accept additional inputs. You'll also need to write the forward() function to handle the inputs.

@shainaraza
Copy link
Author

Thanks you ThilinaRajapakse for yours great work and timely responses, I am using this library and definitely acknowledge and refer you in my coming work, all the best

@ThilinaRajapakse
Copy link
Owner

No problem!

Take a look at Simple Transformers as well. You may find it easier to work with compared to this repo.

@shainaraza
Copy link
Author

yes I am using simple transformers too, its super easy to use.
I am currently using google colab. sometimes I get error "RuntimeError: CUDA error: device-side assert triggered".
Which cloud services for GPU do you suggest, my dataset is like 2GB.
thanks in advance

@ThilinaRajapakse
Copy link
Owner

That error normally happens when you have bad data in your dataset (invalid labels, special characters, etc.)

I don't use cloud GPUs so I'm afraid I can't really recommend any.

@shainaraza
Copy link
Author

thanks ThilinaRajapakse for yours timely response once again, I agree with you about data, one last question for today, can I run same simple transformers on CPU, i mean you tested and built all these models, did you use some GPUs or just CPU

@ThilinaRajapakse
Copy link
Owner

You can run them on either. However, running on CPU will be far too slow for it to be practical. I always train using a GPU.

@shainaraza
Copy link
Author

thanks, best to you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants