-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help with Converting Spatio-Temporal Dataset for Consumption #19
Comments
For each time frame (one line in your training corpus), if it only contains 5 features, you could build embedding model likes. That means each time frame has its unique id. I just updated RNNSharp to support embedding model in raw text format, so you could use above format for training directly. Please replace WORDEMBEDDING_FILENAME with WORDEMBEDDING_RAW_FILENAME in configuration file. For #2, yes. It looks good. For example, it may looks like For each time frame, it has a corresponding label as result. |
Hello: I'm getting closer. I've since extracted all my time frames that I want to train the dataset into a single file: rawModel.txt. It has the format: \t\t\t\t\t I've also created a train.txt file, and it is in the format: \t Finally, I've also create a template.txt file. It looks like this: U01:%x[0,0] I've modified the BAT file to use the new files, but it's not working the way I had planned. 1.) How does RNNSharp (RNNSharpConsole) know when one spatio-temporal entity has completed and a new one begins? I'm more talking about the edge cases. I've tried to split up them using a blank line, but an exception is thrown, stating the lengths are not the same. |
Since you are going to use continuous values as features, the template.txt should only keep one line: U01:%x[0,0]. All of other lines are used for discrete features only. In training corpus, RNNSharp uses a blank line to split two entities, but embedding model (rawModel.txt in your example) needn't to use blank lines, since embedding model is just a key-value pair, RNNSharp access embedding model by keyword, and get dense features from embedding model for encoding or decoding. RNNSharp already supports embedding model in raw text format, you could sync the latest code from depot and use it. In your case, the configuration file looks like: #The file name for template feature set WORDEMBEDDING_RAW_FILENAME: rawModel.txt I hope these information can help you. For exception you mentioned, could you please show more detailed information about it ? |
Hello,
I have a spatio-temporal dataset that I have compiled. It's in a TSV format, and I'd like your RNNSharp to consume the input for classification as well as recognition. My features are continuous values in the range [0, 1]. My TSV file looks like the following:
ID1 0.923 0.223 0.573 0.235 0.111
ID1 0.920 0.228 0.353 0.213 0.098
ID1 0.901 0.677 0.235 0.551 0.121
...
ID1 0.853 0.383 0.301 0.618 0.132
ID1 0.918 0.733 0.622 0.222 0.238
ID1 0.985 0.682 0.793 0.221 0.465
...
ID1 0.953 0.788 0.912 0.228 0.539
ID2 0.918 0.733 0.622 0.222 0.238
ID2 0.985 0.682 0.793 0.221 0.465
...
ID2 0.953 0.788 0.912 0.228 0.539
Each line in my TSV is a snapshot at a specific moment in time. When all snapshot are combined, it describes the spatio-temporal entity. These entities are separated by an EMPTY LINE. Therefore, the first instance ID1 is all the lines until you reach the empty line. The second instance of ID1 is the next set of contiguous lines and so on. Note, the first TSV value is just a class label and is not a feature. Also, I have 6 class labels for this spatio-temporal dataset.
1.) First, how can I transform my data into an "embedded feature" that is in the correct model format? I assume this is the Txt2Vec?
2.) Additionally, I will have to create a corpus. Will the following work for the corpus?
ID1 ClassLabel1
ID2 ClassLabel2
ID3 ClassLabel3
ID4 ClassLabel4
ID5 ClassLabel5
ID6 ClassLabel6
3.) Additional steps or a walkthrough would be greatly appreciated. I hope this information helps all others who are trying to consume RNNSharp. When I finish, I hope to compile a walkthrough for others, so they can easily consume this great technology.
Thank you.
The text was updated successfully, but these errors were encountered: