-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension to other languages #17
Comments
For other languages you would need a dataset that is somewhat similar to the original one. It should include the pen strokes and the corresponding text information. The pen strokes are stored as consecutive points. So the letter above would be represented this way: [[27, 18, 0],
[24, 16, 0],
[21, 16, 0],
[16, 19, 0],
[14, 25, 0],
[16, 31, 0],
[21, 32, 0],
[26, 28, 0],
[27, 23, 0],
[28, 18, 0],
[27, 28, 0],
[29, 31, 1]] # 1 means this is the end of a stroke Where first two numbers are representing coordinates, and the third number says whether it's the end of a stroke. With above sequence there should also be a label. In case of single letter it would be just: For example |
Thank you so much for your reply |
Hello, is there any software through which we generate this sequence on our own?I cannot seem to find any. Also, can you elaborate a bit more on the structure of your .npy file. |
@youraveragesciencepal, unfortunately I don't know of any software that could be used here. As for the elaboration on the import numpy as np
import matplotlib.pyplot as plt
import pickle
# load the preprocessed data from the file
data = np.load('data/dataset.npy', allow_pickle=True)
# let's look whats it's shape
print(data.shape)
# >> (10867,)
# so it's more of a list of examples
# now if we look into some example
print(data[0].shape)
# >>> (855, 3)
# this array stores consecutive points that represent a pen stroke
# if you print first few points
print(data[0][:5])
# >>> [[ 0.82798475 -4.2939095 0. ]
# [ 0.7848605 -4.3370337 0. ]
# [ 0.8193599 -4.2852845 0. ]
# [ 0.81073505 -4.2939095 0. ]
# [ 0.8021102 -4.2939095 0. ]]
# you will see that we store (x, y, e) in each row
# x and y represent coordinates, and e holds special information on
# whether or not after that point we will "lift" the pen (and because it
# is lifted after that point, we wouldn't see the line between those points)
# let's plot first example ignoring `e` part for now
example = data[0]
plt.plot(example[:, 0], -example[:, 1]) # y coordinate is inverted,
# but that's not really important
plt.show()
# this should display a single example from the dataset, as you can see
# it looks like someone didn't lift a pen during writing
# now let's include information stored in `e`
lifts = np.where(example[:, 2] == 1.)[0] + 1 # we do +1 here because we want to
# split after lifted point
splited = np.split(example, lifts)
for s in splited:
plt.plot(s[:, 0], -s[:, 1])
plt.show()
# this should display a single example but ignoring the edges
# when a pen is "lifted"
# now let's move to labels and translation files
labels = np.load('data/labels.npy', allow_pickle=True)
translation = pickle.load(open('data/translation.pkl', 'rb'))
# look at labels shape
print(labels.shape)
# >>> (10867,)
# it should be the same as `data` because we need a label for each
# example that's present in the dataset
print(labels[0])
# >>> [29, 78, 1, 47, 71, 58, 75, 68, 71, 1, 50, 62, 65, 65,
# 62, 54, 66, 72, 13, 1, 28, 1, 66, 68, 75, 58]
# each number in this array represents a letter which we can decode using
# the reversed translation dictionary (we need to reverse it, because it was
# created to be used during the generation, where we need to convert text into
# the numerical labels, but in this example we want to do the reverse)
reversed_translation = {v: k for k, v in translation.items()}
print(''.join(reversed_translation[x] for x in labels[0]))
# >>> "By Trevor Williams. A move"
# which should show the same text we could previously read on plots |
Hello, after several months, I am able to bring data into the array format. You have shown in the example above where the 1 indicates that the pen has been lifted off. Now all the characters 37 of them with 77 variations each have the coordinates in the xyz coordinates with (z=0,1). So now I do not need to do any data-preprocessing right? Also any tips how to convert it into the desired numpy format you have used. I have all the coordinates in a csv format. |
Hi @youraveragesciencepal I am also researching on how to create new datasets with different alphabets to train the model. How did you finally collected the variations? Did you create a tool or followed any guides? 🤔 |
@Grzego I see that your model generates the output in a Actually, in Spanish it has an accent mark on the first e, so it is an |
@youraveragesciencepal if your data is normalized then it should be fine. Otherwise normalizing it similarly to what is done on lines L98-L115 from preprocess.py file could be beneficial. For conversion from Just one side note. The original dataset on which model was trained contained whole sentences in sequences. Meaning that it could learn how to smoothly move from writing some letter and then another one, like handwriting "at" is slightly different than "am". If I understand correctly your current dataset contains only single letters, and in that case this model might not perform as expected. |
Hi, @espetro. About your question on adding marks to the generated sequence. This is theoretically possible but I would not recommend it. The way to achieve this would be to inspect # somewhere in the for loop in lines 87-108 in generate.py
# record indices of all points related to a letter we want to deal with
# `special_letter_idx` is the index of our special letter in input sequence
if np.argmax(phi_data[-1]) == special_letter_idx:
special.append(len(coords))
# ...
# somewhere after the previous for loop at line 118 in generate.py
# at this point `coords` actually hold deltas between consecutive points
# so we must "inject" differences to draw something in correct place
special_coords = coords[special] # select coords related to our letter
cs_special = cumsum(special_coords) # compute actual letter shape
min_x = np.min(cs_special[:, 0])
max_x = np.max(cs_special[:, 0])
max_y = np.max(cs_special[:, 1])
rightdiff = max_x - min_x
updiff = max_y - cs_special[-1, 1] + 0.05 # how high to move a pen
# create array of differences that would be injected into generated sequence
injection = np.array([
[0, 0, 1], # just to lift a pen
[-rightdiff, -updiff, 0],
[rightdiff, 0, 1],
[0, updiff, 1], # move pen back to starting position
])
coords = np.concatenate((
coords[:special[-1]], # before injection
injection,
coords[special[-1]:], # put back everything after
), axis=0)
# ...
# proceed to plotting This will add horizontal line above As I said previously, I do not recommend doing it that way. Better and probably more reliable way of implementing this, would to be create dataset in the language you want to generate (although I understand this can be rather hard). |
Did you ever extend the dataset to full sentences and get a reasonable output for special characters? Digits and consecutive capitals generally perform poor. Thought about adding this as well, but not sure the scope on generating the x,y,z xml format . Has anyone done this? Any tooling to make it easier? |
Hello, I am interested in training the model to other languages such as Spanish and Turkish but not sure how to generate the strokes sequence.Any help will be highly appreciated.
Thanks
The text was updated successfully, but these errors were encountered: