You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if freq / 1.0 / self.total_word_count > self.sub_sampling_t}
# freq to score
sub_sample_tbl = {item: sub_sampling(_freq) for item, _freq in sub_sample_tbl.items()}
# word to id
sub_sample_tbl = {self.word2id[i]: j for i, j in sub_sample_tbl.items() if j < 1}
return sub_sample_tbl
`
line 9
9. def sub_sampling(_freq): it looks like it returns ( p(w_i) = (sqrt(sub_sampling / freq) + sub_sampling / freq) ) not ( p(w_i) = 1 - (sqrt(sub_sampling / freq) + sub_sampling / freq) ) right?
why this line ?
14. if freq / 1.0 / self.total_word_count > self.sub_sampling_t}
if we before used
4. if freq < self.min_count: in the def gen_vocab(self) function in the first part of the question
what is the meaning of this line?
sub_sample_tbl = {self.word2id[i]: j for i, j in sub_sample_tbl.items() if j < 1}
thank you!
The text was updated successfully, but these errors were encountered:
in def gen_vocab(self) we select the vocab that have number of freq >=self.min_count like this:
can you please clarify this function (def gen_subsample_table(self))?
`
`
line 9
9. def sub_sampling(_freq): it looks like it returns ( p(w_i) = (sqrt(sub_sampling / freq) + sub_sampling / freq) ) not ( p(w_i) = 1 - (sqrt(sub_sampling / freq) + sub_sampling / freq) ) right?
why this line ?
14. if freq / 1.0 / self.total_word_count > self.sub_sampling_t}
if we before used
4. if freq < self.min_count: in the def gen_vocab(self) function in the first part of the question
what is the meaning of this line?
thank you!
The text was updated successfully, but these errors were encountered: