-
Notifications
You must be signed in to change notification settings - Fork 2
Feature pipeline changes + DEV language extactor #4
Conversation
To be continued.
return self.clf.predict(samples) | ||
|
||
def __map_input__(self, samples): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warum 2 _? Also nachdem was ich gelesen hab soll __ "privater" als _ sein, habs aber noch nie so gesehen und man liest auch dass mans eher nicht verwendet sollte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keine Ahnung, ich hab mich nicht mit conventions von python befasst :D
""" | ||
The languages returned from github are mapped to the kb size of usage. | ||
Eq {'Python': 98564, 'R': 4914} | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
for language in self.__get_relevant_languages__(): | ||
if language in languages: | ||
relevant_size += languages[language] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ist glaube nicht case insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nicht case insensitive -> case sensitive? Ist es ja. Zumindest sind die verfügbaren Sprachen ja vordefiniert. Wir könnten auch alles defensiv gestalten.
Will be probably be merged into one FeatureExtractor
This one should not be used with the normal decision tree because we have a total of 223 features with this one only.
Add categories to the feature extraction in order to split feature "b…
No description provided.