-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrected norwegian bokmal stopwords and removed nynorsk words #293
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -90,7 +90,10 @@ public enum Language | |||||||||||||||||||||||
Polish = 12, | ||||||||||||||||||||||||
Czech = 13, | ||||||||||||||||||||||||
Arabic = 14, | ||||||||||||||||||||||||
Japanese = 15 | ||||||||||||||||||||||||
Japanese = 15, | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
[HideEnumValue] | ||||||||||||||||||||||||
Norwegian_Bokmal_v1 = 256 | ||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
shouldn't this be v2 as the original should remain as "Norwegian_Bokmal" #Closed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @glebuk . The code as it stands is correct here. Usage of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So should we move the previous enum id to v1 and assign a new id to the new Norwegian_Bokmal enum label then? Otherwise how would the old model be compatible? In reply to: 196494433 [](ancestors = 196494433) |
||||||||||||||||||||||||
} | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
public sealed class Column : OneToOneColumn | ||||||||||||||||||||||||
|
@@ -198,6 +201,11 @@ public ColInfoEx(ModelLoadContext ctx, ISchema input) | |||||||||||||||||||||||
// int: the id of languages column name | ||||||||||||||||||||||||
Lang = (Language)ctx.Reader.ReadInt32(); | ||||||||||||||||||||||||
Contracts.CheckDecode(Enum.IsDefined(typeof(Language), Lang)); | ||||||||||||||||||||||||
if(Lang == Language.Norwegian_Bokmal | ||||||||||||||||||||||||
&& ctx.Header.ModelVerWritten == 0x00010001) | ||||||||||||||||||||||||
{ | ||||||||||||||||||||||||
Lang = Language.Norwegian_Bokmal_v1; | ||||||||||||||||||||||||
} | ||||||||||||||||||||||||
_langsColName = ctx.LoadStringOrNull(); | ||||||||||||||||||||||||
if (_langsColName != null) | ||||||||||||||||||||||||
{ | ||||||||||||||||||||||||
|
@@ -229,8 +237,8 @@ private static VersionInfo GetVersionInfo() | |||||||||||||||||||||||
{ | ||||||||||||||||||||||||
return new VersionInfo( | ||||||||||||||||||||||||
modelSignature: "STOPWRDR", | ||||||||||||||||||||||||
verWrittenCur: 0x00010001, // Initial | ||||||||||||||||||||||||
verReadableCur: 0x00010001, | ||||||||||||||||||||||||
verWrittenCur: 0x00010002, // Initial | ||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of just verWrittenCur: 0x00010002, // Initial I would prefer something more along these lines. //verWrittenCur: 0x00010002, // Initial
verWrittenCur: 0x00010001, // Corrected Norwegian Bokmål stopwords. The idea is that as more versions are added, you have a little running catalog of why each version bump was necessary. The most extreme example of this is in our machinelearning/src/Microsoft.ML.Data/DataLoadSave/Text/TextLoader.cs Lines 909 to 919 in fc7286c
|
||||||||||||||||||||||||
verReadableCur: 0x00010002, | ||||||||||||||||||||||||
verWeCanReadBack: 0x00010001, | ||||||||||||||||||||||||
loaderSignature: LoaderSignature); | ||||||||||||||||||||||||
} | ||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a DEBUG-only path in the general PropertyGroup section. The build type and platform should be replaced with appropriate variables. . As it stands the path is incorrect for the release build.