-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace DvText with .NET Standard type. #705
Conversation
…to dvtext # Conflicts: # test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj # test/Microsoft.ML.Tests/Scenarios/Api/Visibility.cs
scan.Span = DvText.NA; | ||
else if (_sb.Length == 0) | ||
scan.Span = DvText.Empty; | ||
if (scan.QuotingError || _sb.Length == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scan.QuotingError [](start = 28, length = 17)
I wonder if throwing an exception is more appropriate here
labelNames = new VBuffer<DvText>(2, new[] { new DvText("positive"), new DvText("negative") }); | ||
DvText[] names = new DvText[2]; | ||
labelNames = new VBuffer<ReadOnlyMemory<char>>(2, new[] { "positive".AsMemory(), "negative".AsMemory() }); | ||
ReadOnlyMemory<char>[] names = new ReadOnlyMemory<char>[2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadOnlyMemory[] names = new ReadOnlyMemory[2]; [](start = 12, length = 59)
indent #Resolved
Hi @codemzs! Do note that our usual practice for "personal" code reviews is that they happen in a personal fork. That is, there is nothing preventing one from requesting a "pull request" into master in ones own fork, where one can review the changes in convenience as much as they like. Indeed this is our usual practice. #Resolved |
var stratVals = foldCol >= 0 ? new[] { DvText.NA, DvText.NA } : new[] { DvText.NA }; | ||
//REVIEW: Not sure if empty string makes sense here. | ||
|
||
var stratVals = foldCol >= 0 ? new[] { "".AsMemory(),"".AsMemory() } : new[] { "".AsMemory() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check on this.
/// </summary> | ||
public static string GetRawUnderlyingBufferInfo(out int ichMin, out int ichLim, ReadOnlyMemory<char> memory) | ||
{ | ||
MemoryMarshal.TryGetString(memory, out string outerBuffer, out ichMin, out int length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My attitude towards this is either we support ReadOnlyMemory<char>
or we don't. If we do, then our methods on top of it should work, even if this fails. Fortunately it seems like this was only introduced because we wanted to just have a light "shim" on top of existing DvText
methods, which is not really something we want to do. #Resolved
/// </summary> | ||
public static bool Equals(ReadOnlyMemory<char> b, ReadOnlyMemory<char> memory) | ||
{ | ||
if (memory.Length != b.Length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method and the one below could probably efficiently enough be done over direct indices over the ReadOnlyMemory<char>
structure, at least, so I would hope. This would simplify the method considerably. #Resolved
} | ||
|
||
/// <summary> | ||
/// Does not propagate NA values. Returns true if both are NA (same as a.Equals(b)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Returns true if both are NA" suggests this is copy-pasted from somewhere, since of course now we will no longer have the notion of an NA string. This suggests an incomplete conversion, and is an opportunity for code simplification that we should do right now. #Resolved
/// Returns a text span with leading and trailing spaces trimmed. Note that this | ||
/// will remove only spaces, not any form of whitespace. | ||
/// </summary> | ||
public static ReadOnlyMemory<char> Trim(ReadOnlyMemory<char> memory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two major motivations for moving from DvText
to ROM<char>
is that we get to avoid declaring our own special type for text, and also get to exploit the functionality that the .NET framework gives us. It seems like we did the first, and not the second: we have implementations of some things that appear to have relatively straightforward close implementations in System.MemoryExtensions
. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These extension methods are part of ReadOnlySpan and they return ROS as well which you cannot assign to ROM or create a ROM out of it and that effectively makes them useless for someone that is working with ROM. I'm starting a thread with the guys that wrote these methods to make sure I'm not missing anything.
In reply to: 212748423 [](ancestors = 212748423)
/// <summary> | ||
/// This produces zero for an empty string. | ||
/// </summary> | ||
public static bool TryParse(out Single value, ReadOnlyMemory<char> memory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, now that .NET has it, is float.TryParse(ReadOnlyMemory<char>, ...)
unusable? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well. Unfortunately, these methods are only available in .netstandard 2.1, not in 2.0 and our product code targets 2.0. We cannot target 2.1 because then product code won't run on desktop framework anymore because these methods are not available there.
reference: https://docs.microsoft.com/en-us/dotnet/api/system.text.stringbuilder.append?view=netcore-2.1
https://docs.microsoft.com/en-us/dotnet/api/system.text.stringbuilder.append?view=netframework-4.7.2
It would have been really nice if they were though...
In reply to: 212759741 [](ancestors = 212759741,212748642)
public static NormStr FindInPool(NormStr.Pool pool, ReadOnlyMemory<char> memory) | ||
{ | ||
Contracts.CheckValue(pool, nameof(pool)); | ||
MemoryMarshal.TryGetString(memory, out string outerBuffer, out int ichMin, out int length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as before. You're either supporting ReadOnlyMemor<char>
or you're not. Also of course as with every other usage of this function we are ignoring whether this succeeded or not. #Resolved
You will of course have to handle mapping the empty string to 0, which things like In reply to: 415887757 [](ancestors = 415887757,415887609) Refers to: src/Microsoft.ML.Data/Data/Conversion.cs:1265 in ef32b4a. [](commit_id = ef32b4a, deletion_comment = False) |
Please see my previous reply on the other comment similar to this. In reply to: 415892086 [](ancestors = 415892086,415887757,415887609) Refers to: src/Microsoft.ML.Data/Data/Conversion.cs:1265 in ef32b4a. [](commit_id = ef32b4a, deletion_comment = False) |
…to dvtext # Conflicts: # src/Microsoft.ML.Data/DataLoadSave/Text/TextLoader.cs # src/Microsoft.ML.Data/Evaluators/EvaluatorUtils.cs # src/Microsoft.ML.Data/Transforms/TermTransform.cs # src/Microsoft.ML.Data/Transforms/TermTransformImpl.cs # src/Microsoft.ML.ImageAnalytics/ImageLoaderTransform.cs # test/Microsoft.ML.TestFramework/DataPipe/TestDataPipeBase.cs
@@ -237,7 +237,7 @@ public class QuoteInput | |||
public float ID; | |||
|
|||
[Column("1")] | |||
public string Text; | |||
public ReadOnlyMemory<char> Text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still support regular string
types, right? I think that would be bad if we didn't. #Resolved
|
||
[Column("1")] | ||
public float Number_1; | ||
public ReadOnlyMemory<char> Number_1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong - it was float
before, and the property is called Number_1
. #Resolved
@@ -33,6 +33,7 @@ | |||
</ItemGroup> | |||
|
|||
<ItemGroup> | |||
<PackageReference Include="System.Memory" Version="4.5.1" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary. All our test projects are netcoreapp2.1
, and so Span and ReadOnlyMemory should be available by default. #Resolved
@@ -18,4 +18,8 @@ | |||
<NativeAssemblyReference Include="LdaNative" /> | |||
<NativeAssemblyReference Include="FastTreeNative" /> | |||
</ItemGroup> | |||
|
|||
<ItemGroup> | |||
<PackageReference Include="System.Memory" Version="4.5.1" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary. All our test projects are netcoreapp2.1, and so Span and ReadOnlyMemory should be available by default. #Resolved
public float? fFloat; | ||
public double? fDouble; | ||
public bool? fBool; | ||
public string fString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still support string
types, and as such, string
can be null
. So we should have some tests for null strings. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…to dvtext # Conflicts: # src/Microsoft.ML.Data/Microsoft.ML.Data.csproj
This change was committed via #863 |
fixes #673