-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle non-UTF8 files #228
Conversation
This is the plan for this PR:
@archseer You said this, #18 (comment)
AFAIK, |
Haha, oops. The buffer seems glitched right now. I think I may need to drop the |
By the way, there's also https://docs.rs/encoding_rs_io/0.1.7/encoding_rs_io/ by burntsushi. Let's look at how that's doing the reads internally, there might be some patterns we're able to copy. |
I took a quick look, but I don't think there's anything from it that would improve the code here significantly. |
I removed the files for the benchmark for now so it's easier to review. |
Last thing left to do is commands and the status bar. |
I think the encoding had this thing where it'll change encoding on the fly. Can someone double check that? If so, do we want to disable that if encoding is manually selected? |
Got this error while trying to set encoding of
|
When trying to insert a character into UTF-16. Probably also need to encode on insert.
|
|
The in-memory representation is always UTF-8 s and we always want to insert UTF-8 encoded text. I don't think you're finding encoding specific behavior, but rather bugs with our UTF-8 handling. |
To clarify: the only valid rope representation is UTF-8. You need to decode from other encodings to UTF-8 on file load, and re-encode to other encodings on file save. So |
I'll be moving the |
At some point, we should find the cause for this error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor comments this round, great work 👍
@kirawi Your latest commit message says "wip", but after your explanation this looks feature complete to me. I still have minor concerns about code readability, but I don't want to block merging on it. What is it that you're still working on? Edit: nevermind, brain fart. Didn't check to see if there were new commits since the last time I looked. Ha ha. |
There appears to be a bug with line indentation in |
I believe I've fixed the bug that was causing the merge commit tests to fail. It was totally my fault, from a different PR. So hopefully all that's left is to switch back to the nested loop construction (if you still want to--I won't block merging on that), and fix the small merge conflict that my own fix introduced. |
a2a3f0f
to
b701215
Compare
7cc7cf8
to
e26b5b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Not sure what's causing the newlines, I'm guessing it's because of
\n
within the PDF. Definitely still buggy overall (see below).Resolves #18