Skip to content

Conversation

Emerentius
Copy link
Contributor

@Emerentius Emerentius commented Sep 4, 2018

This PR isn't finished, but I'd like to get some feedback already. Would this be a good addition at all? Are any major design changes necessary?

This adds a new exercise that is specifically about the nitty gritty details of unsafe. There is a linked-list exercise in problem-specifications, but this is not supposed to be it (I will have to rename it).

It's expanded in two ways:

  1. The exercise requires the ability to insert or remove elements from any point in O(1)
  2. implemented with unsafe {}

Doubly linked lists are infamous for tripping up newcomers to Rust who choose it as their first trial project and are hit with the full force of the borrow checker before they had a chance to learn about it.

Because they are an owning collection, students will have to learn about how to manually manage memory via Box::into_raw and Box::from_raw and expose a safe interface. Because it's an owning collection, it also touches on Send, Sync and covariance over the contained items (these should probably behind a feature flag). There's also some interaction with drop check, but I am yet to understand exactly what the rules are.

I've tried to keep the exercise simple, yet not too far removed from actual use cases by including the cursor. Without it, you may as well use VecDeque. There aren't actually that many cases where unsafe needs to be explicitly written, but a lot of potential for use-after-free, double-free, accidental leaks, etc, if you don't manage the pointers correctly.
Overall, it came out a little long. I'd love for the required code to be reduced to <200 lines.

Each node in a doubly linked list contains data and pointers to the next
and previous node, if they exist.

New nodes can be efficiently added at any point in the list, if one already has
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth mentioning practical applications, such as the dancing links algorithm.

@coriolinus
Copy link
Member

This is a really great start for an exercise! Thanks, @Emerentius!

There are a lot of small points, and a bit of architectural reshuffling, but the core is strong, and this will be a really useful exercise for advanced students.

Agree that you need to rename this. Suggest just "doubly-linked-list".

@coriolinus
Copy link
Member

You've caught me at an inopportune moment: I'm about to head on vacation, so can't promise to review this within the next two weeks or so.

I will comment that while this exercise requires a fair amount of implementation, at least per the examples, it's not the most verbose exercise in the track:

$ fd example.rs --exec wc -l {} \; | sort -nr | head -n3
Thu Sep 13 09:46:04 DST 2018
342 exercises/poker/example.rs
245 exercises/react/example.rs
187 exercises/decimal/example.rs

Therefore, I wouldn't worry too much about the length.

@Emerentius
Copy link
Contributor Author

Emerentius commented Sep 13, 2018

No rush. I still have to figure out what the deal with dropck is and overhaul the tests so that they have a clear thread.

@coriolinus
Copy link
Member

Looks like filtering the repo on student download is a blacklist system, not a whitelist. What this means is that pre_implemented.rs will properly be copied to student machines.

ignore_pattern

@coriolinus
Copy link
Member

The reason Travis is failing is that the README is other than it expects. Try regenerating that?

@coriolinus
Copy link
Member

Of interest to the cursor implementation: https://github.com/4e554c4c/rfcs/blob/cursors/text/0000-linked-list-cursors.md

Are you still working on this, @Emerentius? I still think this will be a cool addition to the track, but I haven't seen much movement recently.

@Emerentius
Copy link
Contributor Author

There's one safety related issue that I have been blocked on because I don't yet understand it: the interaction with dropck.
In several places it is mentioned that if you have an owning raw pointer, it's important to use PhantomData to show this ownership to dropck.

I haven't yet been able to understand what problems can potentially arise from not doing so and whether the danger is always present or whether you have to opt-in to it somehow.
If the unsafety always exists, then it must be part of the exercise and we should somehow test for it. It would be a great example of the subtleties that you encounter with unsafe. If it's opt-in with an obvious potential for unsafety (e.g. from the dropck eyepatch), then we could delegate that to later and make it part of the advanced tests like Send/Sync and the lifetime subtyping.

The nomicon doesn't have an example demonstrating the unsafety. This issue says that the chapter is outdated anyway and I don't know where the authoritative definition is located right now.

@coriolinus
Copy link
Member

My understanding is that the dropck concern is not relevant to this exercise, because we don't have any structs which own raw pointers. We likewise don't require or request that students implement structs which do.

I'd love to get someone experienced with unsafe code to give a firm answer, but I don't have contact with many people who are experienced with unsafe code. @petertseng do you have thoughts on this?

In conclusion: I'm about 80% confident that there is no dropck concern for this exercise, and therefore you should consider yourself unblocked.

@coriolinus
Copy link
Member

Doing some reading, I believe that I have a little better understanding:

  • Vec<T> would normally require that all T have lifetimes exceeding that of the vector
  • this is unergonomic, becuase in this model, vec![0;10] doesn't work
  • the vector uses some unsafe attributes to enable it to contain instances of T whose lifetime equals its own
  • these unsafe attributes can confuse dropck, so PhantomData<T> is used to un-confuse it

If all of these points are correct, then we can officially declare that this issue is Out Of Scope for this PR. Our LinkedList<T> is fine requiring that all owned data outlives it.

@petertseng
Copy link
Member

I will caution that I actually don't fit the description of "experienced with unsafe code", but when the time is right I would attempt to solve this exercise from the POV of a student and report what I see

@Emerentius Emerentius force-pushed the linked_list branch 2 times, most recently from 869eb8a to 4e899ea Compare January 1, 2019 03:40
@Emerentius
Copy link
Contributor Author

Emerentius commented Jan 1, 2019

I can't reproduce the travis failure locally

Edit: I found and fixed it after rebasing. The travis failure now is because of a race condition in the tests.

@Emerentius
Copy link
Contributor Author

@petertseng I've changed everything I wanted to. If anything major is amiss, I'm lacking the fresh perspective. If you could try your hand at it, that would be cool.

@petertseng
Copy link
Member

I wanted to but I found myself unprepared to do unsafe. So it kept getting pushed back to a day when I can learn that.

@coriolinus
Copy link
Member

After far too long, I've made the time to perform a student implementation of this exercise. You can find the sum of my efforts here.

coriolinus@668e146 fixes some deprecation warnings; rustfmt made some additional changes which I assume are just style updates. This commit should probably be merged into this branch.

I took notes as I went through and implemented this; the full file is here, but don't take any of the early stuff to heart: I was more or less just logging my thoughts as I went, and I ended up fixing a lot of the stuff the early notes complain about. The highlights/final form are as follows:

  • it took a very long time to go from starting the exercise to get even the first test passing; I think it ended up being over two hours, even though I'd (a few months ago) done the necessary research to know that I wanted to orient my code around the NNMut<T> type (an alias for Option<NonNull<*mut Node<T>>>).

  • however, the more tests passed, the less time it took to implement enough that subsequent tests passed

  • I was overall very happy with the test suite; it took me through a very reasonable progression of complexity.

  • The biggest issue I had, initially, was figuring out what Drop was supposed to do; the code wouldn't compile without it, but particularly at first, I had no idea what I wanted it to do. In the end, I just lifted the implementation from the standard library.

    This led to implementing several methods earlier than I otherwise would have needed to, which contributed to the sense that the first steps were the hardest for this exercise.

    I have not yet tested it, but I'd consider removing the Drop implementation from the stub file entirely. This will cause student implementations to leak memory on each test run for the duration of the test runner, which is acceptable; as soon as the tester exits, all that memory will be freed by the OS anyway. I think that this is acceptable, as there exist several leak tests, and Step 4 explicitly instructs students to implement Drop.

  • Though the README instructs us to minimize duplication, my efforts in that direction didn't pan out. They're encapsulated in coriolinus@a45f346, which I immediately reverted because it caused several tests to fail. For a savings of two lines of code, I didn't think it was worth the debug hassle.

  • The bottom line: I felt like I learned a lot about unsafe code from implementing this exercise from the student perspective; I have a better understanding of what unsafe does and how to use it properly. That was the goal of the exercise, and I believe that it was successful.

I apologize that it took me so long to get to this, but this really was the first opportunity I've had. I believe that these are the only items waiting before merge:

  • cherry-pick coriolinus@668e146 or otherwise apply equivalent changes
  • potentially remove the Drop trait impl from src/lib.rs, if you agree that it is a good idea

@coriolinus
Copy link
Member

If there are no comments before then, I intend to do the following on Wednesday the 7th:

  • cherry-pick coriolinus@668e146 into this PR
  • remove the Drop trait impl from src/lib.rs
  • fix any Travis errors produced
  • merge this PR

While this has become a low-velocity issue, it is in nobody's interest that it remain one.

@Emerentius
Copy link
Contributor Author

unimplemented!() in the Drop impl was a mistake, yep. I'd leave the Drop impl but replace the unimplemented!() with a comment, but removing it entirely is also fine.

I also noticed, that the phase 1 tests start off with push_back() and pop_front(). In the interest of a gradual progression, I'd like to split that one and change the order to:

  • empty (keep as is)
  • back
    • push / pop single element (new from split)
    • push / pop multiple elements
  • front
    • push / pop single element (new from split)
    • push / pop multiple elements
  • front + back tests (keep as is)

Emerentius and others added 23 commits August 4, 2019 03:01
simplify and slightly extend tests using this functionality
Outside the  README, no mention of nodes is made anymore,
only of positions.
This exercise is track-specific and there's therefore no
equivalent in problem-specifications from which a version
could be taken.
rename and group by their dependence on the steps in the README
removed tests for pre-implemented methods
added tests for remaining cursor methods
add memory leakage tests
doc comment-ing pre_implemented module
comment for PhantomData marker
by separating them into their own files
there is no Cargo.toml key to do the same as passing --test-threads=1
This is easier than telling students to pass the argument and adapting
the CI
Crashing drop affects everyone right from the start. Delay until step 4.
group tests that only push/pop at back,
then tests working only at the front and only then mix
not needed for tests
all functions where the caller has to maintain invariants are marked
unsafe and all unsafe blocks and impls are commented as to why they
are safe.
In the process of doing so, it became clear that most methods in
NodePtrHelper and a few others were unsafe. Furthermore, many were
only used once after previous refactorings. Those now obsolete
abstractions were removed and the code  refactored to minimize
the amount of unsafe.
@Emerentius
Copy link
Contributor Author

I've done all the changes, rebased and regenerated the README.

Adding the invariant documentation helped me clean up the example a lot. There were several functions that were unsafe but not marked as such.

In your solution, I noticed that you implemented Send and Sync without T: Send | T: Sync bounds. That is unsound and I've added tests against them.
Tip: Getting a reference from a raw pointer is a common operation and you can write unsafe { &*ptr } instead of Box::leak( unsafe { Box::from_raw(ptr) }.

@Emerentius Emerentius changed the title WIP: Add exercise for doubly linked lists with unsafe Add exercise for doubly linked lists with unsafe Aug 4, 2019
@coriolinus
Copy link
Member

Hah, good; those extra tests would have helped me! That's a good tip, as well.

I think this looks great. I'm still going to wait until Wednesday to merge, just to ensure that we've left enough time for other maintainer comments, but at this point, my intent is to merge on Wednesday.

@coriolinus coriolinus merged commit 0df41e2 into exercism:master Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants