Skip to content

Cleanup Vec aliases #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 8, 2020
Merged

Cleanup Vec aliases #100

merged 2 commits into from
Oct 8, 2020

Conversation

juliohm
Copy link
Member

@juliohm juliohm commented Oct 7, 2020

This is a continuation of #93 . It supersedes #97. Addresses #91.

Now we have a single type for representing vectors in geometrical spaces, and it happens to be an alias for SVector (implementation detail). Additionally, we have type aliases for end users that often wish to materialize vectors with specific coordinate type and dimension. Here, we avoid partial instantiation. The aliases Vec2 and Vec3 are concrete vector types in 2D and 3D using double precision, and the aliases Vec2f and Vec3f are concrete vector types using single precision.

We deprecate unit in favor of euclidbasis, which is a more accurate name for the returned vectors. We add constant as a helper function to construct vectors with repeated coordinates. The name constant isn't ideal, please let me know if you have a better suggestion.

@knuesel
Copy link
Contributor

knuesel commented Oct 7, 2020

I'd be happy to help with this refactoring, but I'm not quite clear what was the main problem with the vector types (and the discussion is spread out in other PRs). @juliohm could you expand a bit your summary with more context, maybe with a couple of before/after examples, to highlight the main goals? I think a complete list of all proposed changes (and their justifications) would be great to evaluate how much work it will take to adapt dependent packages. It would also be great to get feedback from the community (e.g. posting this "roadmap" to Discourse).

Here are some ideas/remarks regarding the names:

  • I prefer unit over euclidbasis: the name euclidbasis suggests that the return value is a basis, which is not the case. And the common name for these vectors is "standard basis vector" or "standard unit vector".
  • constant is definitely wrong. I'll try to find a better name...
  • The plan is to change boundbox to boundingbox? That would be nice.
  • What about renaming FRect->Rectf, FRect3D->Rect3f, etc.? It would be consistent with Vec3f, etc.

@SimonDanisch
Copy link
Member

constant is definitely wrong. I'll try to find a better name...

fill is what base uses...We could scope it to apply to return Vecs/Points:
Points.fill(0f0, 3)::Point3f
This should even type infer with constant prop

@SimonDanisch
Copy link
Member

Although I really feel, like nothing beats Vec3f(0) and I don't 100% get what would be confusing about it...

@juliohm
Copy link
Member Author

juliohm commented Oct 7, 2020

Hi @knuesel , thanks for considering joining the effort. I am happy to elaborate on the plans, and share some of the motivations behind the refactoring of the code.

The first set of changes was initially discussed in #91. In that issue, you will find linked PRs that addressed it partially, including this one. The main problem before this set of changes is that Vectors and Points were being interchanged in the code incorrectly, and there were multiple vector types. By just sticking to const Vec = SVector as a single type for the concept of vectors in an embedding space, I was able to fix 3 bugs in GeometryBasics.jl alone. This is a big deal. We cannot move forward with these types of bugs. They are hard to track and a nightmare to fix.

The next PR in this set of changes will fix the Point type. Currently the Point type supports all kinds of operations that are only defined for vectors in the embedding space. We cannot add two points for example.

Regarding the Rectangle types, it will require another set of changes. The constructors are being abused, and all this fuss with single versus double precision is compromising the generality of the implementations. Rectangle are just a polygon with vertices. The type of the coordinates is naturally inherited from the vertices. So all we need is to have a single Rectangle type with the correct vertices upon construction. If we don't fix this pattern in the codebase, we will end up with FSphere, FLineString, FPyramid, ... and similar noise. The only place where one should specify the type of the coordinates is in the constructor of vectors or points. Everything else should follow.

Regarding your comments:

I prefer unit over euclidbasis: the name euclidbasis suggests that the return value is a basis, which is not the case. And the common name for these vectors is "standard basis vector" or "standard unit vector".

I can see that unit is a more elegant name. Maybe we can just call it unit as suggested and provide a clear docstring explaining that these are actually the versors of the Euclidean basis. I am aiming to support general manifolds in GeoStats.jl, so this helper function for constructing a global Euclidean basis is not always helpful. Ultimately, we would like to support multiple coordinate reference systems, and compute coordinates relatively as opposed to globally.

constant is definitely wrong. I'll try to find a better name...

Yes, I was really stuck on this one. Couldn't find a cleaner name. It should something that reminds us of the fact that we are repeating the same value for all the coordinates. I thought of using nval initially. Do you think it is a better option?

The plan is to change boundbox to boundingbox? That would be nice.

Sure, do you mind preparing a PR to the cleanup branch with this change? I was planning to address it after this more serious issue with vectors and points is solved.

What about renaming FRect->Rectf, FRect3D->Rect3f, etc.? It would be consistent with Vec3f, etc.

As I explained above, I have the opinion that we shouldn't be playing with specific precision all over the place. We should inherit the precision from the points or vectors like: Rect(rand(Vec3f), rand(Vec3f)) would construct what we call today FRect3D or something. Also, the name Rect isn't very good. Perhaps we should rename it to a name that translates the concept of a parallepiped: https://en.wikipedia.org/wiki/Parallelepiped This is important if one is considering transformations to the meshes that break the right angle assumption. I think we should keep a general parallepiped type, and add traits that check if the angles are 90 deg. Something like isrectangle(p::Parallepiped) = # check the vertices. We could even add a helper function that constructs parallepipeds satisfying the trait, and these would coincide with the current Rect assumption.

What do you think of this plan? I hope it makes sense to you all. I am doing my best to put the efforts here and work together, but if you feel that these changes aren't welcome, please let me know and I can continue working in a separate project.

@juliohm
Copy link
Member Author

juliohm commented Oct 7, 2020

I really like the fill suggestion @SimonDanisch , just noticed it now after typing a long comment. Will update the PR accordingly.

@juliohm
Copy link
Member Author

juliohm commented Oct 7, 2020

For consistency, and to distinguish with future point utility functions, I've renamed to vunit and vfill where "v" means that this is a vector utility function.

@juliohm
Copy link
Member Author

juliohm commented Oct 7, 2020

I will merge this one by the end of the day in case there are no additional suggestions. 👍

@juliohm
Copy link
Member Author

juliohm commented Oct 7, 2020

Although I really feel, like nothing beats Vec3f(0) and I don't 100% get what would be confusing about it...

It is mainly for consistency and clarity of intention. When we introduce vunit for unit vector and vfill for repeated coordinates, we signal maintainers about what we are trying to achieve very explicitly. Given that currently const Vec = SVector, we also save ourselves from reimplementing the full StaticArrays.jl behaviour manually, just for adding a new constructor with a single entry. The benefit is that things will work out of the box when devs from other packages plug their SVector vectors. There is no need to create a new vector type when StaticArrays.jl is so awesome, and keeps getting better.

@juliohm juliohm merged commit 90a7245 into cleanup Oct 8, 2020
@juliohm juliohm deleted the vector-aliases branch October 8, 2020 09:37
@knuesel
Copy link
Contributor

knuesel commented Oct 10, 2020

@juliohm thanks for the explanation, it's very helpful. Have consistent types, and keeping the implementation as generic as possible sound very good! However this is moving a bit fast for me (waiting less than a day for suggestions... no offense! but I can only work intermittently on this).

Anyway, here are my two cents 😊:

Regarding the type aliases for 2D, 3D, single and double precision: I think it's important to define these aliases because

  • They make the package more user-friendly (probably most users prefer to write Rect2f than Rect{2,Float32}).
  • They make the user code more readable.
  • They prevent unnecessary fragmentation in dependent packages: if packages that use GeometryBasics have to define their own aliases for convenience, the ecosystem will end up with different names for the same thing, and it won't be obvious that XXX.Rect3f and YYY.FRect3D are actually the same type!

I see no downside to these aliases (assuming we keep the GeometryBasics methods as general as possible as you say, so most methods should not use these specific aliases).

Good point on the parallelepipeds!

For the Euclidean basis: I think you actually mean "standard basis". It's the vector space that is (or isn't) Euclidean... I see a couple of results on Google for "euclidean basis" (well, more like 11'000) but AFAIK that's non-standard terminology. People could use Vec{5,Float64} for a non-Euclidean space, and [1,0,0,0,0] would still be the first vector of the standard basis.

I also like fill! But using different function names (vfill, pfill) seems unnecessary: fill for all types is more idiomatic and what's being filled is clear from the parameters...

@juliohm
Copy link
Member Author

juliohm commented Oct 10, 2020

Thank you @knuesel , below are some follow up comments.

Regarding the type aliases for 2D, 3D, single and double precision: I think it's important to define these aliases because

  • They make the package more user-friendly (probably most users prefer to write Rect2f than Rect{2,Float32}).

Users won't write Rect{2,Float32}, they will create rectangles (or paralellepid) with actual points, so it will read like Rect(Point2f(...), Point2f(...)).

  • They make the user code more readable.

As pointed out, code becomes less readable with multiple aliases for each geometry. We are talking Rectangle here, imagine this pattern for all primitives. Take a look at the rectangle source code as it is currently, and you will realize the huge number of constructors just to provide these aliases:

Rect() = Rect{2,Float32}()
TRect{T}() where {T} = Rect{2,T}()
Rect{N}() where {N} = Rect{N,Float32}()
function Rect{N,T}() where {T,N}
# empty constructor such that update will always include the first point
return Rect{N,T}(Vec{N,T}(typemax(T)), Vec{N,T}(typemin(T)))
end
# conversion from other Rects
function Rect{N,T1}(a::Rect{N,T2}) where {N,T1,T2}
return Rect(Vec{N,T1}(minimum(a)), Vec{N,T1}(widths(a)))
end
function Rect(v1::Vec{N,T1}, v2::Vec{N,T2}) where {N,T1,T2}
T = promote_type(T1, T2)
return Rect{N,T}(Vec{N,T}(v1), Vec{N,T}(v2))
end
function TRect{T}(v1::VecTypes{N,T1}, v2::VecTypes{N,T2}) where {N,T,T1,T2}
return if T <: Integer
Rect{N,T}(round.(T, v1), round.(T, v2))
else
return Rect{N,T}(Vec{N,T}(v1), Vec{N,T}(v2))
end
end
function Rect{N}(v1::VecTypes{N,T1}, v2::VecTypes{N,T2}) where {N,T1,T2}
T = promote_type(T1, T2)
return Rect{N,T}(Vec{N,T}(v1), Vec{N,T}(v2))
end
"""
Rect(vals::Number...)
```
Rect(vals::Number...)
```
Rect constructor for indidually specified intervals.
e.g. Rect(0,0,1,2) has origin == Vec(0,0) and
width == Vec(1,2)
"""
@generated function Rect(vals::Number...)
# Generated so we get goodish codegen on each signature
n = length(vals)
@assert iseven(n)
mid = div(n, 2)
v1 = Expr(:call, :Vec)
v2 = Expr(:call, :Vec)
# TODO this can be inbounds
append!(v1.args, [:(vals[$i]) for i in 1:mid])
append!(v2.args, [:(vals[$i]) for i in (mid + 1):length(vals)])
return Expr(:call, :Rect, v1, v2)
end
Rect3D(a::Vararg{Number,6}) = Rect3D(Vec{3}(a[1], a[2], a[3]), Vec{3}(a[4], a[5], a[6]))
Rect3D(args::Vararg{Number,4}) = Rect3D(Rect{2}(args...))
#=
From different args
=#
function (Rect)(args::Vararg{Number,4})
args_prom = promote(args...)
return Rect2D{typeof(args_prom[1])}(args_prom...)
end
function (Rect2D)(args::Vararg{Number,4})
args_prom = promote(args...)
return Rect2D{typeof(args_prom[1])}(args_prom...)
end
function (Rect{2,T})(args::Vararg{Number,4}) where {T}
x, y, w, h = T <: Integer ? round.(T, args) : args
return Rect2D{T}(Vec{2,T}(x, y), Vec{2,T}(w, h))
end
function TRect{T}(args::Vararg{Number,4}) where {T}
x, y, w, h = T <: Integer ? round.(T, args) : args
return Rect2D{T}(Vec{2,T}(x, y), Vec{2,T}(w, h))
end
function FRect3D(x::Rect2D{T}) where {T}
return Rect{3,T}(Vec{3,T}(minimum(x)..., 0), Vec{3,T}(widths(x)..., 0.0))
end
function Rect2D{T}(a::Rect2D) where {T}
return Rect2D{T}(minimum(a), widths(a))
end
function TRect{T}(a::Rect2D) where {T}
return Rect2D{T}(minimum(a), widths(a))
end
function Rect{N,T}(a::GeometryPrimitive) where {N,T}
return Rect{N,T}(Vec{N,T}(minimum(a)), Vec{N,T}(widths(a)))
end
function Rect2D(xy::VecTypes{2}, w::Number, h::Number)
return Rect2D(xy..., w, h)
end
function Rect2D(x::Number, y::Number, wh::VecTypes{2})
return Rect2D(x, y, wh...)
end
function TRect{T}(xy::VecTypes{2}, w::Number, h::Number) where {T}
return Rect2D{T}(xy..., w, h)
end
function TRect{T}(x::Number, y::Number, wh::VecTypes{2}) where {T}
return Rect2D{T}(x, y, wh...)
end
# TODO These are kinda silly
function Rect2D(xy::NamedTuple{(:x, :y)}, wh::NamedTuple{(:width, :height)})
return Rect2D(xy.x, xy.y, wh.width, wh.height)
end
function FRect3D(x::Tuple{Tuple{<:Number,<:Number},Tuple{<:Number,<:Number}})
return FRect3D(Vec3f0(x[1]..., 0), Vec3f0(x[2]..., 0))
end
function FRect3D(x::Tuple{Tuple{<:Number,<:Number,<:Number},
Tuple{<:Number,<:Number,<:Number}})
return FRect3D(Vec3f0(x[1]...), Vec3f0(x[2]...))
end

The plan is to cleanup this source of variation in a future PR, after #101 is ready. It is been a good exercise to fix these issues. I could already solve at least a couple dozen bugs in this last PR.

  • They prevent unnecessary fragmentation in dependent packages: if packages that use GeometryBasics have to define their own aliases for convenience, the ecosystem will end up with different names for the same thing, and it won't be obvious that XXX.Rect3f and YYY.FRect3D are actually the same type!

As I mentioned, there is no need for aliases after we get the point type aliases.

For the Euclidean basis: I think you actually mean "standard basis". It's the vector space that is (or isn't) Euclidean... I see a couple of results on Google for "euclidean basis" (well, more like 11'000) but AFAIK that's non-standard terminology. People could use Vec{5,Float64} for a non-Euclidean space, and [1,0,0,0,0] would still be the first vector of the standard basis.

"Standard" is a function of the research community you are in. This basis may not be standard in other contexts. The better name for it is Cartesian. It is Euclidean because every point is obtained by simple addition and subtraction of coordinates, and it is Cartesian because of the orthonormality property.

I also like fill! But using different function names (vfill, pfill) seems unnecessary: fill for all types is more idiomatic and what's being filled is clear from the parameters...

Introducing new names is a good practice in general. After writing a lot of code in the GeoStats.jl stack and in Julia, I realized that life is much easier when we have our own names for a new functionality. The language doesn't provide a trait system and so extending behavior with Base names is impossible without macro hacks. The v prefix will turn out useful when we start having equivalents for the point type with a p prefix. It also helps maintainers to write a coherent API, and to read code later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants