Skip to content

Transform2D, Basis2D, Node2D (also 3D): clarify, organize, make mathematically correct and uniformize #2738

Open
@andre-caldas

Description

@andre-caldas

Describe the project you are working on

I am just learning Godot. I want to use Godot to produce animations like in manim... in such a way that I can really play the animation. Like a game! :-)

Describe the problem or limitation you are having in your project

When manipulating Node2D objects, I feel that Node2D, Transform2D and the non existant Basis2D are not integrated one with the others. Transform2D has many errors (like godotengine/godot#48712), and Node2D does a lot that should be done by Transform2D, instead. Node2D also keeps track of a lot of redundant information, like angle, scale, position and skew.

Many methods in Transform2D are incorrect in the sense that there are assumptions that are not always true. Because of this, the documentation is very imprecise.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Transform2D, Basis2D (does not exist, yet) and Node2D should be refactored to better integrate them. Each one must have its own role, and those roles should not overlap. Implemented methods would have a precise definition/documentation describing its geometric and arithmetic meaning.

GDScript API would also change to make it cleaner. The API would be keept easy and simple for non advanced users, but also powerful (through integration with Transform2D and Basis2D) for advanced users.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

I will describe it after answering all the mandatory questions. :-)

If this enhancement will not be used often, can it be worked around with a few lines of script?

It will be used very often. Of course any one can live without it, though.

Is there a reason why this should be core and not an add-on in the asset library?

Because it is a refactoring of some of the most basic native types: Node2D and Transform2D.

Very long description of the proposal follows...

This is a long report on my own impressions on Basis, Transform/Transform2D, Vector2, Node2D, etc. I have known Godot just for a few weeks. And I have just learned how to make an app that draws a polygon. So, I am sorry if I got many things wrong! :-)

If this proposal gets interest, maybe it should become a wiki, because it is just a sketch, it is too big and it needs collaboration to reach consensus and be finished.

Mathematical classes, like Transform2D and Transform3D need to be very well defined and mathematically correct. It seems that there is a lot of cargo culture involved in the use o these mathematical tools. I think that precise definitions and concise behaviour can avoid this from happening. Why do I do this an not the other way? Because I have tested it, and it gave me the result I was expecting. The user tends to adjust parameters in some sort of trial and error approach. I catch my self doing this a lot! :-)

I was very surprised when I realized that Node2D, instead of simply using Transform2D, it uses pos (position), angle, _scale, skew and a Transform2D _mat variables! This is a lot of redundant information to deal with.

I will talk about Transform2D since the 3D case is analogous. The first thing that needs to be stated clearly, is the fact that Transform2D represents a affine transformation of a plane.

How to interpret it / motivation

This section is not the proposal. Most of what is here is already implemented. It is just an introduction to describe what it is all about. In special, what is the role Basis and Transform in Godot. It is a long introduction... feel free to skip it!

Imagine we start with an empty canvas. Or an empty sheet of paper if you prefer. In an abstract language, we have a plane. Let's call it P. I want to instruct you what to draw. We shall use a system of coordinates. So, we agree to pick up an arbitrary point and call it origin. Then, we choose two axis passing through this origin, and call them x-axis and y-axis. Those axis have a scale each. Now, with two real numbers (a,b), we can specify a point in the plane. We can specify many points and "draw" on this plane.

choosing a coordinate system

With a coordinate system, we can now use pairs of numbers (Vector2D) to refer to point in this plane P. If you like math, what we have is a function

f: R² --> P

That gives us a point in P for each pair of real numbers (x,y). This is what Transform2D is when seen from inside. You give a Transform2D object, and it tells you "where" the point should be drawn on canvas.

Now, that we have coordinates, I can instruct you to draw a triangle:

Join the points (0,0), (1,0) and (0,1) with segments of lines.

But then, I want to do this same triangle, but rotated 90 degrees counterclockwise about the origin. I can recalculate the points: (0,0), (0,1) and (-1,0).

But imagine that we are not talking about three points. We are talking about a very complex structure made of many points. So, instead of rotating the points and giving a new array of points to you, I could simply instruct you to rotate the axis!

rotating and moving the coordinate system... not the local drawing

If we rotate the axis, the Transform2D as seen "from inside" does not change. But the outside... how it is presented to the canvas does change. By the way, we did not address the problem of how to specify neither the origin, nor the axis x and y. Even if we don't know yet how to specify this information, it is clear that we can move our drawing around just by changing the value of origin. We can scale our drawing (about the origin) by "stretching" the axis x and y.

transforming the coordinate system in many ways

But how to specify origin, x-axis and y-axis? Well, if the canvas already had a coordinate system, it is easy.

specifying a coordinate system inside another one

Even if the canvas already has a coordinate system, it is still worth specifying a new one. A coordinate system inside the global coordinate system. A local coordinate system allow us to design objects in a "standard" coordinate and then, sticking it to a canvas, through a Transform2D object. Ultimately, we need to transform those coordinates to coordinates that can be used by the hardware that will present the data to us. This is the reason why Transform2D has its name. If we specify the point origin and the vectors e1 and e2 using global coordinates, as shown in the drawing below, we have a formula to take coordinates (a,b) in the local system and converting it to coordinates in the canvas coordinate system:

Vector2D Transform2D::apply(const Vector2D &v) {
	return this->origin + v.x * this->e1 + v.y * this->e2;
}

This is called an affine transform. The part v.x * this->e1 + v.y * this->e2 is the "linear part" of the transform. While this->origin sometimes called "origin", sometimes called "translation". From the local perspective, it makes sense to call it "origin". And from the outer perspective, it makes sense to call it "translation", because it is the amount you have to translate from the global origin.

Now, if we want to transform our drawing, instead of changing the point coordinates, we can change the origin and the axis. If we translate the origin point to some other place, it is like translating the whole drawing. If we stretch some axis, it is just like stretching the drawing. If we rotate the axis about the origin, it is just like doing the same with the drawing. Also, you might want to do a rotation about some other point, different from the origin.

This is exactly what happens when we have nested Node2D. You can specify the node position and two vectors x and y that determine the axis. And to specify those, you can use Node2D's parent coordinate system. So, Node2D should have its coordinate system specified by means of an instance of Transform2D. It could be called Node2D::transform or Node2D::coordinate_system.

If A is a Node2D, B a nested Node2D and C a Node2D inside of B, then, in order to convert a Vector2D v representing coordinates in C to coordinates in A, all we have to do is to calculate

B.transform.apply(C.transform.apply(v))

Instead of calculating two transformations each time (imagine we have millions of vectors v), it would be better to do

Transform2D composition = B.transform * C.transform
composition.apply(v)

Usually the GoDot programmer do not need to calculate these compositions. The GoDot engine will automatically do this for nested Node2Ds.

The basic operations a user might be interested in doing with a Node2D is scaling, rotation and translation.

It is also important to notice that access to the linear part of Transform2D is important. When we deal with "points", we use Transform2D. But when we deal with things that are relative, like "distance vectors", or "speed", we use only the linear part. It is just like conversion of temperatures: 0°C corresponds to 32°F, but a difference of 0°C corresponds to a difference of 0°F.

Some considerations

About Vector2D (and Vector3D) class.

Vector2 should be called Vector2D. I know it "sucks"! But this is the time to make the change!

About Basis2D (and Basis3D) class.

There should be a Basis2D native object type. It represents an invertible linear transform (a 2x2 matrix with non-null determinant).

The ultimate usefulness of a Basis2D object is to be applied to a Vector2D object:

var ex = Vector(1,0)
var ey = Vector(1,2)

var T = Basis2D(ex, ey)
var v = Vector(2,1)

# This should print the result of 2 * ex + 1 * ey.
print(T.apply(v))

In terms of matrices, Basis2D(x,y) is equivalent to

[ex.x ey.x]
[ex.y ey.y]
and T.apply(v), equivalent to
[ex.x ey.x] [v.x]
[ex.y ey.y] [v.y]

This class could have static methods to provide common transforms. For example, Basis2D::rotation(theta) should return a Basis2D instance corresponding to the matrix

[cos(theta) -sin(theta)]
[sin(theta) cos(theta) ]

This class would have unary methods: invert(), transpose(), rotate(), scale(), etc. It could also have "const" methods to return a copy: inverse() const, transposed() const, rotated() const, etc. Also, determinant() const.

The method rotate() must be defined/documented with care. Some might understand two different things about T.rotated(angle).apply(v):

  1. It should be equivalent to rotate v and then apply T to the result.
  2. It should be equivalent to apply T and then "rotate" the result.

In current implementation, assumptions about some specific nature of T is being made, making the code correct only for special cases. For example, when T is conformal (made only of scales and rotations), both orders listed above give the same result. In this case, the order does not matter. But, for example, for T = Basis2D(Vector2D(1,1), Vector2D(0,1)), the order does matters.

IMHO, if T already represents the resultant transformation up to "now" and you want to further rotate things afterwards, then the natural choice is option 2. Therefore, Basis2D R = T.rotated(angle) should be equivalent to

	Basis2D R;
	R.e1 = T.e1.rotated(angle);
	R.e2 = T.e2.rotated(angle);

Or, if we define the operator* as composition: Basis2D R = Basis2D::rotation(angle) * T. The definition of operator* should be something like

Basis2D Basis2D::operator* (Basis2D* S) const {
	Basis2D result;
	result.e1 = this->apply(S.e1);
	result.e2 = this->apply(S.e2);
}

For two Basis2D instances, F and G, F * G shall result in a Basis2D such that (F * G).apply(v) results the same as F.apply(G.appy(v)). :-)

The operator* method corresponds to matrix multiplication. But, although I have written "matrix this, matrix that...", I do not think that matrices should be emphasized. In special, I do not think method names and variable names should refer to "lines" and "columns". Nor should they refer to "right multiplication" or "left multiplication". If we choose to represent Basis2D putting Basis.e1 and Basis.e2 in lines instead of columns, we get exactly the same Basis2D "concept", but the roles of "lines/columns" and "left/right" become swapped:

[v.x v.y] [ex.x ex.y]
........... [ey.x ey.y]

Documentation can use matrices to illustrate how Basis2D works. But Basis2D should be independent of "matrices". It is a linear transform, and therefore, independent of your choice of thinking of vectors as "lines" or "columns". Especially when dealing with Transform2D and Transform3D, here is an example on how it can be confusing: #1336.

Internally, Basis2D can consist of just two Vector2D variables: ex and ey. We could have aliases (getters/setters?) so people could refer to ex by other names like x, e1 or i; and to ey by y, e2 or j.

About Transform2D (and Transform3D) structure.

Transform2D should be formed by two variables: one Basis2D linear_part and one Vector2D origin. There could be aliases for origin: position and translation. The variable linear_part could be also called axis. Also, there could be aliases for linear_part.x and linear_part.y: just x and y.

Operations with Transform2D are a bit more complicated and can be very poorly defined if we are not careful. For example, if t is a Transform2D instance, what do we mean by var r = t.translated(Vector2D(1,1))?

  1. r.origin = t.origin + Vector2D(1,1)?
  2. r.origin = t.origin + t.linear_part.apply(Vector2D(1,1))?

From the point of view of the "outside" of t, that is, from the point of view of someone manipulating t, item 1 makes more sense. Remember that t.origin is specified in terms of the outer coordinate system. That is, the canvas coordinate system in our analogy mentioned in the beginning of this document. However, from the point of view of t itself, that "translation by v" would make more sense if v is a vector specified in t's own coordinates. In this case, item 2 would make sense.

  1. Should we have one local_translation and one translation (or some other name indicating it is not local)?
  2. Should we have just the local version and call it translation? If someone wants a non-local translation s/he can simply make t.origin += v.
  3. Should we have just the non-local version?
  4. Should we have none of them and get the programmer to manipulate t.origin manually?

The local version might be a little harmful because t.linear_part.apply(v) is recalculated every call without the caller being aware of it. Maybe the caller should calculate it and always use the non-local version. For example, if v is the "speed", then a * v would be the translated amount after a units of time. Then, instead of calling t.locally_translated(a*v) every millisecond and unconsciously computing t.linear_part.apply(a*v) every time, you could simply use t.translated(a*w), where w = t.linear_part.apply(v). I don't know anything about FPUs and GPUs, so I don't know if this would be a non-problem with the help of hardware.

Item 4 would not be a bad option if the "dot syntax" was not so useful: t.translated(v).rotated(theta).translated(v)....

By the way, what do we mean by var r = t.rotated(theta)?

When dealing with linear transforms, a rotation is necessarily a rotation about the origin. Otherwise, the result is not a linear transform. But in the context of Transform2D (affine transforms), "rotation" can be about any point. There is one very natural point someone might be interested in rotating an object about: the local origin. The local origin can be interpreted as the "position" of the object. If someone says "rotate the object", s/he probably means "rotate the object "without changing its position". This is the easiest to accomplish:

Transform2D& Transform2D::rotate(real_t angle) {
	Basis2D rotation = Basis2D::rotation(angle)
	ex = rotation.apply(ex)
	ey = rotation.apply(ey)
	return *this;
}

It is also very easy to rotate the object about the non-local origin. I don't think "rotation about the non-local origin should be an implemented method. From the point of the object being rotated, it is just an "arbitrary point" of the plane. But it is easy to execute. All you have to do is rotate origin as well as ex and ey.

One might as well, be interested in rotating the Transform2D about some arbitrary point p. The first step is to simply rotate locally. That is, first we rotate ex and ey. Then, we just need to rotate origin about the point p. The easiest is if p is in non-local coordinates, because so is origin:

  1. Pull everything back so that p goes to (0,0): origin -= p.
  2. Rotate origin about the origin.
  3. Push everything back: origin += p.
Transform2D& Transform2D::rotate_about(real_t angle, const Vector2D& about_point) {
	Basis2D rotation = Basis2D::rotation(angle)
	ex = rotation.apply(ex)
	ey = rotation.apply(ey)
	origin = rotation.apply(origin - about_point) + about_point
	return *this;
}

Scaling is completely analogous to rotation! Scaling is not well defined if we do not provide a little more information. For example, we can choose a point to "scale about": everything is stretched, but this point is kept fixed. To stretch locally and not move the object position, one just needs to scale the axis:

Transform2D& Transform2D::scale(real_t scale) {
	ex *= scale
	ey *= scale
	return *this;
}

To do a non-local scaling about an arbitrary point p (in non-local coordinates):

Transform2D& Transform2D::scale_about(real_t scale, const Vector2D& about_point) {
	ex *= scale
	ey *= scale
	origin = scale * (origin - about_point) + about_point
	return *this;
}

Actually, we could even generalize this and apply any linear transform (Basis2D) about an arbitrary point p (in non-local coordinates):

Transform2D& Transform2D::apply_about(const Basis2D& transform, const Vector2D& about_point) {
	ex = p.apply(ex)
	ey = p.apply(ey)
	origin = p.apply(origin - about_point) + about_point
	return *this;
}

But I do not think we should have such a method.

About Node2D (and Node3D) structure.

Unfortunately, Node2D's current implementation does not make use of Transform2D. A Transform2D _mat variable is "kept in sync" with other variables pos, angle, _scale and skew. Instead, everything should be made in terms of Transform2D.

I would just suggest Node2D to be a (protected) subclass of Transform2D. If GDScript does not allow multiple inheritance, instead of ClassDB::bind_method for every Transform2D operation, we should simply implement a cast (for usage in GDScript):

Transform2D& get_transform(void) {
	return *this;
}

We could, of course, ClassDB::bind_method for very simple e common use cases: scale and rotate. But more complex operations should be made directly through Transform2D.

Current source code status.

This is a very difficult topic to write about. I do want to criticize the code and suggest improvements. However, I do not want to make any developer feel uncomfortable in any manner! So, I'd like to start this section apologizing! :-)

Vector2D and Vector3D

  1. Classes should be renamed to Vector2D and Vector3D.
  2. Files .cpp and .h should be renamed to include a "d".

Basis2D

  1. There is no Basis2D. It should be implemented exactly as Basis3D.

Basis3D: basis.h and basis.cpp

Basis is very well written.

Matrices... rows and columns...

The original (and proposed) class name suggest that Basis3D is not just a matrix. Its data can be represented by a matrix, and its operations can be translated into matrix language. In general, I do not see any reason to do this. A matrix is a table of numbers over which many operations are defined. The meaning of the table as well as the meaning of those operations depend on what the matrix is being used to represent.

Basis3D is not just a matrix. As the name suggests, it represents some basis for the 3 dimensional vector space we are working it. Those vectors could be called ex, ey and ez, for example. But they are called elements[0], elements[1] and elements[2]. So far, so good! Basis3D also represents a linear transform, in a very simple way. This linear transform converts the local coordinates to canonical coordinates:

canonical = local.x * ex + local.y * ey + local.z * ez

There is no "left"/"right", no "row"/"column" in this.

Matrices, however, when you "draw" them as a table, do have "rows" and "columns". Vectors represented as matrices can be in the shape of "row matrices" or "column matrices". If we use "column matrices", then conversion from local (x,y,z) coordinates can be calculated like this:

[ex.x ey.x ez.x] [x]
[ex.y ey.y ez.y] [y]
[ex.z ey.z ez.z] [z]

with (x,y,z), ex, ey and ez represented by "columns". If however, we choose "row matrices", we get:

........... [ex.x ex.y ex.z]
[x..y..z] [ey.x ey.y ey.z]
........... [ez.x ez.y ez.z]

We can think of matrices as a box that translates from "local" to "global" coordinates. When we use "columns", the "local side" of the matrix is the "right side". If you operate on the matrix from the "global point of view", you operate on the "left". If you want to operate "locally", you operate on the "right".

But this is when you choose to represent vectors as "columns". It happens that people usually like to consider vectors as "rows". But at the same time, they like to treat the matrix as if "from the right" means "local" and "from the left" means "global". This is not consistent!

Currently Basis constructor gets 3 vectors passed to it. I was supposing they were the "basis vectors", as the class name suggests. They are assigned to element[n] and treated like "rows" of the matrix, because the matrix has entries element[n][m], and people like to say that n is the "row" and m is the "column". So, vectors are rows, right?

The code, however, treats "local" things as done "from the right" and global as done "from the left". That is, vectors are columns??? This is a little problematic, and this fact can be checked at Basis::get_scale_abs() definition:

Vector3 Basis::get_scale_abs() const {
	return Vector3(
			Vector3(elements[0][0], elements[1][0], elements[2][0]).length(),
			Vector3(elements[0][1], elements[1][1], elements[2][1]).length(),
			Vector3(elements[0][2], elements[1][2], elements[2][2]).length());
}

What is being calculated here, is the length of the columns of the matrix!!! With the notation of ex, ey and ez, the code would be:

Vector3 Basis::get_scale_abs() const {
	return Vector3(ex.length(), ey.length(), ez.length());
}

Another example, is the comment on the beginning of Transform2D's constructor:

// Warning #1: basis of Transform2D is stored differently from Basis. In terms of elements array, the basis matrix looks like "on paper":
// M = (elements[0][0] elements[1][0])
//     (elements[0][1] elements[1][1])
// This is such that the columns, which can be interpreted as basis vectors of the coordinate system "painted" on the object, can be accessed as elements[i].
// Note that this is the opposite of the indices in mathematical texts, meaning: $M_{12}$ in a math book corresponds to elements[1][0] here.
// This requires additional care when working with explicit indices.
// See https://en.wikipedia.org/wiki/Row-_and_column-major_order for further reading.

I hope this convinces anyone of the harm that those unneeded "matrices", "rows", "columns", "lefts" and "rights" may cause. Since Basis3D and Basis2D are not part of a multidimensional matrix library, my suggestion is to simply avoid double indexes and simply using ex, ey and ez. I am not saying, of course, that the matrix row is not important! The "rows", that is vectors like Vector3D(ex.x, ey.x, ez.x) have important meaning, especially when you are looking "from the inside to the outside". But those too ought to be called by a name that is meaningful in its context, not "row"! For example, the first "row" is the gradient (in local coordinates) of the x "global coordinate". So, in this context, instead of calling it "row", it could be called "x-gradient".

Roadmap.

  1. The class should be renamed to Basis3D.
  2. File names should have "_3d" appended.
  3. If documentation dictates that vectors must be linearly independent (determinant != 0, invertible), then this should be asserted (via MATH_CHECKS) during construction. (IMO)
  4. Redesign classes to use ex, ey and ez, just like Transform2D uses x and y.
  5. Document the precise geometric and algebraic meaning of methods. Possibly, change its name to something more "uniform" and meaningful.

transform.h and transform.cpp

  1. The class should be called Transform3D.
  2. File names should have "_3d" appended.
  3. Methods shall not work just in special cases. For example, inverse() has to invert always. See: Wrong formula for 2x2 matrix inversion. godot#48712.
  4. Give the user direct access to Transform3D::basis. (Instead of implementing scale_basis(), for example.)

transform_2d.h and transform_2d.cpp

  1. Transform2D has to be implemented (almost) exactly like Transform3D

node_2d.h and node_2d.cpp

Node2D is basically a CanvasItem with coordinates. It is a Transform2D that can be put in a canvas. Maybe, Node2D should subclass both: CanvasItem and Transform2D. Everything you might want to do with a Transform2D, you want to do with a Node2D: rotate, translate, scale, etc.

  1. Subclass Transform2D.
  2. Eliminate redundant variables angle, _scale, skew, etc.
  3. Since there is no multiple inheritance for GDScript, implement a get_transform() cast as explained above (somewhere).
  4. For GDScript, leave rotate, translate and scale functions for the ease of use, but eliminate the rest. If the user wants more complex manipulations, s/he should call get_transform().

Node3D.

Do by analogy! :-)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions