Skip to content

Conversation

@mmatera
Copy link
Contributor

@mmatera mmatera commented Oct 25, 2022

This PR splits the mathics.core.formatter module in one piece (of the same name) including the routines associated to boxes_to_format, and a second piece mathics.core.makeboxes with the format_element associated routines.

Also, a new function do_makeboxes (should I have named it eval_makeboxes?) splits the task of applying MakeBoxes rules.

include_form = True

# If form is Fullform, return it without changes
if form is SymbolFullForm:
Copy link
Member

@rocky rocky Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this code where the boxing is querying the form feels like it is going the wrong way around.

Let's hit reset here. Here is a proposed alternate organization.

At the top level, we have a Form which directs boxing for a given expression.

The form when given an expression then goes through the expression, boxing various parts of the expression.

Recall that previously we had just created a FormBaseClass. A suggestion is to add a method off of that, like make_boxes() or simply box().

"Boxing" is is a process in of itself; there are a number of standard boxing classes and methods. In Boxing, a format routine is called. For things like Complex numbers, or Integers, boxing is formatting. For Rational numbers and certain kinds of forms like DisplayForm or MathMLForm, there could still be boxes such as to separate the box numerator from the box denominator, and either that lower-level boxing is done immediately or another pass is made.

Organizationally, we'd could have a several kinds of boxing routines for say Complex numbers. These just call corresponding format routines appropriate for that kind of form.

But the Form doesn't have to be a parameter. Instead it can be implicit in the box() method of say some form like StandardForm or TranditionalForm.

Developers can add additional Boxes, or formats for particular new kinds of elements. And developers may want to use existing pieces of Forms, Boxes and formats in this.

However it shouldn't be necessary to specify everything, just the special parts. So again, the default boxing methods can be indicated by registering a boxing method for that new element type inside FormBaseClass.

So how would all of this work?

# In forms.py
class FormBaseClass(...):
   def register_box_method(cls, element_type: type, method: Callable):
      """ register ``method`` so method(...) is called when
      ``form.box(element, ...)`` is called.
      form is something like StandardForm, TraditionalForm, etc.

      To register the default boxing routine, register under the class
      ``FormBaseClass``
      """
      ....
      
   def box(cls, element: BaseElement, ...):
      """
      This method needs to be implemented for each Form and element_type.
      """
      element_type = type(element)
      method = cls.lookup(element_type)
      if method is None:
         # The default class should have been registered under FormBaseClass
	 method = FormBaseClass.lookup(element_type)
	 if method is None:
	    raise ImplementationError(f"No boxing routine or default boxing routine registered for {element_type}; encountered when handling form {cls}"
      method(element, ...)
      ...

  @classmethod
  def lookup(element_type):
      ...
# In builtin/box/numbers.py:

def box_complex_type1(element: BaseElement, ...):
   """
   Box and format a complex number of <type1>. (Replace <type1> with
   a more specific description.)
   """
   # Here there is no boxing, just formatting.
   format_complex_type1(element)

# Both InputForm and FullForm use the same kind of Boxing for Complex numbers,
# and they use type1 boxing/formatting.
InputForm.register_box_method(complex_box_type1)
FullForm.register_box_method(complex_box_type1)

def box_complex_type2(element: BaseElement, ...):
   """
   Box and format a complex number of <type2>. (Replace <type2> with
   a more specific description.)
   """
   # Here there is no boxing, just formatting.
   format_complex_type2(element)

# Register complex_box_type2 as the default way to box Complex
# when nothing else has been specified.
FormBaseClass.register_box_method(complex_box_type2)

# The below is not strictly necessary since we have a default,
# but we might want to make explicit which boxing is used for a given Form.
StandardForm.register_box_method(complex_box_type2)
Traditional.register_box_method(complex_box_type2)

def box_rational_2D: 
   ""
    Boxing for a Rational in 2D where a bar appears as dashes below the numerator and denominator
   """
   # Whatever needs to be done here
# Register box_rational_2D 
...
# In format/numbers.py:

def format_complex_type1(c: Complex, ...) -> str
   """
   format a complex number of <type1>. (Replace <type1> with
   a more specific description.)
   """
   # The the kind of formatting of complex numbers that we might do for other kinds of forms

def format_complex_type2(c: Complex, ...) ->str:
   """
   format a complex number of <type2>. (Replace <type2> with
   a more specific description.)
   """
   # The the kind of formatting of complex numbers that we might do for other kinds of forms

# in format/mathml.py
def format_rational_mathml(rational: Rational, ...)-> str:
   """
   format a rational number using MathML
   """
   # ...

# in format/2D
def format_rational_2Dl(rational: Rational, ...)-> str:
   """
   format a rational number in 2D such as used in DisplayForm
   # ...

Your thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's go by parts.
The aim of this PR was just to split two things that should be analyzed separately: "boxing" and "formatting" (where "formatting" is in the sense of "file formatting" or "rendering").
So, we have an expression (le't say, Plus[a,b]) then "boxing" produces, applying rules, something like RowBox[{"a","+","b"}]. This is "boxing". Then, we could keep it as an expression (that is what MakeBoxes or ToBoxes does) or we can render it into a text format ("a + b") or in a MathML format ("a + b") or into a "png" image, which is something needed to show the expression in the front end, or to implement ToString, or to export an expression into a file.

Now, "formatting" (/"rendering") has a nice and flexible implementation (which still could be improved), but "boxing" doesn't. So I am proposing first to split this into two different modules, which is all that is done here. Notice that just splitting this implies changes in many files.

So, what I do not get from your comment is if you think that there is an alternative way to implement all the functionality that here is in mathics.core.makeboxes without having this module at all, or what you are pointing out is that a part of the code does not belong here or must be re-implemented.

If we want to completely get ride of functions like format_element, do_format, etc, then let's close this PR and start in a different direction, moving back functions in mathics.core.formatter to methods/functions in mathics.builtin.forms.

Otherwise, I would like to merge this before doing any further change in the implementation, to make the subsequent PRs easier to follow.

Copy link
Member

@rocky rocky Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to get rid of do_format and format_element and replace that with a box instance method off of the Form. The box method then calls format methods for based on that form and possibly other additional parameters: encoding might be in there, the top-level may have set what kinds of low-level backends are available (SVG, MathML, Asymptote).

Although evaluation of M-expressions is complicated and although we pulled out of the class various operations for other reasons, here things I think can be simpler than it is for evaluation.

So let's close this and start again.

Let me make a first and incomplete cut at this like I did for FormBaseClass() since I may have this organization in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, regarding your proposal. If I understand it well, the idea would be to replace the lookup table mechanism (through the element_formatter dict) to choose which function is used to implement the formatting on certain kinds of objects, by something attached to FormBaseClass.

I am OK with the part of if that exactly replaces the use of the lookup table, and possibly the default implementation of atom_to_boxes. However, we should take into account that formatting rules can be added from inside a session by adding definitions.
In that way, I guess we should keep a function like format_element and "do_format" ("eval_format"?) and "do_makeboxes" ("eval_makeboxes") that respectively, apply Format (high-level) and MakeBoxes (low-level) rules to the expression, but with an improved an clearer implementation.

Copy link
Member

@rocky rocky Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, regarding your proposal. If I understand it well, the idea would be to replace the lookup table mechanism (through the element_formatter dict) to choose which function is used to implement the formatting on certain kinds of objects, by something attached to FormBaseClass.

I am OK with the part of if that exactly replaces the use of the lookup table, and possibly the default implementation of atom_to_boxes. However, we should take into account that formatting rules can be added from inside a session by adding definitions. In that way, I guess we should keep a function like format_element and "do_format" ("eval_format"?) and "do_makeboxes" ("eval_makeboxes") that respectively, apply Format (high-level) and MakeBoxes (low-level) rules to the expression, but with an improved an clearer implementation.

I think we can accomodate all of this. And do it in a way where we can very closely mimic what we see in WMA with extensible rules.

Refactoring is a bit of work, so probably to start out there will be duplicate routines - the current way, and a new way. And/or that the first PR will work only on a small subset of things for the refactored way.

But after the first draft cut is in place, I think it will be both easier to discuss and clear and straightforward to fill out everything. And we will be able to more simply test the formatting of something like the low level rendering of something like 1 / x for either MathML or DisplayForm. (The test for something like 1 / x //MathMLForm right now is very fragile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to get rid of do_format and format_element and replace that with a box instance method off of the Form. The box method then calls format methods for based on that form and possibly other additional parameters: encoding might be in there, the top-level may have set what kinds of low-level backends are available (SVG, MathML, Asymptote).

Although evaluation of M-expressions is complicated and although we pulled out of the class various operations for other reasons, here things I think can be simpler than it is for evaluation.

So let's close this and start again.

Let me make a first and incomplete cut at this like I did for FormBaseClass() since I may have this organization in mind.

OK, sure.

@mmatera mmatera closed this Oct 25, 2022
@rocky
Copy link
Member

rocky commented Oct 25, 2022

Let's leave this open as a draft. This is probably a better model to try to rewrite things than the current code base.

@rocky rocky reopened this Oct 25, 2022
@rocky rocky marked this pull request as draft October 25, 2022 12:23
rocky and others added 7 commits October 26, 2022 13:37
In mathicsscript we find that PrintForms and OutputForms are the empty
set otherwise. Note mystery around why we cannot use "importlib"
Move import forms *before* use of PrintForms
In particular we separate:

* top-level forms in $OutputForms,
* top-level forms not in $OutputForms
* form variables $OutputForms, $PrintForms
* base class code

This is the non-experimental part of #588
@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

Deprecated

@mmatera mmatera closed this Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants