Skip to content

Conversation

andreasgerstmayr
Copy link
Contributor

Currently, [...] GROUP BY links throws the following exception:

beanquery.compiler.CompilationError: GROUP-BY a non-hashable type is not supported: "Column(name='links')"

because set is not hashable. frozenset is hashable, and can be used for the links and tags which don't need to be mutable.

Currently, `[...] GROUP BY links` throws the following exception:

    beanquery.compiler.CompilationError: GROUP-BY a non-hashable type is not supported: "Column(name='links')"

because `set` is not hashable. `frozenset` is hashable, and can be used
for the `links` and `tags` which don't need to be mutable.
@dnicolodi dnicolodi changed the title Support GROUP BY for links and tags sources.beancount: Correct type for links and tags columns Feb 4, 2025
@dnicolodi
Copy link
Collaborator

Thanks for working on this. I trusted the old type annotation in Beancount and deduced that these fields are returned as sets, not frozen sets. Newer Beancount has the right annotations. This should definitely be fixed, however, the proposed patch is not enough: there are other things that need to be adjusted to support frozen sets. The first that comes to mind is support to render columns with frozenset data type (without that the rendering code falls back to the rendering of generic objects, which is quite ugly for frozen sets).

It is not strictly related to the PR itself, but if the added test case is an example of why you need this functionality, I think there is a better way of doing what you are doing:

2010-02-21 open Assets:AccountsReceivable:Doctor20100221

2010-02-21 * "Doctor appointment"
    Assets:AccountsReceivable:Doctor20100221   1000.00 USD
    Assets:Bank:Checking

2010-02-22 * "Insurance reimbursement"
    Assets:AccountsReceivable:Doctor20100221   -100.00 USD
    Assets:Bank:Checking

2010-03-22 * "Insurance reimbursement"
    Assets:AccountsReceivable:Doctor20100221   -900.00 USD
    Assets:Bank:Checking

2010-03-22 close Assets:AccountsReceivable:Doctor20100221

with a query like:

SELECT
  FIRST(date) as date, 
  FIRST(payee) AS payee, 
  FIRST(narration) AS narration,
  SUM(position) AS balance
WHERE account ~ 'Assets:AccountsReceivable:'
GROUP BY account
HAVING not empty(sum(position))
ORDER BY balance DESC

@andreasgerstmayr
Copy link
Contributor Author

Thanks for working on this. I trusted the old type annotation in Beancount and deduced that these fields are returned as sets, not frozen sets. Newer Beancount has the right annotations. This should definitely be fixed, however, the proposed patch is not enough: there are other things that need to be adjusted to support frozen sets. The first that comes to mind is support to render columns with frozenset data type (without that the rendering code falls back to the rendering of generic objects, which is quite ugly for frozen sets).

Thanks! I'll update the PR and also add a new test for the rendering.

It is not strictly related to the PR itself, but if the added test case is an example of why you need this functionality, I think there is a better way of doing what you are doing:

I agree, it feels a bit like a hack (I was surprised that grouping by links works in beancount.query), but I like to simplicity of it, as I don't have to open/close a new account for every reimbursement. The auto_accounts plugin would help with this, but then I don't catch typos in account names (e.g. a misspelled Expenses:Groceries etc.).

@dnicolodi
Copy link
Collaborator

I agree, it feels a bit like a hack (I was surprised that grouping by links works in beancount.query), but I like to simplicity of it, as I don't have to open/close a new account for every reimbursement. The auto_accounts plugin would help with this, but then I don't catch typos in account names (e.g. a misspelled Expenses:Groceries etc.).

The problem of using links in this way is that it makes it impossible to use them for anything else as adding links to a transaction break the GROUP BY but there isn't a mechanism in place to make sure that additional links are not added. Furthermore, there is no check on the set of links is consistent (ie that there are no typos). Finally, it works kind of ok for tracking reimbursements where you have one expense and a few reimbursements transactions. However, it breaks down quite badly if you have a single transaction reimbursing expenses occurred in multiple transactions, or for which only a part is reimbursable. All these problems are not there using a dedicated reimbursement account per reimbursement request. I use this system to track the reimbursement of my travel expenses and it scales well. I think the cost of the open and close entries is worth the advantages.

@andreasgerstmayr
Copy link
Contributor Author

Good points, I'll consider it in the future.

I added a frozenset renderer (inherited from the set renderer) to the PR. Do I need to update anything else?

@andreasgerstmayr
Copy link
Contributor Author

Actually, you got me convinced. I just refactored my entire ledger (with autobean-refactor), and added a customized version of the auto_accounts plugin to only auto-open Assets:AccountsReceivable accounts. Best of both worlds.

Anyway, I think the PR is still valid, possibly there are other use cases for grouping by links or tags.

@hoostus
Copy link
Contributor

hoostus commented Aug 24, 2025

I ran into this today. I think it should be fixed primarily because it is a regression versus previous behaviour.

While I understand what dnicolodi is saying about using subaccounts instead that doesn't really help me with the reality that I've got 10 years of using beancount with tags and bean-query being able to group them. Yes, I could refactor my file but....I'd rather not if I don't have to.

If this won't be fixed then perhaps @andreasgerstmayr could post an example script of how to refactor a file to switch from tags to subaccounts.

@andreasgerstmayr
Copy link
Contributor Author

If this won't be fixed then perhaps @andreasgerstmayr could post an example script of how to refactor a file to switch from tags to subaccounts.

This is the script I used on my ledger:

# /// script
# dependencies = ["autobean-refactor"]
# ///
from pathlib import Path

from autobean_refactor import editor, models


def rename_account(model: models.RawModel, from_account: str, to_account: str):
    for token in model.tokens:
        if isinstance(token, models.Account) and token.value == from_account:
            token.value = to_account


def update_txn(txn: models.Transaction):
    for token in txn.tokens:
        if isinstance(token, models.Account) and token.value == "Assets:AccountsReceivable:Pending":
            for link in txn.links:
                if "reimbursement" in link:
                    year, name = link.replace("-reimbursement", "").split("-", 1)
                    txn.links.remove(link)
                    token.value = f"Assets:AccountsReceivable:{year}:{name.title()}"
                    break


e = editor.Editor()
with e.edit_file_recursive(Path("main.beancount")) as files:
    for path, file in files.items():
        print(f"Processing: {path}")
        for directive in file.directives:
            if isinstance(directive, models.Transaction):
                update_txn(directive)

For every transaction with a posting to Assets:AccountsReceivable:Pending and a link for example ^2022-reimbursement-kitchen, it changed the account to Assets:AccountsReceivable:2022:Kitchen. Make a backup before and use the script at your own risk.

@hoostus
Copy link
Contributor

hoostus commented Aug 25, 2025

If this won't be fixed then perhaps @andreasgerstmayr could post an example script of how to refactor a file to switch from tags to subaccounts.

This is the script I used on my ledger:

Thanks for that. Notes for others who use this:

  1. You'll still need to create all the accounts (or use an autoaccount plugin as mentioned above)
  2. You may need to update queries being used in scripts etc now that there are subaccounts. You need to use account ~ XYZ instead account = XYZ
  3. the rename_account method isn't actually used, so you can delete that. (But it does show you how you can use autobean_refactor to change account names as well.)
  4. I get an exception from the autobean_format framework but it can be worked around.
File "/home/justus/Documents/beancount/.venv/lib/python3.12/site-packages/autobean_refactor/editor.py", line 60, in edit_file_recursive
    os.makedirs(os.path.dirname(current_path), exist_ok=True)
  File "<frozen os>", line 225, in makedirs
FileNotFoundError: [Errno 2] No such file or directory: ''

This is because edit_file_recursive ends up calling os.makedirs(os.path.dirname(current_path), exist_ok=True) but because current_path was created with Path("my.beancount") the result of os.path.dirname is an empty string which means os.makedirs can't work. I don't know enough about autobean_refactor to understand why this didn't work as expected but you can just specify the absolute pathname instead Path("/home/justus/Documents/beancount/my.beancount") and it works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants