Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dedent=True doesn't work in python 3.12 #942

Closed
mmoskal opened this issue Jul 6, 2024 · 5 comments · Fixed by #959
Closed

dedent=True doesn't work in python 3.12 #942

mmoskal opened this issue Jul 6, 2024 · 5 comments · Fixed by #959

Comments

@mmoskal
Copy link
Collaborator

mmoskal commented Jul 6, 2024

The bug

The dedend=True argument to @guidance decorator seems to work in 3.10 and 3.11 but not in 3.12.

To Reproduce

The test below passes in 3.11 but not 3.12 on macOS x86.

import guidance

@guidance(stateless=True, dedent=True)
def character_maker2(lm):
    lm += f"""\
    {{
        "name": "{guidance.gen('name', stop='"', max_tokens=1)}",
    }}"""
    return lm

def test():
    lm = guidance.models._mock.Mock()
    lm += character_maker2()
    assert str(lm).startswith("{")

test()

System info (please complete the following information):

  • OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): macOS x86
  • Guidance Version (guidance.__version__): 0.1.15
@hudson-ai
Copy link
Collaborator

I've noticed the dedent kwarg but didn't previously know what it was doing... Essentially we are introspecting on the source code and rewriting it as if the user had written textwrap.dedent on their block strings?

Do you know if this behavior is documented anywhere? Is it important to maintain? @Harsha-Nori any opinions/thoughts?

@Harsha-Nori
Copy link
Collaborator

Harsha-Nori commented Jul 9, 2024

Python 3.12 had major changes to f-string behavior, so it doesn't surprise me that dedent is acting up now. Thanks for catching this!

@hudson-ai I don't know that we ever documented it, but the idea is to follow the same semantics as textwrap.dedent (ignore leading spaces) so that you can do natural in-line f-strings without having to undo all the indentation if you're in an already nested python block. If we don't have this, the f-string syntax for writing guidance programs gets quite ugly since every line after the first needs to be on the 0-indent level, eg:

def prog(lm):
    if blah:
        lm += f"""first line

second line
third line"""
    return lm

@hudson-ai
Copy link
Collaborator

Understood. That is indeed nice to have...

@riedgar-ms
Copy link
Collaborator

Just messing around a little with this (on Windows.... same behaviour), and the call to dedent happens here:

source = textwrap.dedent(inspect.getsource(f))

The result on Python 3.11 is

@guidance(stateless=True, dedent=True)
def character_maker2(lm):
    lm += f"""\
    {{
        "name": "{guidance.gen('name', stop='"', max_tokens=1)}",
    }}"""
    return lm

and on Python 3.12

@guidance(stateless=True, dedent=True)
def character_maker2(lm):
    lm += f"""\
    {{
        "name": "{guidance.gen('name', stop='"', max_tokens=1)}",
    }}"""
    return lm

which looks rather similar.

@riedgar-ms
Copy link
Collaborator

And continued prodding, this is definitely to do with something changing in how f-strings are handled, rather than dedent() per se.

In Python 3.12, the next portion of strip_multiline_string_indents() builds an AST. Judicious application of print(ast.dump(old_ast, indent=4)) yields:

Module(
    body=[
        FunctionDef(
            name='character_maker2',
            args=arguments(
                posonlyargs=[],
                args=[
                    arg(arg='lm')],
                kwonlyargs=[],
                kw_defaults=[],
                defaults=[]),
            body=[
                AugAssign(
                    target=Name(id='lm', ctx=Store()),
                    op=Add(),
                    value=JoinedStr(
                        values=[
                            Constant(value='    {\n        "name": "'),
                            FormattedValue(
                                value=Call(
                                    func=Attribute(
                                        value=Name(id='guidance', ctx=Load()),
                                        attr='gen',
                                        ctx=Load()),
                                    args=[
                                        Constant(value='name')],
                                    keywords=[
                                        keyword(
                                            arg='stop',
                                            value=Constant(value='"')),
                                        keyword(
                                            arg='max_tokens',
                                            value=Constant(value=1))]),
                                conversion=-1),
                            Constant(value='",\n    }')])),
                Return(
                    value=Name(id='lm', ctx=Load()))],
            decorator_list=[],
            type_params=[])],
    type_ignores=[])

Note in particular the line Constant(value=' {\n "name": "') Shifting back to Python 3.11, this line becomes Constant(value='{\n "name": "'), which lacks the leading spaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants