Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix generate method to work with lists and dicts #46

Merged
merged 1 commit into from
Sep 8, 2022

Conversation

ialarmedalien
Copy link
Contributor

@ialarmedalien ialarmedalien commented Sep 6, 2022

This is a bug fix for cases where the top level data structure in a file is a list, not a dictionary.

e.g. try running json2models with test/test_cli/data/users.json as the input:

json2models -f pydantic -m ModelName test/test_cli/data/users.json

Result:

Traceback (most recent call last):
  File "/json2python/json_to_models/__main__.py", line 4, in <module>
    main()
  File "/json2python/json_to_models/cli.py", line 395, in main
    print(cli.run())
  File "/json2python/json_to_models/cli.py", line 121, in run
    meta = generator.generate(*data)
  File "/json2python/json_to_models/generator.py", line 49, in generate
    fields_sets = [self._convert(data) for data in data_variants]
  File "/json2python/json_to_models/generator.py", line 49, in <listcomp>
    fields_sets = [self._convert(data) for data in data_variants]
  File "/json2python/json_to_models/generator.py", line 58, in _convert
    for key, value in data.items():
AttributeError: 'list' object has no attribute 'items'

Flattening out the input lists in the generate method allows files like this to be parsed correctly.

@coveralls
Copy link

coveralls commented Sep 6, 2022

Coverage Status

Coverage decreased (-0.02%) to 97.967% when pulling 3e274be on ialarmedalien:fix_list_file_input into 7a2c72d on bogdandm:master.

@bogdandm
Copy link
Owner

bogdandm commented Sep 7, 2022

So when do you get this error exactly? Why can't you call g.generate(*data) directly like cli.py does?

@bogdandm
Copy link
Owner

bogdandm commented Sep 7, 2022

If you use cli there is 'list' mode - see example https://github.com/bogdandm/json2python-models#f1-season-results

@ialarmedalien
Copy link
Contributor Author

So when do you get this error exactly? Why can't you call g.generate(*data) directly like cli.py does?
If you use cli there is 'list' mode - see example https://github.com/bogdandm/json2python-models#f1-season-results

One of the APIs I'm building models for returns results where the top level data structure is a list. I have a directory full of this API data that I want to be able to build a model from by running

json2models -f pydantic -m ModelName /path/to/results/dir/*.json

list mode only works on a single file, you can't (currently) combine numerous files if the top level structure is a list.

@bogdandm
Copy link
Owner

bogdandm commented Sep 7, 2022

list mode only works on a single file, you can't (currently) combine numerous files if the top level structure is a list.

Oh, I see. Maybe is it better to fix it? Should not be too hard (json_to_models/cli.py:89)

Main problem with your PR is that you fix only one edge case. You do not change signature to generate(*data: list | dict) and do not handle cases like g.generate([...], [...]) when 2+ lists are passed.

@ialarmedalien
Copy link
Contributor Author

ialarmedalien commented Sep 7, 2022

Main problem with your PR is that you fix only one edge case. You do not change signature to generate(*data: list | dict)

This is true -- I just fixed my problem, since I was surprised that the top level data structure would cause this issue (nothing in the README about it). I'm using this module for a work project so I want it to function on all the relevant inputs. It has been very useful and a real time saver!

It would be easy enough to extend the list flattening to take account of multiple levels of nesting, although if no one else has come across the issue yet, it would be a case of YAGNI...

The signature of generate doesn't need to be changed to fix the issue, and in an unfamiliar codebase, it's best to touch as little as possible to minimise the risk of messing up other stuff. 🙂

and do not handle cases like g.generate([...], [...]) when 2+ lists are passed.

That would mean there were two sets of -l <model_name> <JSON key> <JSON file> entered on the command line -- do you have a use case for that? (I didn't even know you could do that until I tried it just now)

I can imagine wanting to use several JSON key / JSON file combinations to build one model, e.g.

json2models -f pydantic \
    -l Driver DriverStandings.Driver testing_tools/real_apis/f1/driver_standings.json \
    -l Driver - testing_tools/real_apis/f1/drivers.json

but that's something I wouldn't implement unless someone actually asked for it (see: YAGNI).

Let me know what you would want to make the pull request acceptable.

@bogdandm
Copy link
Owner

bogdandm commented Sep 7, 2022

Since your main problem is the input data format and you are using CLI, you need to fix it at CLI code.

You could add logic to handle *.json to https://github.com/bogdandm/json2python-models/blob/master/json_to_models/cli.py#L89 (just like it's done at line 85). models_lists can contain multiple records for single model, later it will be grouped at setup_models_data. _process_path already implemented to handle file patterns, just not used in list mode (without any reason I think)

@ialarmedalien
Copy link
Contributor Author

I think it might make more sense to remove the list mode altogether (so that the user doesn't have to worry about what the top level data structure is), and put in a new argument, --path <JSON path>, which would be available regardless of how the input file is structured. That simplifies the logic in cli.py when setting up models vs models_lists and it enables filtering when the input is /path/to/*.json.

i.e.

# old command: json2models -f attrs -l ModelName json.path /path/to/file.json
# new version:
json2models -f attrs --path json.path -m ModelName /path/to/file.json

# old: json2models -f attrs -l ModelName - /path/to/file.json
# new:
json2models -f attrs -m ModelName /path/to/file.json

# doesn't matter whether *.json contains a dict or a list
json2models -f attrs -m ModelName /path/to/*.json

# currently impossible
json2models -f attrs --path json.path -m ModelName /path/to/*.json

The -l command line param could be left in place for backwards compatibility and behind the scenes converted to this kind of --path model.

@bogdandm
Copy link
Owner

bogdandm commented Sep 8, 2022

remove the list mode altogether

Didn't think about it before but sounds like a good idea. Obviously it requires a lot more work than your fix (and version 0.3 release probably), few edge cases come to mind.

Let do the following:

  • Merge and realease your fix as it will not hurt anybody
  • When I have free time I will try to remove list mode and add logic to detect 'list vs dict' cases
  • If everything works out - release version 0.3.0 and revert your fix

@ialarmedalien
Copy link
Contributor Author

That sounds great, thank you. I may have some time to work on it myself this week or next week, so if you could push any WIP branches to the repo, I can check to make sure that I'm not duplicating your effort.

@bogdandm bogdandm merged commit b09d9bf into bogdandm:master Sep 8, 2022
@ialarmedalien ialarmedalien deleted the fix_list_file_input branch September 8, 2022 16:43
bogdandm added a commit that referenced this pull request Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants