Skip to content

Migration to new API #131

Open
Open
@sebffischer

Description

@sebffischer

The new OpenML API is currently about 50% done, and there will be some changes that are partly listed here:
https://openml.github.io/server-api/migration/.
The new API is hosted here: https://test.openml.org/py/docs

Filter specifications are now part of the body.
Migration to the new API happens in this branch: https://github.com/mlr-org/mlr3oml/tree/feat/new-api

Overview table of API requests that are currently supported by mlr3oml and their status (i.e. whether it works in the branch with the new API).

  • Download Data Description:
    • tags might be different, because datasets are now annotated by LLMs and the database from the new server is before that date and does not have these new tags. This is not an issue
    • The format of the processing_time is now slightly different so had to be adjusted.
    • Also some numbers are now correctly encoded as integers and not strings anymore.
    • different arff link seems to be a bug
    • different parquet_url is intended as the buckets are restructured
    • We need to partially add some conversions but could also remove some. E.g. empty character vectors are returned as list() for length 1, but as character(1) for length 1.
    • In the future, the ignore_attribute, row_id_attribute and target_attribute should never be NULL, but always character vectors, so expect_oml_data() can change the test from expect_character(..., null.ok = TRUE) to expect_character(..., null.ok = FALSE)
  • Download Task Description
    • the $input field now has a different structure. This has to be addressed.
    • The task_name field is now name
  • Download Task Splits: Take new structure of task description into account
  • Download Flow Description
  • Download Run Description
  • Download Predictions
  • List Data
  • List Tasks
  • List Flows
  • List Setups
  • List Evaluations
  • List Measures
  • Upload Dataset
  • Upload Task
  • Upload Collection
  • Download Data (arff)
    • Nothing should change here
  • Download Data (parquet)
    • The url changes but this should not affect the code as we retrieve it from the metadata description
  • Download Data qualities
    • Nothing should change here.

Other stuff that will / might break:

  • return(NA_integer_)
    (These "No results" custom oml codes will soon be one html header code and also not be an error any more).

  • if (response$oml_code %in% c(107L)) {

    (error code might be adjusted)

  • This needs to be updated as this information is not part of the query anymore but of the body

    filters = imap_chr(filters, function(x, name) {

  • if (response$oml_code %in% c(107L)) {
    Error code 107 will probably not be there anymore

  • Increment cache version when everything is done

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions