Skip to content

WIP: generator accepts NdArray attribute #575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Jan 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
a7907cf
added static methods to infer pack type
Nov 1, 2021
def940c
added unit tests for datapack type inference.
Nov 2, 2021
b6e363a
Merge branch 'master' of github.com:asyml/forte
Nov 2, 2021
cc3423b
Merge branch 'master' of github.com:asyml/forte
Nov 19, 2021
a6fa453
fixed issue#558
Nov 23, 2021
4cb691c
Merge branch 'master' of github.com:asyml/forte
Dec 6, 2021
16bd32f
Merge branch 'master' into issue#558
zhanyuanucb Dec 7, 2021
16a934b
Merge branch 'master' of github.com:asyml/forte into issue#558
Dec 7, 2021
dd85896
fixed black: forte/data/selector.py
Dec 7, 2021
4731eb3
Merge branch 'issue#558' of github.com:zhanyuanucb/forte into issue#558
Dec 7, 2021
148a03b
fixed import-outside-toplevel
Dec 7, 2021
aca7510
Merge branch 'issue#558'
Dec 7, 2021
5d95063
Merge branch 'master' of github.com:asyml/forte
Dec 7, 2021
0cb0987
WIP: waiting for ndarray supported
Dec 8, 2021
45b4ce0
Ndarray -> ndarray
Dec 8, 2021
0193898
values -> value
Dec 10, 2021
d082966
added ndarray to SUPPORTED_PRIMITIVES
Dec 15, 2021
a4d8069
Merge branch 'master' of github.com:asyml/forte into issue#567
Dec 15, 2021
2ba663f
WIP: generator accepts NdArray attribute
Dec 20, 2021
6e58a57
remove breakpoints
Dec 20, 2021
0449b44
fixed None -> "None"
Dec 20, 2021
92d9eac
WIP: designing NdArrayProperty
Dec 21, 2021
54c4355
WIP: ndarray_size -> ndarray_shape
Dec 21, 2021
e4afbba
fixed lint
Dec 21, 2021
bcf7e3b
fixed lint
Dec 21, 2021
5702eb1
added ndarray property test cases
Dec 21, 2021
cc86f95
removed irrelevant onto spec
Dec 21, 2021
3ab5c8f
fixed lint
Dec 21, 2021
e1cd712
fixed lint
Dec 21, 2021
5886cb5
fixed black
Dec 21, 2021
f0ec23f
fixed description
Dec 21, 2021
605e374
added FNdArray, a wrapper class for NdArray metric
Dec 22, 2021
672c54a
handle None dtype
Dec 22, 2021
d98ee2d
removed Optional typing
Dec 22, 2021
c4f1279
fixed description
Dec 22, 2021
f355196
added unit tests for ndarray attribute
Dec 23, 2021
b8f7784
fixed black
Dec 23, 2021
d5c4184
Merge branch 'master' of github.com:asyml/forte into issue#567
Jan 5, 2022
4229358
Merge branch 'master' into issue#567
zhanyuanucb Jan 6, 2022
e0f1a37
added doc string and more tests
Jan 10, 2022
b729235
Merge branch 'issue#567' of github.com:zhanyuanucb/forte into issue#567
Jan 10, 2022
33ebd58
fixed type
Jan 10, 2022
c02d5f8
added reference to np.ndarray
Jan 10, 2022
a7e73a9
added unit tests for ndarray attribute against dtype, shape, and warning
Jan 11, 2022
7599c13
Merge branch 'master' into issue#567
zhanyuanucb Jan 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions forte/data/ontology/code_generation_objects.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from abc import ABC
from pathlib import Path
from typing import Optional, Any, List, Dict, Set, Tuple
from numpy import ndarray

from forte.data.ontology.code_generation_exceptions import (
CodeGenerationException,
Expand Down Expand Up @@ -382,6 +383,51 @@ def to_field_value(self):
return self.name


class NdArrayProperty(Property):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some docstring for the class. I understand that many other classes in this file don't have that, but that's a historical mistake, let's add docstring as much as possible for future code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

"""
NdArrayProperty accepts parsed properties of NdArray and
instructs import manager to import and instanciate FNdArray
as default value in the generated code.
"""

def __init__(
self,
import_manager: ImportManager,
name: str,
ndarray_dtype: Optional[str] = None,
ndarray_shape: Optional[List[int]] = None,
description: Optional[str] = None,
default_val: Optional[ndarray] = None,
):
self.type_str = "forte.data.ontology.core.FNdArray"
super().__init__(
import_manager,
name,
self.type_str,
description=description,
default_val=default_val,
)
self.ndarray_dtype: Optional[str] = ndarray_dtype
self.ndarray_shape: Optional[List[int]] = ndarray_shape

def internal_type_str(self) -> str:
type_str = self.import_manager.get_name_to_use(self.type_str)
return f"{type_str}"

def default_value(self) -> str:
if self.ndarray_dtype is None:
return f"FNdArray(shape={self.ndarray_shape}, dtype={self.ndarray_dtype})"
else:
return f"FNdArray(shape={self.ndarray_shape}, dtype='{self.ndarray_dtype}')"

def _full_class(self):
item_type = self.import_manager.get_name_to_use(self.type_str)
return item_type

def to_field_value(self):
return self.name


class DictProperty(Property):
def __init__(
self,
Expand Down
67 changes: 67 additions & 0 deletions forte/data/ontology/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,73 @@ def __iter__(self) -> Iterator[KeyType]:
yield from self.__data


class FNdArray:
"""
FNdArray is a wrapper of a NumPy array that stores shape and data type
of the array if they are specified. Only when both shape and data type
are provided, will FNdArray initialize a placeholder array through
np.ndarray(shape, dtype=dtype).
More details about np.ndarray(...):
https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
"""

def __init__(
self, dtype: Optional[str] = None, shape: Optional[Iterable[int]] = None
):
super().__init__()
self._dtype: Optional[np.dtype] = (
np.dtype(dtype) if dtype is not None else dtype
)
self._shape: Optional[tuple] = (
tuple(shape) if shape is not None else shape
)
self._data: Optional[np.ndarray] = None
if dtype and shape:
self._data = np.ndarray(shape, dtype=dtype)

@property
def dtype(self):
return self._dtype

@property
def shape(self):
return self._shape

@property
def data(self):
return self._data

@data.setter
def data(self, array: Union[np.ndarray, List]):
if isinstance(array, np.ndarray):
if self.dtype and not np.issubdtype(array.dtype, self.dtype):
raise TypeError(
f"Expecting type or subtype of {self.dtype}, but got {array.dtype}."
)
if self.shape and self.shape != array.shape:
raise AttributeError(
f"Expecting shape {self.shape}, but got {array.shape}."
)
self._data = array

elif isinstance(array, list):
array_np = np.array(array, dtype=self.dtype)
if self.shape and self.shape != array_np.shape:
raise AttributeError(
f"Expecting shape {self.shape}, but got {array_np.shape}."
)
self._data = array_np

else:
raise ValueError(
f"Can only accept numpy array or python list, but got {type(array)}"
)

# Stored dtype and shape should match to the provided array's.
self._dtype = self._data.dtype
self._shape = self._data.shape


class Pointer(BasePointer):
"""
A pointer that points to an entry in the current pack, this is basically
Expand Down
4 changes: 3 additions & 1 deletion forte/data/ontology/ontology_code_const.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ class SchemaKeywords:
element_type = "item_type"
dict_key_type = "key_type"
dict_value_type = "value_type"
ndarray_dtype = "ndarray_dtype"
ndarray_shape = "ndarray_shape"


# Some names are used as properties by the core types, they should not be
Expand Down Expand Up @@ -83,7 +85,7 @@ def get_ignore_error_lines(json_filepath: str) -> List[str]:

SUPPORTED_PRIMITIVES = {"int", "float", "str", "bool"}
NON_COMPOSITES = {key: key for key in SUPPORTED_PRIMITIVES}
COMPOSITES = {"List", "Dict"}
COMPOSITES = {"List", "Dict", "NdArray"}

ALL_INBUILT_TYPES = set(list(NON_COMPOSITES.keys()) + list(COMPOSITES))

Expand Down
38 changes: 38 additions & 0 deletions forte/data/ontology/ontology_code_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import jsonschema
import typed_ast.ast3 as ast
import typed_astunparse as ast_unparse
from numpy import ndarray

from forte.data.ontology import top, utils
from forte.data.ontology.code_generation_exceptions import (
Expand All @@ -48,6 +49,7 @@
OntologySourceNotFoundException,
)
from forte.data.ontology.code_generation_objects import (
NdArrayProperty,
NonCompositeProperty,
ListProperty,
ClassTypeDefinition,
Expand Down Expand Up @@ -1078,6 +1080,40 @@ def parse_entry(

return entry_item, property_names

def parse_ndarray(
self,
manager: ImportManager,
schema: Dict,
att_name: str,
desc: str,
):
ndarray_dtype = None
if SchemaKeywords.ndarray_dtype in schema:
ndarray_dtype = schema[SchemaKeywords.ndarray_dtype]

ndarray_shape = None
if SchemaKeywords.ndarray_shape in schema:
ndarray_shape = schema[SchemaKeywords.ndarray_shape]

if ndarray_dtype is None or ndarray_shape is None:
warnings.warn(
"Either dtype or shape is not specified."
" It is recommended to specify both of them."
)

default_val = None
if ndarray_dtype and ndarray_shape:
default_val = ndarray(ndarray_shape, dtype=ndarray_dtype)

return NdArrayProperty(
manager,
att_name,
ndarray_dtype,
ndarray_shape,
description=desc,
default_val=default_val,
)

def parse_dict(
self,
manager: ImportManager,
Expand Down Expand Up @@ -1250,6 +1286,8 @@ def parse_property(self, entry_name: EntryName, schema: Dict) -> Property:
return self.parse_dict(
manager, schema, entry_name, att_name, att_type, desc
)
elif att_type == "NdArray":
return self.parse_ndarray(manager, schema, att_name, desc)
elif att_type in NON_COMPOSITES or manager.is_imported(att_type):
self_ref = entry_name.class_name == att_type
return self.parse_non_composite(
Expand Down
45 changes: 45 additions & 0 deletions forte/data/ontology/validation_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,34 @@
"value_type": {
"description": "Item type for the case of Dice attributes",
"type": "string"
},
"ndarray_dtype": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have some tests to validate a JSON spec with this schema, hopefully, both success and failure cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test against invalid shape.
Do you think we can rely on numpy to validate dtype?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be great if schema can check the possible dtype values. Since this is done without actually running the code, while numpy validation happens late at run time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.
I've added more tests against dtype and shape

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the validation looks awesome, and thanks for the tests

"description": "Data type for the case of NdArray attributes. Allow a subset of NumPy supported data types",
"type": "string",
"enum": [
"bool",
"bool8",
"int",
"int8",
"int32",
"int64",
"uint8",
"uint32",
"uint64",
"float",
"float32",
"float64",
"float96",
"float128",
"complex",
"complex128",
"complex192",
"complex256"
]
},
"ndarray_shape": {
"description": "Shape of N-dimensional array for the case of NdArray attributes",
"type": "array"
}
},
"anyOf": [
Expand Down Expand Up @@ -135,6 +163,23 @@
}
]
},
{
"allOf": [
{
"properties": {
"name": {
"enum": [
"NdArray"
]
}
},
"required": [
"name",
"type"
]
}
]
},
{
"allOf": [
{
Expand Down
Empty file.
Loading