You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This folder contains a list of data samples that are used by forte to facilitate test cases.
2
+
3
+
# List of Data Samples
4
+
## audio_reader_test
5
+
This directory consists of audio files that are used in a unit test for verifying the AudioReader in `forte/tests/forte/data/readers/audio_reader_test.py`. Currently it contains two `.flac` files excerpted from a HuggingFace dataset called [patrickvonplaten/librispeech_asr_dummy](https://huggingface.co/datasets/patrickvonplaten/librispeech_asr_dummy) for automatic speech recognition.
"2","Great CD","My lovely Pat has one of the GREAT voices of her generation. I have listened to this CD for YEARS and I still LOVE IT. When I'm in a good mood it makes me feel better. A bad mood just evaporates like sugar in the rain. This CD just oozes LIFE. Vocals are jusat STUUNNING and lyrics just kill. One of life's hidden gems. This is a desert isle CD in my book. Why she never made it big is just beyond me. Everytime I play this, no matter black, white, young, old, male, female EVERYBODY says one thing ""Who was that singing ?"""
2
+
"2","One of the best game music soundtracks - for a game I didn't really play","Despite the fact that I have only played a small portion of the game, the music I heard (plus the connection to Chrono Trigger which was great as well) led me to purchase the soundtrack, and it remains one of my favorite albums. There is an incredible mix of fun, epic, and emotional songs. Those sad and beautiful tracks I especially like, as there's not too many of those kinds of songs in my other video game soundtracks. I must admit that one of the songs (Life-A Distant Promise) has brought tears to my eyes on many occasions.My one complaint about this soundtrack is that they use guitar fretting effects in many of the songs, which I find distracting. But even if those weren't included I would still consider the collection worth it."
3
+
"1","Batteries died within a year ...","I bought this charger in Jul 2003 and it worked OK for a while. The design is nice and convenient. However, after about a year, the batteries would not hold a charge. Might as well just get alkaline disposables, or look elsewhere for a charger that comes with batteries that have better staying power."
4
+
"2","works fine, but Maha Energy is better","Check out Maha Energy's website. Their Powerex MH-C204F charger works in 100 minutes for rapid charge, with option for slower charge (better for batteries). And they have 2200 mAh batteries."
5
+
"2","Great for the non-audiophile","Reviewed quite a bit of the combo players and was hesitant due to unfavorable reviews and size of machines. I am weaning off my VHS collection, but don't want to replace them with DVD's. This unit is well built, easy to setup and resolution and special effects (no progressive scan for HDTV owners) suitable for many people looking for a versatile product.Cons- No universal remote."
6
+
"1","DVD Player crapped out after one year","I also began having the incorrect disc problems that I've read about on here. The VCR still works, but hte DVD side is useless. I understand that DVD players sometimes just quit on you, but after not even one year? To me that's a sign on bad quality. I'm giving up JVC after this as well. I'm sticking to Sony or giving another brand a shot."
7
+
"1","Incorrect Disc","I love the style of this, but after a couple years, the DVD is giving me problems. It doesn't even work anymore and I use my broken PS2 Now. I wouldn't recommend this, I'm just going to upgrade to a recorder now. I wish it would work but I guess i'm giving up on JVC. I really did like this one... before it stopped working. The dvd player gave me problems probably after a year of having it."
8
+
"1","DVD menu select problems","I cannot scroll through a DVD menu that is set up vertically. The triangle keys will only select horizontally. So I cannot select anything on most DVD's besides play. No special features, no language select, nothing, just play."
9
+
"2","Unique Weird Orientalia from the 1930's","Exotic tales of the Orient from the 1930's. ""Dr Shen Fu"", a Weird Tales magazine reprint, is about the elixir of life that grants immortality at a price. If you're tired of modern authors who all sound alike, this is the antidote for you. Owen's palette is loaded with splashes of Chinese and Japanese colours. Marvelous."
10
+
"1","Not an ""ultimate guide""","Firstly,I enjoyed the format and tone of the book (how the author addressed the reader). However, I did not feel that she imparted any insider secrets that the book promised to reveal. If you are just starting to research law school, and do not know all the requirements of admission, then this book may be a tremendous help. If you have done your homework and are looking for an edge when it comes to admissions, I recommend some more topic-specific books. For example, books on how to write your personal statment, books geared specifically towards LSAT preparation (Powerscore books were the most helpful for me), and there are some websites with great advice geared towards aiding the individuals whom you are asking to write letters of recommendation. Yet, for those new to the entire affair, this book can definitely clarify the requirements for you."
`DataPack` includes a payload for audio data and a metadata for sample rate. You can set them by calling the `set_audio` method:
5
+
```python
6
+
from forte.data.data_pack import DataPack
7
+
8
+
pack: DataPack = DataPack()
9
+
pack.set_audio(audio, sample_rate)
10
+
```
11
+
The input parameter `audio` should be a numpy array of raw waveform and `sample_rate` should be an integer the specifies the sample rate. Now you can access these data using `DataPack.audio` and `DataPack.sample_rate`.
12
+
13
+
## Audio Reader
14
+
`AudioReader` supports reading in the audio data from files under a specific directory. You can set it as the reader of your forte pipeline whenever you need to process audio files:
15
+
```python
16
+
from forte.pipeline import Pipeline
17
+
from forte.data.readers.audio_reader import AudioReader
18
+
19
+
Pipeline().set_reader(
20
+
reader=AudioReader(),
21
+
config={"file_ext": ".wav"}
22
+
).run(
23
+
"path-to-audio-directory"
24
+
)
25
+
```
26
+
The example above builds a simple pipeline that can walk through the specified directory and load all the files with extension of `.wav`. `AudioReader` will create a `DataPack` for each file with the corresponding audio payload and the sample rate.
`DataPack.get()` is commonly used to retrieve entries from a datapack. In some cases, we are only interested in getting entries from a specific range. `DataPack.get()` allows users to set `range_annotation` which controls the search area of the sub-types. If `DataPack.get()` is called frequently with queries related to the `range_annotation`, you may consider building the coverage index regarding the related entry types. Users can call `DataPack.build_coverage_for(context_type, covered_type)` in order to create a mapping between a pair of entry types and target entries that are covered in ranges specified by outer entries.
5
+
6
+
For example, if you need to get all the `Token`s from some `Sentence`, you can write your code as:
7
+
```python
8
+
# Iterate through all the sentences in the pack.
9
+
for sentence in input_pack.get(Sentence):
10
+
# Take all tokens from a sentence
11
+
token_entries = input_pack.get(
12
+
entry_type=Token, range_annotation=sentence
13
+
)
14
+
```
15
+
However, the snippet above may become a bottleneck if you have a lot of `Sentence` and `Token` entries inside the datapack. To speed up this process, you can build a coverage index first:
16
+
```python
17
+
# Build coverage index between `Token` and `Sentence`
18
+
input_pack.build_coverage_for(
19
+
context_type=Sentence
20
+
covered_type=Token
21
+
)
22
+
```
23
+
This `DataPack.build_coverage_for(context_type, covered_type)` function is able to build a mapping from `context_type` to `covered_type`, allowing faster retrieval of inner entries covered by outer entries inside the datapack.
24
+
We also provide a function called `DataPack.covers(context_entry, covered_entry)` for coverage checking. It returns `True` if the span of `covered_entry` is covered by the span of `context_entry`.
Copy file name to clipboardExpand all lines: docs/ontology_generation.md
+10-1
Original file line number
Diff line number
Diff line change
@@ -72,6 +72,13 @@ Let us consider a simple ontology for documents of a pet shop.
72
72
{
73
73
"name": "pet_type",
74
74
"type": "str"
75
+
},
76
+
{
77
+
"name": "price",
78
+
"description": "Price for pet. A 2x2 matrix, whose columns are female/male and rows are juvenile/adult.",
79
+
"type": "NdArray",
80
+
"ndarray_dtype": "float",
81
+
"ndarray_shape": [2, 2]
75
82
}
76
83
]
77
84
},
@@ -133,7 +140,7 @@ Each entry definition will define a couple (can be empty) attributes, mimicking
133
140
* The `description` keyword is optionally used as the comment to describe the attribute.
134
141
* The `type` keyword is used to define the type of the attribute. Currently supported types are:
135
142
* Primitive types - `int`, `float`, `str`, `bool`
136
-
* Composite types - `List`, `Dict`
143
+
* Composite types - `List`, `Dict`, `NdArray`
137
144
* Entries defined in the `top` module - The attributes can be of the type base
138
145
entries (defined in the `forte.data.ontology.top` module) and can be directly
139
146
referred by the class name.
@@ -146,6 +153,8 @@ Each entry definition will define a couple (can be empty) attributes, mimicking
146
153
*`key_type` and `value_type`: If the `type` of the property is a `Dict`,
147
154
then these two represent the types of the key and value of the dictionary,
148
155
currently, only primitive types are supported as the `key_type`.
156
+
*`ndarray_dtype: str` and `ndarray_shape: array`: If the `type` of the property is a `NdArray`, then
157
+
these two represent the data type and the shape of the array. `NdArray` allows storing a N-dimensional (N-d) array in an entry. For instance, through the simple ontology of pet shop above, we are able to instantiate `Pet` and name it `dog`. Then, we can assign a matrix to the attribute `price` by `dog.price.data = [[2.99, 1.99], [4.99, 3.99]]`. Internally, this $2 \times 2$ matrix is stored as a NumPy array. When `ndarray_shape`/`ndarray_dtype` is specified, the shape/data type of the upcoming array will be verified whether they match. If both `ndarray_dtype` and `ndarray_shape` are provided, a placeholder will be created by `numpy.ndarray(ndarray_shape, dtype=ndarray_dtype)`.
149
158
150
159
## Major ontology types, Annotations, Links, Groups and Generics
This folder contains a series of tutorial examples that walk through the basics of building audio processing pipelines using forte.
3
+
4
+
## Introduction
5
+
6
+
We provide a simple speech processing example here to showcase forte's capability to support a wide range of audio processing tasks. This example consists of two parts: speaker segmentation and automatic speech recognition.
7
+
8
+
### Speaker Segmentation
9
+
Speaker segmentation consists in partitioning a conversation between one or more speakers into speaker turns. It is the process of partitioning an input audio stream into acoustically homogeneous segments according to the speaker identity. A typical speaker segmentation system finds potential speaker change points using the audio characteristics. In this example, the speaker segmentation is backed by a pretrained Hugging Face model where you can find details [here](https://huggingface.co/pyannote/speaker-segmentation).
10
+
11
+
### Automatic Speech Recognition
12
+
Automatic Speech Recognition (ASR) develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. Here we using a simple example to show how to build a forte pipeline to perform speech transcription tasks. This example is based on a pretrained wav2vec2 model and you can check out the details [here](https://huggingface.co/facebook/wav2vec2-base-960h).
13
+
14
+
## Run the Example Script
15
+
16
+
This example requires **python3.8 or later versions**. Before running the script, we will need to install a few packages first:
17
+
```bash
18
+
pip install -r requirements.txt
19
+
```
20
+
Note that some packages (e.g., `soundfile`) depend on a system library called `libsndfile` which might entail [additional steps](https://pysoundfile.readthedocs.io/en/latest/#installation) for Linux users.
21
+
22
+
Now you are able to run the example script `speaker_segmentation_pipeline.py`:
23
+
```bash
24
+
python speaker_segmentation_pipeline.py
25
+
```
26
+
which will print out the annotated transcription results including speakers and their corresponding utterances. Each audio segment will be played through your PC speaker. Example output:
27
+
```
28
+
INFO:speaker_segmentation_pipeline.py:SPEAKER_01: HE JOINS US LIFE FROM THE ALLERT CENTER WITH WHAT VOTERS THINK OF TO NIGHT'S DEBATE MICHAEL
29
+
```
30
+
31
+
We include a `test_audio.wav` extracted from [VoxConverse speaker diarisation dataset](https://github.com/joonson/voxconverse) in this example. It is a conversation consisting of three speakers speaking in turns. The example script will partition the audio, transcript the waveform, and play the audio segment for each speaker. The results are not meant to be 100% accurate but they are still recognizable and reasonable.
32
+
33
+
## Code Walkthrough
34
+
The backbone of the example script is a simple forte pipeline for speech processing:
-[`AudioReader`](https://github.com/asyml/forte/blob/master/forte/data/readers/audio_reader.py) supports reading in the audio data from files under a specific directory. Use `file_ext` to configure the target file extension that you want to include as input to your pipeline. You can set it as the reader of your forte pipeline whenever you need to process audio files
44
+
-`SpeakerSegmentationProcessor` performs the speaker segmentation task utilizing a pretrained [model](https://huggingface.co/pyannote/speaker-segmentation) from HuggingFace. After partitioning the recording into segments, it creates annotations called [`AudioUtterance`](https://github.com/asyml/forte/blob/master/ft/onto/base_ontology.py#L537) to store the audio span and speaker information for later retrieval.
45
+
-`AudioUtteranceASRProcessor` transcribes audio segments into text for each `AudioUtterance` found in input datapack. It appends the transcripted text into the text payload of datapack and creates corresponding [`Utterance`](https://github.com/asyml/forte/blob/master/ft/onto/base_ontology.py#L211) with speaker identity for each segment. To illustrate the one-to-one correspondence of `AudioUtterance` and `Utterance` within each segment, it adds a [`Link`](https://github.com/asyml/forte/blob/master/forte/data/ontology/top.py#L194) entry for each speech-to-text relationship.
46
+
47
+
After running the pipeline, you can retrieve the audio and text annotations from each segment by getting all the `Link`s inside the output datapack:
0 commit comments