Skip to content

Visualizations #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 84 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
c84f126
display cos distance of each nugget in document view
nils-bz May 29, 2024
244f06e
add 3D grid to document widget which could later be used for visualiz…
nils-bz May 30, 2024
8920206
Show the cosine similarity value beneath the nuggets names
eneapane Jun 1, 2024
4885e34
Add bar chart, design and button need to be improved
eneapane Jun 1, 2024
234b3f0
add scatterplot
tagzyassi Jun 11, 2024
d627bd2
Adjust buttons in the view, lay they side by side
eneapane Jun 12, 2024
d27fa7b
Add labels on click for bar chart
eneapane Jun 13, 2024
435673f
Add colored bar charts
eneapane Jun 13, 2024
92827e3
show pca reduced embedding of attribute in DocumentWidget
nils-bz Jun 14, 2024
79cb3a5
fix type of point passed to update grid leading to wrong points being…
nils-bz Jun 14, 2024
1cd1741
refactor dim_red_value computation and add nugget embeddings to grid
nils-bz Jun 16, 2024
038f6cc
Add full screen 3D Grid View in separate window
eneapane Jun 22, 2024
0f47224
fixed scatterplot/barchart accumulation error
tagzyassi Jun 22, 2024
8a608e6
Remove print debugs and fix error
eneapane Jun 23, 2024
7d91b95
highlight currently selected nugget in visualizer
nils-bz Jun 24, 2024
258af40
remove pycache folder
nils-bz Jun 24, 2024
26472ac
rm pycache
Dongtaes Jun 25, 2024
0490a4f
implement possibility to use T-SNE dimension reduction
nils-bz Jun 25, 2024
90c205a
add static annotation indicating corresponding nugget to items in 3D …
nils-bz Jun 30, 2024
e32c832
Add list of most likely choices in the interactive matching widget
eneapane Jul 1, 2024
9585515
Make last commit's code more pythonic
eneapane Jul 1, 2024
2696666
adjust annotation boxes for scatterplot/barchart
tagzyassi Jul 3, 2024
51980c1
change colormap of scatterplot
tagzyassi Jul 6, 2024
0473890
enhance 3D grid by adding distances to annotations and possibility to…
nils-bz Jul 10, 2024
6b00569
Change buttons layout below 3D Grid
eneapane Jul 10, 2024
80f5f3e
add 3D grid enhancements to fullscreen grid
nils-bz Jul 15, 2024
3b08a0b
make bar chart horizontally scrollable
eneapane Jul 20, 2024
a8ef4f3
Adjust on-click bar chart
eneapane Jul 20, 2024
629284e
implement simple visualizer for document overview
nils-bz Jul 25, 2024
dca5725
several grid related enhancements and complete refactoring of visuali…
nils-bz Jul 27, 2024
1fce6c2
fix several grid related errors
nils-bz Jul 27, 2024
ad39638
improve visualizations in document overview: threshold label displays…
nils-bz Jul 29, 2024
32e34ea
implement first version of list visualizing changed best matches
nils-bz Jul 30, 2024
1cb12ce
fix issue with not correctly highlighted confirmed matches
nils-bz Jul 30, 2024
ee59737
improve tooltips related to newly added nuggets shown to the user
nils-bz Jul 30, 2024
8ceea6c
Track the user usage of the visual gadgets, preparation for the study
eneapane Aug 4, 2024
d0b657c
Improve logging, and add logs to .gitignore
eneapane Aug 7, 2024
1c30e98
Small fix
eneapane Aug 7, 2024
2340167
fix data not being set initially for scatterPlot and barChart
nils-bz Aug 5, 2024
2cc0a37
add possibility to en-/disable visualizations
nils-bz Aug 5, 2024
4dcba8b
add lists indicating which nuggets moved below/above threshold due to…
nils-bz Aug 7, 2024
6b7255b
track match/no_match button
Dongtaes Aug 15, 2024
36a054f
added accessibility button with IBM color palette
Dongtaes Aug 17, 2024
20c3d2f
Logging ready for show bar chart, show scatter plot, and embedding vi…
eneapane Aug 7, 2024
77e25db
fix issue with match update list
nils-bz Aug 15, 2024
7be5d8c
add dimension reducer to preprocess script
nils-bz Aug 16, 2024
740424f
remove duplicated nuggets in preprocessing phase
nils-bz Aug 17, 2024
7dcd162
fix several issues with changes lists and some small UI improvements
nils-bz Aug 17, 2024
05ce188
add possibility to switch between 3 levels of visualizations
nils-bz Aug 17, 2024
290708b
Track Show Suggestions in 3D butoon
eneapane Aug 24, 2024
aa39d1e
track tooltips in logs/user_report.txt
eneapane Aug 24, 2024
c3f30c5
add json file for jupyter processing
eneapane Aug 24, 2024
38bf3e2
more relevant information on which tooltip was activated, no informat…
eneapane Aug 24, 2024
577432f
Bug Fix of the Accessibility Button
Dongtaes Aug 27, 2024
149569f
add simple legend for 3D grid
nils-bz Sep 8, 2024
6d252e2
Add tutorial for using bar chart
eneapane Sep 11, 2024
5f77868
Fix y-axis annotation
eneapane Sep 11, 2024
674c0e7
Remove scatter plot
eneapane Sep 11, 2024
45334c4
Show tutorial for bar chart only once per application usage
eneapane Sep 11, 2024
e92fbca
Fix bar chart with suggestions
eneapane Sep 13, 2024
4617a55
implement information popups, splash screen and corresponding help menu
nils-bz Sep 14, 2024
cadb1d6
adjust resource folder
nils-bz Sep 14, 2024
2fdfe43
move helper model classes to proper location
nils-bz Sep 14, 2024
1f96924
replace png with svg
nils-bz Sep 19, 2024
2a56abf
increase nugget size in grid
nils-bz Sep 20, 2024
6b32d1e
fix wrong colors in legend
nils-bz Sep 20, 2024
b6b940a
fix custom-match workflow
nils-bz Sep 20, 2024
25c2c73
display similarity instead of distance in nugget list of document view
nils-bz Sep 24, 2024
a38f378
improve var name
nils-bz Sep 24, 2024
ca0cd8e
add documentation for PointLegend class
nils-bz Sep 24, 2024
47824dd
Add comments for bar chart
eneapane Sep 26, 2024
03505bd
add remaining documentation to visualizations.py
nils-bz Sep 28, 2024
503310c
start documenting data_insights.py
nils-bz Sep 28, 2024
f01f83d
add remaining documentation to data_insights.py
nils-bz Sep 28, 2024
7071bc9
add further documentation
nils-bz Sep 29, 2024
27ab0ca
add further documentation
nils-bz Sep 29, 2024
4afb9bf
add further documentation
nils-bz Sep 29, 2024
1659f95
fix enabling / disabling of accessible color palette
nils-bz Sep 30, 2024
ff53145
change color indicating threshold change and appearance of attribute …
nils-bz Sep 30, 2024
ab09738
Add comments study.py
eneapane Sep 30, 2024
c3f196b
Refactor text
eneapane Sep 30, 2024
c32464a
Update .gitignore
Jannis25 Jan 3, 2025
d596ca2
Update requirements and refactor visualization component in interacti…
Jannis25 Jan 20, 2025
697677d
Fix errors occured during rebase
Jannis25 Jan 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@

/models/
/cache/
/build/
/executable/
.json
.pdf
.bson
/evaluation/datasets/aviation/documents
/evaluation/datasets/nobel/documents
/evaluation/results/
/logs/
2 changes: 2 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import logging
import sys

from PyQt6.QtCore import Qt
from PyQt6.QtWidgets import QApplication

from wannadb.resources import ResourceManager
Expand All @@ -14,6 +15,7 @@

with ResourceManager() as resource_manager:
# set up PyQt application
QApplication.setAttribute(Qt.ApplicationAttribute.AA_ShareOpenGLContexts)
app = QApplication(sys.argv)

window = MainWindow()
Expand Down
7 changes: 7 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ language-data==1.2.0
# via langcodes
marisa-trie==1.2.0
# via language-data
markdown==3.7
markdown-it-py==3.0.0
# via rich
markupsafe==2.1.5
Expand Down Expand Up @@ -329,5 +330,11 @@ xxhash==3.5.0
yarl==1.12.1
# via aiohttp

pyqtgraph==0.13.7

PyOpenGL==3.1.9

PyOpenGL_accelerate==3.1.9

# The following packages are considered to be unsafe in a requirements file:
# setuptools
7 changes: 5 additions & 2 deletions scripts/preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@
from wannadb.configuration import Pipeline
from wannadb.data.data import Document, DocumentBase
from wannadb.interaction import EmptyInteractionCallback
from wannadb.preprocessing.dimension_reduction import PCAReducer
from wannadb.preprocessing.embedding import BERTContextSentenceEmbedder, RelativePositionEmbedder, SBERTTextEmbedder, SBERTLabelEmbedder
from wannadb.preprocessing.extraction import StanzaNERExtractor, SpacyNERExtractor
from wannadb.preprocessing.label_paraphrasing import OntoNotesLabelParaphraser, SplitAttributeNameLabelParaphraser
from wannadb.preprocessing.normalization import CopyNormalizer
from wannadb.preprocessing.other_processing import ContextSentenceCacher
from wannadb.preprocessing.other_processing import ContextSentenceCacher, DuplicatedNuggetsCleaner
from wannadb.resources import ResourceManager
from wannadb.statistics import Statistics
from wannadb.status import EmptyStatusCallback
Expand Down Expand Up @@ -68,7 +69,9 @@ def main() -> None:
SBERTLabelEmbedder("SBERTBertLargeNliMeanTokensResource"),
SBERTTextEmbedder("SBERTBertLargeNliMeanTokensResource"),
BERTContextSentenceEmbedder("BertLargeCasedResource"),
RelativePositionEmbedder()
RelativePositionEmbedder(),
DuplicatedNuggetsCleaner(),
PCAReducer()
])

document_base = DocumentBase(documents, [])
Expand Down
220 changes: 220 additions & 0 deletions wannadb/change_captor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
"""
Class providing model classes which can be utilized to capture the changes due a user feedback and propagate them to the
UI.
These changes are computed after every feedback of the user.
"""

from typing import Optional, Union, List

from PyQt6.QtGui import QColor

from wannadb.data.data import InformationNugget
from wannadb_ui.common import ThresholdPosition, AddedReason


class BestMatchUpdate:
"""
Instances of this class represent an update of the best match of a document.

Each instance provide the old best match and the new best match of a document as well as the count specifying how
often similar changes of best guesses happened.
Another best match change is considered as similar if it happened in the same feedback round and the new best guess
is equal.

Methods
-------
old_best_match()
Returns the old best match of the related document.
new_best_match()
Returns the new best match of the related document.
count()
Returns the count of similar best match changes happened in the same feedback round.
"""

def __init__(self, old_best_match: str, new_best_match: str, count: int):
"""
Parameters
----------
old_best_match: str
The old best match of the related document.
new_best_match: str
The new best match of the related document.
count: int
The count of similar best match changes happened in the same feedback round.
"""

self._old_best_match: str = old_best_match
self._new_best_match: str = new_best_match
self._count: int = count

@property
def old_best_match(self) -> str:
return self._old_best_match

@property
def new_best_match(self) -> str:
return self._new_best_match

@property
def count(self) -> int:
return self._count


class ThresholdPositionUpdate:
"""
Instances of this class represent an update of the position of a nugget's distance relative to the threshold.

Each instance provide the text of the nugget whose position changed, the old position (above or below), the new
position (above or below), the old and new distance of the nugget as well as a count indicating how often similar
changes happened in the same feedback round.
A change is considered as similar if it happened in the same feedback round, the text represented by the nugget is
equal, and it has the same type of the update (above -> below or below -> above).

As mentioned, an instance of this class can cover multiple changes if the text of the nuggets with a change are
equal.
In this case the distance related properties are None as we don't refer to a single nugget.
"""

def __init__(self,
nugget_text: str,
old_position: Optional[ThresholdPosition], new_position: ThresholdPosition,
old_distance: Optional[float], new_distance: Optional[float],
count: int):
"""
Parameters
----------
nugget_text: str
Text of the nuggets whose position relative to the threshold changed.
old_position: ThresholdPosition
Previous position of the covered nuggets relative to the threshold (above or below).
new_position: ThresholdPosition
New position of the covered nuggets relative to the threshold (above or below).
old_distance: float
Old distance associated with the nugget. If multiple nuggets are covered by this instance, this will be
None.
new_distance: float
New distance associated with the nugget. If multiple nuggets are covered by this instance, this will be
None.
count: int
Number of similar changes happened in the same feedback round.
"""

self._best_guess: str = nugget_text
self._old_position: Optional[ThresholdPosition] = old_position
self._new_position: ThresholdPosition = new_position
self._old_distance: float = old_distance
self._new_distance: float = new_distance
self._count: int = count

@property
def nugget_text(self) -> str:
return self._best_guess

@property
def old_position(self) -> Optional[ThresholdPosition]:
return self._old_position

@property
def new_position(self) -> ThresholdPosition:
return self._new_position

@property
def old_distance(self) -> Optional[float]:
return self._old_distance

@property
def new_distance(self) -> Optional[float]:
return self._new_distance

@property
def count(self) -> int:
return self._count


class NewlyAddedNuggetContext:
"""
Instances of this class represent a newly added nugget to the document overview.
Each instance provide information about the old and new distance of the nugget as well as the reason why the system
newly added the nugget.
"""

def __init__(self,
nugget: InformationNugget,
old_distance: Union[float, None],
new_distance: float,
added_reason: AddedReason):
"""
Parameters
----------
nugget: InformationNugget
Newly added nugget.
old_distance: float
Old distance associated with the nugget.
new_distance: float
New distance associated with the nugget.
added_reason: AddedReason
Reason for the nugget being newly added.
"""

self._nugget = nugget
self._old_distance = old_distance
self._new_distance = new_distance
self._added_reason = added_reason

@property
def nugget(self):
return self._nugget

@property
def old_distance(self):
return self._old_distance

@property
def new_distance(self):
return self._new_distance

@property
def added_reason(self):
return self._added_reason


class NuggetUpdatesContext:
"""
Wrapper class wrapping multiple types of nugget related updates.
Nugget related updates refer to `NewlyAddedNuggetContext`, `ThresholdPositionUpdate` and `BestMatchUpdate`. Each
instance holds a list of updates for all of these 3 update types.
"""

def __init__(self,
newly_added_nugget_contexts: List[NewlyAddedNuggetContext],
best_match_updates: List[BestMatchUpdate],
threshold_position_updates: List[ThresholdPositionUpdate]):
"""
Parameters
----------
newly_added_nugget_contexts: List[NewlyAddedNuggetContext]
List of all `NewlyAddedNuggetContext` instances wrapped by this instance.
best_match_updates: List[BestMatchUpdate]
List of all `BestMatchUpdate` instances wrapped by this instance.
threshold_position_updates: List[ThresholdPositionUpdate]
List of all `ThresholdPositionUpdate` instances wrapped by this instance.
"""

self._newly_added_nugget_contexts: List[NewlyAddedNuggetContext] = newly_added_nugget_contexts
self._best_match_updates: List[BestMatchUpdate] = best_match_updates
self._threshold_position_updates: List[ThresholdPositionUpdate] = threshold_position_updates

@property
def newly_added_nugget_contexts(self) -> List[NewlyAddedNuggetContext]:
return self._newly_added_nugget_contexts

@property
def best_match_updates(self) -> List[BestMatchUpdate]:
return self._best_match_updates

@property
def threshold_position_updates(self) -> List[ThresholdPositionUpdate]:
return self._threshold_position_updates



Loading