Skip to content

Commit

Permalink
Rebase Shruti's testing changes (#303)
Browse files Browse the repository at this point in the history
* Urgent fix to remove LIWC lexicons from public repo (#279) (#280)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* Patch Release 0.1.3 (#292)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update pyproject.toml

* add demo notebook

* update notebook and add information to docs

* update documentation

* Add Examples Notebook (#294)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* add demo notebook

* update notebook and add information to docs

* update documentation

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* fix typo in demo

* Bump path-to-regexp and express in /website (#298)

* Add Examples Notebook (#294)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* add demo notebook

* update notebook and add information to docs

* update documentation

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* Bump path-to-regexp and express in /website

Bumps [path-to-regexp](https://github.com/pillarjs/path-to-regexp) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `path-to-regexp` from 0.1.7 to 0.1.10
- [Release notes](https://github.com/pillarjs/path-to-regexp/releases)
- [Changelog](https://github.com/pillarjs/path-to-regexp/blob/master/History.md)
- [Commits](pillarjs/path-to-regexp@v0.1.7...v0.1.10)

Updates `express` from 4.19.2 to 4.21.0
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md)
- [Commits](expressjs/express@4.19.2...4.21.0)

---
updated-dependencies:
- dependency-name: path-to-regexp
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>
Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>
Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nltk from 3.8.1 to 3.9 (#297)

* Add Examples Notebook (#294)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* add demo notebook

* update notebook and add information to docs

* update documentation

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* Bump nltk from 3.8.1 to 3.9

Bumps [nltk](https://github.com/nltk/nltk) from 3.8.1 to 3.9.
- [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog)
- [Commits](nltk/nltk@3.8.1...3.9)

---
updated-dependencies:
- dependency-name: nltk
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update pyproject.toml

* Update requirements.txt

* Update download_resources.py

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>
Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>
Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump body-parser and express in /website (#296)

* Add Examples Notebook (#294)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* add demo notebook

* update notebook and add information to docs

* update documentation

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* Bump body-parser and express in /website

Bumps [body-parser](https://github.com/expressjs/body-parser) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `body-parser` from 1.20.2 to 1.20.3
- [Release notes](https://github.com/expressjs/body-parser/releases)
- [Changelog](https://github.com/expressjs/body-parser/blob/master/HISTORY.md)
- [Commits](expressjs/body-parser@1.20.2...1.20.3)

Updates `express` from 4.19.2 to 4.21.0
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md)
- [Commits](expressjs/express@4.19.2...4.21.0)

---
updated-dependencies:
- dependency-name: body-parser
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>
Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>
Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Check embedding update (#295)

* Add Examples Notebook (#294)

* Urgent fix to remove LIWC lexicons from public repo (#279)

* delete small test lexicons

* move .pkl files to assets and remove from GH

* filesystem cleanup

* update certainty pickle location

* remove unpickling certainty

* remove lexicons from pyproject

* change lexical pkl path

* add error handling when lexicons are not found

* update warning message

* add legal caveat and update name of certainty pkl to be correct

* ensure lexicons are ignored

* Update Documentation (Complete Conceptual Documentation, Document Assumptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>

* update torch requirements to resolve compatibility issue on torch end (#290)

* Update Website (#291)

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* deployed website

* copyright and team

* team headshots and footer

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* whitespace edits

* homepage updates

* feature table

* website updates

* renaming tpm-website to website

* deploying via gh-pages

* changed from tpm-website to website

* edits to the pages

* website updates

* updated links

* updated homepage

* link updates

* mobile compatibility

* mobile adjustments

* navbar mobile updates

* homepage updates

* add table of features

* updated team page titles

* include flask in requirements.txt

* updates to table of features

* load pages from top

* fix to 404 issues

* moved build under website folder

* updates to package launch

* hyperlink ./setup.sh

* fix nav bar sizing and hamburger logo

* include preprint

* updates to "getting started"

* update team

---------

Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update documentation for clarity and correct typos in positivity z-score and information exchange and liwc

* add demo notebook

* update notebook and add information to docs

* update documentation

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* update check embeddings with tqdm loading bar and BERT tokenization update

* (1) allow BERT sentiments to be generated from the messages with punctuation, rather than the preprocessed messages; (2) batch BERT sentiment generation to make it more efficient; (3) add loading bar for generation of chat-level features

---------

Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>

* Update README.md to remove col = "message"

* info diversity tests

* intermediate info diversity changes

rebasing!

* intermediate changes

* rebase dev

* removed print

* Update run_tests.py

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Shruti Agarwal <46203852+agshruti12@users.noreply.github.com>
Co-authored-by: amytangzheng <amy.tang.zheng@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: agshruti12 <agshruti2901@gmail.com>
  • Loading branch information
5 people authored Sep 30, 2024
1 parent 5e3c3c6 commit 5d74b89
Show file tree
Hide file tree
Showing 177 changed files with 8,287 additions and 1,292 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/github-actions-test-simple.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
pytest test_package.py
- name: Upload test results
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: test-log
path: ./tests/test.log
Expand Down
9 changes: 6 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ MANIFEST
.DS_Store

# unwanted files
src/team_comm_tools/features/lexicons/liwc_lexicons/
src/team_comm_tools/features/lexicons/liwc_lexicons_small_test/
src/team_comm_tools/features/lexicons/liwc_lexicons/*
src/team_comm_tools/features/lexicons/liwc_lexicons_small_test/*
src/team_comm_tools/features/lexicons/certainty.txt
src/team_comm_tools/modules/
src/team_comm_tools/output/*
Expand All @@ -55,4 +55,7 @@ node_modules/
# testing
/output
/vector_data
test.py
test.py



4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ my_feature_builder = FeatureBuilder(
)

# this line of code runs the FeatureBuilder on your data
my_feature_builder.featurize(col="message")
my_feature_builder.featurize()
```

### Data Format
Expand All @@ -112,4 +112,4 @@ For more information, please refer to the [Introduction on our Read the Docs Pag
Please visit our website, [https://teamcommtools.seas.upenn.edu/](https://teamcommtools.seas.upenn.edu/), for general information about our project and research. For more detailed documentation on our features and examples, please visit our [Read the Docs Page](https://conversational-featurizer.readthedocs.io/en/latest/).

# Becoming a Contributor
If you would like to make pull requests to this open-sourced repository, please read our [GitHub Repo Getting Started Guide](/github_repo_getting_started.md). We welcome new feature contributions or improvements to our framework.
If you would like to make pull requests to this open-sourced repository, please read our [GitHub Repo Getting Started Guide](/github_repo_getting_started.md). We welcome new feature contributions or improvements to our framework.
Binary file modified docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/build/doctrees/examples.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/feature_builder.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/basic_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/burstiness.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/certainty.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/discursive_diversity.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/fflow.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/get_all_DD_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/get_user_network.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/hedge.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/info_exchange_zscore.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/information_diversity.doctree
Binary file not shown.
Binary file removed docs/build/doctrees/features/keywords.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/lexical_features_v2.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/features/other_lexical_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_v2.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_v2_helper.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/question_num.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/readability.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/reddit_tags.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/temporal_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/textblob_sentiment_analysis.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/turn_taking_features.doctree
Binary file not shown.
Binary file removed docs/build/doctrees/features/user_centroids.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/variance_in_DD.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/features/word_mimicry.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features_conceptual/TEMPLATE.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features_conceptual/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/intro.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/assign_chunk_nums.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_chat_level_features.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_user_level_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/check_embeddings.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/gini_coefficient.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/preload_word_lists.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/preprocess.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/summarize_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/zscore_chats_and_conversation.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 9a01a2cd3d4384710101b4a99edd7683
config: d7678f479036f3220c73480ec4f2c467
tags: 645f666f9bcd5a90fca523b33c5a78b7
37 changes: 23 additions & 14 deletions docs/build/html/_sources/examples.rst.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
.. _examples:

Examples
=============
Worked Example
================

**Note:** Our "Examples" page is constantly being improved. This page is a work in progress!
Demo / Sample Code
*******************

After following the "Getting Started" steps below, the Team Communication Toolkit can be imported at the top of any Python script. We have provided a simple example file, "featurize.py", and a demo notebook, "demo.ipynb," under our `examples folder <https://github.com/Watts-Lab/team_comm_tools/tree/main/examples>`_ on GitHub.

You can also `access our demo notebook on Google Colab <https://colab.research.google.com/drive/1e8D5h_prRJsGs_N563EvpoQK0uZIAYsJ?usp=sharing>`_, where you can make a copy and run it on your own.

Finally, this page will walk you through a case study, highlighting top use cases and considerations when using the toolkit.

Getting Started
****************
Expand All @@ -27,18 +34,15 @@ In the event that some dependency installations fail (for example, you may get a
If you encounter a further issue in which the 'wordnet' package from NLTK is not found, it may be related to a known bug in NLTK in which the wordnet package does not unzip automatically. If this is the case, please follow the instructions to manually unzip it, documented in `this thread <https://github.com/nltk/nltk/issues/3028>`_.

You can also find a full list of our requirements `here <https://github.com/Watts-Lab/team_comm_tools/blob/main/requirements.txt>`_.

Import Recommendations: Virtual Environment and Pip
+++++++++++++++++++++++++++++++++++++++++++++++++++++

**We strongly recommend using a virtual environment in Python to run the package.** We have several specific dependency requirements. One important one is that we are currently only compatible with numpy < 2.0.0 because `numpy 2.0.0 and above <https://numpy.org/devdocs/release/2.0.0-notes.html#changes>`_ made significant changes that are not compatible with other dependencies of our package. As those dependencies are updated, we will support later versions of numpy.

**We also strongly recommend that your version of pip is up-to-date (>=24.0).** There have been reports in which users have had trouble downloading dependencies (specifically, the Spacy package) with older versions of pip. If you get an error with downloading ``en_core_web_sm``, we recommend updating pip.

Using the Package
******************

After you install it, the Team Communication Toolkit can be imported at the top of any Python script. We have provided a simple example file, "featurize.py", under our `examples folder <https://github.com/Watts-Lab/team_comm_tools/tree/main/examples>`_ on GitHub, and this walkthrough will highlight some of our top use cases. However, it won't follow the file exactly.

Importing the Package
++++++++++++++++++++++

Expand All @@ -52,10 +56,15 @@ Now you have access to the :ref:`feature_builder`. This is the main class that y

*Note*: PyPI treats hyphens and underscores equally, so "pip install team_comm_tools" and "pip install team-comm-tools" are equivalent. However, Python does NOT treat them equally, and **you should use underscores when you import the package, like this: from team_comm_tools import FeatureBuilder**.

Running the FeatureBuilder on Your Data
++++++++++++++++++++++++++++++++++++++++
Walkthrough: Running the FeatureBuilder on Your Data
*****************************************************

Next, we'll go through the details of running the FeatureBuilder on your data, discussing each of the specific options / parameters at your disposal.

Configuring the FeatureBuilder
++++++++++++++++++++++++++++++++

Next, you'll want to get some data to run your FeatureBuilder on! The FeatureBuilder accepts any Pandas DataFrame as the input, so you can read in data in whatever format you like. For the purposes of this walkthrough, we'll be using some jury deliberation data from `Hu et al. (2021) <https://dl.acm.org/doi/pdf/10.1145/3411764.3445433?casa_token=d-b5sCdwpNcAAAAA:-U-ePTSSE3rY1_BLXy1-0spFN_i4gOJqy8D0CeXHLAJna5bFRTee9HEnM0TnK_R-g0BOqOn35mU>`_.
The FeatureBuilder accepts any Pandas DataFrame as the input, so you can read in data in whatever format you like. For the purposes of this walkthrough, we'll be using some jury deliberation data from `Hu et al. (2021) <https://dl.acm.org/doi/pdf/10.1145/3411764.3445433?casa_token=d-b5sCdwpNcAAAAA:-U-ePTSSE3rY1_BLXy1-0spFN_i4gOJqy8D0CeXHLAJna5bFRTee9HEnM0TnK_R-g0BOqOn35mU>`_.

We first import Pandas and read in the dataframe:

Expand All @@ -81,7 +90,7 @@ Now we are ready to call the FeatureBuilder on our data. All we need to do is de
output_file_path_conv_level = "./jury_output_conversation_level.csv",
turns = True
)
jury_feature_builder.featurize(col="message")
jury_feature_builder.featurize()
Basic Input Columns
^^^^^^^^^^^^^^^^^^^^
Expand All @@ -106,7 +115,7 @@ Basic Input Columns
timestamp_col = ("timestamp_start", "timestamp_end")
* **In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.** Typically, we would use the column **conversation_id_col** to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The **grouping_keys** parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the "conversational identifier". This means that we treat each unique combination of "batch_num" and "round_num" as a different conversation.
* **In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.** Typically, we would use the column **conversation_id_col** to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The **grouping_keys** parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the "conversational identifier". This means that we treat each unique combination of "batch_num" and "round_num" as a different conversation, and we *override* the **conversation_id_col** if a list of **grouping_keys** is present.

* In cases where you are using **conversation_id_col**, "conversation_num" is the default value for this parameter.

Expand Down Expand Up @@ -162,7 +171,7 @@ Basic Input Columns

* These messages by John can be thought of as a single turn, in which he says, "Hey Michael, how are you? I wanted to talk to you real quick!" Instead, however, John sent three messages in a row, suggesting that he took three "turns." When the **turns** parameter is set to True, the FeatureBuilder will automatically combine messages like this into a single "turn."

* We note, however, that one of our features (`:ref:turn_taking_index`) will always give the value of "1" in the case when you set **turns=True**, since, by definition, people will never take multiple "turns" in a row.
* We note, however, that one of our features (:ref:`turn_taking_index`) will always give the value of "1" in the case when you set **turns=True**, since, by definition, people will never take multiple "turns" in a row.


Advanced Configuration Columns
Expand Down
10 changes: 9 additions & 1 deletion docs/build/html/_sources/features/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,12 @@ Once utterance-level features are computed, we compute conversation-level featur

Speaker- (User) Level Features
*********************************
User-level features currently represent an aggregation of features at the utterance- level (for example, the average number of words spoken *by a particular user*). There is therefore no separate speaker-level feature documentation; you may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information.
User-level features generally represent an aggregation of features at the utterance- level (for example, the average number of words spoken *by a particular user*). There is therefore limited speaker-level feature documentation, other than a function used to compute the "network" of other speakers that an individual interacts with in a conversation.

You may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information.


.. toctree::
:maxdepth: 1

get_user_network
7 changes: 0 additions & 7 deletions docs/build/html/_sources/features/keywords.rst.txt

This file was deleted.

7 changes: 0 additions & 7 deletions docs/build/html/_sources/features/user_centroids.rst.txt

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _TEMPLATE:
.. _TEMPLATE:

FEATURE NAME
============
Expand Down
16 changes: 12 additions & 4 deletions docs/build/html/_sources/features_conceptual/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,16 @@ Features: Conceptual Documentation

In contrast with the :ref:`Features: Technical Documentation <features_technical>` page, this page aims to provide a resource for conceptually understanding the features: what are they, what are they meant to measure, and how is our operationalization connected to concepts from social science?

**Please note that this page is currently under construction.**

Utterance- (Chat) Level Features
*********************************

.. toctree::
:maxdepth: 1

named_entity_recognition
time_difference
liwc
certainty
information_exchange
proportion_of_first_person_pronouns
message_length
Expand All @@ -28,16 +29,23 @@ Utterance- (Chat) Level Features
function_word_accommodation
mimicry_bert
moving_mimicry
time_difference
forward_flow
hedge
questions
conversational_repair
politeness_strategies
politeness_receptiveness_markers
online_discussions_tags


Conversation-Level Features
****************************

.. toctree::
:maxdepth: 1

turn_taking_index
gini_coefficient
turn_taking_index
team_burstiness
discursive_diversity
information_diversity
10 changes: 4 additions & 6 deletions docs/build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
.. Team Communication Toolkit documentation master file, created by
sphinx-quickstart on Fri Jun 14 12:54:37 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
.. _index_main:

The Team Communication Toolkit
===============================
Expand Down Expand Up @@ -79,15 +76,16 @@ Once you import the tool, you will be able to declare a FeatureBuilder object, w
)
# this line of code runs the FeatureBuilder on your data
my_feature_builder.featurize(col="message")
my_feature_builder.featurize()
Use the Table of Contents below to learn more about our tool. We recommend that you begin in the "Introduction" section, then explore other sections of the documentation as they become relevant to you. More information on using our tool can be found in :ref:`examples`.
Use the Table of Contents below to learn more about our tool. We recommend that you begin in the "Introduction" section, then explore other sections of the documentation as they become relevant to you. We recommend reading :ref:`basics` for a high-level overview of the requirements and parameters, and then reading through the :ref:`examples` for a detailed walkthrough and discussion of considerations.

.. toctree::
:maxdepth: 2
:caption: Contents:

intro
basics
feature_builder
features/index
features_conceptual/index
Expand Down
6 changes: 5 additions & 1 deletion docs/build/html/_sources/intro.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Finally, even when researchers build and measure their own features, what happen

What if there existed a single package did it all for you? What if, instead of combing through the literature, deciding on constructs of interest, and putting together packages to build out features on your own, a vast (and ever-increasing!) collection of conversational attributes was readily available at your fingertips?

We introduce the **Team Communication Toolkit**: a "one-stop shop" for exploring conversational data. Our framework is a single package encompassing a variety of common, research-backed measures of communication. These include tools like `LIWC <https://www.liwc.app/>`_, `Convokit <https://convokit.cornell.edu/>`_, `The Conversational Receptiveness Package <https://www.mikeyeomans.info/papers/receptiveness.pdf>`_, `The Lexical Suite <https://www.lexicalsuite.com/>`_ and much more. If you are working with conversational data for the first time, or just seeking to understand what you can possibly learn from open-ended conversations, this is the right place for you. We have collected over 100 features that you can explore, so that researchers can spend more time learning from conversations and less time worrying about how to begin studying them.
We introduce the **Team Communication Toolkit**: a "one-stop shop" for exploring conversational data. Our framework is a single package encompassing a variety of common, research-backed measures of communication. These include tools like `LIWC <https://www.liwc.app/>`_, `ConvoKit <https://convokit.cornell.edu/>`_, `The Conversational Receptiveness Package <https://www.mikeyeomans.info/papers/receptiveness.pdf>`_, `The Lexical Suite <https://www.lexicalsuite.com/>`_ and much more. If you are working with conversational data for the first time, or just seeking to understand what you can possibly learn from open-ended conversations, this is the right place for you. We have collected over 100 features that you can explore, so that researchers can spend more time learning from conversations and less time worrying about how to begin studying them.

The FeatureBuilder
*******************
Expand Down Expand Up @@ -54,6 +54,10 @@ The three levels of analysis are closely interconnected. In the Toolkit, Utteran

The driving functions for generating features at different levels are located in the :ref:`Utilities <utils>`. In general, you do not have to directly interact with these utilties, as the Toolkit generates utterance-, speaker-, and conversational-level features by default. However, you (as a researcher) may only only be interested a subset of the outputs, and customizable options will be made avilable in the FeatureBuilder soon.

Getting Started
*****************
Please refer to the :ref:`index_main` to get started. From there, we recommend reading :ref:`basics` for a high-level overview of the requirements and parameters, and then reading through the :ref:`examples` for a detailed walkthrough and discussion of considerations.

Feature Documentation
**********************
For technical information on the features generated by our Toolkit, please refer to the :ref:`Features: Technical Documentation <features_technical>` page.
Expand Down
7 changes: 4 additions & 3 deletions docs/build/html/_static/searchtools.js
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ const Search = {

htmlToText: (htmlString, anchor) => {
const htmlElement = new DOMParser().parseFromString(htmlString, 'text/html');
for (const removalQuery of [".headerlinks", "script", "style"]) {
for (const removalQuery of [".headerlink", "script", "style"]) {
htmlElement.querySelectorAll(removalQuery).forEach((el) => { el.remove() });
}
if (anchor) {
Expand Down Expand Up @@ -328,13 +328,14 @@ const Search = {
for (const [title, foundTitles] of Object.entries(allTitles)) {
if (title.toLowerCase().trim().includes(queryLower) && (queryLower.length >= title.length/2)) {
for (const [file, id] of foundTitles) {
let score = Math.round(100 * queryLower.length / title.length)
const score = Math.round(Scorer.title * queryLower.length / title.length);
const boost = titles[file] === title ? 1 : 0; // add a boost for document titles
normalResults.push([
docNames[file],
titles[file] !== title ? `${titles[file]} > ${title}` : title,
id !== null ? "#" + id : "",
null,
score,
score + boost,
filenames[file],
]);
}
Expand Down
Loading

0 comments on commit 5d74b89

Please sign in to comment.