Skip to content

Commit

Permalink
Update Documentation (Complete Conceptual Documentation, Document Ass…
Browse files Browse the repository at this point in the history
…umptions) (#289)

* new docs

* lexicons hotfix

* emilys doc edits

* update deprecated github actions to latest

* update last remaining text features

* update index

* update docs

* update index

* update docs

* update docs and the feature dictionary

* add basics.rst

* add new basics page

* update docs

---------

Co-authored-by: Xinlan Emily Hu <xehu@wharton.upenn.edu>
Co-authored-by: Xinlan Emily Hu <xehu@cs.stanford.edu>
  • Loading branch information
3 people authored Sep 9, 2024
1 parent 8178caa commit 93e77d8
Show file tree
Hide file tree
Showing 138 changed files with 809 additions and 422 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/github-actions-test-simple.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
pytest test_package.py
- name: Upload test results
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: test-log
path: ./tests/test.log
Expand Down
Binary file modified docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/build/doctrees/examples.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/feature_builder.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/basic_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/burstiness.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/certainty.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/discursive_diversity.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/fflow.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/get_all_DD_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/get_user_network.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/hedge.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/info_exchange_zscore.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/information_diversity.doctree
Binary file not shown.
Binary file removed docs/build/doctrees/features/keywords.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/lexical_features_v2.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/features/other_lexical_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_v2.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/politeness_v2_helper.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/question_num.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/readability.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/reddit_tags.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/temporal_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/textblob_sentiment_analysis.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/turn_taking_features.doctree
Binary file not shown.
Binary file removed docs/build/doctrees/features/user_centroids.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/variance_in_DD.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/features/word_mimicry.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features_conceptual/TEMPLATE.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features_conceptual/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/intro.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/assign_chunk_nums.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_chat_level_features.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_user_level_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/check_embeddings.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/gini_coefficient.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/preload_word_lists.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/preprocess.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/summarize_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/zscore_chats_and_conversation.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 9a01a2cd3d4384710101b4a99edd7683
config: d7678f479036f3220c73480ec4f2c467
tags: 645f666f9bcd5a90fca523b33c5a78b7
4 changes: 1 addition & 3 deletions docs/build/html/_sources/examples.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
Examples
=============

**Note:** Our "Examples" page is constantly being improved. This page is a work in progress!

Getting Started
****************

Expand Down Expand Up @@ -106,7 +104,7 @@ Basic Input Columns
timestamp_col = ("timestamp_start", "timestamp_end")
* **In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.** Typically, we would use the column **conversation_id_col** to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The **grouping_keys** parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the "conversational identifier". This means that we treat each unique combination of "batch_num" and "round_num" as a different conversation.
* **In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.** Typically, we would use the column **conversation_id_col** to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The **grouping_keys** parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the "conversational identifier". This means that we treat each unique combination of "batch_num" and "round_num" as a different conversation, and we *override* the **conversation_id_col** if a list of **grouping_keys** is present.

* In cases where you are using **conversation_id_col**, "conversation_num" is the default value for this parameter.

Expand Down
10 changes: 9 additions & 1 deletion docs/build/html/_sources/features/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,12 @@ Once utterance-level features are computed, we compute conversation-level featur

Speaker- (User) Level Features
*********************************
User-level features currently represent an aggregation of features at the utterance- level (for example, the average number of words spoken *by a particular user*). There is therefore no separate speaker-level feature documentation; you may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information.
User-level features generally represent an aggregation of features at the utterance- level (for example, the average number of words spoken *by a particular user*). There is therefore limited speaker-level feature documentation, other than a function used to compute the "network" of other speakers that an individual interacts with in a conversation.

You may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information.


.. toctree::
:maxdepth: 1

get_user_network
7 changes: 0 additions & 7 deletions docs/build/html/_sources/features/keywords.rst.txt

This file was deleted.

7 changes: 0 additions & 7 deletions docs/build/html/_sources/features/user_centroids.rst.txt

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _TEMPLATE:
.. _TEMPLATE:

FEATURE NAME
============
Expand Down
16 changes: 12 additions & 4 deletions docs/build/html/_sources/features_conceptual/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,16 @@ Features: Conceptual Documentation

In contrast with the :ref:`Features: Technical Documentation <features_technical>` page, this page aims to provide a resource for conceptually understanding the features: what are they, what are they meant to measure, and how is our operationalization connected to concepts from social science?

**Please note that this page is currently under construction.**

Utterance- (Chat) Level Features
*********************************

.. toctree::
:maxdepth: 1

named_entity_recognition
time_difference
liwc
certainty
information_exchange
proportion_of_first_person_pronouns
message_length
Expand All @@ -28,16 +29,23 @@ Utterance- (Chat) Level Features
function_word_accommodation
mimicry_bert
moving_mimicry
time_difference
forward_flow
hedge
questions
conversational_repair
politeness_strategies
politeness_receptiveness_markers
online_discussions_tags


Conversation-Level Features
****************************

.. toctree::
:maxdepth: 1

turn_taking_index
gini_coefficient
turn_taking_index
team_burstiness
discursive_diversity
information_diversity
3 changes: 2 additions & 1 deletion docs/build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,13 +81,14 @@ Once you import the tool, you will be able to declare a FeatureBuilder object, w
# this line of code runs the FeatureBuilder on your data
my_feature_builder.featurize(col="message")
Use the Table of Contents below to learn more about our tool. We recommend that you begin in the "Introduction" section, then explore other sections of the documentation as they become relevant to you. More information on using our tool can be found in :ref:`examples`.
Use the Table of Contents below to learn more about our tool. We recommend that you begin in the "Introduction" section, then explore other sections of the documentation as they become relevant to you. We recommend reading :ref:`basics` for a high-level overview of the requirements and parameters, and then reading through :ref:`examples` for a detailed walkthrough and discussion of considerations.

.. toctree::
:maxdepth: 2
:caption: Contents:

intro
basics
feature_builder
features/index
features_conceptual/index
Expand Down
6 changes: 5 additions & 1 deletion docs/build/html/_sources/intro.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Finally, even when researchers build and measure their own features, what happen

What if there existed a single package did it all for you? What if, instead of combing through the literature, deciding on constructs of interest, and putting together packages to build out features on your own, a vast (and ever-increasing!) collection of conversational attributes was readily available at your fingertips?

We introduce the **Team Communication Toolkit**: a "one-stop shop" for exploring conversational data. Our framework is a single package encompassing a variety of common, research-backed measures of communication. These include tools like `LIWC <https://www.liwc.app/>`_, `Convokit <https://convokit.cornell.edu/>`_, `The Conversational Receptiveness Package <https://www.mikeyeomans.info/papers/receptiveness.pdf>`_, `The Lexical Suite <https://www.lexicalsuite.com/>`_ and much more. If you are working with conversational data for the first time, or just seeking to understand what you can possibly learn from open-ended conversations, this is the right place for you. We have collected over 100 features that you can explore, so that researchers can spend more time learning from conversations and less time worrying about how to begin studying them.
We introduce the **Team Communication Toolkit**: a "one-stop shop" for exploring conversational data. Our framework is a single package encompassing a variety of common, research-backed measures of communication. These include tools like `LIWC <https://www.liwc.app/>`_, `ConvoKit <https://convokit.cornell.edu/>`_, `The Conversational Receptiveness Package <https://www.mikeyeomans.info/papers/receptiveness.pdf>`_, `The Lexical Suite <https://www.lexicalsuite.com/>`_ and much more. If you are working with conversational data for the first time, or just seeking to understand what you can possibly learn from open-ended conversations, this is the right place for you. We have collected over 100 features that you can explore, so that researchers can spend more time learning from conversations and less time worrying about how to begin studying them.

The FeatureBuilder
*******************
Expand Down Expand Up @@ -54,6 +54,10 @@ The three levels of analysis are closely interconnected. In the Toolkit, Utteran

The driving functions for generating features at different levels are located in the :ref:`Utilities <utils>`. In general, you do not have to directly interact with these utilties, as the Toolkit generates utterance-, speaker-, and conversational-level features by default. However, you (as a researcher) may only only be interested a subset of the outputs, and customizable options will be made avilable in the FeatureBuilder soon.

Getting Started
*****************
Please refer to the :ref:`index` to get started. From there, we recommend reading :ref:`basics` for a high-level overview of the requirements and parameters, and then reading through :ref:`examples` for a detailed walkthrough and discussion of considerations.

Feature Documentation
**********************
For technical information on the features generated by our Toolkit, please refer to the :ref:`Features: Technical Documentation <features_technical>` page.
Expand Down
7 changes: 4 additions & 3 deletions docs/build/html/_static/searchtools.js
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ const Search = {

htmlToText: (htmlString, anchor) => {
const htmlElement = new DOMParser().parseFromString(htmlString, 'text/html');
for (const removalQuery of [".headerlinks", "script", "style"]) {
for (const removalQuery of [".headerlink", "script", "style"]) {
htmlElement.querySelectorAll(removalQuery).forEach((el) => { el.remove() });
}
if (anchor) {
Expand Down Expand Up @@ -328,13 +328,14 @@ const Search = {
for (const [title, foundTitles] of Object.entries(allTitles)) {
if (title.toLowerCase().trim().includes(queryLower) && (queryLower.length >= title.length/2)) {
for (const [file, id] of foundTitles) {
let score = Math.round(100 * queryLower.length / title.length)
const score = Math.round(Scorer.title * queryLower.length / title.length);
const boost = titles[file] === title ? 1 : 0; // add a boost for document titles
normalResults.push([
docNames[file],
titles[file] !== title ? `${titles[file]} > ${title}` : title,
id !== null ? "#" + id : "",
null,
score,
score + boost,
filenames[file],
]);
}
Expand Down
8 changes: 4 additions & 4 deletions docs/build/html/examples.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Utilities" href="utils/index.html" />
<link rel="prev" title="Turn Taking Index" href="features_conceptual/turn_taking_index.html" />
<link rel="prev" title="Information Diversity" href="features_conceptual/information_diversity.html" />
</head>

<body class="wy-body-for-nav">
Expand All @@ -47,6 +47,7 @@
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="basics.html">The Basics</a></li>
<li class="toctree-l1"><a class="reference internal" href="feature_builder.html">feature_builder module</a></li>
<li class="toctree-l1"><a class="reference internal" href="features/index.html">Features: Technical Documentation</a></li>
<li class="toctree-l1"><a class="reference internal" href="features_conceptual/index.html">Features: Conceptual Documentation</a></li>
Expand Down Expand Up @@ -97,7 +98,6 @@

<section id="examples">
<span id="id1"></span><h1>Examples<a class="headerlink" href="#examples" title="Link to this heading"></a></h1>
<p><strong>Note:</strong> Our “Examples” page is constantly being improved. This page is a work in progress!</p>
<section id="getting-started">
<h2>Getting Started<a class="headerlink" href="#getting-started" title="Link to this heading"></a></h2>
<p>To use our tool, please ensure that you have Python &gt;= 3.10 installed and a working version of <a class="reference external" href="https://pypi.org/project/pip/">pip</a>, which is Python’s package installer. Then, in your local environment, run the following:</p>
Expand Down Expand Up @@ -184,7 +184,7 @@ <h4>Basic Input Columns<a class="headerlink" href="#basic-input-columns" title="
</div>
</div></blockquote>
</li>
<li><p><strong>In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.</strong> Typically, we would use the column <strong>conversation_id_col</strong> to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The <strong>grouping_keys</strong> parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the “conversational identifier”. This means that we treat each unique combination of “batch_num” and “round_num” as a different conversation.</p>
<li><p><strong>In the FeatureBuilder, we assume that every conversation has a unique identifying string, and that all the messages belonging to the same conversation have the same identifier.</strong> Typically, we would use the column <strong>conversation_id_col</strong> to indicate the name of this identifier. However, we also support cases in which there is more than one identifer per conversation, and our example here illustrates this functionality. The <strong>grouping_keys</strong> parameter means that we want to group by more than one column, and allow the FeatureBuilder to treat unique combinations of the grouping keys as the “conversational identifier”. This means that we treat each unique combination of “batch_num” and “round_num” as a different conversation, and we <em>override</em> the <strong>conversation_id_col</strong> if a list of <strong>grouping_keys</strong> is present.</p>
<blockquote>
<div><ul class="simple">
<li><p>In cases where you are using <strong>conversation_id_col</strong>, “conversation_num” is the default value for this parameter.</p></li>
Expand Down Expand Up @@ -372,7 +372,7 @@ <h3>Additional FeatureBuilder Considerations<a class="headerlink" href="#additio
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="features_conceptual/turn_taking_index.html" class="btn btn-neutral float-left" title="Turn Taking Index" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="features_conceptual/information_diversity.html" class="btn btn-neutral float-left" title="Information Diversity" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="utils/index.html" class="btn btn-neutral float-right" title="Utilities" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

Expand Down
5 changes: 3 additions & 2 deletions docs/build/html/feature_builder.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Features: Technical Documentation" href="features/index.html" />
<link rel="prev" title="Introduction" href="intro.html" />
<link rel="prev" title="The Basics" href="basics.html" />
</head>

<body class="wy-body-for-nav">
Expand All @@ -47,6 +47,7 @@
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="basics.html">The Basics</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">feature_builder module</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#feature_builder.FeatureBuilder"><code class="docutils literal notranslate"><span class="pre">FeatureBuilder</span></code></a><ul>
<li class="toctree-l3"><a class="reference internal" href="#feature_builder.FeatureBuilder.chat_level_features"><code class="docutils literal notranslate"><span class="pre">FeatureBuilder.chat_level_features()</span></code></a></li>
Expand Down Expand Up @@ -308,7 +309,7 @@
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="intro.html" class="btn btn-neutral float-left" title="Introduction" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="basics.html" class="btn btn-neutral float-left" title="The Basics" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="features/index.html" class="btn btn-neutral float-right" title="Features: Technical Documentation" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

Expand Down
1 change: 1 addition & 0 deletions docs/build/html/features/basic_features.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../intro.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="../basics.html">The Basics</a></li>
<li class="toctree-l1"><a class="reference internal" href="../feature_builder.html">feature_builder module</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Features: Technical Documentation</a><ul class="current">
<li class="toctree-l2 current"><a class="reference internal" href="index.html#utterance-chat-level-features">Utterance- (Chat) Level Features</a><ul class="current">
Expand Down
1 change: 1 addition & 0 deletions docs/build/html/features/burstiness.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../intro.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="../basics.html">The Basics</a></li>
<li class="toctree-l1"><a class="reference internal" href="../feature_builder.html">feature_builder module</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Features: Technical Documentation</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="index.html#utterance-chat-level-features">Utterance- (Chat) Level Features</a></li>
Expand Down
1 change: 1 addition & 0 deletions docs/build/html/features/certainty.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../intro.html">Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="../basics.html">The Basics</a></li>
<li class="toctree-l1"><a class="reference internal" href="../feature_builder.html">feature_builder module</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Features: Technical Documentation</a><ul class="current">
<li class="toctree-l2 current"><a class="reference internal" href="index.html#utterance-chat-level-features">Utterance- (Chat) Level Features</a><ul class="current">
Expand Down
Loading

0 comments on commit 93e77d8

Please sign in to comment.