Skip to content

Commit

Permalink
fix bug in which politeness features were in the wrong order
Browse files Browse the repository at this point in the history
  • Loading branch information
xehu committed Apr 24, 2024
2 parents ff3ef8e + c0510aa commit 5be95d3
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 3 deletions.
2 changes: 1 addition & 1 deletion feature_engine/featurize.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
# )
# feature_builder.featurize(col="message")

# # Tiny multi-task
# Tiny multi-task
# tiny_multi_task_feature_builder = FeatureBuilder(
# input_file_path = "../feature_engine/tpm-data/cleaned_data/test_data/multi_task_TINY.csv",
# vector_directory = "../feature_engine/tpm-data/vector_data/",
Expand Down
35 changes: 33 additions & 2 deletions feature_engine/testing/data/cleaned_data/test_chat_level.csv
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ I respond to that too",num_block_quote_responses,2
1,A,hello,Hello,1
1,B,So how should we answer this,Token_count,6
1,A,We can start here. What is the question?,YesNo_Questions,0
<<<<<<< HEAD
1,B,I am not sure. Where is the rest of our team?,WH_Questions,1
1,B,"Please help me figure this out, I really want to do well on this please",Please,2
2,C,Hey,Hello,1
Expand All @@ -100,6 +101,19 @@ I respond to that too",num_block_quote_responses,2
3,G,hey,indirect_greeting,1
3,G,I think we should try something else,1st_person_start,1
3,F,Ok whatever. You should leave the team then,2nd_person_start,1
=======
1,B,I am not sure. Where is the rest of our team?,First_Person_Single,1
1,B,"Well please help me figure this out, I really want to do well on this please okay",factuality,1
2,C,Okay bro lets split it 50/50,Impersonal_Pronoun,0
2,D,Maybe but how about 60/40? I doubt its fair otherwise,hedges,1
2,C,Seems possible,hashedge,1
2,E,I see what youre thinking but I disagree,Acknowledgement,1
2,E,We get only one chance so we should understand how to split it,Acknowledgement,1
2,D,"I just don't agree, I'm making the 60/40 split",Adverb_Limiter,1
3,F,greetings everybody. should we use the first option?,Hello,1
3,G,I think we should try something else,1st_person_start,1
3,F,Ok whatever. You should leave the team then,2nd_person_start,0
>>>>>>> c0510aabb22fa300950d6f3e3b9762ac2e7e56ae
4,H,Honestly thank you so so much,factuality,1
4,H,What's the plan?,direct_question,1
4,I,That is the dumbest idea I've heard; youre actually dumb af,hasnegative,1
Expand All @@ -110,20 +124,30 @@ I respond to that too",num_block_quote_responses,2
5,K,Sorry sorry I didn't mean to,apologizing,1
6,L,I don't really want to work with you all but let's get this over with,Impersonal_Pronoun,1
6,J,Fine by me,Affirmation,0
<<<<<<< HEAD
6,K,Ok so which part should we do first? the first or second?,YesNo_Questions,1
7,L,Please don't do that?,please_start,1
7,L,I don't think that will work,hashedge,1
7,M,I'm exhuasted rn,hasnegative,0
7,M,i don't really care please just finish this up,haspositive,0
7,N,Please don't do that?,Please,1
=======
6,K,Ok so which part should we do first? the first or second?,YesNo_Questions,0
7,L,please don't do that?,please_start,1
7,L,I don't think that will work,hashedge,1
7,M,I'm exhuasted rn,hasnegative,0
7,M,i don't really care please just finish this up,haspositive,0
7,N,Please don't do that?,please_start,1
>>>>>>> c0510aabb22fa300950d6f3e3b9762ac2e7e56ae
7,N,I don't think that will work,Hedges,0
7,O,I'm exhuasted rn,Negative_Emotion,0
7,O,i don't really care please just finish this up,Positive_Emotion,1
8,P,i appreciate all this from you,Gratitude,1
8,P,i appreciate all this from you,gratitude,1
8,P,"well we should start rn, our part is long",1st_person_pl,1
8,Q,ok forgive me for this error but,apologizing,1
8,Q,you have to redo the whole thing,2nd_person,0
8,R,ok so who will work with me? where should we begin?,direct_question,0
<<<<<<< HEAD
8,S,i appreciate all this from you,Gratitude,1
8,S,"well we should start rn, our part is long",First_Person_Plural,1
8,T,ok forgive us for this error but,Apology,0
Expand All @@ -142,4 +166,11 @@ I respond to that too",num_block_quote_responses,2
19,A,"I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.",Disagreement,1
20,A,"I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.",Acknowledgement,1
21,A,"I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.",First_Person_Plural,1
22,A,"I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.",For_Me,0
22,A,"I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.",For_Me,0
=======
8,S,i appreciate all this from you,Gratitude,0
8,S,"well we should start rn, our part is long",Token_count,9
8,T,ok forgive us for this error but,Apology,0
8,T,I think you have to redo the whole thing,First_Person_Single,1
8,U,ok so who will work with me? where should we begin?,WH_Questions,0
>>>>>>> c0510aabb22fa300950d6f3e3b9762ac2e7e56ae
4 changes: 4 additions & 0 deletions feature_engine/utils/preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ def assert_key_columns_present(df):
# Assert that key columns are present
if {'conversation_num', 'message', 'speaker_nickname'}.issubset(df.columns):
print("Confirmed that data has `conversation_num`, `message`, and `speaker_nickname` columns!")
# ensure no NA's in essential columns
df['message'] = df['message'].fillna('')
df['conversation_num'] = df['conversation_num'].fillna(0)
df['speaker_nickname'] = df['speaker_nickname'].fillna(0)
else:
print("One of `conversation_num`, `message`, or `speaker_nickname` is missing! Raising error...")
print("Columns available: ")
Expand Down

0 comments on commit 5be95d3

Please sign in to comment.