GitHub - stevenraphael/language-complexity

When testing ANN models on various executive function tasks, there is a clear separation in areas of activation between language and non-language tasks. While this may imply separation of language and non-language tasks in human brains as well, one possible confound for this result is the complexity of the tasks, mainly that language tasks are more complex than the other non-language tasks tested on the models. Thus, the purpose of this project is to find ways of reducing the complexity of language tasks to see if this separation is maintained. The main goal was to find ways of reducing the complexity of language data used to train and test models. These required two things: one, to find sets of more implicit English language data, and two, to determine features for measuring text difficulty. This package is the code produced alongside this project.

complexity_measures.py: All functions with language complexity measurements

scraping.py : Used for scraping sites, designed with wikipedia and simple wikipedia in mind

generate_fake.py: Sample of PCFG generation

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
1-1000.txt		1-1000.txt
README.md		README.md
bert.py		bert.py
complexity_measures.py		complexity_measures.py
generate_fake.py		generate_fake.py
gpt2.py		gpt2.py
scraping.py		scraping.py
tree.py		tree.py
wiki_pipeline.py		wiki_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

stevenraphael/language-complexity

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages