src/main/resources/application.properties


    
application.properties local
logging.file=logs/reciter.log

###Spring configuration ###
##This is to make sure bean oveririding is true since relase of 2.1.0 bean overriding is by default false. 
##For more details see - https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-2.1-Release-Notes
spring.main.allow-bean-definition-overriding=true

##Spring Security configuration##
spring.security.enabled=true


#### Server configuration ####

## Server port. You can override this by passing your port using environment variable SERVER_PORT.
server.port=5000


#### Database configuration ####
## Set the region for your dynamodb tables. They can be us-east-1(US East (N. Virginia)), us-east-2(US East (Ohio)) etc.
aws.dynamodb.settings.region=us-east-1
aws.dynamodb.settings.table.create=true
aws.dynamodb.settings.table.readcapacityunits=5
aws.dynamodb.settings.table.writecapacityunits=5
## Note this is the billing mode for dynamodb. It accepts a enum of type BillingMode which has two values either PROVISIONED or PAY_PER_REQUEST
## Use PAY_PER_REQUEST for unpredicatable workloads. This provisions the resources for any amount data you want to insert or read.
## Use PROVISIONED for predictable workloads where you are sure about the data input. Also if you want to control the cost this is better.
aws.dynamodb.settings.table.billingmode=PAY_PER_REQUEST

## Method of identity data import ##
## You can import identity data to ReCiter in one of two ways:
## Option 1: Load data from the JSON file located here: https://github.com/wcmc-its/ReCiter/blob/master/src/main/resources/files/Identity.json To use this
## method, set the below value to "true." With this option, scholars' identities will only be refreshed upon application startup.
## Option 2: Use the Identity API. For this, you will need to use the /reciter/identity/ API for a single identity or the /reciter/save/identities/ API for multiple
## identities. To use this method, set the below value to "false."

aws.dynamodb.settings.file.import=false

## Local or AWS-hosted DynamoDB. Set this flag to true if you want to test ReCiter with DynamoDB local. 
## For more about local hosting, refer to https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.html
## If you are using an AWS hosted version, those parameters are controlled in the environment 
## variables of your IDE.
aws.dynamoDb.local=false 
aws.dynamoDb.local.port=8000
aws.dynamoDb.local.region=us-west-2
aws.dynamoDb.local.accesskey=dummy
aws.dynamoDb.local.secretkey=dummy
aws.dynamoDb.local.dbpath=/Users/szd2013/Documents/Reciter/dynamodb_local_latest/2019-06-25
 

#### AWS Simple Queue Service (SQS) ####
## This faeture is not yet ruled out with reciter. If you turn this on  uncomment code in /ReCiter/src/main/java/reciter/queue/sqs/AmazonSQSExtendedClientConfig.java
## Usage: toggle to use SQS in front of all DynamoDB operations for efficient request handling.
## Note: Turn this off if you are using dynamodb local
##aws.sqs.use=false
##aws.sqs.region=us-east-1

## Since SQS has a size limit of 256kb. Use if you need additional storage for large messages. 
## BucketName needs to have a *globally* unique name. If it isn't, you'll get an error during the build.
## Note: Turn this off if you are using dynamodb local
##aws.sqs.extendedClient=false
##aws.sqs.s3.bucketName=reciter-queue


#### AWS S3 file storage ####

## Enable S3. Usage: turn on if you expect the item size for DynamoDB to exceed 400kb. This is common.
## BucketName needs to have a *globally* unique name
## Consider naming it like "myInstitution-reciter-dynamodb"
## Note: Turn this off if you are using dynamodb local
aws.s3.use=true
aws.s3.region=us-east-1
aws.s3.dynamodb.bucketName=reciter-dynamodb
## This option might trigger a failed build if set as false since bucket name have to be globally unique. We recommend turning this option true. 
## So reciter will dynamically generate the bucket name following the convention of <aws.s3.dynamodb.bucketName>-<aws.s3.region>-<awsaccountNumber>
aws.s3.use.dynamic.bucketName=false
## This option allows you to cache identityAll endpoint result to store in S3 bucket in path <${aws.s3.dynamodb.bucketName}/idenity>.
## This helps in performance of the endpoint and latency thereby reducing cost by avoiding expensive scans dynamodb Identity table.
## In order to use caching options aws.s3.use flags and its corresponding flags should be set with proper values
aws.s3.use.cached.identityAll=true
## This option helps in setting number of days the data will be cached in S3 before replacing it. It takes number of days in integer.
aws.s3.use.cached.identityAll.cacheTime=1


#### Scopus configuration (optional) ####

## Usage: use Scopus for disambiguated organizations and more complete names.
## For more: see https://github.com/wcmc-its/ReCiter/wiki/Configuring-Scopus-(optional)
use.scopus.articles=true
 

#### Strategies ####

## By setting any of these strategies to false, you negate them.
## This allows you to see how individual strategies contribute to overall performance.
strategy.email=true
strategy.department=true
strategy.journalcategory=true
strategy.known.relationship=true
strategy.affiliation=true
strategy.scopus.common.affiliation=true
strategy.coauthor=true
strategy.journal=true
strategy.education=false
strategy.grant=true
strategy.citation=true
strategy.cocitation=true
strategy.article.size=true
strategy.bachelors.year.discrepancy=true
strategy.doctoral.year.discrepancy=true
strategy.cluster.size=true
strategy.mesh.major=true
strategy.persontype=true
strategy.averageclustering=true
strategy.gender=true


#### Retrieval ####

## Usage: controls the maximum number of results returned under "lenient" and "strict" retrieval. 
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Retrieving-candidate-records-from-PubMed
searchStrategy-leninent-threshold=2000
searchStrategy-strict-threshold=1000
 

#### Clustering ####

## Goal: control how aggressive clustering is. 
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Create-clusters

## This value speaks to the proportion of evidence that must be shared between two articles 
## the system is considering for clustering.
cluster.similarity.threshold.score=0.28

## This value excludes from consideration for this strategy, certain candidate articles that have more 
## than this many indexed grants.
clusteringGrants-threshold=12


#### Scoring ####

### Name evidence ###

## Each candidate targetAuthor is scored individually for similarity to the person of interest.
## For more, see https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Name-evidence
nameMatchFirstType.full-exact=1.852
nameMatchFirstType.inferredInitials-exact=0.441
nameMatchFirstType.full-fuzzy=-0.75
nameMatchFirstType.noMatch=-1.941
nameMatchFirstType.full-conflictingAllButInitials=-2.646
nameMatchFirstType.full-conflictingEntirely=-3.087
nameMatchFirstType.nullTargetAuthor-MatchNotAttempted=-1.323
nameMatchLastType.full-exact=0.664
nameMatchLastType.full-fuzzy=0.332
nameMatchLastType.full-conflictingEntirely=-0.996
nameMatchLastType.nullTargetAuthor-MatchNotAttempted=-0.996
nameMatchMiddleType.full-exact=1.588
nameMatchMiddleType.exact-singleInitial=1.191
nameMatchMiddleType.inferredInitials-exact=0.794
nameMatchMiddleType.noMatch=-0.794
nameMatchMiddleType.full-fuzzy=0
nameMatchMiddleType.full-conflictingEntirely=-1.588
nameMatchMiddleType.nullTargetAuthor-MatchNotAttempted=-1.588
nameMatchMiddleType.identityNull-MatchNotAttempted=0.794
nameMatchModifier.combinedFirstNameLastName=0.451
nameMatchModifier.incorrectOrder=-0.804
nameMatchModifier.articleSubstringOfIdentity-lastName=-0.804
nameMatchModifier.articleSubstringOfIdentity-firstMiddleName=-0.804
nameMatchModifier.identitySubstringOfArticle-lastName=-1.608
nameMatchModifier.identitySubstringOfArticle-firstName=-1.608
nameMatchModifier.identitySubstringOfArticle-middleName=-0.804
nameMatchModifier.identitySubstringOfArticle-firstMiddleName=0.804
nameMatchModifier.combinedMiddleNameLastName=0

## Include all name suffixes seperated by comma. All the name suffixes will be checked for with following expression ,<space><suffix> and <space><suffix>
nameScoringStrategy-excludedSuffixes=Jr,Jr.,MD PhD,MD-PhD,PhD,MD,III,III.,II,II.,Sr,Sr.

## These author names are so common that they were messing up clustering. 
## This list comes from Torvik et al. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2805000/figure/F8/ and Vishnyakova et al. https://www.ncbi.nlm.nih.gov/pubmed/30958542
## Format is Last Name <space> First Initial
namesIgnoredCoauthors=Zhang Y, Lee J, Wang J, Kim J, Wang X, Chen Y, Li J, Zhang J, Kim H, Chen J, Wang L, Liu J, Wang Z, Zhang H, Wang S, Wang C, Lee H, Li H, Chen H, Zhang Z, Yang Y, Li L, Li Z, Park S, Yang J, Lee Y, Li S, Lee C, Chen L, Smith J, Kim S, Lee S
## As an alternative, here is a revised, more aggressive approach. These are the top 500 most common names in PubMed and represent ~7.5% of PubMed authors.
## namesIgnoredCoauthors=Wang Y, Zhang Y, Li Y, Wang J, Wang X, Liu Y, Li J, Lee J, Zhang X, Chen Y, Zhang J, Li X, Kim J, Lee S, Wang H, Kim S, Wang L, Liu J, Chen J, Zhang L, Liu X, Wang Z, Zhang H, Kim H, Li H, Wang S, Chen X, Wang C, Kim Y, Zhang Z, Yang Y, Li Z, Chen C, Li L, Chen H, Li S, Liu H, Yang J, Lee H, Chen S, Chen L, Wang W, Li C, Zhang W, Liu Z, Park J, Zhao Y, Wu Y, Zhang S, Huang Y, Li W, Wu J, Liu C, Park S, Lee K, Lee Y, Liu S, Chen Z, Liu L, Yang X, Wang Q, Zhang C, Lee C, Xu Y, Chen W, Li M, Zhou Y, Kim D, Kim K, Zhang Q, Xu J, Wang M, Li Q, Wang D, Sun Y, Yang H, Kim M, Liu W, Wu X, Xu X, Zhang M, Huang J, Lee M, Zhao J, Yang L, Yang S, Lin Y, Zhou J, Lin C, Chen M, Li D, Zhou X, Zhu Y, Wang T, Wang G, Wu C, Li G, Yang C, Wang F, Zhao X, Huang C, Yu J, Lu Y, Wu H, Yang Z, Lee D, Wang B, Hu Y, Huang X, Xu H, Lin J, Zhang D, Li B, Sun J, Smith J, Liu Q, Liu M, Sun X, Wang P, Gao Y, Huang H, Xu L, Jiang Y, Zhu J, Choi J, Guo Y, Zhang B, Wu S, Zhang G, Park H, Zhu X, Yu Y, Chen G, Ma Y, Liu B, Wang R, Yu H, Li F, Wang K, Liu D, Huang S, Ma J, Song Y, Chang C, Xu Z, Li T, Kim C, Lin S, Singh S, Liu F, He Y, Zhang F, Liu G, Zhao L, Hu J, Lee W, Zhao H, Lin H, Wu Z, Li P, Zheng Y, Chen D, Kumar A, Chen Q, Han J, Zhou H, Lu J, Kim B, Kumar S, Zhang R, Zhang T, Choi S, Yang W, Smith M, Huang L, Chen T, Kim T, Hu X, Liu T, Yu X, Smith R, He J, Shi Y, Wu L, Park Y, Smith D, Zhou L, Smith A, Huang W, Song J, Sun L, Chang Y, Ma X, Li R, Sun H, Wu W, Guo X, Chen K, Zhou Z, Yang M, Chen P, Park K, Lee B, Zhang P, Zhu H, Chen B, Jiang H, Xu W, Guo J, Wu M, Cheng Y, Lee T, Han S, Zhu L, Zhao Z, Lu X, Gupta S, Jiang X, Miller J, Tang Y, Shen Y, Kim E, Huang Z, Lee E, Sharma S, He X, Choi Y, Smith S, Jiang J, Smith C, Brown J, Chen F, Gao X, Chang S, Yu L, Yu S, Cao Y, Singh A, Luo Y, Ma L, Han Y, Suzuki K, Park C, Hong S, Liu P, Kang S, Gao J, Johnson J, Li K, Khan M, Williams J, Zhang K, Li N, Liu K, Yu C, Xu C, Chang J, Xu S, Martin J, Cho S, Johnson M, Chen R, Jones R, Sharma A, Yang D, Wu D, Feng Y, Liu R, Yang G, Shi J, Cheng J, Tang J, Kim W, Jiang L, Xie Y, Lu H, Chang H, Zheng X, Zheng J, Yang Q, Choi H, Lin L, Zhao S, Sun Z, Wei Y, Kang J, Lu C, Zhu Z, Williams R, Gupta A, Zhao W, Yang F, Kumar R, Liang Y, Luo J, Singh R, Zhou Q, Zhou W, Khan A, Wilson J, Yu Z, Lee A, Shen J, Guo L, Miller M, Smith G, Hu Z, Wang N, Lin X, Watanabe K, Anderson J, Wu T, Nguyen T, Xu M, Song H, Ding Y, Guo H, Sun S, Tang X, Shi X, Sun W, Jones D, Miller D, Zhao C, Yuan Y, Williams D, Miller R, Brown R, Lu Z, Brown M, Xu Q, Ma H, Kang H, Guo Z, Johnson R, Taylor J, Johnson D, Han X, Davis J, Cheng C, Jones J, Zhao Q, Yang T, Huang M, Zhao M, Zhu W, Thompson J, Khan S, Cho Y, Lin W, Lin M, Gupta R, Fu Y, Yan J, Lin Z, Pan Y, Hu S, Park M, Oh S, Thomas J, Lim S, Lu L, Luo X, Yan Y, Das S, Xu D, Peng Y, Jones M, Xu G, Brown D, Cohen M, Shin J, Smith P, Cui Y, Cheng H, Xu B, Feng J, Lu S, Hong J, Smith K, Anderson R, Zheng H, Ma C, Patel S, Wang A, Jiang S, Hu C, Smith T, Yu M, Cho J, Zhang N, Hu W, Huang T, Shah S, Cao J, Wei J, Du Y, Yang K, Shi L, Sharma R, Yang P, Yao Y, Zhu S, Lee P, Brown S, Zhou M, Johnson A, Cohen J, Smith L, Williams M, Lim J, Zhu C, Lee G, Kato T, Moore J, Brown C, Song S, Patel A, Hwang S, Ghosh S, Smith B, Yuan J, Jung S, Johnson C, Brown A, Martin M, Fan J, Kumar P, Shin S, Kim I, Chan K, Chung J, Wilson M, Jones A, Shen H, Jones C, Tang H, Martin R, Cheng L, Hwang J, Sun M, Smith E, Lee R, Li A, Kim G, Singh M, Cho H, Williams A, Fang Y, Cheng S, Johnson S, Jones S, Yu D, Chan C, Kumar V, Williams S, Young J, Miller A, Bai Y, Kang Y, Thomas D, Tang L, White J, Wright J, Wilson D, Cohen S, Chen A, Tang S, Zheng W, Thomas M, Kim N, Shen X, Martin C, Tan Y, Jiang C, Tang C, Oh J, Tang W, Johnson K, Martin A, Williams C, Wei L, Scott J, Yoon S, Lu M, Yao J, Jung H, Ye J, Shah A, Wilson R, Taylor A, Huang G, Singh P, Wong C, Zheng S, Roberts J, Garcia J, Lewis J, Thomas S, Miller S

### Organizational unit evidence ###

## Goal: reward cases where an author's known affiliation appears in the affiliation statement.
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Organizational-unit-evidence
strategy.orgUnitScoringStrategy.organizationalUnitDepartmentMatchingScore=0.6

## Some org units (e.g., Medicine) are so common that we give them a lower score.
strategy.orgUnitScoringStrategy.organizationalUnitModifier=Medicine
strategy.orgUnitScoringStrategy.organizationalUnitModifierScore=0

## At least at Weill Cornell, org units of type "program" are relatively rare, so they get a higher score.
## We define program in identity.organizationalUnits.organizationalUnitType
## If an org unit is not a program, we score it as we would a department.
strategy.orgUnitScoringStrategy.organizationalUnitProgramMatchingScore=1.083

## Org units go by a lot of different names. This is important to get right.
## Note the proper use of the "," and "|" delimiters.
strategy.orgUnitScoringStrategy.organizationalUnitSynonym=strategy.orgUnitScoringStrategy.organizationalUnitSynonym=Anaesthesiology|Anesthesiology|Anesthesia,Anesthesiology And Critical Care|Anesthesiology and Critical Care Medicine,Anesthesiology and Intensive Care Medicine|Anesthesiology and Intensive Care|Anesthesiology And Critical Care,Biochemistry|Bioquímica,Biologia|Biology|Biological Science|Biological Sciences,Brain and Mind Research Institute|Neuroscience|Feil Family Brain and Mind Research Institute,Cancer|Meyer Cancer Center|Meyers Cancer Center|Sandra and Edward Meyer Cancer Center|Sandra & Edward Meyer Cancer Center,Cardiac Surgery|Cardiothoracic Surgery,Chemical and Biomolecular Engineering|Chemical and Biological Engineering,Chemistry And Biochemistry|Chemistry and Chemical Biology,Chemistry|Chimica|Quimica,Cirugía|Surgery,Clinical Research|Clinical Science,Critical Care Medicine|Critical Care,Ecology and Evolutionary Biology|Ecology and Evolution,Endocrine|Endocrinology,Environmental Health|Environmental Health Science,Environmental Science|Environmental Sciences,Food Science|Food Science and Technology,Health Informatics|Quality and Medical Informatics,Hematology/Oncology|Hematology and Oncology|Hematology and Medical Oncology,Histopathology|Histology,Infectious Diseases|Infectious Disease,Institute for Computational Biomedicine|HRH Prince Alwaleed Bin Talal Bin,Library|Library Science|Samuel J. Wood|Library and Information Science|Information Science|Wood Library,Life Science|Life Sciences,Materials Science|Materials Science and Engineering,Meyer Cancer Center|Meyers Cancer Center|Sandra and Edward Meyer Cancer Center|Sandra & Edward Meyer Cancer Center,Microbiología|Microbiology,Microbiology and Immunology|Microbiology|Immunology,Nephrology and Hypertension|Nephrology,Neurological Science|Neurology|Neurological Sciences,Neurological Surgery|Neurosurgery,Nutrition|Nutritional Science|Food Science and Human Nutrition|Food and Nutrition,Obstetrics and Gynecology|Obstetrics,Ophthalmology and Visual Sciences|Ophthalmology and Visual Science|Ophthalmology|Ophthalmology and Visual Science,Oral and Maxillofacial Surgery|Oral Surgery,Orthopaedic Surgery|Orthopedics|Orthopedic Surgery,Orthopaedics|Orthopedics|Orthopaedics and Trauma,Otolaryngology|Head and Neck Surgery|Ear Nose & Throat|Otolaryngology-Head and Neck Surgery|Otorhinolaryngology|Ear Nose and Throat|ENT,Paediatrics|Pediatrics,Pathology and Laboratory Medicine|Pathology,Pediatrics|Phyllis and David Komansky|Paediatrics,Pharmaceutics|Pharmaceutical Science|Pharmaceutical Science,Pharmacology and Therapeutics|Pharmacology,Pharmacy Practice|Pharmacy,Physics|Física,Physiological Science|Physiology|Physiological Sciences|Physiology and Biophysics|Physiology|Fisiología,Plant|Plant Science,Plastic and Reconstructive Surgery|Plastic Surgery,Population Health Sciences|Epidemiology and Population Health|Population Health|Public Health|Public Health Science|Public Health Sciences|Public Health Programs|Healthcare Policy and Research|Health Policy|Healthcare Policy & Research|Public Health|Health Care Policy & Research|Healthcare Policy and Research|Community Public Health|Epidemiology and Biostatistics|Epidemiology & Biostatistics|Epidemiology|Health Policy,Psychiatry and Behavioral Sciences|Psychiatry and Behavioral Science,Psychological Science|Psychological Sciences|Psychology,Pulmonary and Critical Care Medicine|Pulmonary and Critical Care,Pulmonary Medicine and Critical Care|Pulmonary & Critical Care Medicine|Pulmonary and Critical Care Medicine,Radiology and Radiological Science|Radiology,Radiology|Imaging|Diagnostic Imaging,Rehabilitation Medicine|Physical Medicine|Rehabilitation,Rehabilitation|Rehabilitation Medicine|Rehabilitation|Rehabilitation Medicine,Reproductive Medicine|Ronald O. Perelman and Claudia Cohen Center for Reproductive Medicine|Center for Reproductive Medicine,Surgical Research|Surgical Science|Surgical Sciences|Cirugía|Surgery,Thoracic and CardioVascular Surgery|Cardiothoracic Surgery,Veterans Affairs Medical Center|Veterans Affairs,Veterinary Clinical Science|Veterinary Clinical Sciences|Veterinary Science
 

### Email evidence ###

## Goal: see whether mail subdomains co-occur with a given target person's UID. For example: "paa2013" + "@nyp.org"
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Email-evidence
strategy.email.default.suffixes=@med.cornell.edu,@mail.med.cornell.edu,@weill.cornell.edu,@nyp.org
strategy.email.emailMatchScore=8
strategy.email.emailNoMatchScore=-1
 

### Journal category evidence ###

## Goal: compare target individuals' known org units against any possible journal category matches in 
## the ScienceMetrixDepartmentCategory table. Updating this table can improve the accuracy 
## of scores for researchers working in certain smaller specialties.

## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#journal-category-evidence
strategy.journalCategoryScore.journalSubfieldFactorScore=0.470
strategy.journalCategoryScore.journalSubfieldScore=-0.470
 

### Affiliation evidence ###

## Goal: account for the extent to which affiliation of all authors affects the likelihood 
## a given targetAuthor authored an article

## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#affiliation-evidence

strategy.authorAffiliationScoringStrategy.targetAuthor-institutionalAffiliation-matchType-positiveMatch-individual-score=1.8
strategy.authorAffiliationScoringStrategy.targetAuthor-institutionalAffiliation-matchType-positiveMatch-institution-score=1.0
strategy.authorAffiliationScoringStrategy.targetAuthor-institutionalAffiliation-matchType-null-score=0
strategy.authorAffiliationScoringStrategy.targetAuthor-institutionalAffiliation-matchType-noMatch-score=-0.8

## Consistently remove certain stopwords from both identity and article metadata.
strategy.authorAffiliationScoringStrategy.institutionStopwords=of, the, for, and, to

## For target author, we search the PubMed affiliation statement for groups of terms suggesting 
## this is your home institution.
## Note the proper use of the "," and "|" delimiters.
strategy.authorAffiliationScoringStrategy.homeInstitution-keywords=weil|cornell, weill|cornell, weill|medicine, cornell|medicine, cornell|medical, weill|medical, weill|bugando, weill|graduate, cornell|presbyterian, weill|presbyterian, 10065|cornell, 10065|presbyterian, 10021|cornell, 10021|presbyterian, weill|qatar, cornell|qatar, @med.cornell.edu, @qatar-med.cornell.edu, tri-institutional|md-phd, memorial|sloan|kettering, rockefeller|university, hospital|special|surgery

## Used only for output purposes.
strategy.authorAffiliationScoringStrategy.homeInstitution-label=Weill Cornell Medicine / NewYork-Presbyterian Hospital

## If Scopus is configured, attempt to match Scopus institution IDs to any of the authors.
## Used for both targetAuthor and nonTargetAuthors.
strategy.authorAffiliationScoringStrategy.homeInstitution-scopusInstitutionIDs=60007997,105529809,108567977,60019868,60000247,60072750,60109878,60026978,60025849,105533257,105529809,100315705,120474126,100370280,115325844,60011045,107923633,112568533,117155719,60009343,60026827,60022875,118363394,109378892,60104858,118197089,126361147,120163969,112627712,126185471,114308448,106543182,126269335,105455582,101996631,100361301,122917640,108960262,60009656,126208061,109784927,123171596,126598090,106087453,109854037,123278324,125813951,101019046,126569819,127416979,117953990,126209228,112743895,112651838,107138783,124947167,106167999,118611789,126140916,123187312,124678793,112805461,126267592,126291601,127328905,102020098,117898168,112684684,116739393,112625535,121645847,126789206,128316808,123643846,123953277,125305204,114850318,123854989,126155466,126155728,126384844,123888058,126490784,127790532,120167334,127026077,125303244,108477936,126232598,128252749,127074522,112890978,60005676,115296805,115806170,119862316,128024060,119003489,116516911,128218578,113546601,113384516,101778222,105184720,60138910,125316739,126961696,120409098,100520080,122214686,115630826,108092594,105771528,118724684,109488938,106541996,101633292,116150182,122604316,123730906,106537581,113913820,113207685,107163477,60012732,126269126,126272114,127419075,127567119,127863145,128462600,122033412,108364787,124041467,105800915,106582936,113734633

strategy.authorAffiliationScoringStrategy.collaboratingInstitutions-scopusInstitutionIDs=60007776,60010570,60026827,60025849,60012732,60018043,60008981,60022875,60019970,60025879,60009343,60009656,60072743,60072746,60104769,60012981,60000764,60004670,60014933,60022377,60005705,60003158,60027954,60003711,60103484,60029961,60031841,60005208,60002388,60024099,60030304,60029652,60026273,60024541,60023247,60007555,60017027,60002896,60011605,60027565,60032105,100654821,60013574,60021784,60030162

## Attempt to match these groups of keywords to the affiliation of the targetAuthor in PubMed.
strategy.authorAffiliationScoringStrategy.collaboratingInstitutions-keywords=new|york|presbyterian, rockefeller|university, HSS, hospital|special|surgery, North|Shore|hospital, Long|Island|Jewish, memorial|sloan, sloan|kettering, hamad, mount|sinai, methodist|houston, National|Institute|Mental|Health, beth israel, University|Pennsylvania|Medicine, Merck|Research, New|York|Medical|College, Medicine|Dentistry|New|Jersey, Montefiore, Lenox|Hill, Cold|Spring|Harbor, St|Luke|Roosevelt, New|York|University|Medicine, Langone, SUNY|Downstate, Albert|Einstein|Medicine, Yeshiva, UMDNJ, Icahn|Medicine, Mount|Sinai, columbia|medical, columbia|physicians
 
## The following non-target author scores are implemented only for Scopus affiliations rather than PubMed affiliations.
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-matchType-positiveMatch-individual-score=0.598
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-matchType-positiveMatch-institution-score=0.299
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-matchType-null-score=0
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-matchType-noMatch-score=-0.449
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-weight=0.15
strategy.authorAffiliationScoringStrategy.nonTargetAuthor-institutionalAffiliation-maxScore=0.897


### Relationship evidence ###

## Goal: use institutionally tracked relationships (e.g., two people are on the same grant) 
## to increase article scores where such individuals' names appear as co-authors on an article.

## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#relationship-evidence

strategy.knownrelationships.relationshipMatchingScore=0.52

## Provide extra credit if first name is exact match.
strategy.knownrelationships.relationshipVerboseMatchModifier=1.68

## Provide extra credit if relationship is of type mentor. Individuals with mentors tend to
## author fewer articles.
strategy.knownrelationships.relationshipMatchModifier-Mentor=0.522

## Provide extra credit if mentor is listed as a senior author, which is quite common.
strategy.knownrelationships.relationshipMatchModifier-Mentor-SeniorAuthor=0

## Provide extra credit if relationship is of type manager.
strategy.knownrelationships.relationshipMatchModifier-Manager=1.01

## Provide extra credit if relationship is of type manager and is listed a senior author.
strategy.knownrelationships.relationshipMatchModifier-Manager-SeniorAuthor=0

## Minimum score for relationship strategy when number of authors is huge and non match is high.
strategy.knownrelationships.relationshipMinimumTotalScore=-1.592
## Score for each non matching relationship for non target author
strategy.knownrelationships.relationshipNonMatchScore=-0.048
 

### Grant evidence ###

## Goal: see whether NIH grant IDs from identity.grants match those indexed in articles.
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#Grant-evidence
strategy.grant.grantMatchScore=2.253

### Gender evidence ###
## For more, see: https://github.com/wcmc-its/ReCiter/issues/357

strategy.genderStrategyScore.minimumScore=-1.36
strategy.genderStrategyScore.rangeScore=1.6

### Education year evidence ###

## Goal: down-weight candidate articles where candidate articles are published relatively early
## in a scholar's academic career as benchmarked against known years of degree.

## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#education-year-evidence

strategy.discrepancyDegreeYear.degreeYearDiscrepancyScore=-99|-12.6,-98|-12.5,-97|-12.4,-96|-12.3,-95|-12.2,-94|-12.1,-93|-12,-92|-11.9,-91|-11.8,-90|-11.7,-89|-11.6,-88|-11.5,-87|-11.4,-86|-11.3,-85|-11.2,-84|-11.1,-83|-11,-82|-10.9,-81|-10.8,-80|-10.7,-79|-10.6,-78|-10.5,-77|-10.4,-76|-10.3,-75|-10.2,-74|-10.1,-73|-10,-72|-9.9,-71|-9.8,-70|-9.7,-69|-9.6,-68|-9.5,-67|-9.4,-66|-9.3,-65|-9.2,-64|-9.1,-63|-9,-62|-8.9,-61|-8.8,-60|-8.7,-59|-8.6,-58|-8.5,-57|-8.4,-56|-8.3,-55|-8.2,-54|-8.1,-53|-8,-52|-7.9,-51|-7.8,-50|-7.7,-49|-7.6,-48|-7.5,-47|-7.4,-46|-7.3,-45|-7.2,-44|-7.1,-43|-7,-42|-6.9,-41|-6.8,-40|-6.7,-39|-6.6,-38|-6.5,-37|-6.4,-36|-6.3,-35|-6.2,-34|-6.1,-33|-6,-32|-5.9,-31|-5.8,-30|-5.7,-29|-5.6,-28|-5.5,-27|-5.4,-26|-5.3,-25|-5.2,-24|-5.1,-23|-5,-22|-4.9,-21|-4.8,-20|-4.7,-19|-4.6,-18|-4.5,-17|-4.4,-16|-4.3,-15|-4.2,-14|-4.1,-13|-4,-12|-3.78,-11|-3.56,-10|-3.34,-9|-3.12,-8|-2.9,-7|-2.68,-6|-2.46,-5|-2.24,-4|-2.02,-3|-1.8,-2|-1.58,-1|-1.36,0|-1.16,1|-0.96,2|-0.76,3|-0.56,4|-0.36,5|-0.175,6|0.01,7|0.195,8|0.38,9|0.55,10|0.72,11|0.89,12|0.9,13|0.9,14|0.9,15|0.9,16|0.9,17|0.9,18|0.9,19|0.9,20|0.9,21|0.9,22|0.9,23|0.9,24|0.9,25|0.9,26|0.9,27|0.9,28|0.9,29|0.9,30|0.9,31|0.9,32|0.9,33|0.9,34|0.9,35|0.9,36|0.9,37|0.9,38|0.9,39|0.9,40|0.9,41|0.9,42|0.9,43|0.9,44|0.9,45|0.9,46|0.9,47|0.9,48|0.89,49|0.88,50|0.86,51|0.85,52|0.83,53|0.81,54|0.8,55|0.77,56|0.75,57|0.73,58|0.7,59|0.68,60|0.65,61|0.62,62|0.59,63|0.56,64|0.53,65|0.49,66|0.45,67|0.42,68|0.38,69|0.34,70|0.3,71|0.25,72|0.21,73|0.16,74|0.12,75|0.07,76|0.02,77|-0.04,78|-0.09,79|-0.14,80|-0.2,81|-0.26,82|-0.31,83|-0.37,84|-0.43,85|-0.5,86|-0.56,87|-0.63,88|-0.69,89|-0.76,90|-0.83,91|-0.9,92|-0.98,93|-1.05,94|-1.12,95|-1.2,96|-1.28,97|-1.36,98|-1.44,99|-1.52,100|-1.61


## If a scholar does not have a doctoral year degree, we look for year of bachelor degree. 
## If a scholar has a bachelors degree of 2000, and this value is -7, this is functionally equivalent to a scholar having a 
## doctoral degree of 2007.
strategy.discrepancyDegreeYear.bacherlorYearWeight=-7


### Person type evidence ###

## Goal: downweight or upweight individual articles depending on person types in identity.personTypes.
## Upweight individual articles when they come from a small corpus of candidate articles.
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#person-type-evidence
## Update: the recent change to "education year evidence" (see above) obviates the need for person type evidence. 
## In the interests of simplicity, these values are being set to 0. 

## Upweight this person type where the person type is "academic-faculty-weillfulltime."
strategy.personTypeScoringStrategy.personTypeScore-academic-faculty-weillfulltime=0

## Downweight this person type.
strategy.personTypeScoringStrategy.personTypeScore-student-md-new-york=0
 

### Article count evidence ###

## Goal: reward each candidate article in which there are few articles retrieved and penalizes cases
## in which a lot of articles are retrieved. This is consistent with Bayesian insights about probability.

## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#article-count-evidence

## If this number of candidate articles, score for this strategy would be 0.
strategy.articleCountScoringStrategy.articleCountThresholdScore=800

## Increasing this value decreases the weight of this strategy. Decreasing this value, increases the weight.
strategy.articleCountScoringStrategy.articleCountWeight=583.9
 

### Average clustering evidence ###

## Goal: take advantage of the fact that articles in the same cluster are more likely to be written by the same person.
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#average-clustering-evidence

## These accepted and rejected scores are only used for clustering and are not included in the 
## individual scores for an article. This is so that users can judge score independently of 
## any feedback.
strategy.acceptedRejectedScoringStrategy.feedbackScore-accepted=1.5
strategy.acceptedRejectedScoringStrategy.feedbackScore-rejected=-10
strategy.acceptedRejectedScoringStrategy.feedbackScore-null=0
 
## This represents the maximum effect a cluster average can have on the scores of other articles in that cluster.
strategy.averageClusteringScoringStrategy.clusterScore-Factor=0.66

## Increasing the score increases the penalty for having inconsistent first names for the targetAuthor in a 
## given cluster.
strategy.averageClusteringScoringStrategy.clusterReliabilityScoreFactor=3
 

#### Scoring ####

## ReCiter outputs both a raw or non-standardized and standardized scores.
## For more, see: https://github.com/wcmc-its/ReCiter/wiki/How-ReCiter-works#map-article-to-a-standardized-score

## To insulate users from changing raw scores and its non-intuitive range ("How good is 4.02?"), we map the raw score
## to a 1-10 scale. Articles with scores between the 1st and 2nd term in standardizedScoreMapping have a score of 1. 
## Articles with scores between the 2nd and 3rd term in standardizedScoreMapping have a score of 2. Etc.
standardizedScoreMapping=-999,1.78,2.01,2.25,2.54,2.85,3.24,3.55,4.3,5.4


## If an admin does not select any global score, we default to this score.
totalArticleScore-standardized-default=7
 
## Use this property to set a lower limit for storing ReCiter's output in the "Analysis" DynamoDB table. 
## (Make sure to set aws.s3.use=true, see above, if you wish to store larger objects in s3.)
reciter.minimumStorageThreshold=3

### Keywords ###
## This sets the maximum number of keywords to return in the Feature Generator API.
reciter.feature.generator.keywordCountMax=10

### API Parameters Configuration ###
## Maximum count for unique identifiers allowed for feature-generator Group API ##
# Takes Integer the higher the value higher the response time of the api #
reciter.feature.generator.group.uids.maxCount=100