-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
22 changed files
with
992 additions
and
97 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,38 @@ | ||
# MPEDS Coder - An Event Coding System | ||
# MPEDS Annotation Interface | ||
|
||
This is the annotation interface used in creating datasets for the Machine-learning Protest Event Data System (MPEDS). While applied to the specific task of coding for protest events, this can also be used for the development of other event datasets. | ||
|
||
The MPEDS project uses this interface to generate a training data for event data coding. As structured in this project, coders must first discern whether an article contains a protest event (the haystack task) and then highlight the text in which variables of interest are present. Although many of the variables (e.g. claims) are not explicit in the text, we must rely on the text itself to produce variables of interest. After this 'first pass' of coding, articles which are candidates for event coding are passed to a 'second pass', in which coders disentangle multiple events in a single article, categorize forms, claims, and targets into discrete categories, and ensure the coding for specific locations, dates, social movement organizations, and crowd sizes. | ||
|
||
This system is built in Python using the [Flask](http://flask.pocoo.org/) microframework. It can source articles parsed from Lexis-Nexis (using the `split-ln.py` script), [Apache Solr](http://lucene.apache.org/solr/), or XML files formatted in [News Industry Text Format](http://www.nitf.org/), such as the [LDC's New York Times Annotated Corpus](https://catalog.ldc.upenn.edu/LDC2008T19). | ||
|
||
It also uses [Bootstrap](http://getbootstrap.com/) for CSS and [jQuery](https://jquery.com/) for JavaScript. It only works in Firefox (for now). | ||
|
||
## Setup | ||
|
||
To populate the database with example information, first run the setup script. | ||
|
||
python setup.py | ||
|
||
This will add five users: an admin (admin), two first-pass coders (coder1p\_1, coder1p\_2), and two second-pass coders (coder2p\_1, coder2p\_2). They will all have the password `default`). It will add a variable hierarchy for second-pass coding. It will also enter metadata for all the articles in the `example-articles` directory, and queue them up for the first-pass coders. | ||
|
||
Then run the Flask test server with the following. | ||
|
||
python mpeds_coder.py | ||
|
||
## Development plan | ||
|
||
This is a product in early alpha stages. Features we hope to have working soon: | ||
|
||
* Robust admin dashboard | ||
* Template system for variables | ||
* Ability to specify multiple article sources | ||
* Generalizing an n-pass structure and control flow | ||
* Ability for multiple database integration | ||
* User management | ||
* Cross-browser compatibility | ||
* Visual article queuing | ||
|
||
## Acknowledgments | ||
|
||
This research has been supported by a National Science Foundation Graduate Research Fellowship and National Science Foundation grant SES-1423784. Thanks to Emanuel Ubert and Katie Fallon for working with this system since its inception, and to many undergraduate annotators who have put a lot of time working with and refining this system. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
TITLE: TRIAL OPENS FOR 11 WHO AIDED CENTRAL AMERICANS | ||
DATE: 1985-11-16 | ||
PUBLICATION: The New York Times | ||
LANGUAGE: ENGLISH | ||
SECTION: Section 1; Page 10, Column 3; National Desk | ||
DATELINE: TUCSON, Ariz., Nov. 15 | ||
EDITION: | ||
LENGTH: 844 words | ||
DATE: 1985-11-16 | ||
SEARCH_ID: 10 | ||
AThe Government today detailed its evidence against 11 people, including two Roman Catholic priests, a nun and a Protestant minister, who are on trial in connection with providing sanctuary to Central Americans in this country. | ||
|
||
Repeatedly referring to ''this alien smuggling conspiracy,'' the chief prosecutor, Assistant United States Attorney Donald M. Reno, said in his opening argument that the ''underground railroad'' was a ''three-tiered criminal enterprise.'' | ||
|
||
A minister, a nun and two church lay workers, he said, were ''the C.E.O.'s, the chief executive officers,'' or the ''generals'' of the group that brought undocumented Central Americans into this country and announced publicly that it was doing so. | ||
|
||
The 'Nogales Connection' | ||
|
||
Underneath this top tier, Mr. Reno said, were two other tiers, of people who ran the smuggling or transporting operation and those who operated ''staging'' areas and ''safe houses,'' including two Roman Catholic churches in the sister cities of Nogales, Ariz., and Nogales, Sonora, in Mexico. | ||
|
||
Mr. Reno, over objections of defense lawyers, referred to this as the ''Nogales connection.'' | ||
|
||
The religious leaders and church workers, whose activities are sanctioned and supported by the national governing bodies of several denominations, including the United Methodist Church and the Presbyterian Church (U.S.A.), maintain that they brought refugees into the United States from war-torn Central America as a matter of religious conscience and in protest of what they say is the Government's own failure to honor its laws for granting asylum to political refugees. The prosecution contends the trial is nothing more than ''an alien smuggling case.' | ||
|
||
Mr. Reno identified the ''generals'' of the enterprise as the Rev. John M. Fife, 45 years old, pastor of the Tucson Southside Presbyterian Church; James A. Corbett, 52, a retired rancher who founded the ''underground railroad'' in 1981 in sympathy with Central Americans denied asylum in this country; Philip Willis-Conger, 27, director of the Tucson Ecumenical Council's Refugee Task Force, and Sister Darlene Nicgorski, 41, of the order of School Sisters of St. Francis, who taught in Guatemala until 1980, when her pastor there was killed and she returned to this country to work on behalf of Central Americans. | ||
|
||
Fife 'the Principal Person' | ||
|
||
Mr. Reno said Mr. Fife was ''the principal person in this tier of the conspiracy.'' Mr. Fife's church was among the first to declare itself a sanctuary for Central Americans, along with six others in the Bay Area of California, on March 24, 1982, and the church has become the focal point of the movement. | ||
|
||
Organizers of the sanctuary movement say there are now 270 churches allied with the movement in 33 states. | ||
|
||
In his opening statement, Mr. Reno acknowledged that Sister Darlene was a Roman Catholic nun, but pointedly said that he would refer to her as ''Miss Nicgorsky'' and not as ''Sister,'' although he said he meant her ''no disrespect.'' | ||
|
||
He referred to the two priests, the Rev. Ramon Dagoberto Quinones, 49, of Nogales, Sonora, and the Rev. Anthony Clark, 37, of Nogales, Ariz., as ''Father,'' and to Mr. Fife alternately as ''Mr. Fife'' and ''Reverend Fife.'' | ||
|
||
Defense lawyers, who will not get to their own opening arguments until next week, moved for a mistrial on the grounds of Mr. Reno's refusal to use the religious honorific, and also for what they characterized as inflammatory designations such as ''generals'' and ''Nogales connection.'' | ||
|
||
The presiding Federal district judge, Earl H. Carroll, denied the motions. He has also denied motions for dismissals, based on various grounds, over the past four weeks of jury selection and legal skirmishing that had delayed opening arguments until today. | ||
|
||
Although the opening arguments of the defense will not be presented until next week, the lawyers have conceded that their planned defense had been effectively ''gutted'' by prosecution pretrial motions. | ||
|
||
In months of legal maneuvering leading up to the trial, Judge Carroll granted a succession of Government motions to sharply limit defense arguments and evidence. | ||
|
||
The defendants will not, for example, be allowed to present any defense relating to religous motivation, none relating to conditions in Central America, none relating to United States foreign policy and none relating to international law. | ||
|
||
'Defense in Silhouette' | ||
|
||
None of the defendants have ever denied that they were actively engaged in clandestinely bringing undocumented Central Americans into this country and hiding them from the Government in homes and churches. | ||
|
||
Thus shorn of the intended basis of their defense, that they acted out of religous motivation and in accord with what they believe the law on immigration to actually be, the defendants are now limited to what Mr. Corbett, the defendant, described as ''a defense in silhouette.'' | ||
|
||
That is the hope that the jury will somehow be able to perceive suggestions of the motives of the defendants even though they are unable to present specific arguments. |
Oops, something went wrong.