You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Proposals/publications/README.md
+43-3Lines changed: 43 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,12 +4,20 @@ This project is intended to help add DOIs to existing publications, support meta
4
4
5
5
## Updating Existing Neotoma Records
6
6
7
+
These scripts are run (for the most part) directly against the Neotoma Database. Because of this, we use an `.env` file when running the scripts. The `.env` file should contain the connection string to the database, as a JSON object, related to `DBAUTH`. For example:
Without this connection string the scripts will not work properly. For development work, please use `neotomatank` as the database for connection.
14
+
7
15
### Finding new DOIs
8
16
9
17
For existing publications that do not include DOIs we can scan the Neotoma Publications database from the commandline:
10
18
11
-
```python
12
-
uv run src/find_potential_dois.py --limit 100--skip 100--output ./data/offset100.csv
19
+
```bash
20
+
uv run src/find_potential_dois.py --limit 30000 --output ./data/offset100.csv
13
21
```
14
22
15
23
This will return a CSV file (saved in the `--output` directory) with the Neotoma `publicationid`, current `citation`, the `doi` stored in Neotoma (generally empty) and then columns for the `newdoi`, obtained from a CrossRef search, as well as the `bibtex` citation.
@@ -32,9 +40,41 @@ The `--commit` flag allows us to test the run, to ensure that we don't accidenta
32
40
33
41
Otherwise the upload will end with the statement:
34
42
35
-
```
43
+
```bash
36
44
The --commit flag was set to False, rolling back operation.
37
45
```
38
46
39
47
## Inserting New Publicaitons from DOIs
40
48
49
+
To facilitate bulk uploads of records we can use a file with raw DOI strings and use CrossRef to resolve the citation for us. The functions we use are in the [`publications`](./src/publications/) folder, both [`return_bibtex.py`](./src/publications/return_bibtex.py) -- which takes the DOI and returns a formatted BibTex citation -- and [`add_citation`](./src/publications/add_citation.py), which takes the BibTex citation and formats it using APA style.
50
+
51
+
Given a text file with DOIs and, potentially empty spaces (to support simply copying a column from a spreadsheet), we process each unique entry.
52
+
53
+
```csv
54
+
10.1017/S0033822200001089
55
+
56
+
10.1017/S0033822200020452
57
+
10.1017/S0033822200001089
58
+
10.1017/S0033822200001089
59
+
10.1017/S0033822200001089
60
+
10.1080/0734578X.2017.1377510
61
+
```
62
+
63
+
Using the script:
64
+
65
+
```bash
66
+
uv run insert_new_from_doi.py --input FILEPATH.csv
67
+
```
68
+
69
+
The script will parse individual DOIs and dry-run insertion. To insert the records into the database, include the `--commit` flag. If `--commit` is set, you will see the output:
70
+
71
+
```text
72
+
Committing the following citation:
73
+
Emslie, S. D., & Mead, J. I. (2020 , August). The age and vertebrate paleontology of labor-of-love cave, white pine county, nevada. Western North American Naturalist, 80(3). URL: http://dx.doi.org/10.3398/064.080.0301, doi:10.3398/064.080.0301
74
+
```
75
+
76
+
for each record submitted. Note here that capitalization is often inconsistent. Most publishers on CrossRef do not properly capitalize records, and as such, any automated system will do a poor job of returning properly capitalized records.
77
+
78
+
## Conclusion
79
+
80
+
The scripts together support the management of publication records within Neotoma.
0 commit comments