Skip to content

Commit

Permalink
Test and compile
Browse files Browse the repository at this point in the history
  • Loading branch information
benben-miao committed Apr 15, 2022
0 parents commit 13d4abe
Show file tree
Hide file tree
Showing 19 changed files with 2,501 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ncbi-parser.exe
50 changes: 50 additions & 0 deletions MK919144.1.gbk
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
LOCUS MK919144 567 bp DNA linear VRT 10-JUL-2020
DEFINITION Acanthopagrus pacificus voucher SPAR040418-1 16S ribosomal RNA
gene, partial sequence; mitochondrial.
ACCESSION MK919144
VERSION MK919144.1
KEYWORDS .
SOURCE mitochondrion Acanthopagrus pacificus (Pacific sunbream)
ORGANISM Acanthopagrus pacificus
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Actinopterygii; Neopterygii; Teleostei; Neoteleostei;
Acanthomorphata; Eupercaria; Spariformes; Sparidae; Acanthopagrus.
REFERENCE 1 (bases 1 to 567)
AUTHORS Hasan,M.E., Durand,J.-D. and Iwatsuki,Y.
TITLE Acanthopagrus datnia (Hamilton, 1822), a senior synonym
of#Acanthopagrus longispinnis (Valenciennes, 1830) (Perciformes:
Sparidae)
JOURNAL Zootaxa 4750 (2), 151-181 (2020)
REFERENCE 2 (bases 1 to 567)
AUTHORS Hassan,M.E., Durand,J.-D. and Iwatsuki,Y.
TITLE Direct Submission
JOURNAL Submitted (10-MAY-2019) Ocean, IRD, MARBEC cc093 Bat 24 Place E.
Bataillon, Montpellier, Herault 34095, France
FEATURES Location/Qualifiers
source 1..567
/organism="Acanthopagrus pacificus"
/organelle="mitochondrion"
/mol_type="genomic DNA"
/specimen_voucher="SPAR040418-1"
/db_xref="BOLD:SPARM032-19.16S"
/db_xref="taxon:1129777"
/country="Viet Nam: Ba Ria-Vung Tau, V##ng T##u (fish
market in Ho Chi Minh, D2)"
/collection_date="04-Apr-2018"
/collected_by="Jean-Dominique Durand"
/identified_by="Jean-Dominique Durand"
rRNA <1..>567
/product="16S ribosomal RNA"
ORIGIN
1 aagaggtccc gcctgccctg tgactatatg ttcaacggcc gcggtatttt gaccgtgcga
61 aggtagcgta atcacttgtc ttttaaatga agacctgtat gaatggcacc acgagggctt
121 aactgtctcc ctctcccagt caatgaaatt gattcccccg tgcagaagcg gggataaaag
181 cataagacga gaagacccta tggagcttaa gacgccagga cagctcatgt taaacactcc
241 aaaataaagg aaataaactg attgaaaccc tgtcctagtg tctttggttg gggcgaccac
301 ggggaaaaat ctaaccccca tgtggaatag gaatactatt ttcccatacc caagagttcc
361 cgctctaata aacagaactt ctgaccaaaa tggatccggc aatgccgatc aacggaccga
421 gttaccctag ggataacagc gcaatcctct taaagagtcc ctatcgacaa gggggtttac
481 gacctcgatg ttggatcagg acatcctaat ggtgcagccg ctattaaggg ttcgtttgtt
541 caacgattaa agtcctacgt gatctga
//

1 change: 1 addition & 0 deletions MK919144.1.seq
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
AAGAGGTCCCGCCTGCCCTGTGACTATATGTTCAACGGCCGCGGTATTTTGACCGTGCGAAGGTAGCGTAATCACTTGTCTTTTAAATGAAGACCTGTATGAATGGCACCACGAGGGCTTAACTGTCTCCCTCTCCCAGTCAATGAAATTGATTCCCCCGTGCAGAAGCGGGGATAAAAGCATAAGACGAGAAGACCCTATGGAGCTTAAGACGCCAGGACAGCTCATGTTAAACACTCCAAAATAAAGGAAATAAACTGATTGAAACCCTGTCCTAGTGTCTTTGGTTGGGGCGACCACGGGGAAAAATCTAACCCCCATGTGGAATAGGAATACTATTTTCCCATACCCAAGAGTTCCCGCTCTAATAAACAGAACTTCTGACCAAAATGGATCCGGCAATGCCGATCAACGGACCGAGTTACCCTAGGGATAACAGCGCAATCCTCTTAAAGAGTCCCTATCGACAAGGGGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAAAGTCCTACGTGATCTGA
115 changes: 115 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
### NCBI-Parser
Search any term from all NCBI databases to obtain matched entries, and download all NCBI GenBank format files and sequences corresponding to accessory number in genbank directory, finally export all Species Latin names, Accessory Number, Sequence Length, Division, Collection Country, Collection Date, Collected Worker, Identified Worker, and Reference information, etc. of the search records as tabular files.

### Usage
```bash
ncbi-parser.exe --help
Usage: ncbi-parser.exe [OPTIONS]

Description: Search any term from all NCBI databases to obtain matched
entries, and download all NCBI GenBank format files and sequences
corresponding to accessory number in genbank directory, finally export all
Species Latin names, Accessory Number, Sequence Length, Division, Collection
Country, Collection Date, Collected Worker, Identified Worker, and Reference
information, etc. of the search records as tabular files.

1. Get options and parameters help:

ncbi-parser --help

2. Example (simple information). Users only need to input the search term,
and the default search is in NCBI nucleotide database:

ncbi-parser --term "Acanthopagrus 16S" --output results.xls

3. Example (complete information). Specify the NCBI database type, input the
search term, specify the number of records to download and extract, and
suggest setting a larger parameter max_record:

ncbi-parser --db_type nucleotide --term "Acanthopagrus 16S" --max_record 500
--res_type gb --output results.xls

Options:
--db_type TEXT Please input NCBI database type, default: "nucleotide",
including: [pubmed, protein, nuccore, ipg, nucleotide,
structure, genome, annotinfo, assembly, bioproject,
biosample, blastdbinfo, books, cdd, clinvar, gap,
gapplus, grasp, dbvar, gene, gds, geoprofiles,
homologene, medgen, mesh, ncbisearch, nlmcatalog, omim,
orgtrack, pmc, popset, proteinclusters, pcassay, protfam,
pccompound, pcsubstance, seqannot, snp, sra, taxonomy,
biocollections, gtr] [required]
--term TEXT Please input search term content, default: "Acanthopagrus
16S" [required]
--max_record TEXT Please input max record number, default: "100", up to:
10000 [required]
--res_type TEXT Please input result type, default: "gb", including:
["gb", "fasta", "gbwithparts", "gbcoll"]
--output TEXT Please input output full name (path + name + extension).
default="results.xls"
--help Show this message and exit.
```

### NCBI GenBank format
```shell
LOCUS LC649152 1081 bp DNA linear VRT 03-SEP-2021
DEFINITION Acanthopagrus bifasciatus Kuroshio Biological Research Foundation
KBF-I 361 mitochondrial genes for 12S rRNA, tRNA-Val, 16S rRNA,
partial and complete sequence.
ACCESSION LC649152
VERSION LC649152.1
KEYWORDS .
SOURCE mitochondrion Acanthopagrus bifasciatus (twobar seabream)
ORGANISM Acanthopagrus bifasciatus
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Actinopterygii; Neopterygii; Teleostei; Neoteleostei;
Acanthomorphata; Eupercaria; Spariformes; Sparidae; Acanthopagrus.
REFERENCE 1
AUTHORS Sado,T., Fukuchi,T. and Miya,M.
TITLE Reference data for MiFish metabarcoding analysis
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 1081)
AUTHORS Sado,T., Fukuchi,T. and Miya,M.
TITLE Direct Submission
JOURNAL Submitted (27-AUG-2021) Contact:Tetsuya Sado Natural History Museum
& Institute, Chiba; 955-2 Aoba-cho, Chuo-ku, Chiba, Chiba 260-8682,
Japan URL :http://www2.chiba-muse.or.jp/NATURAL/
FEATURES Location/Qualifiers
source 1..1081
/organism="Acanthopagrus bifasciatus"
/organelle="mitochondrion"
/mol_type="genomic DNA"
/specimen_voucher="Kuroshio Biological Research Foundation
KBF-I 361"
/db_xref="taxon:767411"
/PCR_primers="fwd_name: L-708-12S, fwd_seq:
ttayacatgcaagtatccgc, rev_name: H-1784-16SG, rev_seq:
ttcagctttcccttgcggtac"
rRNA <1..894
/product="12S ribosomal RNA"
tRNA 895..967
/product="tRNA-Val"
rRNA 968..>1081
/product="16S ribosomal RNA"
ORIGIN
1 acccccgtga aaatgcccta cagttccccg cccggaaaca aggagccggt atcaggcaca
61 ttcaatttag cccacgacac cttgctcagc cacaccctca agggtactca gcagtgataa
121 accttgacac ataagtgaaa acttgaatca gttaaagcta agtagggccg gtaaaactcg
181 tgccagccac cgcggttata cgagaggccc aagttgttag aaatcggcgt aaagggtggt
241 taagaataag attaaaatta aagccgaaca tctttagtag ctgttatacg ctttcaaaga
301 caagaagccc aactgcgaaa gtagctttat attttctgaa cccacgaaag ctaaggtaca
361 aactgggatt agatacccca ctatgcttag ccgtaaacat cgacagttta ttacattttc
421 tgtccgcctg ggtactacaa gcattagctc aaaacccaaa ggacttggcg gtgctttaga
481 cccacctaga ggagcctgtt ctagaaccga tattccccgt tcaacctcac ctctccttgc
541 ctctcagcct atataccgcc gtcgttcagc ttaccctgtg aagggcaaaa agtaagcaaa
601 attggcactg cccagtacgt caggtcgagg tgtagtcaat ggagtgggaa gaaatgggct
661 acattccctt gtcttcaggg aactacgaat ggtgcactga aaatgtgtgc ctgaaggagg
721 atttagcagt aagtagtaat ttagaatatt ctactgaagc cggctcttaa gcgcgcacac
781 accgcccgtc actctccccg agactttaaa ttcacattaa ctaaaatatt aaatatcata
841 gaggggaggc aagtcgtaac atggtaagtg taccggaagg tgtacttgga aaaccagcgc
901 atagctaaac tagataaagc acctccctta cactgagaag atattcgtgc aaatcgaatt
961 gccctgagcc tatcagctag ccctctaaca aaaaacaaca cacccccatc aattaacccc
1021 caatgcactt acattaaatt aaacaaatca tttttccacc caagtatggg cgacagaaaa
1081 g
//
```
75 changes: 75 additions & 0 deletions einfo.dtd
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
<!--
This is the Current DTD for Entrez eInfo
$Id: einfo.dtd 577924 2019-01-09 22:59:07Z fialkov $
need xml2json.xsl
-->
<!-- ================================================================= -->

<!--~~ !dtd
~~json
<json type='einfo' version='0.3'>
<config lcnames='true'/>
</json>
~~-->

<!ELEMENT DbName (#PCDATA)> <!-- \S+ -->
<!ELEMENT Name (#PCDATA)> <!-- .+ -->
<!ELEMENT FullName (#PCDATA)> <!-- .+ -->
<!ELEMENT Description (#PCDATA)> <!-- .+ -->
<!ELEMENT DbBuild (#PCDATA)> <!-- .+ -->
<!ELEMENT TermCount (#PCDATA)> <!-- \d+ -->
<!ELEMENT Menu (#PCDATA)> <!-- .+ -->
<!ELEMENT DbTo (#PCDATA)> <!-- \S+ -->
<!ELEMENT MenuName (#PCDATA)> <!-- .+ -->
<!ELEMENT Count (#PCDATA)> <!-- \d+ -->
<!ELEMENT LastUpdate (#PCDATA)> <!-- \d+ -->

<!ELEMENT ERROR (#PCDATA)> <!-- .+ -->
<!ELEMENT Warning (#PCDATA)> <!-- .+ -->

<!ELEMENT IsDate (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT IsNumerical (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT SingleToken (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT Hierarchy (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT IsHidden (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT IsRangable (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT IsTruncatable (#PCDATA)> <!-- (Y|N) -->


<!ELEMENT DbList (DbName+)>

<!ELEMENT Field (Name,
FullName,
Description,
TermCount,
IsDate,
IsNumerical,
SingleToken,
Hierarchy,
IsHidden,
IsRangable?,
IsTruncatable?)>

<!ELEMENT Link (Name,Menu,Description,DbTo)>


<!ELEMENT LinkList (Link*)>
<!ELEMENT FieldList (Field*)>

<!ELEMENT DbInfo (DbName,
MenuName,
Description,
DbBuild?,
Warning?,
Count?,
LastUpdate?,
FieldList?,
LinkList?)>
<!--~~ <eInfoResult>
~~ json
<object>
<member select='DbList'/>
<array key='dbinfo' select='DbInfo|ERROR'/>
</object>
~~-->
<!ELEMENT eInfoResult (DbList|(DbInfo|ERROR)+)>
103 changes: 103 additions & 0 deletions esearch.dtd
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<!--
This is the Current DTD for Entrez eSearch
$Id: eSearch_020511.dtd 85163 2006-06-28 17:35:21Z olegh $
-->
<!-- ================================================================= -->

<!--~~ !dtd
~~json
<json type='esearch' version='0.3'>
<config lcnames='true'/>
</json>
~~-->

<!ELEMENT eSearchResult (
(
(
Count,
( RetMax,
RetStart,
QueryKey?,
WebEnv?,
IdList,
TranslationSet,
TranslationStack?,
QueryTranslation
)?
) | ERROR
),
ErrorList?,
WarningList?
)>


<!ELEMENT Count (#PCDATA)> <!-- \d+ -->
<!ELEMENT RetMax (#PCDATA)> <!-- \d+ -->
<!ELEMENT RetStart (#PCDATA)> <!-- \d+ -->
<!ELEMENT Id (#PCDATA)> <!-- \d+ -->

<!ELEMENT From (#PCDATA)> <!-- .+ -->
<!ELEMENT To (#PCDATA)> <!-- .+ -->
<!ELEMENT Term (#PCDATA)> <!-- .+ -->

<!ELEMENT Field (#PCDATA)> <!-- .+ -->

<!ELEMENT QueryKey (#PCDATA)> <!-- \d+ -->
<!ELEMENT WebEnv (#PCDATA)> <!-- \S+ -->

<!ELEMENT Explode (#PCDATA)> <!-- (Y|N) -->
<!ELEMENT OP (#PCDATA)> <!-- (AND|OR|NOT|RANGE|GROUP) -->
<!ELEMENT IdList (Id*)>

<!ELEMENT Translation (From, To)>
<!ELEMENT TranslationSet (Translation*)>

<!ELEMENT TermSet (Term, Field, Count, Explode)>

<!--~~ <TranslationStack>
~~ json <array/>
~~-->
<!ELEMENT TranslationStack ((TermSet|OP)*)>

<!-- Error message tags -->
<!--~~ <ERROR>
~~ json <json key="ERROR"/>
~~-->
<!ELEMENT ERROR (#PCDATA)> <!-- .+ -->

<!ELEMENT OutputMessage (#PCDATA)> <!-- .+ -->

<!ELEMENT QuotedPhraseNotFound (#PCDATA)> <!-- .+ -->

<!ELEMENT PhraseIgnored (#PCDATA)> <!-- .+ -->

<!ELEMENT FieldNotFound (#PCDATA)> <!-- .+ -->

<!ELEMENT PhraseNotFound (#PCDATA)> <!-- .+ -->


<!ELEMENT QueryTranslation (#PCDATA)> <!-- .+ -->

<!--~~ <ErrorList>
~~ json
<object>
<array key="phrasesnotfound" select='PhraseNotFound'/>
<array key="fieldsnotfound" select='FieldsNotFound'/>
</object>
~~-->
<!ELEMENT ErrorList (PhraseNotFound*,FieldNotFound*)>

<!--~~ <WarningList>
~~ json
<object>
<array key="phrasesignored" select='PhraseIgnored'/>
<array key="quotedphrasesnotfound" select='QuotedPhraseNotFound'/>
<array key="outputmessages" select='OutputMessage'/>
</object>
~~-->
<!ELEMENT WarningList ( PhraseIgnored*,
QuotedPhraseNotFound*,
OutputMessage* )>



Binary file added favicon.ico
Binary file not shown.
Loading

0 comments on commit 13d4abe

Please sign in to comment.