Kalamari database link out of data #531

Mattstorey · 2022-02-06T20:50:48Z

Expected Behavior

mmseqs databases Kalamari kalamari tmp should grab the "Kalamari" sequences and produce a taxonomy database.

Current Behavior

Fails to download anything and database creation crashes. Inspection of the ../data/workflow/databases.sh script shows the link to download a .tsv file from the kalamari repo is no longer valid.

Steps to Reproduce (for bugs)

mmseqs databases Kalamari kalamari tmp

MMseqs Output (for bugs)

databases Kalamari kalamari tmp 

MMseqs Version:              	fcf52600801a73e95fd74068e1bb1afb437d719d
Force restart with latest tmp	false
Remove temporary files       	false
Compressed                   	0
Threads                      	8
Verbosity                    	3

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    14  100    14    0     0     34      0 --:--:-- --:--:-- --:--:--    34
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    32    0    32    0     0     24      0 --:--:--  0:00:01 --:--:--    24
createdb tmp/10928370746232574590/kalamari.fasta kalamari --compressed 0 -v 3 

Converting sequences

Time for merging to kalamari_h: 0h 0m 0s 3ms
Time for merging to kalamari: 0h 0m 0s 3ms
Database type: Nucleotide
The input files have no entry:  - tmp/10928370746232574590/kalamari.fasta
Please check your input files. Only files in fasta/fastq[.gz|bz2] are supported
Error: createdb died

Context

Updating the link might partially resolve the error. I have tried a manual download of the Kalamari database but can't get the final taxonomy database to be successfully produced.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters):MMseqs Version: fcf5260
Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.): Statically-compiled
Server specifications (especially CPU support for AVX2/SSE and amount of system memory): ARMv8 Processor rev 0 (v8l) × 8
Operating system and version: Ubuntu 18.04

The text was updated successfully, but these errors were encountered:

martin-steinegger · 2022-02-12T04:39:53Z

I have changed the link pointing to a commit that contains the 3.7v tsv file.

ce7bf53b82 Point Kalamari3.7v to a fixed commit soedinglab/MMseqs2#531 fcf5260080 Remove a level of indirection to access compatible index version 922e2691e0 Fix failing utility tests 74c3aa65e5 Fix typo (violoations -> violations) (#526) 7281baf933 Add --comp-bias-corr-scale d89fcecf99 Write serialized index in appenddbtoindex 79ea1ee301 Fix new IndexReader USER_SELECT trying to read header databases as fallback a506d677f2 Allow subprojects to build their own precomputed indices 75af0c82ed Add appenddbtoindex to argument a precomputed index in sub-projects 4f046dd197 Add mask prob to mask sequence 38cf3f1085 Fix TestIndexTable b768f48f0b Add --mask-prob parameter bfc6f85bbb removed error message for wrapped scoring, should work with all rescore modes edb8223d1e Fix pairaln 6e7ed70055 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 e19df7ce10 Rework pairing to support more than two sequences 9fded60acb Add environment variable MMSEQS_IGNORE_INDEX to ignore an existing precomputed target index efacc6904c Cushioning the overestimated number of diagonals in case of many successive hits on one diagonal 5fc318b6d8 Add convertalis --format-mode 4 to print blast-tab headers 80fcaddefb Disable profile gap scores in msa2profile temporarily 9cc89aa594 Fix huge memory allocations introduced in 49c2b70 a8c30da56d result2msa correctly prints X residues 482dedc657 Explicitly set threads in Cirrus 75e9bfaa29 Update tectonic in azure to fix error in userguide building 16830a5247 Fix number of CPUs used in cirrus aab640d279 Fix gap pseudocount mode again 716fb6217d Turn --k-score into MuliParam so it works correctly in iterative-profile search 56816b3964 Resfinder download should not use tar wildcards, broken in busybox #494 e85ceb9d14 Change the url for UniRef* from ftp to https in databases downloader (#496) 49c2b70b47 Fix mem. issue 09e261bf19 Avoid substracting from getMaxSeqLen 4b77690ea1 Move maxSeqLen logig to getMaxSeqLen() to avoid index issues d87369739b Fix max length in DBReader Allocate CSProfile only when needed 42bf6438fe Rework download database 5afd33c37f Make "databases" usable in sub-projects f65187996c Update regression f3f5b13350 Update k-score sensitivity fitting for no-cntxt profile searches 3e92abf7d9 Add db-load-mode support to pairaln 5e245d17b9 copy dbtype and clear map 4a3bb34080 Merge branch 'master' of https://github.com/milot-mirdita/mmseqs2 9a0df0d25a Add pairaln fa44760ec6 Fix recent forgotten else in getKmerThreshold 45b2b52175 Revert "Try increasing the k-mer thresholds again for 5/6-mers" be11943326 Fix prefilter not correctly masking extended dbtype for comparision e3ce4605e2 Fix memory leak in MappingReader uncovered by ASan 06bdc5e796 Fix missing cassert header in tsv2exprofiledb 8521fb45c3 Remove useless calls to opendir/closedir in FileUtil 885b46999b Add workflow to create expandable profile (profile-profile) db from a bunch of TSV files ad05844f36 Add missing pseudocount check in indexdb e33c32aae7 Fit new values for prefilter 7950368f70 Fix another broken test b456cf51dd Fix unused variables in lca 003cd244b5 Merge remote-tracking branch 'main/master' 6a8f586bed Add extended dbtype to check for context specific pseudocounts, so that the correctly fitted kmer thresholds can be used 92a19497b3 Fix uninitialized warning in addtaxonomy 2e75435ec7 Fix createbintaxonomy mapping dump size written 178eacff4e impl. contextPseudoCnts getKmerThreshold, values not fitted yet 35c67c87c1 Change pos. spec. gap costs to templates 9defdf8910 fixed bug for uneven number of repeated kmers 0c26a1077d replaced global with end_to_end in rescore mode variable 9064061dde fixed size_t parameter handling 3fa46fe3b9 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 763fa9ffb7 Change compress loop to omp static to keep order 49710b7f41 Fix sub. mat asan issue d0a00d6a30 Update Sub. Mat. logic for aa2num mapping ccf5555980 Fix test e4aae9271f Make taxonomy mapping mmap'able for instant read-in c66fd1b10e Fix syntax error in filterresult 8762359677 Fix issues with include identities in filterresult 91617c4b78 Add includeIdentity to filterresult fe16da3957 Stay compatible with previous short A3M header output format ce5b241800 Fix wrong assumption about header databases IDs with new index database scheme in result2msa a54df87419 Remove E-value threshold in filterresults 5647a56a8c Allow --diff 0 d565619151 Add MSA output mode for A3M+aln info 85ce847221 Expand can filter in each target cluster before expanding ae4c7ab1b5 Merge branch 'master' of https://github.com/soedinglab/MMseqs2 38ab523ae7 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 5e0d11f256 Extend MSA filtering for bucketed filtering within qid buckets c6d8ae0c05 Add filter min enable 25cb16fff4 Enable result2profile/filterresult to read new expand alignment index 37225004ad Don't mask consensus sequences in profiles b2a3402022 Ignore cacode warnings c3e90f4197 Allow indexing of profile-profile db 66fa3c76d6 Update regression to remove result2pp from expand check 87fed2e60a Merge remote-tracking branch 'main/master' 5b75b8421a Try increasing the k-mer thresholds again for 5/6-mers 01492c9581 Revert "Make sure QueryMatcher::radixSortByScoreSize cant corrupt memory" 86152a2fb6 Remove useless calls to std::map::operator[] d4dd06d27a Fix iterative profile search restartable again 91b617067f Make sure QueryMatcher::radixSortByScoreSize cant corrupt memory af3170952e Save a buch of work when sequences are not needed in expand* be5a1da484 Replace many aligned allocation in MultipleAlignment with single allocation 7469d5999a Fix unused warning 942a012a5a Move MultiParam::format out of header to avoid compilation warning d214805827 Fix unused parameter warning 40ba03f461 Disable warnings from nedmalloc (external dependency) c811a511a0 Fix tests after profile-profile refactoring 7a8ee48507 Try to fix profile-profile alignment for SSE 68862ed27c Add missing simd.h functions for SSE a09de7eb8e Fix compile errors 807d97a9fa Merge remote-tracking branch 'main/master' into ppmerge 4578f8ba34 Temporary change to slicesearch to speed things up 3a51b4456c Add support to support position-specific gap penalties in profile-profile alignment in iterative search. 3d40f1055b Fixes for gap panalties merge 2718ca7524 First attempt to merge prof-prof and gap-penalties 93f90b045f Fixes to last merge b78111882d Merge branch 'master' into main-master 22a7bfa243 Add iterativepp workflow 2a4a2dc5ee Add correlation score parameter to align f9d2ae30e9 Add support for new Multiparameter type cbc1b4898c Refactor pseudocounts 1e58454a94 Restore K4000.crf from history d9f2041e99 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 971f9d9090 Turn profiles from lin-space to scores, add average profile-profile code 3af62f0651 Fix banded_sw 725d9f6349 Modified Profile-Profile alignment implementation with templates. 60d5be1752 Add missing var to profile 12b78e3f4f Merge branch 'master' of https://github.com/haydenji0731/MMseqs 2aaac47a64 First running version of double max profile/profile db1c38b1c0 Made changes to SSW class for Profile2Profile Alignment b001dfb2af Made modifications for Profile-Profile alignment. Changes belong to SSW, Alignment, Matcher. Right before integrating lin space vector cost calculation for H value. 521c0d257b Made modifications to ssw algorithm implementation. git-subtree-dir: lib/mmseqs git-subtree-split: ce7bf53b8241f7ced20f5a75bab052da98e5ca79

b0b8e85f Fix truncated profile sequences in convertalis #567 96b20099 Fix broken badges in README (and remove travis) 407b315e Fix multi-threading issues in pairaln 92deb92f Fix unpackdb parameter be8c278c Progress update fix 58593ec0 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 3f8695ea Add multi-thread support to pairaln e9e829c7 Fix seg. fault in realign ce7bf53b Point Kalamari3.7v to a fixed commit soedinglab/MMseqs2#531 fcf52600 Remove a level of indirection to access compatible index version 922e2691 Fix failing utility tests 74c3aa65 Fix typo (violoations -> violations) (#526) git-subtree-dir: lib/mmseqs git-subtree-split: b0b8e85f3b8437c10a666e3ea35c78c0ad0d7ec2

c48da9d7 Update Prefiltering.cpp 45891515 Reset errno before various strto* calls 7e284099 Update docker install instruction to GHCR 28b00883 Fix FASTA input not ending with a newline resulting in invalid sequence db with --createdb-mode 1 (#617) a81d9e72 Fix issue with gcc 4.9 8799829d Fix compile error 1761bd60 Add module db2tar: Create a tar file from a database dcd180be (Re)add support for tar-writing to microtar fea8d203 Add support for external k-mer thresholds for the prefilter ede0be15 Rework rescore diagonal 8f78b0ab Rework ungapped alignment aabc78c2 Fix indexdb ce8cd536 Fix masking issue 304a99bb Delete unmasked index to fix asan issue 67949d70 Fix #586 summarizeresult should not reject hits that match the coverage threshold 3d4840b3 Use macos-11 in azure 8ff26f23 Support finding taxonomy db paths from other prefilter databases 8ff72796 Add speedup shortcut to TaxonomyExpression for a single tax identifier 1d631726 Add taxonomic filtering during prefilter with --taxon-list 3b9cf881 Add URIs as allowed parameter inputs 1c739ae7 Add easy parsable tsv output to databases ba4e11f1 workflow_dispatch can tag container as latest 7ebd2e04 Revert alignment profile in sequence.cpp 5185d3cb Allow tagging of docker containers through workflow dispatch eb203d35 Build docker image in GH action and publish to ghcr 678c82ac GTDB ar122_taxonomy does not exist anymore, replace with different file #561 7be78c81 Fix tar2db breaking with --tar-include/exclude #561 d1555862 Encode more 16b57741 Encode " \n\t[]{}^$?|.~!*" as b64 b0b8e85f Fix truncated profile sequences in convertalis #567 96b20099 Fix broken badges in README (and remove travis) 407b315e Fix multi-threading issues in pairaln 92deb92f Fix unpackdb parameter be8c278c Progress update fix 58593ec0 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 3f8695ea Add multi-thread support to pairaln e9e829c7 Fix seg. fault in realign ce7bf53b Point Kalamari3.7v to a fixed commit soedinglab/MMseqs2#531 fcf52600 Remove a level of indirection to access compatible index version 922e2691 Fix failing utility tests 74c3aa65 Fix typo (violoations -> violations) (#526) 7281baf9 Add --comp-bias-corr-scale d89fcecf Write serialized index in appenddbtoindex 79ea1ee3 Fix new IndexReader USER_SELECT trying to read header databases as fallback a506d677 Allow subprojects to build their own precomputed indices 75af0c82 Add appenddbtoindex to argument a precomputed index in sub-projects 4f046dd1 Add mask prob to mask sequence 38cf3f10 Fix TestIndexTable b768f48f Add --mask-prob parameter bfc6f85b removed error message for wrapped scoring, should work with all rescore modes edb8223d Fix pairaln 6e7ed700 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 e19df7ce Rework pairing to support more than two sequences 9fded60a Add environment variable MMSEQS_IGNORE_INDEX to ignore an existing precomputed target index efacc690 Cushioning the overestimated number of diagonals in case of many successive hits on one diagonal 5fc318b6 Add convertalis --format-mode 4 to print blast-tab headers 80fcadde Disable profile gap scores in msa2profile temporarily 9cc89aa5 Fix huge memory allocations introduced in 49c2b70 a8c30da5 result2msa correctly prints X residues 482dedc6 Explicitly set threads in Cirrus 75e9bfaa Update tectonic in azure to fix error in userguide building 16830a52 Fix number of CPUs used in cirrus aab640d2 Fix gap pseudocount mode again 716fb621 Turn --k-score into MuliParam so it works correctly in iterative-profile search 56816b39 Resfinder download should not use tar wildcards, broken in busybox #494 e85ceb9d Change the url for UniRef* from ftp to https in databases downloader (#496) 49c2b70b Fix mem. issue 09e261bf Avoid substracting from getMaxSeqLen 4b77690e Move maxSeqLen logig to getMaxSeqLen() to avoid index issues d8736973 Fix max length in DBReader Allocate CSProfile only when needed 42bf6438 Rework download database 5afd33c3 Make "databases" usable in sub-projects f6518799 Update regression f3f5b133 Update k-score sensitivity fitting for no-cntxt profile searches 3e92abf7 Add db-load-mode support to pairaln 5e245d17 copy dbtype and clear map 4a3bb340 Merge branch 'master' of https://github.com/milot-mirdita/mmseqs2 9a0df0d2 Add pairaln fa44760e Fix recent forgotten else in getKmerThreshold 45b2b521 Revert "Try increasing the k-mer thresholds again for 5/6-mers" be119433 Fix prefilter not correctly masking extended dbtype for comparision e3ce4605 Fix memory leak in MappingReader uncovered by ASan 06bdc5e7 Fix missing cassert header in tsv2exprofiledb 8521fb45 Remove useless calls to opendir/closedir in FileUtil 885b4699 Add workflow to create expandable profile (profile-profile) db from a bunch of TSV files ad05844f Add missing pseudocount check in indexdb e33c32aa Fit new values for prefilter 7950368f Fix another broken test b456cf51 Fix unused variables in lca 003cd244 Merge remote-tracking branch 'main/master' 6a8f586b Add extended dbtype to check for context specific pseudocounts, so that the correctly fitted kmer thresholds can be used 92a19497 Fix uninitialized warning in addtaxonomy 2e75435e Fix createbintaxonomy mapping dump size written 178eacff impl. contextPseudoCnts getKmerThreshold, values not fitted yet 35c67c87 Change pos. spec. gap costs to templates 9defdf89 fixed bug for uneven number of repeated kmers 0c26a107 replaced global with end_to_end in rescore mode variable 9064061d fixed size_t parameter handling 3fa46fe3 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 763fa9ff Change compress loop to omp static to keep order 49710b7f Fix sub. mat asan issue d0a00d6a Update Sub. Mat. logic for aa2num mapping ccf55559 Fix test e4aae927 Make taxonomy mapping mmap'able for instant read-in c66fd1b1 Fix syntax error in filterresult 87623596 Fix issues with include identities in filterresult 91617c4b Add includeIdentity to filterresult fe16da39 Stay compatible with previous short A3M header output format ce5b2418 Fix wrong assumption about header databases IDs with new index database scheme in result2msa a54df874 Remove E-value threshold in filterresults 5647a56a Allow --diff 0 d5656191 Add MSA output mode for A3M+aln info 85ce8472 Expand can filter in each target cluster before expanding ae4c7ab1 Merge branch 'master' of https://github.com/soedinglab/MMseqs2 38ab523a Merge branch 'master' of https://github.com/soedinglab/mmseqs2 5e0d11f2 Extend MSA filtering for bucketed filtering within qid buckets c6d8ae0c Add filter min enable 25cb16ff Enable result2profile/filterresult to read new expand alignment index 37225004 Don't mask consensus sequences in profiles b2a34020 Ignore cacode warnings c3e90f41 Allow indexing of profile-profile db f3491183 Make sure very large database don't overflow localThreads 66fa3c76 Update regression to remove result2pp from expand check 87fed2e6 Merge remote-tracking branch 'main/master' 5b75b842 Try increasing the k-mer thresholds again for 5/6-mers ad5837b3 Revert "result2msa now supports reading from index" 7ee3e794 Fix wrong database name printed for variadic input when creating a tmp directory 15fdf48e result2msa now supports reading from index 7aade9df Change deep copies to const references in result2msa ce7cf754 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 31eb67ae Add A3M support to result2msa 56f7685b Add symlinks/copies for _taxonomy file #474 904d0c6d Transition old compiler tests from travis to CirrusCI 442d8983 Fix memory issues in QueryMatcher 17c8028e Move fixRlimitNoFile to Application c6634976 Fix the forbidden symbols when using unpackdb (#467) 488df863 Refactoring of gff2db d822533f Build update function for DbType validators a09a704e Remove bash dependency in regression to fix FreeBSD in CirrusCI 4f1996a4 Fix FreeBSD on CirrusCI samtools issue a2e2129c Add CirrusCI to test FreeBSD 01492c95 Revert "Make sure QueryMatcher::radixSortByScoreSize cant corrupt memory" 15ace29a Fix posix_madvise on FreeBSD returning error if size=0 (See #460) 86152a2f Remove useless calls to std::map::operator[] d4dd06d2 Fix iterative profile search restartable again 91b61706 Make sure QueryMatcher::radixSortByScoreSize cant corrupt memory af317095 Save a buch of work when sequences are not needed in expand* be5a1da4 Replace many aligned allocation in MultipleAlignment with single allocation 7469d599 Fix unused warning 942a012a Move MultiParam::format out of header to avoid compilation warning d2148058 Fix unused parameter warning 40ba03f4 Disable warnings from nedmalloc (external dependency) c811a511 Fix tests after profile-profile refactoring 7a8ee485 Try to fix profile-profile alignment for SSE 68862ed2 Add missing simd.h functions for SSE a09de7eb Fix compile errors 807d97a9 Merge remote-tracking branch 'main/master' into ppmerge 4578f8ba Temporary change to slicesearch to speed things up 3a51b445 Add support to support position-specific gap penalties in profile-profile alignment in iterative search. 139e4502 Get rid of MathUtil::popCount in favor of __builtin_popcount bbfd6e26 Add preload mode to expand(aln/2profile) b14d0136 Fix a few more tests 635911ec Increase sortresult buffer for matcher result d6c19db9 Fix exhaustive search parameter in examples e86afeab Move substitution matrix init code out of Parameters::parseParameters to fix tests 62f7aba1 Replace biorxiv citation for taxonomy paper 24f6b52a Cleanup magic value with constant in kseq c7f6a37e Allocate at least a 20 * 20 matrix in StripedSmithWaterman 57de8c8d Fix profile2repseq input database type 96a069e5 Shellcheck fix 52c6ae87 "Can not" to "Cannot" in DBReader and cleanup e39d02af MemoryMapped cannot accidentally segfault on 0-byte sized files anymore 2d7411a1 Revert "Bug fix with empty temporary files" 7be4fca9 Add VOGDB to database downloader dd5db429 Update dbCAN2 to V9 and make remove .aln suffix from profile names d4a33542 Always set a value for FILTER_RESULT in exhaustive search ec1f599e Update regression for recent change to nucl-nucl search c967985e changed rescoring for nucleotide sequences only in prefilter 19064f27 Revert "fixed rescoring for nucleotide sequences with multiple diagonals for one target exceeding UCHAR_MAX count" c54c5382 fixed signed error f751bcc9 fixed rescoring for nucleotide sequences with multiple diagonals for one target exceeding UCHAR_MAX count 1d770285 Fix endless loop in rescorediagonal 4462533c Don't allow iterative profile search in taxonomy #432 64a2265f Make sure no backtraces are computed in lcaalign b8501a1b Fix previous broken commit 971b442e Fix additional two more memory leaks before exit 7fbc0b65 Fix memory leak in DBWriter::createRenumberedDB a6cab565 Fix prefilter/alignment with 0-size query input #433 14a3dce2 createsubdb and view can now return results from identifiers in .lookup with --id-mode 1 6622c9f0 Fix DBReader::USE_LOOKUP_REV d77de8da Fix extractorfs sometimes loading invalid start/stop codons on non-avx2 platforms 5daca424 Fix typos in extractorfs warnings for short input sequences fe61aeee Replace strcpy in microtar 0523594f Add support for GNU tar specific filenames and some lesser used entry types to tar2db 5ed18ff0 Merge commit '15242315f80fbda1bffc05cd41fa47c192373902' 15242315 Squashed 'lib/simde/simde/' changes from 79bf0b7c..1f4a28c4 bb02734e Get rid of more scanf calls fa4cd2a7 Fix arch selection on ARM (use -mcpu instead of -march) and s390x (enable -mzvector) a202b3c2 Squashed 'lib/simde/simde/' changes from b6c9c964..79bf0b7c fb39ca1e Merge commit 'a202b3c2d58cc2f80ecfb2123158377f08bc6510' 3d40f105 Fixes for gap panalties merge 2718ca75 First attempt to merge prof-prof and gap-penalties 93f90b04 Fixes to last merge b7811188 Merge branch 'master' into main-master 22a7bfa2 Add iterativepp workflow 1a87a226 Cleanup Matcher::compressAlignment 6885bad8 Get rid of sscanf in Matcher::uncompressAlignment 50ce7a5c Fix previous commit writing dbtypes for big endian 852f04de Fix compile error afa6d02d Read/write dbtypes always as little-endian 6269994f Explicitly support size_t in Parameters d9744e3c Fix some 32-bit issues #418 c25aec57 Cleanup kmergenerator header be343e98 Additional s390x fixes (linclust might work now) 45111b64 Add initial fixes to get MMseqs2 working on s390x b1704ccc Merge branch 'master' of https://github.com/soedinglab/mmseqs2 f388ead8 Add parameter --alignment-output-mode, remove alignment mode 5 2a4a2dc5 Add correlation score parameter to align f9d2ae30 Add support for new Multiparameter type cbc1b489 Refactor pseudocounts 1e58454a Restore K4000.crf from history f6eadeaa --majority parameter was missing from taxonomy workflow 24217dc9 Reduce number of threads on travis ARM ff4c9029 Remove SORTRESULT_PAR from search.cpp 178d3b5f Fix exhaustive search 247de411 Move warning from inner loop to outer in extractorfs 6a0dcee4 Update Regression test f92447d0 Rename slice to exhaustive search, add filterresult 6c2fefce Set pca to 0.0 in expand2profile 0cc7e674 Add unpackdb to split a database into separate files #406 877344c3 Add USE_SYSTEM_ZSTD cmake flag to use system provided zstd #411 bbd56417 Replace throw with abort in ALP again 46c26ce9 Add missing licenses and readmes for code in lib #403 20543e0a Update ALP to 1.98 and add readme/license d5717e82 Add CDD to databases downloader #410 04b27f98 msa2profile always copies lookup/source files instead of linking them to be independent from the MSA db 2d83f517 msa2profile/result can skip the first sequence 242a8faf Pass threads to tar2db in databases workflow a19f5a52 Allow clustering of clustering input with set-cover or connected-component by ignoring scores/weight 39a41403 Don't set INT_MAX as --max-seqs in slice search to avoid huge allocations in prefilter 9290a2b5 Allow sequence database input in taxonomyreport #408 aaba0c7f Short circuit cluster-reassign if nothing can be reassigned 3822a8f5 Fix tmp files not getting removed in linclust/cluster with --remove-tmp--files 2a35e025 Fix kmermatcher setting user k-mer pattern in auto k-mer selection and breaking a1050359 Rename accelerated 2bLCA to approximate 2bLCA to be consistent with manuscript 11698a5b Rename LICENCE to LICENSE soedinglab/MMseqs2#402 0828d865 Allow result database input in taxonomyreport #401 b31ebb64 Krona taxonomy report was not working if no sequence was unclassified 9f0fb3ed Cleanup taxonomyreport a2d9568d Fix wrong azure dependency b1367fc2 Make resultToBuffer buffer sizes consistent (needs further refactoring) 98f9939d Get rid of results temporary array in msa2result d495e0e9 Replace texlive with tectonic for userguide building e03b5257 Fix MMseqs2 Taxonomy citation 602689c1 Update examples in mmseqs (easy-)taxonomy invocation ecf152cf Improve (easy-)taxonomy description text by reordering parameters by importance e0b04434 Improve description of --orf-filter a7f91d46 Add warning if cluster or prefilter input is used in majoritylca with invalid --vote-mode a3399397 Update regression to include recent speedup d5da12d7 Add GTDB to databases downloader 83780f4c Respect verbosity for rmdb calls in databases 9011c15d Improve output of databases list 86c03fd4 Increase buffer sizes in tar2db 2bd03c68 Fix tar directory (symlink, etc) entries causing tar2db to stop early 7bdb222d Use DBWriter to write .lookup multi-threaded in tar2db 23c9e1e7 Don't use multiple threads in tar2db when reading .tar.gz/.tgz as nearly all the time is spent inside zlib 2e128d4f Increase zlib buffer in tar2db to speedup reading c1911893 Fix multiple locations where Util::checkAllocation would never be called as the preceding allocation would already terminate on failure 1f302134 Fix two compilation failures revealed by Debian 5b03cdff Another instance of the same warning 3fda449b Fix compile warning 3b0197af Encode species names in taxonomy blocklist to make sure we don't block random nodes in non-NCBI taxonomies (e.g. GTDB) ab2426f8 Fix String MultiParameter (e.g. sub matrices) breaking if filenames contain whitespaces e8de3507 Encode whitespace containing parameters as base64 to better deal with shell word splitting in workflows c7a7c366 Add instructions to simd.h 6672bbc9 Fix missing newline in log message 84034a52 Remove useless taxonomy ancestor warning 6609c6cd Fix invalid taxonomy output mode being set 441c52cf Fix taxpercontig not working with easy-taxonomy 4ce38109 lca is not computed by easy-taxonomy anymore 9d631c16 Fix cleanup of taxonomy intermediate files d0f596f5 taxonomyreport and addtaxonomy output is now adjustable in easy-taxonomy 6bfd08d5 Cleanup default set parameters in easy-taxonomy afcade16 Improve default taxonomy parameter lists shown (without -h) fc126b3e Improve error messages when something is wrong with the input/output paths 3b49310f Improve unrecognized parameter message 83b9e9a1 Remove useless missing tmp dir warning d0a9b79f Fix typo 48f9737a Add ORF filter parameters only to taxonomy for now a6068975 Disable unfinished ORF filter in search 336d9d04 Add taxonomy citation f7fde6fe Reduce binary taxonomy dump memory requirements slightly eff61cfe Add \0 byte after serialization 7e63e1ea Fix typo in Parameters.h 019de271 Add vector of predefined substitution matrices 34b3a539 Merge pull request #389 from mr-c/simde_v0.7.0 74724b3a Cleanup headers in kmermatcher 73fd5cfa Update xxhash to v0.8.0 8dd192c0 Don't create false _has_{builtin,attribute} c2d60348 Squashed 'lib/simde/simde/' changes from f2257f11..b6c9c964 062ef995 Merge commit 'c2d60348af5c036eb2cbc7974d84065e16ab4096' into simde_v0.7.0 bad16c76 Check correctly for existing of binary tax dump in createtaxdb 457cacab Replace string concatenation in aggregatetax with append a5169557 Fix strcmp comperator in nrtotaxmapping too 0da81a03 Fix ASAN free-delete mismatch 4fa7cb27 Replace std::sort in StringBlock with fast sort dc4f9ed4 Wrong comparision used in sort comperator was crashing clang e09b3db3 Move taxonomy version to cpp file 1645696b Use less threads on PPC64LE regression 9c0a99ca Fix compile error in taxonomy test f1ab0b3c Fix missing newline in lca 7ff6dc5e Add version check for binary taxonomy df301e3b Create serialized/mmapable taxonomy in createtaxdb, taxonomy loads instantly compared to before 3addec8e Remove debug output mode from createtaxdb again 5407ca4c Don't create taxonomy files in createtaxdb again if they already exist 95968440 read correct number of CPUs in macos build script of nproc is not available 0defb362 Split aggregatetax and aggregatetaxweight parameter lists 86e6b0b7 Cleanup weightedMajorityLCA d03a8d03 Add score vote mode to taxonomy weighted voting 553a670d Split non-index parts over more files if a split index is requested f5a762ff Do not read e-values for tax-id 0 again in aggregatetax 4c1137c2 Add majoritylca module for majority voting based taxonomy from alignment results 4224c6a6 Move majority lca voting to NcbiTaxonomy class 6ab700bf Fix parameter order in lca and aggregatetax 26a8e478 Skip secondary structure in msa2* with (c)a3m input 6f56a262 Fix: Extract the correct source name when tar2db and createdb are used together ea83a916 Fix cmake deprecation warning ca6aea96 Fix #379: E-value parameters are now correctly parsed as doubles instead of floats 1cec7419 Fix atomic check when cross-compiling aed7d976 Fix now correctly switching to xcode 12.2 in azure 184d834a Try building macOS ARM binaries on Azure's Catalina VMs 9b819686 Fix not returning error in mergeresultsbyset after error case 9f718741 Add MMSEQS_FORCE_MERGE env var for forcing generating fully merged dbs 3df79c30 Build arm64 macos binary only on big sur (not in CI yet) acfa3ef1 Build universal mac binary for sse/avx and arm neon f4f38685 Add symlinks to splitdb #376 41adb5d4 Add cpdb and lndb, place them and rmdb, mvdb into same file 99410a2e Revert "Remove handling of pre-split sequences in splitsequence" 3c0000ba Remove handling of pre-split sequences in splitsequence 6bb22ecc Add splitsequence parameters to all relevant workflows d204e91f createtaxdb can create a taxdb by mapping through .source 1c52b75a Fix tar2db would create entries for non regular tar files 2719ba2f Allow createdb to read generic dbtype (to use in combination with tar2db) 9e990b30 Add missing stdin dbtype to getDbTypeName c8e082e3 Increase number of opened files limit when DBReader is used 2a972e91 Fix gapped score calculation in proteinaln2nucl 750e8844 Update regression for taxonomy 35ad87ed Remove debug message 6a882624 Unify TaxPerContig and Taxonomy 7da33b05 Acc 2bLCA is now default for protein and translated taxonomy, tophit is always used for nucl-nucl 9d0169cc Taxonomy search mode fully integrated into alignment module f8d2878e Refactor alignment to allow computing a limited number of realignments 1cc54190 New 2blca could compute LCA from res not finding anything in first aln 5067c1d4 Taxonomy refactoring 18da8d6e Set approx 2blca as default taxpercontig mode 6da35599 Make taxpercontig orf-prefilter parameters adjustable 45c4de7f Include file size and modified date of inputs in tmp file hash calculation #372 cc472544 Fix #371: --cov-mode 5 was not working 8e8e9a0b Fix MPI compile issue f537370a BC breaking: Unify in result2msa --compress --summarize --omit-consensus to --msa-format-mode, support stockholm output 951d51b4 Don't link header db etc in filterresult to output db 349c2765 Move currentKey out of ifdef in tar2db d95e41e7 Always compute result files in easy-taxonomy 31a90e13 Actually fix the uninitialized warning 20eeaabc Fix uninitialized warning 3c94c0a2 splitsequence can create a sequence database with original headers aca7380b Return bit-score in proteinaln2nucl instead of raw-score 18588bb3 Fix filterresult off by one issue 9b74117e proteinaln2nucl can now compute scores and evalues 8ea08f0c Add curl flag to follow redirects to database downloader 1cf3002a Fix compiler warning 5dc4bcd4 Update eggnog urls (fix curl bug) 20a03128 Fix id issue in tar2db be4d2e07 Add multi-threading support to tar2db f6831608 Merge pull request #359 from mr-c/spelling b244246b Spelling typos fixes d9f2041e Merge branch 'master' of https://github.com/soedinglab/mmseqs2 971f9d90 Turn profiles from lin-space to scores, add average profile-profile code 96d452cb Inline single use of DBWriter::mergeFiles to mergedbs 24ecc26c Fix some compilation flags would not be correctly set during cross-compilation beabb353 Make sure to flush stdout/err before calling any workflows a1622068 Add missing dbtypes to allDbAndFlat 49240a30 Setting APT::Immediate-Configure=false fixes cross-compiler installation d4fd0729 Next try to fix cross-compilation bd3e49fe Remove ubuntu-toolchain ppa breaking cross-compiler installation on azure 4b9b3b56 Remove all other apt sources from azure before installing cross-compilers 57f429a0 remove unused remnants of the past in alignment class de06950f Reduce calls to posix_memalign, fixes lock contention of some platforms d3b0cf9a Fix result2profile could allocate not enough memory if target database contained much longer sequences than query database 1a490efe Support ungapped alignments in sliced search 3af62f06 Fix banded_sw 333cc350 Fix addtaxonomy always crashing due to invalid check 29e327f9 change orf filter params to match test runs cc7d7da3 result2repseq should preload the sequence database into memory 63794225 Improve createsubdb help text 951d5a72 Add nrtotaxmapping to create taxonomy mapping from NR 90e71f99 Squashed 'lib/simde/simde/' changes from 938d82c8..f2257f11 df69c26e Merge commit '90e71f9968d3925e545c45d7c68325dd3cd0c588' into master 48950b95 Correctly pass threads/verbosity in taxonomy workflow 9d3ab794 Merge commit 'b6a4528e818ca644f8200fc84b2d1856ecd8f5c7' into master b6a4528e Squashed 'lib/simde/simde/' changes from 2119ac73..938d82c8 725d9f63 Modified Profile-Profile alignment implementation with templates. 113e3212 Fix ASAN issue in extractorf when using AVX2 b15e95a1 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 b7ec0e93 Fix setcover issues with dbs > 2^31 sequences f8b3f8b1 Add biocontainers badge b7ac683c Update cluster update regression 4d665ce9 Automatically set cluster parameters also in cluster update b5a08833 Fix #272 remove deleted sequences from old clustering in cluster update 66f77ce8 Cleanup subtractdbs b2ac9e0b remove confusing comments 3d2e394a Limit number of jobs used for compiling on travis d58cc78c Fix invalid symlinks in result2repseq 21f71466 Cluster update refactoring 60d5be17 Add missing var to profile 12b78e3f Merge branch 'master' of https://github.com/haydenji0731/MMseqs 2aaac47a First running version of double max profile/profile fbe754e0 Fix missing newline in first sequence in entries of result2msa db1c38b1 Made changes to SSW class for Profile2Profile Alignment a29379e2 Do not map scores if not needed in result2pp e80ec9a3 Updated ROC for result2pp 769aa78a Add seqdb preloading in result2pp cf8b1429 Remove more unused parameter from result2pp f2a29339 Update regression to include result2pp test 3fac8dde Copy profile information of unaligned regions from query profile 967a4555 Cleanup and fix result2pp b2f49a25 Add NR taxonomy information efdbe941 Change serial sort to std::sort 97a8f1dc Update regression 0c123fe7 Fix comp. bias correction in expandaln 401d8e6f Add --max-seqs to ungappedprefilter f57d1a71 Update expandaln, expand2profile and regression a62ea9a9 Update reassign cov. mode in prefilter and fix regression 64f9294b Update regression to include expansion test 61d8b64d Fix coverage read in for nucl-nucl alignment results (#339) 45ae9276 Compute evalues and sort correctly in expandaln 38fab36e Fix wrong sequences being loading in expandaln due to wrong sorting 3aa032be Cleanup in MultipleAlignment ea3212f0 Fix realloc size in profile set size increase 7cca0508 Fix restart cluster-reassign 0945e5a5 Add prefilter parameter to reassign 4e436c79 Fix compile error in tests 47e62299 Avoid constant allocations in PSSMCalculator 657a97c0 Don't clone the whole result_t vector uselessly in profile related modules b87cae01 MultipleAlignment does not require constantly allocating and deallocating Sequence objects anymore 486e13ac Remove add internal ID parameter in result2msa 0a8a7a3a expand2profile module should be able to directly build a new profile a84e6f48 Make max set size in profile classes dynamically growable 5baf62ab Cleanup Sequence class e4b2ffb0 Move PSSM masking and writing to its own file d10a6104 Fix clang warning 76d7d83b Fix progressbar in first clust readin step 01937be2 Taxonomy expressions in filtertax(seq)db interpret , as || now #320 fddf635d Add SILVA to databases module 9ec7c5e6 Fix MPI warning ce65cb86 disable ICC in travis, beta08 breaks their setvars.sh script and SIMDe has many issues 87183135 Fix warning in clang 97653a92 Check the return code of fclose to handle full disk errors better 06bd0cfd Add filterresult for pairwise HHblits filtering to reduce redundancy in a result db #316 3bdaf488 Fix various result2msa modes (compress works cleanly now, --filter-msa mode could return invalid MSAs) c1f78338 Fix invalid projected backtraces in expandaln d741a251 Remove circular include 595625a1 Cleanup result2msa/profile 8ad36374 Unify to computation of alignments in msa2result and transitivealign 55534d71 Fix wrong lengths used in msa2profile 5d10ce00 Rewrite expandaln module 4be0d6e1 Add msa2result module for generating result dbs from MSAs a179ab27 Cleanup DBConcat a9c56e57 Merge branch 'master' of github.com:soedinglab/MMseqs2 ec3b8254 Try out new aggregate tax algoritm cfba9f02 Fox .index.0 files not being removed after sorting dde4b2e3 Next try to downgrade ICC 618331da Downgrade ICC since latest version seems to be broken ed45a9f2 Remove unused variables in rewritten microtar 328732a1 Update regression ae7398d6 Added fident to convertalis. fident prints the fraction of the sequence identity. pident reports the percentage. soedinglab/MMseqs2#337 a61b9eb9 handle the unranked root and cell orgs d2141f32 ORF filter with high-eval thr ungapped alignment ea01a174 Remove useless cast in QueryMatcher 1e95b6bd Update tantan 207d0d21 Allow overwriting string parameters with empty strings 755a7b03 Add new binaries to README and fix whitespace be05b8d0 Add orf-filter to taxpercontig and cleanup 22e17aa4 orf-filter should also work in easy-search and easy-taxonomy 7fefa8af added mode to ByteParser 4393c5aa typo b05d7d75 Speed up read index and kmermatcher 3f9a6031 Fix --search-type 4 in createindex 18e90119 Rework read index in DBReader 1eb72611 Do not sort indexes when already ordered while DB close 65f246b1 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 053fd61c Improve multi-threaded speed of writing clustering results e1a71066 Fix typo in arch name 3b1e528c Add SSE2 binary to Docker d66ee416 Use Ubuntu 20.04 for cross-compilation 8ba605e8 Add SSE2 and cross-compiled ARM64/POWER8/POWER9 builds to azure a5e485ba Fix broken checks for libraries when cross-compiling 7fe0cb90 Fix progress bar in DBConcat cef0731b Create translated index if --search-type 4 is used in createindex 47afc572 Fix --search-mode 4 issues in offsetalignment 80fdcbed Change cluster reassing to bool soedinglab/MMseqs2#329 57e8a9df Allow ORF filter only in combination with query nucleotides d55f06ce Fix Pfam.full database creation 659cc1f8 Add additional experimental ORF prefiltering step before translated search e934f1c4 made ByteParser more informative 4d14c9fe tax-lineage modes: 0 nothing, 1 names, 2 taxids b777cd09 Disable ips4o on ppc for now 95a88524 find_package is case sensitive 8e797b1d Allow disabling use of IPS4O, cleanup 850a196b added seqs assignment agreement to the output e21dc40f Fix wrong existence checks for databases in workflows 5901a0a9 Set minimum clang to 5.0 for now d7b46e60 Disable ips4o on cygwin 033fda23 Change travis gcc check to 4.9 908675d2 Add includes ee7b5c11 Change random_shuffle to shuffle d1a1af5e Rewrite atomic check in cmake d6590f39 Add missing FastSort.h 704d0fb4 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 109be7bb Change sort to ips4o if possible 02059366 Fix warning d092a469 Fix kmermatcher MPI support b001dfb2 Made modifications for Profile-Profile alignment. Changes belong to SSW, Alignment, Matcher. Right before integrating lin space vector cost calculation for H value. 521c0d25 Made modifications to ssw algorithm implementation. 2f1db01c Rename martin.steinegger@mpibpc.mpg.de to martin.steinegger@snu.ac.kr 0f7b6856 Fix #326 wrong citation link 62a387ed Merge branch 'master' of https://github.com/soedinglab/mmseqs2 c125a217 Fix issues in expandaln 648bc1f6 Add Pfam-B download script 16e79a2a Add dbCAN2 download script 7c0ed7f8 Microtar would try to seek backwards resulting in horrible gzip read performance cab0e838 Fix #323 createdb not correctly reading gz/bzip with --createdb-mode 1 1d650034 mmseqs --help should not give a useless correction suggestion 35c58af9 Improve download of taxdmp file in createtaxdb 68feeb20 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 6546822c Add missing .dbtype to newSeqDb header in cluster update 2a787482 Merge pull request #321 from milot-mirdita/simde 72d19b96 Seems like travis reduced the RAM available on ARM 565ad3f9 Add script to update SIMDe b9783a7f Squashed 'lib/simde/simde/' content from commit 2119ac73 9828f0d6 Merge commit 'b9783a7fca1677486f2f830a9c59fda11330980c' as 'lib/simde/simde' 641ef68b Remove submodule in preparation for subtree b6dd6447 Work around clang issue a877dc00 Rebuild SIMD autodetection 5ba9e7ae Cleanup warnings 3980d2a7 Add one Newton-Raphson it to make division with _mm_rcp_ps always consistent 27b82963 Try limiting threads in ppc to not crash on 4gig ram c95bdcc1 Silence strict aliasing warning in Itoa for NEON 590cfb96 Rebuild 128/256 bit SIMD split in simd.h f5750fee Enable building on non-x86 and less than SSE4.1 21d798f0 Remove not finished createtaxdb changes b59c3381 Make orf information available through convertalis 284bb757 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 f4bbce84 Add MemoryTracker, Account for index size when computing available memory e2510e8f fixed comment because it wasmisleading def7ace2 Add convertalis HTML output based on MMseqs2 app (app.mmseqs.com) dd3ff63a Fix convertkb to work without a mapping file dc054792 Previous lookup writer would always report failing 52ac0f36 Refactor lookup writing to not corrupt memory if an accession is too long 9f2be0e0 Disable ICC travis for now d319bb92 Merge branch 'master' of github.com:soedinglab/MMseqs2 d1522365 remove appendtaxaln 648cf836 refined code as per Milot's feedback 94db0316 One more INT -> UINT warning that ICC complains about 271b7c13 Next try for travis 2c1dfdd4 Fix terminate value in SSW again 6016e1b1 Try to fix travis ddaaaf7f Fix various warnings reported by ICC, add ICC to travis f3adc10f added aggregatetaxweights to get rid of appendtaxaln 211dd7a7 change to SSTR 517b01ba true in createParameterString saves defining taxonomy defaults aa068c86 added taxonomy default parameter values because life f9d1face in the process of adding taxPerContig workflow 94f895a3 fixed english 19f1dfbd moved weight const back 8db3e714 moved definition of constants 05cfc8bd added mode of tax output: both lca and aln 2cd59046 added voteMode parameter 128f57b5 extended aggregatetax to handle eval-based weights e4a10bd7 added appendtaxaln for extending aggregatetax 0c29da4a Actually fix the filterdb --join-db issue 7ff6ae7c Restore fix lost char in joindb mode change f5c8b28c Update README.md e4f7e745 Add qOrfStart/qOrfEnd, dbOrfStart,dbOrfEnd to offsetalignment cf40916c Merge branch 'master' of https://github.com/soedinglab/mmseqs2 c0dac797 Do not write null byte in splitdb cbb542af added rand id to tmp files created at localTmp 214e87e9 Remove goto in lca.cpp c8309fce Merge branch 'master' of https://github.com/soedinglab/mmseqs2 b761ddf4 Fix issue with qset format output 80bff832 Do not write .lookup in easy workflows if not needed 21f7a05f createdb can now read a database containing FASTA/Q entries d5a05376 Fix whitespace and cleanup output strings in createdb d14b622e Fix cygwin compile issue 9e5fb33b Introduce KSeqWrapper to read from memory location dc7b9626 Merge branch 'master' of https://github.com/soedinglab/mmseqs2 de6f7524 Fix soft link createdb bug if multiple input file are provided b06bee91 Fix alpha regex 46c84389 Update combine pval agg-mode 3 67d61013 Disable fancy progress bars on travis to reduce output 203a2173 Updated two more tests to use tighter ROC thresholds a9052f44 Update regression with tighter bounds for ROC tests c62736a6 Correctly parse keys from data files in filterdb --filter-file This was causing a linsearch instability fe007cb4 Use MultiParam for gapOpen, gapExtend costs 3513001d Add easy-rbh workflow d0d3032e Fix RBH search if using -a to show alignments ce1a43bf Merge branch 'master' of https://github.com/soedinglab/mmseqs2 ea24e493 Fix issues with abs. path if using aria2c 5228745f Improve --alignment-mode parameter description and make it a non expert parameter fffa9b10 Fix various inconsistencies and usability issues with alignall: * alignall alignment-mode did not correspond to align alignment-mode * add-backtrace did not do anything, has to be specified now if backtrace is needed * Did return a alignment db type even though it is incompatible with that type, uses generic for now * various parameters were passed but unused - zdrop and scorebias are used now (however see below) - realign, alt ali, max accept/reject, wrapped are now gone 29066847 Fix wrong warning 813d81f2 Update regression 264d7811 Switch greedy clustering algorithm back to old idea c09f6574 Improve nucleotide clustering workflow 38a73770 Set k-mers in linclust to 0 for the nucleotide clustering 7df6e3f7 Replace characters that can not be reversed by N in extract frames e9678f62 Update regression f886e868 Add nucleotide support to cluster (workflow nucleotide_clustering), clust module will infer identity automatically if missing, Improve low. mem. greedy incremental algorithm, Update regression 5f873587 Add kmers-per-sequence-scale to linsearch 0310eb60 Change --kmer-per-seq-scale to a multi parameter, add error if cluster is called with a nucleotide sequence e258bc8d Fix #299 PDB70 database creation was not working 7095f37e Add support reverse complemente in rescorediagonal --rescore-mode 0 and 1 61ca4888 Fix result2dnamsa 70d014e4 Add search-type 4 to Search 462f24cb Add module result2dnamsa 5670d990 Fix regression error e4451d59 Add result direction parameter to kmersearch 12c499dc Fix reverse sequences issues in linclust and linsearch 44499c3c Update filterdb regression test 807b4a56 Fix issue soedinglab/MMseqs2#290. Filterdb checked for mode == true but mode was 2. 24479bc2 Fix Docker a578f52a Fix char signedness on PPC a0d64a98 Update regression a07a266f Working on PPC64LE support 09734177 Remove remaining _mm_shuffle_epi32 cdef78a6 Merge pull request #285 from hgsommer/misc_small 283c8d03 Replace goto end in ssw 6bfc5028 Fix c/p mistake in convertalignments e61da344 Fix spelling of 'length' 9a63760f Replace nested ternary operator 4349b5c6 Avoid repeatedly checking for profile db types c170a11f Call MsaFilter::shuffleSequences() from MsaFilter::filter() ef49ba22 Return value from MsaFilter::filter() d155dc36 Replace int by bool literals for bool variable ec6722ad Align headings with column in PSSMCalculator::printProfile() 548a9bd6 Avoid forward declaration of ScoreMatrix d0fbe471 Do some cleanup in StripedSmithWaterman.cpp 91d1aedd Replace check for zero-sized containers by empty() e47b8eed Remove superfluous parameter from ssw_init() 250b1221 Simplify return statements 4fe1116a Remove counting zero scores in Sequence::mapProfile() 4303728b Replace multiplication by zero 1bd60242 Remove increment by zero e4d4389f Move check for exit condition in front of allocations 556d26d1 Clean up function signatures in MultipleAlignment 3863af9a Move include back to header to restore build e1208493 Remove unused TmpResult score field 1fd4db8f Die if DBReader cannot reopen files (e.g. no more file handles left) 1e21b87b Purge sequenceLookup early since its recreate in split databases 40854ddc Prefiltering and CacheFriendlyOperations refactoring 2433e086 WASM work in progress 14014cd0 Fix prefilter overflow instability e0f97184 Add conda forge to conda install instructions aa175d63 Fix off by one in kmermatcher soedinglab/MMseqs2#274 (comment) d1607bc8 Remove LINE_MAX eca2155d Clear string buffer instead of reassigning in swapresults 0f4645ed Fix wrong reverse marking in linsearch reported by UBSAN 5b612a32 Missing mpi binaries for travis regression 83d22417 Next try for ARM compiler flags 7ad122f0 Missed a few variables ac7914be Do not require a cmake variable to build ARM 0dcfaadb Update regression to fix broken samtools call on ARM 29927b4c More NEON fixes, we assume signed chars, ARM uses unsigned by default 7760220f Next try to get the ARM regression to work cc6d0d52 Add hack to not break travis log size limit 5408c3d1 Try to get NEON to compile 83192cab Fix search workflow parameters printed twice f6f001c8 Fix new clang-10 warnings and further travis fixes 259e6434 llvm-10 alias is not whitelisted in travis yet b1249fd5 Fix errors in Travis YAML from previous commit 18486d4c Update travis - use native aarch64 for neon - use xenial - shorten script 98c37f3c shortend MultiParam usage, improved line breaks in usage c9be07f1 Add gcc-9 to travis 2e5fb309 Fix travis clang build d5865c89 Remove MultiParam g++-9 warning 73679835 Rework target split merging ca586939 Fix RESSIZE issue in slice search if sequences are used 491900b9 Improve usage text of cluster/linclust 0166850a Remove old greedy incremental clustering code and just run the memory efficient version instead. 15163e64 Fix Verbosity in workflows aa78af46 Fix issue soedinglab/MMseqs2#274 7846dfce fixed clang template error e1206371 extended MultiParam class, replaced ScoreMatrixFile type by MultiParam<char*> b88b5475 rewrite alphabetSize as multi parameter ecb4e35d started template class MultiParam to store sequence type specific values e1a1c122 changed dbtype comparision in AlignmentSymmetry 2a829aef Replace symlinkat call with getcwd/chdir/symlink/chdir to fix Conda build using macOS 10.9 SDK 28e83e8d Add OpenMP include to DBReader fb00aa0c Fix realloc issue while IndexTable creation of profiles 504e5021 Take max. seq. len of query and target db in prefilter and alignment 16e23521 Fix bug if seq. len > max seq. length in Alignment 80d0187d Fix asan issue 751f5c19 Make ZDROP an expert parameter, change description text 1b6edd0d Rework x detection (SIMD) 9677254a Merge branch 'master' of https://github.com/soedinglab/mmseqs2 1ac1e686 Fix max seq issues in prefilter cb737033 Reset download strategy to not use aria2c for the NCBI download c95f3ee0 fixed ksw2 test 72b95c0c Error if we cannot download from NCBI 1d0aad50 Fix databases not piecing togehter all kalamari accessions 516723d5 Merge branch 'master' of https://github.com/soedinglab/MMseqs2 d81b6cca added zdrop parameter to control banded nucleotide alignment e2e39a97 Add Kalamari Contaminants database c0c538ea Various fixes in databases script 08cc95b3 Fix createtaxdb redownloading when taxdump already exists 018eb349 Remove a bit whitespace in front of each parameter in usage message 8aa7513d add aggregatetax example, fix typos 8bcd7c74 Fix typo 8e581b76 Rework usage texts 7dc25764 Hide most parameters from createindex 2baa609e Add examples to many modules 00a7d769 fixed bugs for long or wrapped nucleotide sequences a4bdcb47 eggNOG profiles should not depend on the deleted MSAs 4c783095 Fix eggNOG database construction f7a5599c Cleanup not needed files immediately in databases workflow 3ed3690d Fix downloads always restarting in databases workflow 4cfac9a8 Fix aria warning with more than 16 connections e0a00e10 Revert "Use SW instead of BandedNucAln if we don't have diagonals" 7ac966b2 Fix result2msa could fail if it was writing compressed output 95729ac7 Fix wrong output DB type written in alignall f899e7c7 Use SW instead of BandedNucAln if we don't have diagonals c08d9fa8 Allow parameter descriptions to span multiple lines 57868498 MMseqs2 is not limited to proteins, update README to reflect that 11818b0a Cleanup hiding parameters in workflows c481cea6 Remove some useless includes 2f64aeeb Fix databases timestamp appending instead of overwriting ae9e9e32 Add eggNOG setup procedure to databases 31c8e5d5 Shorten two short parameter descriptions 2f49d3e3 Read header from lookup in msa2profile if available 1356869b add option to reverese profile dbs ac3482e8 More issues with zlib and tar2db aaafafe4 Fix tar2db keys c751d9e2 More tar2db fixes a9c93014 Fix variadic input to tar2db 51a76130 Add tar2db module to convert content of any tar to a DB 96f9a91e Use nedmalloc on Windows/Cygwin 73f5c2a2 Add databases workflow to README 5a7ac9e5 make align output consistent c5ebe529 fixed setcover cluster mode (by fixing bug in similarity reading for short aln results e.g. hamming distance aln) 481696b5 Fix databases output c6b4a57a Beginning cleaning up parameter descriptions a9552a17 Show default value of bool parameters af89c467 Add a proposed example text structure git-subtree-dir: lib/mmseqs git-subtree-split: c48da9d781b81804727b5cccfed7f97cfcc20c9d

martin-steinegger added a commit that referenced this issue Feb 12, 2022

Point Kalamari3.7v to a fixed commit #531

ce7bf53

milot-mirdita closed this as completed Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kalamari database link out of data #531

Kalamari database link out of data #531

Mattstorey commented Feb 6, 2022

martin-steinegger commented Feb 12, 2022

Kalamari database link out of data #531

Kalamari database link out of data #531

Comments

Mattstorey commented Feb 6, 2022

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

MMseqs Output (for bugs)

Context

Your Environment

martin-steinegger commented Feb 12, 2022