-
Notifications
You must be signed in to change notification settings - Fork 270
Open
Description
To reproduce. On a fresh c6id.large instance:
wget https://dev.mmseqs.com/latest/mmseqs-linux-avx2.tar.gz
tar -xvzf mmseqs-linux-avx2.tar.gz
export PATH=$PWD/mmseqs/bin:$PATH
rm mmseqs-linux-avx2.tar.gz
wget https://dl.secondarymetabolites.org/mibig/mibig_prot_seqs_4.0.fasta
mmseqs createdb mibig_prot_seqs_4.0.fasta mibig_db
mkdir -p example/
mmseqs cluster mibig_db "example/example" tmp \
--compressed 1 \
--cluster-mode 2
fails with the error below
also, in brief:
Clustering mode: Greedy
9036 ZSTD_decompressStream Corrupted block detected
Error: Pre-clustering step died
Error: linclust died
but
mmseqs cluster mibig_db "example/example" tmp \
--compressed 1 \
--cluster-mode 0
works.
This happens with both binary and compiled mmseqs
Error output:
Create directory tmp
cluster mibig_db example/example tmp --compressed 1 --cluster-mode 2
MMseqs Version: bd01c2229f027d8d8e61947f44d11ef1a7669212
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
Sensitivity 4
k-mer length 0Target search mode 0
k-score seq:2147483647,prof:2147483647
Alphabet size aa:21,nucl:5
Max sequence length 65535
Max results per query 20
Split database 0
Split mode 2Split memory limit 0Coverage threshold 0.8
Coverage mode 0
Compositional bias 1
Compositional bias scale 1
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1Mask residues probability 0.9Mask lower case residues 0Mask lower letter repeating N times 0Minimum diagonal score 15
Selected taxa
Include identical seq. id. false
Spaced k-mers 1
Preload mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Spaced k-mer pattern
Local temporary path
Threads 2
Compressed 1
Verbosity 3
Add backtrace false
Alignment mode 3
Alignment mode 0
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Max reject 2147483647
Max accept 2147483647
Score bias 0
Realign hits false
Realign score bias -0.2
Realign max seqs 2147483647
Correlation score weight 0
Gap open cost aa:11,nucl:5
Gap extension cost aa:1,nucl:2
Zdrop 40
Rescore mode 0
Remove hits by seq. id. and coverage false
Sort results 0
Cluster mode 2
Max connected component depth 1000
Similarity type 2
Weight file name
Cluster Weight threshold 0.9
Set mode false
Single step clustering false
Cascaded clustering steps 3
Cluster reassign false
Remove temporary files false
Force restart with latest tmp false
MPI runner
k-mers per sequence 21
Scale k-mers per sequence aa:0.000,nucl:0.200
Adjust k-mer length false
Shift hash 67
Include only extendable false
Skip repeating k-mers false
Set cluster sensitivity to -s 6.000000
Set cluster iterations to 3
linclust mibig_db tmp/12627170530073326854/clu_redundancy tmp/12627170530073326854/linclust --cluster-mode 2 --max-iterations 1000 --similarity-type 2 --threads 2 --compressed 1 -v 3 --cluster-weight-threshold 0.9 --set-mode 0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0.8 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0--pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --alph-size aa:13,nucl:5 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 0 --mask-prob 0.9 --mask-lower-case 0 --mask-n-repeat 0 -k 0 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --rescore-mode 0 --filter-hits 0 --sort-results 0 --remove-tmp-files 0 --force-reuse 0
kmermatcher mibig_db tmp/12627170530073326854/linclust/7507599336006465408/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --alph-size aa:13,nucl:5 --min-seq-id 0 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 0 --mask-prob 0.9 --mask-lower-case 0 --mask-n-repeat 0 --cov-mode 0 -k 0 -c 0.8 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 2 --compressed 1 -v 3 --cluster-weight-threshold 0.9
kmermatcher mibig_db tmp/12627170530073326854/linclust/7507599336006465408/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --alph-size aa:13,nucl:5 --min-seq-id 0 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 0 --mask-prob 0.9 --mask-lower-case 0 --mask-n-repeat 0 --cov-mode 0 -k 0 -c 0.8 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 2 --compressed 1 -v 3 --cluster-weight-threshold 0.9
Database size: 46987 type: Aminoacid
Reduced amino acid alphabet: (A S T) (C) (D B N) (E Q Z) (F Y) (G) (H) (I V) (K R) (L J M) (P) (W) (X)
Generate k-mers list for 1 split
[=================================================================] 100.00% 46.99K 0s 621ms
Sort kmer 0h 0m 0s 97ms
Sort by rep. sequence 0h 0m 0s 18ms
Time for fill: 0h 0m 0s 11ms
Time for merging to pref: 0h 0m 0s 0ms
Time for processing: 0h 0m 0s 779ms
rescorediagonal mibig_db mibig_db tmp/12627170530073326854/linclust/7507599336006465408/pref tmp/12627170530073326854/linclust/7507599336006465408/pref_rescore1 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --rescore-mode 0 --wrapped-scoring 0 --filter-hits 0 -e 0.001 -c 0.8 -a 0 --cov-mode 0 --min-seq-id 0.5 --min-aln-len 0 --seq-id-mode 0 --add-self-matches 0 --sort-results 0 --db-load-mode 0 --threads 2 --compressed 1 -v 3
[=================================================================] 100.00% 46.99K 0s 48ms
Time for merging to pref_rescore1: 0h 0m 0s 9ms
Time for processing: 0h 0m 0s 70ms
clust mibig_db tmp/12627170530073326854/linclust/7507599336006465408/pref_rescore1 tmp/12627170530073326854/linclust/7507599336006465408/pre_clust --cluster-mode 2 --max-iterations 1000 --similarity-type 2 --threads 2 --compressed 1 -v 3 --cluster-weight-threshold 0.9 --set-mode 0
Clustering mode: Greedy
9036 ZSTD_decompressStream Corrupted block detected
Error: Pre-clustering step died
Error: linclust died
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels