RepeatModeler Version 2.0.4 =========================== Using output directory = /dev/shm/rModeler.B15YZN/RM_24236.TueAug130144372024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1723538676 Database = /dev/shm/rModeler.B15YZN/GCA_004125335.1_ASM412533v1 - Sequences = 116756 - Bases = 543605392 - N50 = 9937 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 128217-137364 | [ 1 ] 119070-128217 | [ ] 109923-119070 | [ ] 100776-109923 | [ ] 91629-100776 | [ ] 82482-91629 | [ 2 ] 73335-82482 | [ 6 ] 64188-73335 | [ 12 ] 55041-64188 | [ 25 ] 45894-55041 | [ 74 ] 36747-45894 | [ 265 ] 27600-36747 | [ 853 ] 18453-27600 |* [ 3301 ] 9306-18453 |****** [ 13365 ] 159-9306 |************************************************** [ 98852 ] WARN: The N50 for this assembly is low ( <10,000 ). The de novo methods employed by RepeatModeler are intended for use with long contiguous sequences and may not perform well with an over-abundance of short contigs in the database. Storage Throughput = excellent ( 1032.94 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40182927 bp ( 40000934 non ambiguous ) - Num Contigs Represented = 8809 - Sequence extraction : 00:00:05 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:16:41 (hh:mm:ss) Elapsed Time Round Time: 01:28:13 (hh:mm:ss) Elapsed Time : 435 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:02 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:46 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 12575 repeats masked totaling 8016743 bp(s). - TE Masking time 00:00:46 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10052362 bp Num Contigs Represented = 2165 Non ambiguous bp: Initial: 10009718 bp After Masking: 1907477 bp Masked: 80.94 % -- Input Database Coverage: 10052362 bp out of 543605392 bp ( 1.85 % ) Sampling Time: 00:01:36 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2342530 Comparison Time: 00:11:04 (hh:mm:ss) Elapsed Time, 25709 HSPs Collected Number of families returned by RECON: 482 Round Time: 00:13:29 (hh:mm:ss) Elapsed Time : 8 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:04 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:02:15 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 37591 repeats masked totaling 24635177 bp(s). - TE Masking time 00:02:16 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30142626 bp Num Contigs Represented = 6648 Non ambiguous bp: Initial: 30003104 bp After Masking: 5136899 bp Masked: 82.88 % -- Input Database Coverage: 40194988 bp out of 543605392 bp ( 7.39 % ) Sampling Time: 00:04:40 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 22094628 Comparison Time: 00:37:39 (hh:mm:ss) Elapsed Time, 112185 HSPs Collected Number of families returned by RECON: 1401 Round Time: 00:43:26 (hh:mm:ss) Elapsed Time : 36 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:00:11 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:06:43 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 116063 repeats masked totaling 74169370 bp(s). - TE Masking time 00:06:59 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90410420 bp Num Contigs Represented = 19453 Non ambiguous bp: Initial: 90008333 bp After Masking: 15069085 bp Masked: 83.26 % -- Input Database Coverage: 130605408 bp out of 543605392 bp ( 24.03 % ) Sampling Time: 00:14:05 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 189336070 Comparison Time: 02:27:03 (hh:mm:ss) Elapsed Time, 852082 HSPs Collected Number of families returned by RECON: 3646 Round Time: 02:45:33 (hh:mm:ss) Elapsed Time : 219 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:00:32 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:20:33 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 371156 repeats masked totaling 231672505 bp(s). - TE Masking time 00:28:46 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 271211390 bp Num Contigs Represented = 58145 Non ambiguous bp: Initial: 270000355 bp After Masking: 36042532 bp Masked: 86.65 % -- Input Database Coverage: 401816798 bp out of 543605392 bp ( 73.92 % ) Sampling Time: 00:50:28 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 1694696871 Comparison Time: 12:50:09 (hh:mm:ss) Elapsed Time, 6481954 HSPs Collected Number of families returned by RECON: 9212 Round Time: 13:59:02 (hh:mm:ss) Elapsed Time : 472 families discovered. RepeatScout/RECON discovery complete: 1170 families found Classification Time: 01:41:15 (hh:mm:ss) Elapsed Time Program Time: 20:50:58 (hh:mm:ss) Elapsed Time