RepeatModeler Version 2.0.4 =========================== Using output directory = /scratch/tmp/rModeler.mR6DjY/RM_4100171.SunNov172337262024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1731915445 Database = /scratch/tmp/rModeler.mR6DjY/GCA_964106825.2_mVulVul1.hap1.2 - Sequences = 461 - Bases = 2411727275 - N50 = 139751513 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 187478798-200869998 | [ 2 ] 174087598-187478797 | [ ] 160696398-174087597 | [ 2 ] 147305198-160696397 | [ 2 ] 133913998-147305197 | [ 3 ] 120522798-133913997 | [ 3 ] 107131598-120522797 | [ 2 ] 93740399-107131598 | [ 3 ] 80349199-93740398 | [ ] 66957999-80349198 | [ ] 53566799-66957998 | [ ] 40175599-53566798 | [ ] 26784399-40175598 | [ ] 13393199-26784398 | [ ] 2000-13393199 |************************************************* [ 444 ] Storage Throughput = excellent ( 1507.37 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40033083 bp ( 40027683 non ambiguous ) - Num Contigs Represented = 41 - Sequence extraction : 00:01:29 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:07:36 (hh:mm:ss) Elapsed Time Round Time: 00:12:05 (hh:mm:ss) Elapsed Time : 174 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:21 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:14 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 13751 repeats masked totaling 2191296 bp(s). - TE Masking time 00:00:04 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10019411 bp Num Contigs Represented = 26 Non ambiguous bp: Initial: 10018411 bp After Masking: 7727432 bp Masked: 22.87 % -- Input Database Coverage: 10019411 bp out of 2411727275 bp ( 0.42 % ) Sampling Time: 00:00:40 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 31125 Comparison Time: 00:03:06 (hh:mm:ss) Elapsed Time, 6361 HSPs Collected Number of families returned by RECON: 868 Round Time: 00:04:01 (hh:mm:ss) Elapsed Time : 22 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:01:05 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:01:02 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 45077 repeats masked totaling 7280790 bp(s). - TE Masking time 00:00:09 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30013592 bp Num Contigs Represented = 33 Non ambiguous bp: Initial: 30009192 bp After Masking: 22428839 bp Masked: 25.26 % -- Input Database Coverage: 40033003 bp out of 2411727275 bp ( 1.66 % ) Sampling Time: 00:02:17 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 283128 Comparison Time: 00:14:59 (hh:mm:ss) Elapsed Time, 38989 HSPs Collected Number of families returned by RECON: 2448 Round Time: 00:17:45 (hh:mm:ss) Elapsed Time : 65 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:03:14 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:03:38 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 144620 repeats masked totaling 24927258 bp(s). - TE Masking time 00:00:29 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90040192 bp Num Contigs Represented = 81 Non ambiguous bp: Initial: 90025888 bp After Masking: 63802704 bp Masked: 29.13 % -- Input Database Coverage: 130073195 bp out of 2411727275 bp ( 5.39 % ) Sampling Time: 00:07:25 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2550411 Comparison Time: 01:27:25 (hh:mm:ss) Elapsed Time, 214620 HSPs Collected Number of families returned by RECON: 9119 Round Time: 01:41:59 (hh:mm:ss) Elapsed Time : 159 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:09:54 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:09:04 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 474171 repeats masked totaling 83164089 bp(s). - TE Masking time 00:02:14 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270048102 bp Num Contigs Represented = 153 Non ambiguous bp: Initial: 270006102 bp After Masking: 183354369 bp Masked: 32.09 % -- Input Database Coverage: 400121297 bp out of 2411727275 bp ( 16.59 % ) Sampling Time: 00:21:22 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 23007936 Comparison Time: 10:05:29 (hh:mm:ss) Elapsed Time, 187952 HSPs Collected Number of families returned by RECON: 39137 Round Time: 10:40:59 (hh:mm:ss) Elapsed Time : 330 families discovered. RepeatScout/RECON discovery complete: 750 families found Classification Time: 00:17:36 (hh:mm:ss) Elapsed Time Program Time: 13:14:25 (hh:mm:ss) Elapsed Time