RepeatModeler Version 2.0.4 =========================== Using output directory = /data/tmp/rModeler.i4k5vT/RM_835855.WedNov131917432024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1731554263 Database = /data/tmp/rModeler.i4k5vT/GCA_042242105.1_fPemKlu1.hap1 - Sequences = 441 - Bases = 646252061 - N50 = 27811033 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 34280133-36727992 | [ 1 ] 31832274-34280132 | [ ] 29384415-31832273 | [ 5 ] 26936556-29384414 | [ 5 ] 24488698-26936556 | [ 4 ] 22040839-24488697 | [ 6 ] 19592980-22040838 | [ 1 ] 17145121-19592979 | [ 1 ] 14697262-17145120 | [ 1 ] 12249404-14697262 | [ ] 9801545-12249403 | [ ] 7353686-9801544 | [ ] 4905827-7353685 | [ ] 2457968-4905826 | [ ] 10110-2457968 |************************************************** [ 417 ] Storage Throughput = excellent ( 1106.65 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40033046 bp ( 40032246 non ambiguous ) - Num Contigs Represented = 65 - Sequence extraction : 00:00:34 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:14:34 (hh:mm:ss) Elapsed Time Round Time: 00:21:54 (hh:mm:ss) Elapsed Time : 253 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:07 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:42 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 4632 repeats masked totaling 693890 bp(s). - TE Masking time 00:00:09 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10030299 bp Num Contigs Represented = 31 Non ambiguous bp: Initial: 10030299 bp After Masking: 8836403 bp Masked: 11.90 % -- Input Database Coverage: 10030299 bp out of 646252061 bp ( 1.55 % ) Sampling Time: 00:00:59 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 32131 Comparison Time: 00:05:35 (hh:mm:ss) Elapsed Time, 7117 HSPs Collected Number of families returned by RECON: 1516 Round Time: 00:07:43 (hh:mm:ss) Elapsed Time : 12 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:20 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:01:56 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 15498 repeats masked totaling 2484112 bp(s). - TE Masking time 00:00:20 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30002667 bp Num Contigs Represented = 58 Non ambiguous bp: Initial: 30001867 bp After Masking: 26023163 bp Masked: 13.26 % -- Input Database Coverage: 40032966 bp out of 646252061 bp ( 6.19 % ) Sampling Time: 00:02:38 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 287661 Comparison Time: 00:34:23 (hh:mm:ss) Elapsed Time, 47465 HSPs Collected Number of families returned by RECON: 5412 Round Time: 00:42:17 (hh:mm:ss) Elapsed Time : 102 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:01:08 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:07:16 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 58337 repeats masked totaling 9267938 bp(s). - TE Masking time 00:01:16 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90036121 bp Num Contigs Represented = 113 Non ambiguous bp: Initial: 90034140 bp After Masking: 76409946 bp Masked: 15.13 % -- Input Database Coverage: 130069087 bp out of 646252061 bp ( 20.13 % ) Sampling Time: 00:09:48 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2609470 Comparison Time: 03:20:01 (hh:mm:ss) Elapsed Time, 202476 HSPs Collected Number of families returned by RECON: 20388 Round Time: 04:02:41 (hh:mm:ss) Elapsed Time : 379 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:02:40 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:20:01 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 227263 repeats masked totaling 37014525 bp(s). - TE Masking time 00:05:55 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270024007 bp Num Contigs Represented = 254 Non ambiguous bp: Initial: 270015288 bp After Masking: 220136519 bp Masked: 18.47 % -- Input Database Coverage: 400093094 bp out of 646252061 bp ( 61.91 % ) Sampling Time: 00:28:53 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 23348361 Comparison Time: 23:53:47 (hh:mm:ss) Elapsed Time, 820837 HSPs Collected Number of families returned by RECON: 83081 Round Time: 29:17:47 (hh:mm:ss) Elapsed Time : 1009 families discovered. RepeatScout/RECON discovery complete: 1755 families found Classification Time: 01:05:42 (hh:mm:ss) Elapsed Time Program Time: 35:38:04 (hh:mm:ss) Elapsed Time