profile
viewpoint
Pacific Biosciences PacificBiosciences Start from the website below! http://pacbiodevnet.com

PacificBiosciences/Bioinformatics-Training 402

Bioinformatics training resources

PacificBiosciences/FALCON 196

FALCON: experimental PacBio diploid assembler -- Out-of-date -- Please use a binary release: https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries

PacificBiosciences/blasr 130

BLASR: The PacBio® long read aligner

PacificBiosciences/DevNet 97

The DevNet project on github stores the PacBio DevNet website.

PacificBiosciences/ccs 76

CCS: Generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads)

PacificBiosciences/barcoding 45

Lima - Demultiplex Barcoded PacBio Samples

PacificBiosciences/FALCON-integrate 30

Mostly deprecated. See https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries

PacificBiosciences/bam2fastx 21

Converting and demultiplexing of PacBio BAM files into gzipped fasta and fastq files.

PacificBiosciences/ANGEL 16

Robust Open Reading Frame prediction (ANGLE re-implementation)

PacificBiosciences/apps-scripts 15

Miscellaneous scripts for applications of PacBio systems

startedPacificBiosciences/DevNet

started time in 2 hours

startedPacificBiosciences/pbbioconda

started time in 3 hours

issue commentPacificBiosciences/pbbioconda

Lima terminate called without an active exception

thanks! File is shared :)

zmz1988

comment created time in 5 hours

startedPacificBiosciences/pbbioconda

started time in 6 hours

issue commentPacificBiosciences/pbbioconda

Lima terminate called without an active exception

Include the header :)

samtools view -h sample.ccs.bam | head -n 1000 | samtools -bS > test.ccs.bam
zmz1988

comment created time in 6 hours

issue commentPacificBiosciences/align-clr-to-ccs

If no .bam.pbi files for associated .bam files, then actc exits without outputting stderr or stdout

Thanks for the bug report. I will check and fix it.

jelber2

comment created time in 11 hours

issue openedPacificBiosciences/align-clr-to-ccs

If no .bam.pbi files for associated .bam files, then actc exits without outputting stderr or stdout

I noticed if there are no .bam.pbi files for associated input .bam files, then actc exits without outputting stderr or stdout (using https://github.com/PacificBiosciences/align-clr-to-ccs/releases/download/0.1.0/actc) on Debian 11 machine. A simple error message would be great as well as adding that to the documentation and how to generate the index with pbindex binary from bioconda package pbbam.

created time in 12 hours

push eventPacificBiosciences/pbcommand

Nathaniel Echols

commit sha 79df48210f151322a6538602e1ed806d0dbfc7e6

[SL-7546] add name to fast dataset metadata loading

view details

Nathaniel Echols

commit sha 5b4e6d0bc2cf9d3f96eca07ab1b6438cce6faae0

Pull request #165: [SL-7546] add name to fast dataset metadata loading Merge in SL/pbcommand from bugfix/SL-7546-all-consensusreadsets-show-up-as-lima-0-in-demultiplex-barcodes-file-downloads to develop * commit '79df48210f151322a6538602e1ed806d0dbfc7e6': [SL-7546] add name to fast dataset metadata loading

view details

push time in 12 hours

issue commentPacificBiosciences/pb-metagenomics-tools

Fail job due to "[E::sam_parse1] no SQ lines present in the header"

@CaroleBelliardo I am not sure what the specific issue is here. It looks like the diamond search failed, but it is not clear why.

Would you be able to paste the contents of logs/cat.CheckForBins.log? This can tell us if there were actually bins produced by Metabat2, which are required for DAS_Tool.

CaroleBelliardo

comment created time in a day

issue closedPacificBiosciences/pb-metagenomics-tools

Testing RMA parameters

Two parameters may help improve recall while preserving the high precision currently seen. These are in sam2rma and used to create the RMA file from the DIAMOND or minimap2 alignments:

-top, --topPercent [number]          Top percent. Default value: 10.0.

Test value of 5.0

-supp, --minSupportPercent [number]   Min support as percent of assigned reads (0==off). Default value: 0.05.
Change to 0.01?

Test 0 or 0.01.

closed time in a day

dportik

issue commentPacificBiosciences/pb-metagenomics-tools

Testing RMA parameters

Changed defaults for --minSupportPercent to 0.01.

dportik

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Oh, nevermind. I see that it is labeled in bioconda as 1.0.0, but the binary version output is 1.2.0.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Just confirming, 1.0.0 on bioconda now is the same as the release version you linked to in github? (i.e. 1.2.0 is just a labeling issue, and not a different version?) Thanks.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

Lima terminate called without an active exception

Thanks for replying me, @armintoepfer. I'm having problem to extract e.g. the first 1000 reads from the .ccs.bam file. Seems like something in the bam file is changed if I use samtools view sample.ccs.bam | head -n 1000 | samtools -S -b > test.ccs.bam. I used lima to run on the test.ccs.bam file, it complains that lima can't recognise the read type.

I tried the same samtools command on the subheads.bam file before ccs analysis, and ccs can't recognise the test.subreads.bam either. Could you please give me some hints how to generate a small file for me to share? Thanks!

zmz1988

comment created time in a day

issue closedPacificBiosciences/pbbioconda

zmwfilter very slow

Hi, Filtering 14,000 ZMW holes from an 800 Gb subreads BAM file using 'zmwfilter' is taking > 3 hours. That seems too slow. Is there any speed improvement, or any other way to do this?

closed time in a day

gevro

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

It's in bioconda. Still as version 1.0.0, though the internal version is 1.2.0. Will correct that at some point.

gevro

comment created time in a day

startedPacificBiosciences/zmwfilter

started time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Ah, yes, correct. My mistake.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

samtools index on an unaligned BAM always results in an empty BAI.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Actually, there's some issue I think. I'm trying to 'samtools index' the resulting BAM, but it isn't working. It outputs a 16 byte BAM index. Maybe something is malformed in the BAM output.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Works great! Please let me know once on bioconda.

gevro

comment created time in a day

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

I've rebuild with the latest dependencies. 20k ZMWs are now being filtered in 45s on my machine. Please let me know if that works and I'll upload to bioconda:

https://github.com/PacificBiosciences/zmwfilter/releases/tag/v1.0.0

gevro

comment created time in a day

release PacificBiosciences/zmwfilter

v1.0.0

released time in a day

startedPacificBiosciences/FALCON

started time in 2 days

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Note some more data, and a work-around I found in the meantime:

  1. Some more tests show the time per hole extraction is linear, ~0.2 sec/hole on the fastest disk/machine. However, this is too slow for large numbers of holes. It would take 5.5 hours for 100,000 holes.

  2. A single pass through the full subreads.BAM using this awk script is faster than trying to extract about > 20,000 holes with zmwfilter. This script below takes the same amount of time regardless of the number of holes that are extracted, but is slower for about < 20,000 holes than using zmwfilter.

samtools view -@8 -h subreads.bam | awk 'BEGIN{idx=1;matched=0}{if(FNR==NR){holes[NR]=$1}else{split($1,hole,"/");if(hole[2]==holes[idx]){print $0;matched=1}else{if(matched==1){idx++;matched=0;if(hole[2]==holes[idx]){print $0;matched=1}}}}}' holes.txt - | samtools view -b - > output.sam

#holes.txt = numerically sorted hole numbers, one per line
gevro

comment created time in 2 days

issue commentPacificBiosciences/pbbioconda

zmwfilter very slow

Thanks. This is a bit urgent for me, because I can't complete my analysis without this. So I appreciate any help.

gevro

comment created time in 2 days

issue commentPacificBiosciences/pbbioconda

pbindexdump - uses a lot of memory

FYI: Bug doesn't seem to happen when doing --format cpp. Issue is probably in the json conversion.

gevro

comment created time in 2 days

issue closedPacificBiosciences/pbbioconda

pbindexdump - uses a lot of memory

pbindexdump uses a large amount of RAM. It crashes out of memory with 64 G RAM on a 2.7Gb pbi file. Seems like it should not take that much RAM to dump an index file.

It also sometimes crashes with "Bus error".

Also, it runs very slowly.

Perhaps this is the cause of zmwfilter being slow, if it is using the same underlying pbindexdump code

closed time in 2 days

gevro

issue commentPacificBiosciences/pbbioconda

pbindexdump - uses a lot of memory

pbindexdump is an internal tool and not used in production pipelines. We will have a look, but this is low priority. Will ping here if we find something

gevro

comment created time in 2 days

issue openedPacificBiosciences/pbbioconda

pbindexdump - uses a lot of memory

pbindexdump uses a large amount of RAM. It crashes out of memory with 64 G RAM on a 2.7Gb pbi file. Seems like it should not take that much RAM to dump an index file.

Also, it runs very slowly.

Perhaps this is the cause of zmwfilter being slow, if it is using the same underlying pbindexdump code

created time in 2 days

more