Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem - Genomic Region Viewing Window #33

Open
BolisLab opened this issue Apr 5, 2024 · 3 comments
Open

Problem - Genomic Region Viewing Window #33

BolisLab opened this issue Apr 5, 2024 · 3 comments

Comments

@BolisLab
Copy link

BolisLab commented Apr 5, 2024

Hi, thanks in advance for the help. I tried plotting some bigwig files, successfully creating a track plot with a not very wide viewing window:

> loci_names
         SYMBOL                      LOCI
ANKRD11 ANKRD11   chr16:89007630-89650561`

**Indeed, the code works:**

> t = track_extract(colData = sample_bigWigs_list[[j]], loci = loci_names$LOCI[i])
Parsing loci..
    Queried region: chr16:89007630-89650561 [642931 bps]
Querying UCSC genome browser for gene model and cytoband..
Extracting gene models from UCSC:
    chromosome: chr16
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, txStart, txEnd, strand, name, name2, exonStarts, exonEnds from refGene WHERE chrom ="chr16"'
Extracting cytobands from UCSC:
    chromosome: chr16
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, chromStart, chromEnd, name, gieStain from cytoBand WHERE chrom ="chr16"'
Generating windows [10 bp window size]
Extracting signals
    Processing iPSC_1_MNCdLS1 ..
    Processing iPSC_2_G12 ..
    Processing iPSC_3_MNFa1 ..
    Processing iPSC_4_MNMo1 ..
OK!

When I try to widen the viewing window:

> loci_names
         SYMBOL                      LOCI
ANKRD11 ANKRD11   chr16:88667630-89990561

 #get the following error:

t = track_extract(colData = sample_bigWigs_list[[j]], loci = loci_names$LOCI[i])
Parsing loci..
    Queried region: chr16:88667630-89990561 [1322931 bps]
Querying UCSC genome browser for gene model and cytoband..
Extracting gene models from UCSC:
    chromosome: chr16
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, txStart, txEnd, strand, name, name2, exonStarts, exonEnds from refGene WHERE chrom ="chr16"'
Extracting cytobands from UCSC:
    chromosome: chr16
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, chromStart, chromEnd, name, gieStain from cytoBand WHERE chrom ="chr16"'
Generating windows [10 bp window size]
Extracting signals
    Processing iPSC_1_MNCdLS1 ..
invalid unsigned integer: "8.9e+07"
    Processing iPSC_2_G12 ..
invalid unsigned integer: "8.9e+07"
    Processing iPSC_3_MNFa1 ..
invalid unsigned integer: "8.9e+07"
    Processing iPSC_4_MNMo1 ..
invalid unsigned integer: "8.9e+07"
Error: Can't assign 1 names to a 0-column data.table

I have checked that the selected regions do not exceed the chromosome sizes, and there is no such issue.
What could it be due to? Can it be resolved?

@PoisonAlien
Copy link
Owner

Hi,

Thanks for the issue. The problem here is that R tends to represent integers in scientific notation after a certain length. You can increase this threshold with options before running the trackplot commands.

options(scipen = 15L)
t = track_extract(colData = sample_bigWigs_list[[j]], loci = loci_names$LOCI[i])

I hope this helps.

@BolisLab
Copy link
Author

BolisLab commented Apr 8, 2024 via email

@BolisLab
Copy link
Author

BolisLab commented Apr 8, 2024

Hi,

Thank you very much, now with the setting options(scipen = 15L) it works.

I wanted to ask you one more small thing.
Although now with this setting the following step is always passed without errors:

t = track_extract(colData = sample_bigWigs_list[[j]], loci = loci_names$LOCI[i])

for some genomic regions, there is another error when performing the "track_plot" (not for all, it works for some, but not for others).
Below, I am sending the code related to the region that does not work:

> # Extract bigWig signal for a loci of interest
> options(scipen = 15L)
> t = track_extract(colData = sample_bigWigs_list[[j]], loci = loci_names$LOCI[i])
Parsing loci..
    Queried region: chrX:70801691-74073101 [3271410 bps]
Querying UCSC genome browser for gene model and cytoband..
Extracting gene models from UCSC:
    chromosome: chrX
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, txStart, txEnd, strand, name, name2, exonStarts, exonEnds from refGene WHERE chrom ="chrX"'
Extracting cytobands from UCSC:
    chromosome: chrX
    build: hg38
    query: mysql --user genome --host genome-mysql.soe.ucsc.edu -NAD hg38 -e 'select chrom, chromStart, chromEnd, name, gieStain from cytoBand WHERE chrom ="chrX"'
Generating windows [10 bp window size]
Extracting signals
    Processing iPSC_1_MNCdLS1 ..
    Processing iPSC_2_G12 ..
    Processing iPSC_3_MNFa1 ..
    Processing iPSC_4_MNMo1 ..
OK!
> 
> # Estrapolazione Peaks Names
> bw_sample_names <- sample_bigWigs_list[[j]]$bw_sample_names
> peaks_names <- sub("^iPSC_([0-9]+_[^_]+)_.*$", "\\1", bw_sample_names)
> peaks_names <- paste0(peaks_names, "_peaks")
> 
> # Estrapolazione Tracks Names
> bw_sample_names <- sample_bigWigs_list[[j]]$bw_sample_names
> track_names <- sub("^iPSC_([0-9]+_[^_]+)_.*$", "\\1", bw_sample_names)
> track_names <- paste0(track_names, "_track")
> 
> # Impostare un margine superiore più grande del plot
> par(oma = c(2, 2, 2, 2), cex.main = 1) 
> 
> # TrackPlot Execution:
> trackplot::track_plot(summary_list = t, col = track_cols, show_ideogram = TRUE, y_min = 0, y_max = 250, gene_fsize = 0.6, left_mar = 14, track_names = track_names, track_names_to_left = T, peaks = sample_NarrowPeak_list[[j]], peaks_track_names = peaks_names, layout_ord = c("c", "g", "p", "b", "h"), gene_track_height = 8, peaks_track_height = 5)
Collapsing transcripts..
Error in if (attr(txtbl, "strand") == "+") { : 
  the condition has length > 1

In particular, the genomic region referred to is the following:

> loci_names
      SYMBOL                   LOCI
HDAC8  HDAC8 chrX:70801691-74073101

What could this be due to?

Thank you very much in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants