Skip to content

3.0.0 gapindex::get_data function triggers space quota issues #63

Open
@EmilyMarkowitz-NOAA

Description

@EmilyMarkowitz-NOAA

Issue Description

TLDR: The new 3.0.0 gapindex::get_data function triggers space quota issues. Tagging in @sowasser for awareness.

Steps to Reproduce

When I run the species complex example on gapindex, everything runs great! (yay!)

Image

However, in running the download data script for the Bering Sea data report, pulling through the GAP_PRODUCTS schema, I experienced this issue:

Image

And when we try running this through a personal schema (here, @sowasser's personal oracle schema) the same error occurs much sooner:

Image

So I tried running the get_data function line by line, and got Oracle error 1536 indicating a the space quota issue. As described in this StackOverflow issue, this issue is due to:

"Cause: The space quota for the segment owner in the tablespace has been exhausted and the operation attempted the creation of a new segment extent in the tablespace."
"Action: Either drop unnecessary objects in the tablespace to reclaim space or have a privileged user increase the quota on this tablespace for the segment owner."

This explains why the GAP_PRODCUTS and a personal schema would trigger space errors at different sections of the code - GAP_PRODCUTS schema has more space!

Image

Steps to Reproduce

Here is the excerpt of code from the Bering Sea data report download data script:

# Load libraries and connect to Oracle
library(gapindex)
channel <- gapindex::get_connected()

# Species Covered
googledrive::drive_download(file = googledrive::as_id("10Pn3fWkB-Jjcsz4iG7UlR-LXbIVYofy1yHhKkYZhv2M"),
                            type = "csv",
                            overwrite = TRUE,
                            path = paste0(dir_out_rawdata, "/species-local-names"))

# identify which species complexes you need
report_spp <- readr::read_csv(file = paste0(dir_out_rawdata, "/species-local-names.csv"), 
                              skip = 1, 
                              show_col_types = FALSE) %>%  
  dplyr::filter(grepl(x = species_code, pattern = "c(", fixed = TRUE)) 

temp1 <- data.frame()
for (i in 1:nrow(report_spp)){
  temp2 <- eval(expr = parse(text = report_spp$species_code[i]))
  temp1 <- dplyr::bind_rows(temp1, 
                            dplyr::bind_cols(GROUP_CODE = report_spp$print_name[i], 
                                             SPECIES_CODE = eval(expr = parse(text = report_spp$species_code[i]))))
}

## Pull data. 
production_data <- gapindex::get_data(
  year_set = 1982:maxyr,
  survey_set = "EBS"
  spp_codes = temp1,
  pull_lengths = TRUE, 
  haul_type = 3, 
  abundance_haul = "Y", 
  taxonomic_source = "GAP_PRODUCTS.TAXONOMIC_CLASSIFICATION", # same thing happens with "RACEBASE.SPECIES"
  channel = channel)

Suggested solution

I'm not sure exactly what the solution is, but maybe there is a way to increase the space when we run the get_data function? This code worked in the last version of gapindex, and the function doesn't look that different, so I'm not sure what changed.

Thanks for your help, @zoyafuso-NOAA!

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions