Skip to content

getTombo cannot deal with special characters in collection column #148

Open
@ggrittz

Description

@ggrittz

There may be cases where columns collectionCode.new contain special characters.

This causes a regex error in lines 40-41:

tmb2 <- tmb1[pos > 1]
      col2 <- paste0("^", col1[pos > 1], "+")
      tmb2 <- mapply(function(x, y) { # error occurs in this mapply
        gsub(y, "", x, perl = TRUE)
      }, tmb2, col2)
Error in gsub(y, "", x, perl = TRUE) : 
  expressão regular inválida '^ Collected for Botany , U.N.C. [University of North Caroli+'
Além disso: Warning message:
In gsub(y, "", x, perl = TRUE) : erro de compilação de padrão PCRE
	'missing terminating ] for character class'
	at ''

This can be fixed by escaping special characters right before doing the mapply:

pos <- as.double(regexpr("\\d", tmb1, perl = TRUE))
      pos[is.na(pos)] <- 0
      tmb2 <- tmb1[pos > 1]
      
      # Escape special characters in col1
      col1_escaped <- gsub("([][{}()*+?.\\^$|])", "\\\\\\1", col1[pos > 1])
      col2 <- paste0("^", col1_escaped, "+")

      tmb2 <- mapply(function(x, y) {
        gsub(y, "", x, perl = TRUE)
      }, tmb2, col2)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions