This is how I use R, some command line utilities, and imagemagick to get IODP core photographs in a reduced size into my scientific figures!
- R
- bash or another command line shell
- imagemagick
- the develpment R package ggtextures
library(stringr)
library(dplyr)
library(glue)
library(readxl)
library(readr)
## library(ggtextures)
devtools::load_all("~/Downloads/ggtextures")
Loading ggtextures
IODP info is available on the IODP website
- Core data from all ODP, DSDP and IODP expedition 301–312 data are available on the janus database
- For older core images (before leg 198) we need the url: http://www-odp.tamu.edu/publications/###_IR/VOLUME/CORES/IMAGES to get PDFs of the whole core, sections laid out next to each other.
- There is also a newer file format called MrDIT, but this was too hard to get to work.
- Core data for all newer IODP expeditions from 317–present are available in the LORE database.
In this example setup we’re looking at Expedition 208, Site 1264. After some digging around on the janus database, we find that the JPG images are hosted with a regular url: http://www-odp.tamu.edu/publications/208_IR/VOLUME/CORES/JPEG/1264B/1264b_026h/1264b_026h_06.jpg
These are the ones I’ll be using.
This function converts leg, site, hole, etc. data to create the URL to the image.
create_image_url <- function(leg, site, hole, core, type = "H", section, image_type = "JPEG", extension = ".jpg") {
base_url <- "http://www-odp.tamu.edu/publications/"
core_padded <- str_pad(core, 3, pad = 0)
type_l <- str_to_lower(type)
sec <- section %>% str_to_lower() %>% str_pad(2, pad=0)
glue("{base_url}{leg}_IR/VOLUME/CORES/{image_type}/{site}{hole}/{site}{str_to_lower(hole)}_{core_padded}{type_l}/{site}{str_to_lower(hole)}_{core_padded}{type_l}_{sec}{extension}")
}
# some tests
create_image_url(leg = 208, site = 1264, hole = "B", core = 30, type = "H", section = "CC")
create_image_url(leg = 208, site = 1264, hole = "C", core = 1, type = "H", section = 2)
create_image_url(leg = 208, site = 1264, hole = "B", core = 26, type = "H", section = 6)
http://www-odp.tamu.edu/publications/208_IR/VOLUME/CORES/JPEG/1264B/1264b_030h/1264b_030h_cc.jpg http://www-odp.tamu.edu/publications/208_IR/VOLUME/CORES/JPEG/1264C/1264c_001h/1264c_001h_02.jpg http://www-odp.tamu.edu/publications/208_IR/VOLUME/CORES/JPEG/1264B/1264b_026h/1264b_026h_06.jpg
This section is probably something that you need to tweak some more for your use-case.
In the end, you need the IODP core info up to the section level. Each sections hould have a top_depth
(in mbsf, mcd, or armcd) and bot_depth
.
# this is the affine table, with the recovery and adjusted depths of the cores
md <- read_excel("~/SurfDrive/PhD/sites/site_1264_walvisridge/208 1264 Composite.xlsx",
sheet = "Affine Table", range = "A1:S31") %>% # hole A
# I combine the info for the three holes
bind_rows(read_excel("~/SurfDrive/PhD/sites/site_1264_walvisridge/208 1264 Composite.xlsx",
sheet = "Affine Table", range = "A33:S63")) %>% # hole B
bind_rows(read_excel("~/SurfDrive/PhD/sites/site_1264_walvisridge/208 1264 Composite.xlsx",
sheet = "Affine Table", range = "A65:S66")) # hole C
# now I need the recovered depth per section from somewhere
sd <- read_tsv("~/SurfDrive/PhD/programming/iodp_read_info/data/208-1264/sections/208_1264_sections.txt",
skip = 2) %>%
rename_all(str_to_lower)
# combine the metadata and clean up
cm <- left_join(sd, md, by = c("expedition" = "Leg",
"site" = "Site",
"hole" = "H",
"core" = "Cor",
"core_type" = "T")) %>%
mutate(armcd_top = `New Cum Offset` + top_depth,
armcd_bot = `New Cum Offset` + bottom_depth) %>%
select(expedition:core_type, section_label, top_depth, rev_length, armcd_top, armcd_bot) %>%
# here we create the image url based on the metadata
mutate(url = create_image_url(leg=expedition,
site=site,
hole=hole,
core=core,
type=core_type,
section=section_label,
image_type="JPEG"))
We create a directory where we save all the pretty large jpg images. For walvis ridge, this takes quite a while: 4 minutes and 30 seconds to download 455 files.
This is the part where you have to have bash or another shell with the cat, awk, grep, and wget utilities installed.
dir <- "1264_corepics"
dir.create(dir)
# save the clean metadata to a tsv so we can use awk to work with it
write_tsv(cm, "cm.txt")
# download the corepics
system("cat cm.txt | awk '{ print $11 }' | grep '^http' | wget -P 1264_corepics -i-")
FINISHED –2020-11-09 12:02:14– Total wall clock time: 5m 48s Downloaded: 455 files, 1.2G in 4m 30s (4.57 MB/s)
If you are working with a newer core, good on you! It’s much easier to get those data in shape!
- Go to http://web.iodp.tamu.edu/LORE/ in a modern browser
- In the left menu, click on “images”
- click “Core Sections (LSIMG)”
- click “standard”
- in the filtering view, filter by your site/core/section as desired
- optional: click an alternate depth scale
- click “View data”
- click “Download tabular data” to save the csv with the metadata
- click “Batch download linked files” to save the jpg images
- Choose “cropped images”
We can merge the file paths with the metadata by their image link:
cp <- read_csv("dat/core_site_pictures.csv")
# list all the downloaded jpg files
fl <- tibble(file = list.files("dat/corepics", pattern=".jpg")) %>%
# I'm assuming that these ID's are unique here
mutate(id = str_extract(file, "[0-9]+.jpg$") %>% str_replace(".jpg", "") %>% parse_double())
cp <- cp %>%
# merge the file info (section depth etc.) with the image name
left_join(fl, by = c("Cropped image (JPG) link" = "id")) %>%
mutate(file = str_replace(file, ".jpg", ".png"))
The image at the top is the result of reading in and combining data for IODP site U1411 using this method! Don’t forget the next cropping step though, or your computer might crash.
The images are much too large to load into memory all at once, so we downsize and crop them all.
The target width at 300 dpi if we want to plot it at half a cm:
300 / 2.54 * .5 # 300 dpi in cm, for half a cm
[1] 59.05512
Use magick’s mogrify to batch resize and crop the images. Play around with the magick commandline options to get the cropping correct.
system("mogrify -resize 60 -crop -17-54 -format png 1264_corepics/*.jpg")
cm <- cm %>%
mutate(file = paste0("1264_corepics/", basename(url) %>% str_replace(".jpg", ".png"))) %>%
# some of the sections don't have images, or our download failed perhaps?
mutate(file_exists = !is.na(file.info(file)$size),
file = ifelse(file_exists, file, NA_character_))
cm %>%
ggplot(aes(y = armcd_bot, xmin = 0L, xmax = 1L, ymin = armcd_bot, ymax = armcd_top, image = file)) +
geom_rect(alpha = .2) +
## coord_cartesian(ylim = c(17, 3)) +
geom_textured_rect(colour = NA, nrow = 1, ncol = 1, img_width = unit(1, "null"), img_height = unit(1, "null"), interpolate = FALSE) +
facet_grid(cols = vars(hole)) +
scale_y_reverse() +
labs(title="IODP Leg 208 Site 1264", subtitle = "core photographs", caption = "created by Ilja Kocken") +
theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank(),
panel.grid = element_blank()) -> pl
pl
To do this, I recommend the patchwork package. Make sure to set the exact same y-axis for both plots.
cm %>%
ggplot(aes(y = armcd_bot, xmin = 0L, xmax = 1L, ymin = armcd_bot, ymax = armcd_top, image = file)) +
geom_rect(alpha = .2) +
coord_cartesian(ylim = c(17, 3)) +
geom_textured_rect(colour = NA, nrow = 1, ncol = 1, img_width = unit(1, "null"), img_height = unit(1, "null"), interpolate = FALSE) +
facet_grid(cols = vars(hole)) +
scale_y_reverse() +
theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank(),
panel.grid = element_blank()) -> pl
cm %>%
mutate(D47 = rnorm(n(), mean = 0.76, sd = 0.2)) %>%
ggplot(aes(x = D47, y = armcd_bot)) +
scale_y_reverse() +
coord_cartesian(ylim = c(17, 3)) +
theme(axis.title.y = element_blank(), axis.text.y = element_blank()) +
geom_point() -> pl2
library(patchwork)
pc <- (pl + pl2) + plot_layout(widths = c(.1, .9))
pc
If you need to put the depth on the x-axis, the images need to be rotated. Make sure you rotate them in the correct way! If you want to put greater depths to the right, rotate the images 90° anti-clockwise, like so:
system("mogrify -rotate -90 1264_corepics/*.png")
If you want to put the deeper sediments to the left, so that time progresses from left to right, rotate them by +90°.
system("mogrify -rotate 90 1264_corepics/*.png")
cm %>%
ggplot(aes(x = armcd_bot, ymin = 0L, ymax = 1L, xmin = armcd_bot, xmax = armcd_top, image = file)) +
geom_rect(alpha = .2) +
coord_cartesian(xlim = c(17, 3)) +
geom_textured_rect(colour = NA, nrow = 1, ncol = 1, img_width = unit(1, "null"), img_height = unit(1, "null"), interpolate = FALSE) +
facet_grid(rows = vars(hole)) +
## scale_x_reverse() +
theme(axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank(),
panel.grid = element_blank()) -> pl
cm %>%
mutate(D47 = rnorm(n(), mean = 0.76, sd = 0.2)) %>%
ggplot(aes(y = D47, x = armcd_bot)) +
scale_y_reverse() +
coord_cartesian(xlim = c(17, 3)) +
theme(axis.title.x = element_blank(), axis.text.x = element_blank()) +
geom_point() -> pl2
pc <- (pl2/pl) + plot_layout(heights = c(.9, .1))
pc