Description
Sometimes we talk about GMT sessions in issues/PRs. It's important to know that there are two different kinds of sessions: (1) the GMT CLI session; (2) the GMT C API session. Here are some technical notes of the two GMT sessions to help understand how PyGMT works and the potential flaws.
The GMT CLI session
Here is a simple GMT CLI script:
gmt begin map -V
gmt basemap -R0/10/0/10 -JX10c -Baf
gmt end show
The gmt begin
command creates the so-called GMT CLI session directory under the ~/.gmt/sessions
directory. The session directory name is like ~/.gmt/sessions/gmt_session.XXXXX
, in which XXXXX is the parent process ID (PPID), but can be changed by the environmental variable GMT_SESSION_NAME. Then, subsequent GMT commands will read/write information from/to files in this directory. This is how different GMT module calls communicate in modern mode. See https://docs.generic-mapping-tools.org/dev/begin.html#note-on-unix-shells for the official explanations.
In the current PyGMT implementation, when we import the PyGMT library (i.e., import pygmt
),
we call gmt begin
to create the GMT CLI session. This GMT CLI session will be used by all subsequent GMT calls. It's usually OK, but when used in multiprocessing, GMT module calls from different processes access this directory at the same time, which can cause corruptions. This explains why PyGMT has troubles with multiprocessing (#217).
So, to make PyGMT support multiprocessing, the solution seems straightforward:
- Set environmental variables
GMT_SESSION_NAME
to a unique value (we already have the
unique_name()
function) so that each process has a unique session name - Do not call
gmt begin
at import time so that each process has a unique GMT CLI session directory
A proof-of-concept PR is opened at #3392.
The GMT C API session
We also need to know a little about the GMT C API session. Here is a simplified C example that calls GMT C API functions (the original example is https://github.com/GenericMappingTools/gmt/blob/master/src/testapi_modern.c):
#include "gmt.h"
int main () {
void *API = NULL;
/* Initialize the GMT session */
API = GMT_Create_Session ("testapi_modern", 2, GMT_SESSION_RUNMODE, NULL));
GMT_Call_Module(API, "begin", GMT_MODULE_CMD, "apimodern png");
GMT_Call_Module(API, "basemap", GMT_MODULE_CMD, "-BWESN -Bafg -JM16c -R5/41/9/43");
GMT_Call_Module(API, "end", GMT_MODULE_CMD, "show");
GMT_Destroy_Session (API);
}
The C API function GMT_Create_Session
creates the so-called GMT C API session. This function does a lot of things, including, allocating memory for internal variables, deciding the session name, loading gmt.conf settings, and more. The API function GMT_Destroy_Session
is responsible for destroying the GMT C API session.
The equivalent PyGMT version should be:
from pygmt.clib import Session
with Session() as lib:
lib.call_module("begin", "pygmt-session")
lib.call_module("figure", "apimodern -")
lib.call_module("basemap", "-BWESN -Bafg -JM16c -R5/41/9/43")
lib.call_module("psconvert", "-A -Tg")
lib.call_module("end")
However, in the current implementation, the PyGMT version looks like below:
from pygmt.clib import Session
with Session() as lib:
lib.call_module("begin", "pygmt-session")
with Session() as lib:
lib.call_module("figure", "apimodern -")
with Session() as lib:
lib.call_module("basemap", "-BWESN -Bafg -JM16c -R5/41/9/43")
with Session() as lib:
lib.call_module("psconvert", "-A -Tg")mi
with Session() as lib:
lib.call_module("end")
in which the GMT C API sessions are created/destroyed multiple times. We may have some performance improvements if we can use a single GMT C API session, but we also need to note that the GMT CLI script also creates/destorys GMT C API sessions repeatly.
Extra notes
Here are some extra notes:
- The session name is decided in C API function
GMT_Create_Session
. So,GMT_SESSION_NAME
should be defined before callingGMT_Create_Session
. - Data processing modules can be called in either classic mode or modern mode (i.e., inside
gmt begin
or not), but some modules behave different in classis/modern mode. For example,gmt makecpt
writes the output to stdout in classic mode but to a hidden CPT file in modern mode. To make things as simple as possible, it's OK to always callgmt begin
at the beginning. - A GMT C API session is required when calling any GMT C API functions. However, a GMT CLI session is only required when calling GMT modules. For example, the following Python script works without a GMT CLI session:
from pygmt.clib import Session
with Session() as lib:
lib.read_data("@earth_relief_01d_g", kind="grid")