Skip to content

[lldb][headers] Create Python script to fix up framework headers #142051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chelcassanova
Copy link
Contributor

This commit replaces the shell script that fixes up includes for the LLDB framework with a Python script. This script will also be used when fixing up includes for the LLDBRPC.framework.

@llvmbot
Copy link
Member

llvmbot commented May 29, 2025

@llvm/pr-subscribers-lldb

Author: Chelsea Cassanova (chelcassanova)

Changes

This commit replaces the shell script that fixes up includes for the LLDB framework with a Python script. This script will also be used when fixing up includes for the LLDBRPC.framework.


Full diff: https://github.com/llvm/llvm-project/pull/142051.diff

7 Files Affected:

  • (modified) lldb/cmake/modules/LLDBFramework.cmake (+14-19)
  • (added) lldb/scripts/framework-header-fix.py (+78)
  • (removed) lldb/scripts/framework-header-fix.sh (-17)
  • (added) lldb/test/Shell/Scripts/Inputs/Main/SBAddress.h (+8)
  • (added) lldb/test/Shell/Scripts/Inputs/RPC/RPCSBAddress.h (+9)
  • (added) lldb/test/Shell/Scripts/TestFrameworkFixScript.test (+11)
  • (added) lldb/test/Shell/Scripts/TestRPCFrameworkFixScript.test (+14)
diff --git a/lldb/cmake/modules/LLDBFramework.cmake b/lldb/cmake/modules/LLDBFramework.cmake
index 471aeaaad3c0d..b0eae27eb29e9 100644
--- a/lldb/cmake/modules/LLDBFramework.cmake
+++ b/lldb/cmake/modules/LLDBFramework.cmake
@@ -68,24 +68,16 @@ if(NOT APPLE_EMBEDDED)
   )
 endif()
 
-# At configuration time, collect headers for the framework bundle and copy them
-# into a staging directory. Later we can copy over the entire folder.
-file(GLOB public_headers ${LLDB_SOURCE_DIR}/include/lldb/API/*.h)
-set(generated_public_headers ${LLDB_OBJ_DIR}/include/lldb/API/SBLanguages.h)
-file(GLOB root_public_headers ${LLDB_SOURCE_DIR}/include/lldb/lldb-*.h)
-file(GLOB root_private_headers ${LLDB_SOURCE_DIR}/include/lldb/lldb-private*.h)
-list(REMOVE_ITEM root_public_headers ${root_private_headers})
-
 find_program(unifdef_EXECUTABLE unifdef)
 
-set(lldb_header_staging ${CMAKE_CURRENT_BINARY_DIR}/FrameworkHeaders)
-foreach(header
-    ${public_headers}
-    ${generated_public_headers}
-    ${root_public_headers})
+# All necessary header files should be staged in the include directory in the build directory,
+# so just copy the files from there into the framework's staging directory.
+set(lldb_build_dir_header_staging ${CMAKE_BINARY_DIR}/include/lldb)
+set(lldb_framework_header_staging ${CMAKE_CURRENT_BINARY_DIR}/FrameworkHeaders)
+foreach(header ${lldb_build_dir_header_staging})
 
   get_filename_component(basename ${header} NAME)
-  set(staged_header ${lldb_header_staging}/${basename})
+  set(staged_header ${lldb_framework_header_staging}/${basename})
 
   if(unifdef_EXECUTABLE)
     # unifdef returns 0 when the file is unchanged and 1 if something was changed.
@@ -107,14 +99,17 @@ endforeach()
 # Wrap output in a target, so lldb-framework can depend on it.
 add_custom_target(liblldb-resource-headers DEPENDS lldb-sbapi-dwarf-enums ${lldb_staged_headers})
 set_target_properties(liblldb-resource-headers PROPERTIES FOLDER "LLDB/Resources")
+
+# We're taking the header files from where they've been staged in the build directory's include folder,
+# so create a dependency on the build step that creates that directory.
+add_dependencies(liblldb-resource-headers liblldb-header-staging)
 add_dependencies(liblldb liblldb-resource-headers)
 
-# At build time, copy the staged headers into the framework bundle (and do
-# some post-processing in-place).
+# Take the headers from the staging directory and fix up their includes for the framework.
+# Then write them to the output directory.
 add_custom_command(TARGET liblldb POST_BUILD
-  COMMAND ${CMAKE_COMMAND} -E copy_directory ${lldb_header_staging} $<TARGET_FILE_DIR:liblldb>/Headers
-  COMMAND ${LLDB_SOURCE_DIR}/scripts/framework-header-fix.sh $<TARGET_FILE_DIR:liblldb>/Headers ${LLDB_VERSION}
-  COMMENT "LLDB.framework: copy framework headers"
+  COMMAND ${LLDB_SOURCE_DIR}/scripts/framework-header-fix.py lldb ${lldb_framework_header_staging} $<TARGET_FILE_DIR:liblldb>/Headers
+  COMMENT "LLDB.framework: Fix up and copy framework headers"
 )
 
 # Copy vendor-specific headers from clang (without staging).
diff --git a/lldb/scripts/framework-header-fix.py b/lldb/scripts/framework-header-fix.py
new file mode 100755
index 0000000000000..af20389900601
--- /dev/null
+++ b/lldb/scripts/framework-header-fix.py
@@ -0,0 +1,78 @@
+#!/usr/bin/env python3
+
+"""
+Usage: <path/to/input-directory> <path/to/output-directory>
+
+This script is used when building LLDB.framework or LLDBRPC.framework. For each framework, local includes are converted to their respective framework includes.
+
+This script is used in 2 ways:
+1. It is used on header files that are copied into LLDB.framework. For these files, local LLDB includes are converted into framework includes, e.g. #include "lldb/API/SBDefines.h" -> #include <LLDB/SBDefines.h>.
+
+2. It is used on header files for LLDBRPC.framework. For these files, includes of RPC common files will be converted to framework includes, e.g. #include <lldb-rpc/common/RPCCommon.h> -> #include <LLDBRPC/RPCCommon.h>. It will also change local includes to framework includes, e.g. #include "SBAddress.h" -> #include <LLDBRPC/SBAddress.h>
+"""
+
+import argparse
+import os
+import re
+
+# Main header regexes
+INCLUDE_FILENAME_REGEX = re.compile(r'#include "lldb/API/(?P<include_filename>.*){0,1}"')
+
+# RPC header regexes
+RPC_COMMON_REGEX = re.compile(r'#include <lldb-rpc/common/(?P<include_filename>.*)>')
+RPC_INCLUDE_FILENAME_REGEX = re.compile(r'#include "(?P<include_filename>.*)"')
+
+def modify_rpc_includes(input_directory_path, output_directory_path):
+    for input_filepath in os.listdir(input_directory_path):
+        current_input_file = os.path.join(input_directory_path, input_filepath)
+        output_dest = os.path.join(output_directory_path, input_filepath)
+        if os.path.isfile(current_input_file):
+            with open(current_input_file, "r") as input_file:
+                lines = input_file.readlines()
+                file_buffer = "".join(lines)
+            with open(output_dest, "w") as output_file:
+                # Local includes must be changed to RPC framework level includes.
+                # e.g. #include "SBDefines.h" -> #include <LLDBRPC/SBDefines.h>
+                # Also, RPC common code includes must change to RPC framework level includes.
+                # e.g. #include "lldb-rpc/common/RPCPublic.h" -> #include <LLDBRPC/RPCPublic.h>
+                rpc_common_matches = RPC_COMMON_REGEX.finditer(file_buffer)
+                rpc_include_filename_matches = RPC_INCLUDE_FILENAME_REGEX.finditer(file_buffer)
+                for match in rpc_common_matches:
+                    file_buffer = re.sub(match.group(), r'#include <LLDBRPC/' + match.group('include_filename') + '>', file_buffer)
+                for match in rpc_include_filename_matches:
+                    file_buffer = re.sub(match.group(), r'#include <LLDBRPC/' + match.group('include_filename') + '>', file_buffer)
+                output_file.write(file_buffer)
+
+def modify_main_includes(input_directory_path, output_directory_path):
+    for input_filepath in os.listdir(input_directory_path):
+        current_input_file = os.path.join(input_directory_path, input_filepath)
+        output_dest = os.path.join(output_directory_path, input_filepath)
+        if os.path.isfile(current_input_file):
+            with open(current_input_file, "r") as input_file:
+                lines = input_file.readlines()
+                file_buffer = "".join(lines)
+            with open(output_dest, "w") as output_file:
+                # Local includes must be changed to framework level includes.
+                # e.g. #include "lldb/API/SBDefines.h" -> #include <LLDB/SBDefines.h>
+                regex_matches = INCLUDE_FILENAME_REGEX.finditer(file_buffer)
+                for match in regex_matches:
+                    file_buffer = re.sub(match.group(), r'#include <LLDB/' + match.group('include_filename') + '>', file_buffer)
+                output_file.write(file_buffer)
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("framework", choices=["lldb_main", "lldb_rpc"])
+    parser.add_argument("input_directory")
+    parser.add_argument("output_directory")
+    args = parser.parse_args()
+    input_directory_path = str(args.input_directory)
+    output_directory_path = str(args.output_directory)
+    framework_version = args.framework
+
+    if framework_version == "lldb_main":
+        modify_main_includes(input_directory_path, output_directory_path)
+    if framework_version == "lldb_rpc":
+        modify_rpc_includes(input_directory_path, output_directory_path)
+
+if __name__ == "__main__":
+    main()
diff --git a/lldb/scripts/framework-header-fix.sh b/lldb/scripts/framework-header-fix.sh
deleted file mode 100755
index 3459dd91c9ec1..0000000000000
--- a/lldb/scripts/framework-header-fix.sh
+++ /dev/null
@@ -1,17 +0,0 @@
-#!/bin/sh
-# Usage: framework-header-fix.sh <source header dir> <LLDB Version>
-
-set -e
-
-for file in `find $1 -name "*.h"`
-do
-  /usr/bin/sed -i.bak 's/\(#include\)[ ]*"lldb\/\(API\/\)\{0,1\}\(.*\)"/\1 <LLDB\/\3>/1' "$file"
-  /usr/bin/sed -i.bak 's|<LLDB/Utility|<LLDB|' "$file"
-  LLDB_VERSION=`echo $2 | /usr/bin/sed -E 's/^([0-9]+).([0-9]+).([0-9]+)(.[0-9]+)?$/\\1/g'`
-  LLDB_REVISION=`echo $2 | /usr/bin/sed -E 's/^([0-9]+).([0-9]+).([0-9]+)(.[0-9]+)?$/\\3/g'`
-  LLDB_VERSION_STRING=`echo $2`
-  /usr/bin/sed -i.bak "s|//#define LLDB_VERSION$|#define LLDB_VERSION $LLDB_VERSION |" "$file"
-  /usr/bin/sed -i.bak "s|//#define LLDB_REVISION|#define LLDB_REVISION $LLDB_REVISION |" "$file"
-  /usr/bin/sed -i.bak "s|//#define LLDB_VERSION_STRING|#define LLDB_VERSION_STRING \"$LLDB_VERSION_STRING\" |" "$file"
-  rm -f "$file.bak"
-done
diff --git a/lldb/test/Shell/Scripts/Inputs/Main/SBAddress.h b/lldb/test/Shell/Scripts/Inputs/Main/SBAddress.h
new file mode 100644
index 0000000000000..33eb54cd0397c
--- /dev/null
+++ b/lldb/test/Shell/Scripts/Inputs/Main/SBAddress.h
@@ -0,0 +1,8 @@
+// This is a truncated version of an SB API file
+// used to test framework-header-fix.py to make sure the includes are correctly fixed
+// up for the LLDB.framework.
+
+// Local includes must be changed to framework level includes.
+// e.g. #include "lldb/API/SBDefines.h" -> #include <LLDB/SBDefines.h>
+#include "lldb/API/SBDefines.h"
+#include "lldb/API/SBModule.h"
diff --git a/lldb/test/Shell/Scripts/Inputs/RPC/RPCSBAddress.h b/lldb/test/Shell/Scripts/Inputs/RPC/RPCSBAddress.h
new file mode 100644
index 0000000000000..556afa38a9225
--- /dev/null
+++ b/lldb/test/Shell/Scripts/Inputs/RPC/RPCSBAddress.h
@@ -0,0 +1,9 @@
+// This is a truncated version of an SB API file generated by lldb-rpc-gen
+// used to test framework-header-fix.py to make sure the includes are correctly fixed
+// up for the LLDBRPC.framework.
+
+// Local includes must be changed to framework level includes.
+// e.g. #include "lldb/API/SBDefines.h" -> #include <LLDB/SBDefines.h>
+#include "LLDBRPC.h"
+#include "SBDefines.h"
+#include <lldb-rpc/common/RPCPublic.h>
diff --git a/lldb/test/Shell/Scripts/TestFrameworkFixScript.test b/lldb/test/Shell/Scripts/TestFrameworkFixScript.test
new file mode 100644
index 0000000000000..3f6dd88940517
--- /dev/null
+++ b/lldb/test/Shell/Scripts/TestFrameworkFixScript.test
@@ -0,0 +1,11 @@
+# Create a temp dir for output and run the framework fix script on the truncated version of SBAddress.h in the inputs dir.
+RUN: mkdir -p %t/Outputs
+RUN: %python %p/../../../scripts/framework-header-fix.py lldb_main %p/Inputs/ %t/Outputs/
+
+# Check the output
+RUN: cat %t/Outputs/SBAddress.h | FileCheck %s
+
+# Local includes must be changed to framework level includes.
+# e.g. #include "lldb/API/SBDefines.h" -> #include <LLDB/SBDefines.h>
+CHECK: #include <LLDB/SBDefines.h>
+CHECK: #include <LLDB/SBModule.h>
diff --git a/lldb/test/Shell/Scripts/TestRPCFrameworkFixScript.test b/lldb/test/Shell/Scripts/TestRPCFrameworkFixScript.test
new file mode 100644
index 0000000000000..be2f70f7a2461
--- /dev/null
+++ b/lldb/test/Shell/Scripts/TestRPCFrameworkFixScript.test
@@ -0,0 +1,14 @@
+# Create a temp dir for output and run the framework fix script on the truncated version of SBAddress.h in the inputs dir.
+RUN: mkdir -p %t/Outputs
+RUN: %python %p/../../../scripts/framework-header-fix.py lldb_rpc %p/Inputs/ %t/Outputs/
+
+# Check the output
+RUN: cat %t/Outputs/RPCSBAddress.h | FileCheck %s
+
+# Local includes must be changed to RPC framework level includes.
+# e.g. #include "SBDefines.h" -> #include <LLDBRPC/SBDefines.h>
+# Also, RPC common code includes must change to RPC framework level includes.
+# e.g. #include "lldb-rpc/common/RPCPublic.h" -> #include <LLDBRPC/RPCPublic.h>
+CHECK: #include <LLDBRPC/RPCPublic.h>
+CHECK: #include <LLDBRPC/SBDefines.h>
+CHECK: #include <LLDBRPC/LLDBRPC.h>

Copy link

github-actions bot commented May 29, 2025

✅ With the latest revision this PR passed the Python code formatter.

@chelcassanova chelcassanova force-pushed the create-python-framework-fix-script branch from 62d0c96 to e78cdf5 Compare May 29, 2025 23:12
Comment on lines 73 to 74
# All necessary header files should be staged in the include directory in the build directory,
# so just copy the files from there into the framework's staging directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should be/will be/ (because you added the dependency on liblldb-header-staging)

@chelcassanova chelcassanova force-pushed the create-python-framework-fix-script branch 3 times, most recently from dd86d08 to 4064ae7 Compare May 30, 2025 19:06
parser.add_argument("output_directory")
parser.add_argument(
"--unifdef_guards",
nargs="*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use nargs='+' then you can specify multiple and get a list. That way you don't need the splitting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, though in trying this it doesn't like to play along with the positional args that come before the guards 🤔



def remove_guards(output_directory_path, unifdef_guards):
unifdef_path = shutil.which("unifdef")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CMake is computing the unifdef binary, let's pass that in to the script. If it's not specified, we can fallback to this (e.g. for the tests).

@chelcassanova chelcassanova force-pushed the create-python-framework-fix-script branch 5 times, most recently from baad2eb to 4bdd38c Compare June 3, 2025 17:17
Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could benefit from following the pattern of the other scripts, which is running on a single file and having CMake doing the globbing, presumably that means the build system can do all of this in parallel instead of making this a single-threaded operation.

@chelcassanova
Copy link
Contributor Author

I think this could benefit from following the pattern of the other scripts, which is running on a single file and having CMake doing the globbing, presumably that means the build system can do all of this in parallel instead of making this a single-threaded operation.

I was actually wondering if file-level operations should be done in the script or by the build system. With the other scripts we do this in the build system so this can be changed to match that.

@chelcassanova chelcassanova force-pushed the create-python-framework-fix-script branch 4 times, most recently from c0079c3 to b8096b6 Compare June 4, 2025 20:13
This commit replaces the shell script that fixes up includes for the
LLDB framework with a Python script. This script will also be used when
fixing up includes for the LLDBRPC.framework.
@chelcassanova chelcassanova force-pushed the create-python-framework-fix-script branch from b8096b6 to 2d63dfe Compare June 4, 2025 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants