Skip to content

[GOFF] Add writing of section symbols #133799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: users/redstar/goffwriter-2
Choose a base branch
from

Conversation

redstar
Copy link
Member

@redstar redstar commented Mar 31, 2025

The GOFF format uses symbol definitions to represent sections and symbols. Introducing a section can require up to 3 symbol definitions. However, most of these details are not needed by the AsmPrinter. To mapped from a section (a MCSectionGOFF) to the symbol definitions, a new class called MCGOFFSymbolMapper is used. The same information can also be used by the assembly output, which justifies this centralized approach. Writing the mapped symbols is then straight forward.

The GOFF format uses symbol definitions to represent sections and
symbols. Introducing a section can require up to 3 symbol definitions.
However, most of these details are not needed by the AsmPrinter.
To mapped from a section (a MCSectionGOFF) to the symbol definitions,
a new class called MCGOFFSymbolMapper is used. The same information
can also be used by the assembly output, which justifies this
centralized approach. Writing the mapped symbols is then straight
forward.
@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-llvm-binary-utilities
@llvm/pr-subscribers-backend-systemz

@llvm/pr-subscribers-mc

Author: Kai Nacke (redstar)

Changes

The GOFF format uses symbol definitions to represent sections and symbols. Introducing a section can require up to 3 symbol definitions. However, most of these details are not needed by the AsmPrinter. To mapped from a section (a MCSectionGOFF) to the symbol definitions, a new class called MCGOFFSymbolMapper is used. The same information can also be used by the assembly output, which justifies this centralized approach. Writing the mapped symbols is then straight forward.


Patch is 35.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133799.diff

8 Files Affected:

  • (modified) llvm/include/llvm/BinaryFormat/GOFF.h (+85)
  • (added) llvm/include/llvm/MC/MCGOFFSymbolMapper.h (+148)
  • (modified) llvm/lib/MC/CMakeLists.txt (+1)
  • (modified) llvm/lib/MC/GOFFObjectWriter.cpp (+253-37)
  • (added) llvm/lib/MC/MCGOFFSymbolMapper.cpp (+203)
  • (modified) llvm/lib/MC/MCObjectFileInfo.cpp (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/zos-ppa2.ll (+1-1)
  • (added) llvm/test/MC/GOFF/section.ll (+73)
diff --git a/llvm/include/llvm/BinaryFormat/GOFF.h b/llvm/include/llvm/BinaryFormat/GOFF.h
index 443bcfc9479a8..43d80e0c247e9 100644
--- a/llvm/include/llvm/BinaryFormat/GOFF.h
+++ b/llvm/include/llvm/BinaryFormat/GOFF.h
@@ -169,6 +169,91 @@ enum SubsectionKind : uint8_t {
   SK_PPA1 = 2,
   SK_PPA2 = 4,
 };
+
+// The standard System/390 convention is to name the high-order (leftmost) bit
+// in a byte as bit zero. The Flags type helps to set bits in byte according
+// to this numeration order.
+class Flags {
+  uint8_t Val;
+
+  constexpr static uint8_t bits(uint8_t BitIndex, uint8_t Length, uint8_t Value,
+                                uint8_t OldValue) {
+    uint8_t Pos = 8 - BitIndex - Length;
+    uint8_t Mask = ((1 << Length) - 1) << Pos;
+    Value = Value << Pos;
+    return (OldValue & ~Mask) | Value;
+  }
+
+public:
+  constexpr Flags() : Val(0) {}
+  constexpr Flags(uint8_t BitIndex, uint8_t Length, uint8_t Value)
+      : Val(bits(BitIndex, Length, Value, 0)) {}
+
+  template <typename T>
+  constexpr void set(uint8_t BitIndex, uint8_t Length, T NewValue) {
+    Val = bits(BitIndex, Length, static_cast<uint8_t>(NewValue), Val);
+  }
+
+  template <typename T>
+  constexpr T get(uint8_t BitIndex, uint8_t Length) const {
+    return static_cast<T>((Val >> (8 - BitIndex - Length)) &
+                          ((1 << Length) - 1));
+  }
+
+  constexpr operator uint8_t() const { return Val; }
+};
+
+// Structure for the flag field of a symbol. See
+// https://www.ibm.com/docs/en/zos/3.1.0?topic=formats-external-symbol-definition-record,
+// offset 41, for the definition.
+struct SymbolFlags {
+  Flags SymFlags;
+
+#define GOFF_SYMBOL_FLAG(NAME, TYPE, BITINDEX, LENGTH)                         \
+  void set##NAME(TYPE Val) { SymFlags.set<TYPE>(BITINDEX, LENGTH, Val); }      \
+  TYPE get##NAME() const { return SymFlags.get<TYPE>(BITINDEX, LENGTH); }
+
+  GOFF_SYMBOL_FLAG(FillBytePresence, bool, 0, 1)
+  GOFF_SYMBOL_FLAG(Mangled, bool, 1, 1)
+  GOFF_SYMBOL_FLAG(Renameable, bool, 2, 1)
+  GOFF_SYMBOL_FLAG(RemovableClass, bool, 3, 1)
+  GOFF_SYMBOL_FLAG(ReservedQwords, ESDReserveQwords, 5, 3)
+
+#undef GOFF_SYMBOL_FLAG
+
+constexpr operator uint8_t() const { return static_cast<uint8_t>(SymFlags); }
+};
+
+// Structure for the behavioral attributes. See
+// https://www.ibm.com/docs/en/zos/3.1.0?topic=record-external-symbol-definition-behavioral-attributes
+// for the definition.
+struct BehavioralAttributes {
+  Flags Attr[10];
+
+#define GOFF_BEHAVIORAL_ATTRIBUTE(NAME, TYPE, ATTRIDX, BITINDEX, LENGTH)       \
+  void set##NAME(TYPE Val) { Attr[ATTRIDX].set<TYPE>(BITINDEX, LENGTH, Val); } \
+  TYPE get##NAME() const { return Attr[ATTRIDX].get<TYPE>(BITINDEX, LENGTH); }
+
+  GOFF_BEHAVIORAL_ATTRIBUTE(Amode, GOFF::ESDAmode, 0, 0, 8)
+  GOFF_BEHAVIORAL_ATTRIBUTE(Rmode, GOFF::ESDRmode, 1, 0, 8)
+  GOFF_BEHAVIORAL_ATTRIBUTE(TextStyle, GOFF::ESDTextStyle, 2, 0, 4)
+  GOFF_BEHAVIORAL_ATTRIBUTE(BindingAlgorithm, GOFF::ESDBindingAlgorithm, 2, 4,
+                            4)
+  GOFF_BEHAVIORAL_ATTRIBUTE(TaskingBehavior, GOFF::ESDTaskingBehavior, 3, 0, 3)
+  GOFF_BEHAVIORAL_ATTRIBUTE(ReadOnly, bool, 3, 4, 1)
+  GOFF_BEHAVIORAL_ATTRIBUTE(Executable, GOFF::ESDExecutable, 3, 5, 3)
+  GOFF_BEHAVIORAL_ATTRIBUTE(DuplicateSymbolSeverity,
+                            GOFF::ESDDuplicateSymbolSeverity, 4, 2, 2)
+  GOFF_BEHAVIORAL_ATTRIBUTE(BindingStrength, GOFF::ESDBindingStrength, 4, 4, 4)
+  GOFF_BEHAVIORAL_ATTRIBUTE(LoadingBehavior, GOFF::ESDLoadingBehavior, 5, 0, 2)
+  GOFF_BEHAVIORAL_ATTRIBUTE(COMMON, bool, 5, 2, 1)
+  GOFF_BEHAVIORAL_ATTRIBUTE(IndirectReference, bool, 5, 3, 1)
+  GOFF_BEHAVIORAL_ATTRIBUTE(BindingScope, GOFF::ESDBindingScope, 5, 4, 4)
+  GOFF_BEHAVIORAL_ATTRIBUTE(LinkageType, GOFF::ESDLinkageType, 6, 2, 1)
+  GOFF_BEHAVIORAL_ATTRIBUTE(Alignment, GOFF::ESDAlignment, 6, 3, 5)
+
+#undef GOFF_BEHAVIORAL_ATTRIBUTE
+};
 } // end namespace GOFF
 
 } // end namespace llvm
diff --git a/llvm/include/llvm/MC/MCGOFFSymbolMapper.h b/llvm/include/llvm/MC/MCGOFFSymbolMapper.h
new file mode 100644
index 0000000000000..dbdc1408dab2f
--- /dev/null
+++ b/llvm/include/llvm/MC/MCGOFFSymbolMapper.h
@@ -0,0 +1,148 @@
+//===- MCGOFFSymbolMapper.h - Maps MC section/symbol to GOFF symbols ------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Maps a section or a symbol to the GOFF symbols it is composed of, and their
+// attributes.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_MC_MCGOFFSYMBOLMAPPER_H
+#define LLVM_MC_MCGOFFSYMBOLMAPPER_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/BinaryFormat/GOFF.h"
+#include "llvm/Support/Alignment.h"
+#include <string>
+#include <utility>
+
+namespace llvm {
+class MCAssembler;
+class MCContext;
+class MCSectionGOFF;
+
+// An "External Symbol Definition" in the GOFF file has a type, and depending on
+// the type a different subset of the fields is used.
+//
+// Unlike other formats, a 2 dimensional structure is used to define the
+// location of data. For example, the equivalent of the ELF .text section is
+// made up of a Section Definition (SD) and a class (Element Definition; ED).
+// The name of the SD symbol depends on the application, while the class has the
+// predefined name C_CODE64.
+//
+// Data can be placed into this structure in 2 ways. First, the data (in a text
+// record) can be associated with an ED symbol. To refer to data, a Label
+// Definition (LD) is used to give an offset into the data a name. When binding,
+// the whole data is pulled into the resulting executable, and the addresses
+// given by the LD symbols are resolved.
+//
+// The alternative is to use a Part Defiition (PR). In this case, the data (in a
+// text record) is associated with the part. When binding, only the data of
+// referenced PRs is pulled into the resulting binary.
+//
+// Both approaches are used, which means that the equivalent of a section in ELF
+// results in 3 GOFF symbol, either SD/ED/LD or SD/ED/PR. Moreover, certain
+// sections are fine with just defining SD/ED symbols. The SymbolMapper takes
+// care of all those details.
+
+// Attributes for SD symbols.
+struct SDAttr {
+  GOFF::ESDTaskingBehavior TaskingBehavior = GOFF::ESD_TA_Unspecified;
+  GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified;
+};
+
+// Attributes for ED symbols.
+struct EDAttr {
+  bool IsReadOnly = false;
+  GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
+  GOFF::ESDAmode Amode;
+  GOFF::ESDRmode Rmode;
+  GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName;
+  GOFF::ESDTextStyle TextStyle = GOFF::ESD_TS_ByteOriented;
+  GOFF::ESDBindingAlgorithm BindAlgorithm = GOFF::ESD_BA_Concatenate;
+  GOFF::ESDLoadingBehavior LoadBehavior = GOFF::ESD_LB_Initial;
+  GOFF::ESDReserveQwords ReservedQwords = GOFF::ESD_RQ_0;
+  GOFF::ESDAlignment Alignment = GOFF::ESD_ALIGN_Doubleword;
+};
+
+// Attributes for LD symbols.
+struct LDAttr {
+  bool IsRenamable = false;
+  GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
+  GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName;
+  GOFF::ESDBindingStrength BindingStrength = GOFF::ESD_BST_Strong;
+  GOFF::ESDLinkageType Linkage = GOFF::ESD_LT_XPLink;
+  GOFF::ESDAmode Amode;
+  GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified;
+};
+
+// Attributes for PR symbols.
+struct PRAttr {
+  bool IsRenamable = false;
+  bool IsReadOnly = false; // ???? Not documented.
+  GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
+  GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName;
+  GOFF::ESDLinkageType Linkage = GOFF::ESD_LT_XPLink;
+  GOFF::ESDAmode Amode;
+  GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified;
+  GOFF::ESDDuplicateSymbolSeverity DuplicateSymbolSeverity =
+      GOFF::ESD_DSS_NoWarning;
+  GOFF::ESDAlignment Alignment = GOFF::ESD_ALIGN_Byte;
+  uint32_t SortKey = 0;
+};
+
+struct GOFFSectionData {
+  // Name and attributes of SD symbol.
+  StringRef SDName;
+  SDAttr SDAttributes;
+
+  // Name and attributes of ED symbol.
+  StringRef EDName;
+  EDAttr EDAttributes;
+
+  // Name and attributes of LD or PR symbol.
+  StringRef LDorPRName;
+  LDAttr LDAttributes;
+  PRAttr PRAttributes;
+
+  // Indicates if there is a LD or PR symbol.
+  enum { None, LD, PR } Tag;
+
+  // Indicates if the SD symbol is to root symbol (aka the Csect Code).
+  bool IsSDRootSD;
+};
+
+class GOFFSymbolMapper {
+  MCContext &Ctx;
+
+  std::string RootSDName;
+  SDAttr RootSDAttributes;
+
+  std::string ADALDName;
+
+  StringRef BaseName;
+
+  bool IsCsectCodeNameEmpty;
+  bool Is64Bit;
+  bool UsesXPLINK;
+
+public:
+  GOFFSymbolMapper(MCContext &Ctx);
+  GOFFSymbolMapper(MCAssembler &Asm);
+
+  // Required order: .text first, then .ada.
+  std::pair<GOFFSectionData, bool> getSection(const MCSectionGOFF &Section);
+
+  void setBaseName();
+  void determineRootSD(StringRef CSectCodeName);
+  llvm::StringRef getRootSDName() const;
+  const SDAttr &getRootSD() const;
+};
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/lib/MC/CMakeLists.txt b/llvm/lib/MC/CMakeLists.txt
index f49f14c848b90..967ec73a2be5b 100644
--- a/llvm/lib/MC/CMakeLists.txt
+++ b/llvm/lib/MC/CMakeLists.txt
@@ -26,6 +26,7 @@ add_llvm_component_library(LLVMMC
   MCExpr.cpp
   MCFragment.cpp
   MCGOFFStreamer.cpp
+  MCGOFFSymbolMapper.cpp
   MCInst.cpp
   MCInstPrinter.cpp
   MCInstrAnalysis.cpp
diff --git a/llvm/lib/MC/GOFFObjectWriter.cpp b/llvm/lib/MC/GOFFObjectWriter.cpp
index efaf5ff006ddc..92603c6fb1002 100644
--- a/llvm/lib/MC/GOFFObjectWriter.cpp
+++ b/llvm/lib/MC/GOFFObjectWriter.cpp
@@ -13,7 +13,11 @@
 #include "llvm/BinaryFormat/GOFF.h"
 #include "llvm/MC/MCAssembler.h"
 #include "llvm/MC/MCGOFFObjectWriter.h"
+#include "llvm/MC/MCGOFFSymbolMapper.h"
+#include "llvm/MC/MCSectionGOFF.h"
 #include "llvm/MC/MCValue.h"
+#include "llvm/Support/Casting.h"
+#include "llvm/Support/ConvertEBCDIC.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/raw_ostream.h"
@@ -23,44 +27,13 @@ using namespace llvm;
 #define DEBUG_TYPE "goff-writer"
 
 namespace {
-
-// The standard System/390 convention is to name the high-order (leftmost) bit
-// in a byte as bit zero. The Flags type helps to set bits in a byte according
-// to this numeration order.
-class Flags {
-  uint8_t Val;
-
-  constexpr static uint8_t bits(uint8_t BitIndex, uint8_t Length, uint8_t Value,
-                                uint8_t OldValue) {
-    assert(BitIndex < 8 && "Bit index out of bounds!");
-    assert(Length + BitIndex <= 8 && "Bit length too long!");
-
-    uint8_t Mask = ((1 << Length) - 1) << (8 - BitIndex - Length);
-    Value = Value << (8 - BitIndex - Length);
-    assert((Value & Mask) == Value && "Bits set outside of range!");
-
-    return (OldValue & ~Mask) | Value;
-  }
-
-public:
-  constexpr Flags() : Val(0) {}
-  constexpr Flags(uint8_t BitIndex, uint8_t Length, uint8_t Value)
-      : Val(bits(BitIndex, Length, Value, 0)) {}
-
-  void set(uint8_t BitIndex, uint8_t Length, uint8_t Value) {
-    Val = bits(BitIndex, Length, Value, Val);
-  }
-
-  constexpr operator uint8_t() const { return Val; }
-};
-
 // Common flag values on records.
 
 // Flag: This record is continued.
-constexpr uint8_t RecContinued = Flags(7, 1, 1);
+constexpr uint8_t RecContinued = GOFF::Flags(7, 1, 1);
 
 // Flag: This record is a continuation.
-constexpr uint8_t RecContinuation = Flags(6, 1, 1);
+constexpr uint8_t RecContinuation = GOFF::Flags(6, 1, 1);
 
 // The GOFFOstream is responsible to write the data into the fixed physical
 // records of the format. A user of this class announces the begin of a new
@@ -223,13 +196,113 @@ void GOFFOstream::finalizeRecord() {
 }
 
 namespace {
+// A GOFFSymbol holds all the data required for writing an ESD record.
+class GOFFSymbol {
+public:
+  std::string Name;
+  uint32_t EsdId;
+  uint32_t ParentEsdId;
+  uint64_t Offset = 0; // Offset of the symbol into the section. LD only.
+                       // Offset is only 32 bit, the larger type is used to
+                       // enable error checking.
+  GOFF::ESDSymbolType SymbolType;
+  GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_ProgramManagementBinder;
+
+  GOFF::BehavioralAttributes BehavAttrs;
+  GOFF::SymbolFlags SymbolFlags;
+  uint32_t SortKey = 0;
+  uint32_t SectionLength = 0;
+  uint32_t ADAEsdId = 0;
+  uint32_t EASectionEDEsdId = 0;
+  uint32_t EASectionOffset = 0;
+  uint8_t FillByteValue = 0;
+
+  GOFFSymbol() : EsdId(0), ParentEsdId(0) {}
+
+  GOFFSymbol(StringRef Name, uint32_t EsdID, const SDAttr &Attr)
+      : Name(Name.data(), Name.size()), EsdId(EsdID), ParentEsdId(0),
+        SymbolType(GOFF::ESD_ST_SectionDefinition) {
+    BehavAttrs.setTaskingBehavior(Attr.TaskingBehavior);
+    BehavAttrs.setBindingScope(Attr.BindingScope);
+  }
+
+  GOFFSymbol(StringRef Name, uint32_t EsdID, uint32_t ParentEsdID,
+             const EDAttr &Attr)
+      : Name(Name.data(), Name.size()), EsdId(EsdID), ParentEsdId(ParentEsdID),
+        SymbolType(GOFF::ESD_ST_ElementDefinition) {
+    this->NameSpace = Attr.NameSpace;
+    // TODO Do we need/should set the "mangled" flag?
+    SymbolFlags.setFillBytePresence(1);
+    SymbolFlags.setReservedQwords(Attr.ReservedQwords);
+    BehavAttrs.setReadOnly(Attr.IsReadOnly);
+    BehavAttrs.setExecutable(Attr.Executable);
+    BehavAttrs.setAmode(Attr.Amode);
+    BehavAttrs.setRmode(Attr.Rmode);
+    BehavAttrs.setTextStyle(Attr.TextStyle);
+    BehavAttrs.setBindingAlgorithm(Attr.BindAlgorithm);
+    BehavAttrs.setLoadingBehavior(Attr.LoadBehavior);
+    BehavAttrs.setAlignment(Attr.Alignment);
+  }
+
+  GOFFSymbol(StringRef Name, uint32_t EsdID, uint32_t ParentEsdID,
+             const LDAttr &Attr)
+      : Name(Name.data(), Name.size()), EsdId(EsdID), ParentEsdId(ParentEsdID),
+        SymbolType(GOFF::ESD_ST_LabelDefinition) {
+    this->NameSpace = Attr.NameSpace;
+    SymbolFlags.setRenameable(Attr.IsRenamable);
+    BehavAttrs.setExecutable(Attr.Executable);
+    BehavAttrs.setBindingStrength(Attr.BindingStrength);
+    BehavAttrs.setLinkageType(Attr.Linkage);
+    BehavAttrs.setAmode(Attr.Amode);
+    BehavAttrs.setBindingScope(Attr.BindingScope);
+  }
+
+  GOFFSymbol(StringRef Name, uint32_t EsdID, uint32_t ParentEsdID,
+             const PRAttr &Attr)
+      : Name(Name.data(), Name.size()), EsdId(EsdID), ParentEsdId(ParentEsdID),
+        SymbolType(GOFF::ESD_ST_PartReference) {
+    this->NameSpace = Attr.NameSpace;
+    SymbolFlags.setRenameable(Attr.IsRenamable);
+    BehavAttrs.setExecutable(Attr.Executable);
+    BehavAttrs.setAlignment(Attr.Alignment);
+    BehavAttrs.setAmode(Attr.Amode);
+    BehavAttrs.setLinkageType(Attr.Linkage);
+    BehavAttrs.setBindingScope(Attr.BindingScope);
+    BehavAttrs.setDuplicateSymbolSeverity(Attr.DuplicateSymbolSeverity);
+    BehavAttrs.setReadOnly(Attr.IsReadOnly);
+  }
+};
+
 class GOFFWriter {
   GOFFOstream OS;
   [[maybe_unused]] MCAssembler &Asm;
 
+  /// Mapping from MCSectionGOFF/MCSymbolGOFF to GOFF symbols and attributes.
+  GOFFSymbolMapper SymbolMapper;
+
+  /// Counter for symbol id's.
+  uint32_t EsdIdCounter = 0;
+
+  /// Id's of some special symbols.
+  uint32_t RootSDEsdId = 0;
+  uint32_t ADAEsdId = 0;
+
   void writeHeader();
+  void writeSymbol(const GOFFSymbol &Symbol);
   void writeEnd();
 
+  GOFFSymbol createGOFFSymbol(StringRef Name, const SDAttr &Attr);
+  GOFFSymbol createGOFFSymbol(StringRef Name, const EDAttr &Attr,
+                              uint32_t ParentEsdId);
+  GOFFSymbol createGOFFSymbol(StringRef Name, const LDAttr &Attr,
+                              uint32_t ParentEsdId);
+  GOFFSymbol createGOFFSymbol(StringRef Name, const PRAttr &Attr,
+                              uint32_t ParentEsdId);
+
+  void defineRootSymbol(const MCSectionGOFF *Text);
+  void defineSectionSymbols(const MCSectionGOFF &Section);
+  void defineSymbols();
+
 public:
   GOFFWriter(raw_pwrite_stream &OS, MCAssembler &Asm);
   uint64_t writeObject();
@@ -237,7 +310,108 @@ class GOFFWriter {
 } // namespace
 
 GOFFWriter::GOFFWriter(raw_pwrite_stream &OS, MCAssembler &Asm)
-    : OS(OS), Asm(Asm) {}
+    : OS(OS), Asm(Asm), SymbolMapper(Asm) {}
+
+GOFFSymbol GOFFWriter::createGOFFSymbol(StringRef Name, const SDAttr &Attr) {
+  return GOFFSymbol(Name, ++EsdIdCounter, Attr);
+}
+
+GOFFSymbol GOFFWriter::createGOFFSymbol(StringRef Name, const EDAttr &Attr,
+                                        uint32_t ParentEsdId) {
+  return GOFFSymbol(Name, ++EsdIdCounter, ParentEsdId, Attr);
+}
+
+GOFFSymbol GOFFWriter::createGOFFSymbol(StringRef Name, const LDAttr &Attr,
+                                        uint32_t ParentEsdId) {
+  return GOFFSymbol(Name, ++EsdIdCounter, ParentEsdId, Attr);
+}
+
+GOFFSymbol GOFFWriter::createGOFFSymbol(StringRef Name, const PRAttr &Attr,
+                                        uint32_t ParentEsdId) {
+  return GOFFSymbol(Name, ++EsdIdCounter, ParentEsdId, Attr);
+}
+
+void GOFFWriter::defineRootSymbol(const MCSectionGOFF *Text) {
+  // There is always a text section except for DWARF unit tests.
+  SymbolMapper.determineRootSD("");
+  GOFFSymbol RootSD =
+      createGOFFSymbol(SymbolMapper.getRootSDName(), SymbolMapper.getRootSD());
+  writeSymbol(RootSD);
+  RootSDEsdId = RootSD.EsdId;
+}
+
+void GOFFWriter::defineSectionSymbols(const MCSectionGOFF &Section) {
+  auto [GOFFSectionData, Found] = SymbolMapper.getSection(Section);
+  if (Found) {
+    uint32_t SDEsdId = RootSDEsdId;
+    if (!GOFFSectionData.IsSDRootSD) {
+      GOFFSymbol SD = createGOFFSymbol(GOFFSectionData.SDName,
+                                       GOFFSectionData.SDAttributes);
+      SDEsdId = SD.EsdId;
+      writeSymbol(SD);
+    }
+
+    GOFFSymbol ED = createGOFFSymbol(GOFFSectionData.EDName,
+                                     GOFFSectionData.EDAttributes, SDEsdId);
+    if (GOFFSectionData.Tag == GOFFSectionData::None ||
+        GOFFSectionData.Tag == GOFFSectionData::LD) {
+      ED.SectionLength = Asm.getSectionAddressSize(Section);
+    }
+    writeSymbol(ED);
+
+    if (GOFFSectionData.Tag == GOFFSectionData::LD) {
+      GOFFSymbol LD = createGOFFSymbol(GOFFSectionData.LDorPRName,
+                                       GOFFSectionData.LDAttributes, ED.EsdId);
+      if (Section.isText())
+        LD.ADAEsdId = ADAEsdId;
+      writeSymbol(LD);
+    }
+
+    if (GOFFSectionData.Tag == GOFFSectionData::PR) {
+      GOFFSymbol PR = createGOFFSymbol(GOFFSectionData.LDorPRName,
+                                       GOFFSectionData.PRAttributes, ED.EsdId);
+      PR.SectionLength = Asm.getSectionAddressSize(Section);
+      if (Section.getName() == ".ada") {
+        // We cannot have a zero-length section for data.  If we do,
+        // artificially inflate it. Use 2 bytes to avoid odd alignments. Note:
+        // if this is ever changed, you will need to update the code in
+        // SystemZAsmPrinter::emitCEEMAIN and SystemZAsmPrinter::emitCELQMAIN to
+        // generate -1 if there is no ADA
+        if (!PR.SectionLength)
+          PR.SectionLength = 2;
+        ADAEsdId = PR.EsdId;
+      }
+      writeSymbol(PR);
+    }
+    return;
+  }
+  // TODO It is possible to get here. This will be handled later.
+}
+
+void GOFFWriter::defineSymbols() {
+  // Search for .text and .ada sections. These should be the first sections in
+  // the list, so the loop should be cheap.
+  MCSectionGOFF *Text = nullptr;
+  MCSectionGOFF *ADA = nullptr;
+  for (MCSection &S : Asm) {
+    if (S.getName() == ".text")
+      Text = &cast<MCSectionGOFF>(S);
+    if (S.getName() == ".ada")
+      ADA = &cast<MCSectionGOFF>(S);
+  }
+  defineRootSymbol(Text);
+  if (ADA)
+    defineSectionSymbols(*ADA);
+  if (Text)
+    defineSectionSymbols(*Text);
+
+  // Process the other sections.
+  for (MCSection &S : Asm) {
+    auto &Section = cast<MCSectionGOFF>(S);
+    if (Text != &Section && ADA != &Section)
+      defineSectionSymbols(Section);
+  }
+}
 
 void GOFFWriter::writeHeader() {
   OS.newRecord(GOFF::RT_HDR);
@@ -253,6 +427,45 @@ void GOFFWriter::writeHeader() {
   OS.write_zeros(6);       // Reserved
 }
 
+void GOFFWriter::writeSymbol(const GOFFSymbol &Symbol) {
+  if (Symbol.Offset >= (((...
[truncated]

Copy link

github-actions bot commented Mar 31, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment, I'm not really sure why we need the whole SymbolMapper indirection. The various flags for the sections seem basically hard-coded. Couldn't we just require callers of getGOFFSection to pass in the necessary attributes, and then move the whole SymbolMapper logic to initGOFFMCObjectFileInfo and just have it use the appropriate attributes to begin with?

struct GOFFSectionData {
// Name and attributes of SD symbol.
StringRef SDName;
SDAttr SDAttributes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to always need a pair of "name, attributes" to identify a symbol. Would it be simpler to have the name be a member of the attributes struct?

std::string RootSDName;
SDAttr RootSDAttributes;

std::string ADALDName;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be here? There's only a single local use.

// Required order: .text first, then .ada.
std::pair<GOFFSectionData, bool> getSection(const MCSectionGOFF &Section);

void setBaseName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nowhere called or defined.

}

void GOFFSymbolMapper::determineRootSD(StringRef CSectCodeName) {
IsCsectCodeNameEmpty = CSectCodeName.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be always called with an empty CSectCodeName ?

: GOFFSymbolMapper(Asm.getContext()) {
if (!Asm.getWriter().getFileNames().empty())
BaseName =
sys::path::stem((*(Asm.getWriter().getFileNames().begin())).first);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the nuance here, but I'm wondering why we need the MCAssembler here. Couldn't this simply use getMainFileName() from the MCContext instead?

@redstar
Copy link
Member Author

redstar commented Apr 1, 2025

As a general comment, I'm not really sure why we need the whole SymbolMapper indirection. The various flags for the sections seem basically hard-coded. Couldn't we just require callers of getGOFFSection to pass in the necessary attributes, and then move the whole SymbolMapper logic to initGOFFMCObjectFileInfo and just have it use the appropriate attributes to begin with?

That was actually my first approach. There are a couple of challenges:

  • The root SD and the ADA PR name are not static

The name is either derived from the file name, or defined by a pragma or a command line option and passed as meta data in the module. When the sections are initialized in MCObjectFileInfo::initGOFFMCObjectFileInfo() neither the file name is available (MCConfig::getFileName() is empty) nor access to the meta data is possible. Thus, I have the need to provide the name later (e.g. in TargetLoweringObjectFileGOFF::getModuleMetadata()).

  • The root SD is shared among most sections
  • There are 2 or 3 symbols (aka names) per section

The sections are stored in a hash map in the context. Of the 2 or 3 symbol names, there is not a single one which is unique, which means that a synthetic key like "SD name/ED name/PR name" needs to be constructed. Since the root SD is shared in many cases, and not available when the section is instantiated, it adds another special case.
("Sharing of the root SD" means that the name is needed for HLASM output and the id in the object case.)

  • Not all sections are defined in MCObjectFileInfo.

If a section is only used once in the AsmPrinter, then there is no need to cache the reference in MCObjectFileInfo. Of course, this is still possible when adding the attributes to getGOFFSection(), but it becomes very hard to debug and change the data. It also means that class names like "C_WSA64" and the logic for the little non-static part are spread over and repeated in different classes.
The ADA and the IDRL sections are such sections. The DWARF EH data for each function is emitted into a separate section for each functions.

  • The SDAttr/EDAttr/LDAttr/PRAttr structures are also needed for MCSymbolGOFF

However, the situation for MCSymbolGOFF is more dynamic, and cannot be done at construction time. The SymbolMapper adds a convenient place for the logic (but it could also moved to the MCSymbolGOFF class if needed.)

As a result, I found it more convenient to have all GOFF specific attributes defined in a single class, and map the ELF-style sections to the actual GOFF symbols later.

However, if you think it is better/more streamlined to move everything into getGOFFSections() then I can move the code. E.g. initialization of the text section would be:

  TextSection = Ctx->getGOFFSection(
      SectionKind::getText(), CODE[Is64Bit],
      EDAttr{true, GOFF::ESD_EXE_CODE, AMODE[Is64Bit], RMODE[Is64Bit],
             GOFF::ESD_NS_NormalName, GOFF::ESD_TS_ByteOriented,
             GOFF::ESD_BA_Concatenate, GOFF::ESD_LB_Initial, GOFF::ESD_RQ_0,
             GOFF::ESD_ALIGN_Doubleword},
      LDAttr{false, GOFF::ESD_EXE_CODE, GOFF::ESD_NS_NormalName,
             GOFF::ESD_BST_Strong, LINKAGE[UsesXPLINK], AMODE[Is64Bit],
             GOFF::ESD_BSC_Section});

The root SD name and the binding scope would then be updated in TargetLoweringObjectFileGOFF::getModuleMetadata(). The MCSectionGOFF class would then carry all members from GOFFSectionData.

// in a byte as bit zero. The Flags type helps to set bits in byte according
// to this numeration order.
class Flags {
uint8_t Val;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= 0 and make the default ctor defaulted = default;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

};

// Structure for the flag field of a symbol. See
// https://www.ibm.com/docs/en/zos/3.1.0?topic=formats-external-symbol-definition-record,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space before the ,? Otherwise the link doesn't work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.


#undef GOFF_SYMBOL_FLAG

constexpr operator uint8_t() const { return static_cast<uint8_t>(SymFlags); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not indented correctly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -19,7 +19,7 @@
; CHECK: .quad L#PPA2-CELQSTRT * A(PPA2-CELQSTRT)
; CHECK: L#PPA1_void_test_0:
; CHECK: .long L#PPA2-L#PPA1_void_test_0 * Offset to PPA2
; CHECK: .section "B_IDRL"
; CHECK: .section ".idrl"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Is the previous "B_IDRL" wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to have unique, uniform synthetic name for the sections. But with Uli's suggestion this is gone now. In the object file/HLASM file, the name is indeed B_IDRL.

@@ -0,0 +1,73 @@
; RUN: llc <%s --mtriple s390x-ibm-zos --filetype=obj -o - | \
; RUN: od -Ax -tx1 -v | FileCheck --ignore-case %s
; REQUIRES: systemz-registered-target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneeded thanks to

% cat llvm/test/MC/SystemZ/lit.local.cfg
if not "SystemZ" in config.root.targets:
    config.unsupported = True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

GOFF_SYMBOL_FLAG(Mangled, bool, 1, 1)
GOFF_SYMBOL_FLAG(Renameable, bool, 2, 1)
GOFF_SYMBOL_FLAG(RemovableClass, bool, 3, 1)
GOFF_SYMBOL_FLAG(ReservedQwords, ESDReserveQwords, 5, 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I once knew the answer to this question but I have since forgotten...

What's up with this field? The documentation linked a few lines above say that field is reserved - does the documentation need an update, or can this be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ED- and PR-type records, this field specifies the amount of space (in multiples of 16-byte quadwords) that the Binder will reserve at the origin of an element.
It's required, and the documentation needs an update.

redstar and others added 6 commits April 2, 2025 11:25
Co-authored-by: Neumann Hon <neumann.hon@ibm.com>
Co-authored-by: Neumann Hon <neumann.hon@ibm.com>
Co-authored-by: Neumann Hon <neumann.hon@ibm.com>
@redstar
Copy link
Member Author

redstar commented Apr 2, 2025

I implemented the suggestion from @uweigand. The GOFF attributes are set directly at the MCSectionGOFF, and the GOFFSymbolMapper is gone. I still need to update a couple of tests, since now the section names have changed.

@uweigand
Copy link
Member

uweigand commented Apr 3, 2025

I implemented the suggestion from @uweigand. The GOFF attributes are set directly at the MCSectionGOFF, and the GOFFSymbolMapper is gone. I still need to update a couple of tests, since now the section names have changed.

Thanks, I actually like this better. Looking at the current state, I'm wondering if this could be simplified even further: couldn't we simply allocate separate MCSectionGOFF objects for each of the SD / ED / PR (and possibly LD, but that's maybe really rather a label instead of a section?) symbols?

Then each of those could have its own name (there'd probably still have to be a synthesized name to avoid dupliation?) and attributes, and then a "parent" link to identify the containing section. During emission, we could follow those links to ensure the proper numbering and ordering. Or even simpler, we might be able to assign the ID numbers already in the MCSectionGOFF, similar to how the UniqueID is handled for MCSectionELF.

Finally, this would allow representing the ADA link explicitly, without having to impose a particular policy in the low-level writer.

Does this make sense or am I missing something here?

@@ -2767,15 +2774,55 @@ MCSection *TargetLoweringObjectFileGOFF::getExplicitSectionGlobal(
MCSection *TargetLoweringObjectFileGOFF::getSectionForLSDA(
const Function &F, const MCSymbol &FnSym, const TargetMachine &TM) const {
std::string Name = ".gcc_exception_table." + F.getName().str();
return getContext().getGOFFSection(Name, SectionKind::getData(), nullptr, 0);
constexpr bool Is64Bit = true;
constexpr bool UsesXPLINK = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are of course always true in the current upstream code base. I'm not sure it makes sense to add support for features not (currently) usable upstream ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My goal was to mark those values somehow, because it is very difficult to consistently find those spots again. But using a constant is enough, so this is gone now.

GOFFSymbol PR = createGOFFSymbol(Section.getLDorPRName(),
Section.getPRAttributes(), ED.EsdId);
PR.SectionLength = Asm.getSectionAddressSize(Section);
if (Section.getName() == ".ada") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this name will ever match with the latest code. In any case, this is really not the proper place to put this kind of decision logic ... that should really be handled elsewhere (or more generically), I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that does not work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

void TargetLoweringObjectFileGOFF::getModuleMetadata(Module &M) {
// Set the main file name if not set previously by the tool.
if (getContext().getMainFileName().empty())
getContext().setMainFileName(M.getSourceFileName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This sets the common-code "MainFileName", where it otherwise wouldn't be set. I'm not sure if this is causing any problems - but it would be a place where we behave differently than others, which is usually better to avoid ...

If it makes sense in general to default MainFileName to SourceFileName when compiling a module, maybe that could and should be done in common code.

Otherwise, could we instead set the name to MainFileName (as default) in initGOFFMCObjectFileInfo, and then override to SourceFileName here (if empty)? Probably need to hold the Root/Ada names as members of TLOF then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using anymore.

@redstar
Copy link
Member Author

redstar commented Apr 3, 2025

Does this make sense or am I missing something here?

I think that splitting the SD/ED/LD into 3 "section"s implies that a MCSectionGOFF has a fundamentally different semantic than the other MCSectionXXX. This is something I would like to avoid. On the other hand, the SD/ED pair is almost the same as an ELF section, so just putting those 2 into a MCSectionGOFF instance and handling the LD/PR symbol differently makes sense. In HLASM, a section definition looks like

HLASMHEL CSECT                        * The SD symbol
HLASMHEL AMODE 64
HLASMHEL RMODE ANY
C_CODE64 CATTR EXECUTABLE,RENT        * The ED symbol; the LD symbol is automatically generated.
         ENTRY MAIN
MAIN     AMODE 64
MAIN     XATTR LINKAGE(XPLINK),SCOPE(LIBRARY),REFERENCE(CODE)
MAIN     DS    0FD                    * A "normal" label, with attributes. Results in another LD symbol

When I switch sections, and later want to continue the code section, then I need to repeat both statements in a short form:

HLASMHEL CSECT ,
C_CODE64 CATTR ,

I thought about this when moving the definitions from the SymbolMapper to the MCObjectFileInfo class, but I got no good idea how to handle the LD or PR symbol. A possible way could be to attach the LD or PR symbol as the begin symbol of a section. I need to try this.

One possible downside is that it makes handling of relocations much more complicated. For the base of a relocation, I need the PR symbol (when I have the SD/ED/PR case) or the LD and ED (in the SD/ED/LD case). Currently, this information is available in the MCSectionGOFF, without the need to chase a list to the correct element.

@uweigand
Copy link
Member

uweigand commented Apr 4, 2025

I think that splitting the SD/ED/LD into 3 "section"s implies that a MCSectionGOFF has a fundamentally different semantic than the other MCSectionXXX. This is something I would like to avoid. On the other hand, the SD/ED pair is almost the same as an ELF section, so just putting those 2 into a MCSectionGOFF instance and handling the LD/PR symbol differently makes sense.

Thinking a bit more about this, it looks to me that we should treat SD/ED/PR on the one hand differently from LD (and ER) on the other. The former identify a range of address space and may hold contents of those ranges in the form of text records; the latter identify a single address (and hold no content of their own).

From that perspective, the former correspond to the "section" concept, while the latter correspond to the "symbol" concept. Now, among the section types SD/ED/PR, GOFF is a bit special in that those are nested - this is somewhat similar to the subsection concept, but it is explicit in the object file format (as opposed to, say, ELF subsections).

It seems to me that modelling that nested section concept explicitly by creating a separate MCSectionGOFF for each of SD, ED, and PR, and linking them as appropriate via a Parent pointer (which we actually already have today!), doesn't look too fundamentally different ... As long as we ensure that text emission happens into the right section (ED or PR as appropriate), this should work fine with common code.

In fact, considering that at some point we want to be able to implement a general HLASM AsmParser, which would require handling any allowed combination of CSECT with multiple CATTR, we should not merge SD and ED into a single section. (Also, by having them separately, we no longer need special treatment of the "root" SD in the writer.)

Finally, having separate MCSession structures for each ESD record may allow using the MCSession::Ordinal field as the ESD ID, which matches its purpose for other object file formats, and which would allow easy resolution of parent (and ADA) section pointers to ESD IDs in the writer.

The LD record, on the other hand, clearly should not get a MCSectionGOFF. Rather, it would make sense for this to be represented as a MCSymbolGOFF. Specifically, this symbol really represents the implicit section start symbol (which ELF also has!); so it should probably best be emitted not from the section table but from the symbol table. (MCSection already has a Begin symbol - it should be possible to use this for that purpose.) That would also unify emission of that type of LD record with the other LD records for "normal" symbols.

Attributes associated with the LD record should likewise come from the MCSymbolGOFF. This would include the ADA section, which means that association no longer needs to be hard-coded in the writer, but can instead set up by codegen as appropriate when defining symbols. (E.g. this would also allow handling arbitrary user-provided XATTR PSECT attributes in an HLASM AsmParser.)

@redstar
Copy link
Member Author

redstar commented Apr 4, 2025

Thinking a bit more about this, it looks to me that we should treat SD/ED/PR on the one hand differently from LD (and ER) on the other. The former identify a range of address space and may hold contents of those ranges in the form of text records; the latter identify a single address (and hold no content of their own).

Yes, that is correct.

From that perspective, the former correspond to the "section" concept, while the latter correspond to the "symbol" concept. Now, among the section types SD/ED/PR, GOFF is a bit special in that those are nested - this is somewhat similar to the subsection concept, but it is explicit in the object file format (as opposed to, say, ELF subsections).

I try to implement this. Well, first I'll fix the failing test cases....

if (FileName.size() > 1 && FileName.starts_with('<') &&
FileName.ends_with('>'))
FileName = FileName.substr(1, FileName.size() - 2);
DefaultRootSDName = Twine(FileName).concat("#C").str();

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a name and setting the binding scope to "section scope" is similar to using " " as name and leaving the binding scope unspecified.
The XLC and Open XL compilers provide a command line option to change this name (XLC: -qcsect, -qnocsect; Open XL: -mcsect, -mnocsect). The Open XL compiler defaults to the variant coded here but that can be changed to having a name with binding scope unspecified (-mcsect=a) or set to space (-mnocsect).
Using -mcsect=a results in exactly the problem you describe.
The compiler option will be added later to clang, along with the required code here.

The front end (aka clang) provides this value in the source_filename property. All strings in LLVM/clang are in UTF-8 so there is no other choice. The same problem arises for symbols derived from function names etc.

Copy link

@AidoP AidoP Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that's very interesting. The documentation seems to suggest that the binding scope attribute only applies to LDs or ERs. Interestingly AMBLIST doesn't seem to display it for sections (N/A is shown).

It seems like there are a few undocumented fields and behaviours being relied upon now. Is there anything being done or are there any plans for IBM to update the doc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my plan is to request an update to the documentation.

struct EDAttr {
bool IsReadOnly = false;
GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
GOFF::ESDAmode Amode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole Amode/Rmode handling is a bit confusing to me. As I understand the GOFF docs, Amode is supposed to be a symbol property (i.e. set on LD and ER records) while Rmode is supposed to be an element property (i.e. set on ED records). So it is unclear what an Amode property on ED or PR is supposed to do, exactly. There also doesn't seem to be a way to specify those with any HLASM command I can see.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do a bit of research here. The Amode at the ED symbol acts as a default when no Amode at the LD/ER is present. The Amode at the PR seems to be not necessary. However, I need to check if this results in binder errors if I remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No binding problems, so it seems sage to make this change. That said, amblist shows the Amode on PR and ED symbols.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I removed this. However, it feels like that there is an inconsistency in the documentation / implementations. The following HLASM code

stdin#C CSECT
C_WSA64 CATTR ALIGN(4),DEFLOAD,NOTEXECUTABLE,PART(a),RMODE(64)
        DC      0X
        END

results in RMODE(64) set at the ED symbol, and AMODE(64) set at the PR symbol.
Setting the Amode on a PR symbol makes sense to me because it is not possible to add a LD symbol to the part - this is causing binder errors. To reference the part from a different compilation unit, I have to use a, thus the PR has also some symbol semantics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double-check: your current code does not set Amode on the PR symbol. Is is necessary to do this or not?

bool IsRenamable = false;
GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName;
GOFF::ESDBindingStrength BindingStrength = GOFF::ESD_BST_Strong;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For LD (as opposed to ER), it seems "strong" is the only allowed value here. Again, if this is true, it doesn't make much sense to specify it.

Copy link
Member Author

@redstar redstar Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually needed for C++ inline functions etc. which can end up in several object files. Also for weak definitions, e.g.:

// a.c
#include <stdio.h>

__attribute__((weak)) void fun() {
  printf("Weak fun\n");
}

void feature() {
  fun();
}

and

// b.c
#include <stdio.h>

extern void feature();

void fun() {
  printf("Other fun\n");
}

int main(int argc, char *argv[]) {
  feature();
  return 0;
}

(example taken from a blog by @MaskRay)

Another case were the documentation needs an update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. This is not currently reflected in the HLASM output, however. How would one do this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to ask for this. I see only WXTRN for weak externals in the HLASM documentation.

if (!Emitted) {
emitCATTR(OS, ED->getName(), ED->getParent()->getName(), !SDEmitted,
PRAttributes.Amode, getParent()->EDAttributes.Rmode,
PRAttributes.Alignment, getParent()->EDAttributes.LoadBehavior,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoadBehavior and Rmode are only supported on ED, not PR. Should we even emit those with the CATTR PART at all, or only with the base CATTR for the ED?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HLASM statement

C_WSA64 CATTR ALIGN(4),DEFLOAD,NOTEXECUTABLE,RMODE(64),PART(a)

generates both the ED symbol C_WSA64 and the PR symbol a.
At the ED symbol, Rmode and LoadBehaviour are set.
At the PR symbol, Amode (derived from RMode) and the LoadBehaviour are set. The latter looks like a HLASM bug. I am not aware of a way to defined the class (ED) and the part (PR) separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HLASM docs contain the somewhat cryptic statement: "Binding attributes assigned to the class are also assigned to the part." where is not really defined anywhere (I can see) what exactly "binding attributes" mean.

Do we need to replicate the HLASM behavior of setting Amode and LoadBehavior or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO there is no need to replicate the HLASM behaviour. The table in the MVS Program Management: Advanced Facilities book clearly states that the LoadBehaviour and the Rmode are set at the ED.

Unfortunately, the attributes which can be set at the PR are in another table, inside the notes. This does not include LoadBehaviour and Rmode.

Over the first table there is a similar comment: "Note that the attributes of PR items (with the exception of alignment) are determined from the element EDID to which they belong." This seems to imply that those attributes set at the ED do not need to be set at the PR but I could not verify this.

@@ -599,8 +600,18 @@ class MCContext {
unsigned Flags,
unsigned EntrySize);

MCSectionGOFF *getGOFFSection(StringRef Section, SectionKind Kind,
MCSection *Parent, uint32_t Subsection = 0);
private:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding a private, just move the getGOFFSection to the existing private part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the code.

@@ -26,6 +27,23 @@ GOFFObjectWriter &MCGOFFStreamer::getWriter() {
return static_cast<GOFFObjectWriter &>(getAssembler().getWriter());
}

namespace {
// Make sure that all section are registered in the correct order.
void registerSectionHierarchy(MCAssembler &Asm, MCSectionGOFF *Section) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style prefers static to anonymous namespace for internal functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the anonymous namespace, and added static.

@@ -26,6 +27,23 @@ GOFFObjectWriter &MCGOFFStreamer::getWriter() {
return static_cast<GOFFObjectWriter &>(getAssembler().getWriter());
}

namespace {
// Make sure that all section are registered in the correct order.
void registerSectionHierarchy(MCAssembler &Asm, MCSectionGOFF *Section) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style prefers static to anonymous namespace for internal functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the anonymous namespace, and added static.

using namespace llvm;

namespace {
void emitCATTR(raw_ostream &OS, StringRef Name, GOFF::ESDRmode Rmode,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the anonymous namespace, and added static.

@@ -0,0 +1,112 @@
; RUN: llc <%s --mtriple s390x-ibm-zos --filetype=obj -o - | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llc < %s does not need -o -. .ll tests in llvm/test/MC is probably not a good convention we'd recommend. Move to llvm/test/CodeGen/SystemZ, perhaps under a directory zos/ ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the tests.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As I'm not familiar with z/OS, my review focused on its compatibility with the current MC infrastructure. (My internet access will be limited between April 20th and May 4th, which may cause delays in my response time.)

Thanks for reimplementing the getPPA1/PPA2 stuff that puzzled me :)

The value would serve as default if not specified on LD or PR. However, we can make sure that it is always specified on LD and PR, s not need to have it on ED.
@redstar
Copy link
Member Author

redstar commented Apr 23, 2025

@uweigand I made all the suggested changes.

// Attributes for PR symbols.
struct PRAttr {
bool IsRenamable = false;
bool IsReadOnly = false; // ???? Not documented.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it ever make sense for this value to differ from the ED value? Or is this one of those attributes that should be copied from the element to the part? It doesn't seem possible to specific differing values in HLASM ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked again, and it seems we are only using the default value. I remove it.

GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified;
GOFF::ESDLinkageType Linkage = GOFF::ESD_LT_XPLink;
GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified;
GOFF::ESDAlignment Alignment = GOFF::ESD_ALIGN_Byte;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question for the alignment, can it ever differ from the ED alignment? How would you specify this in HLASM?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement for the alignment is that the aligment of ED must be equal or higher than the alignment of PR. However, to my knowledge it is not possible to share the ED element among several PR elements (because the SD must have the same name as the PR). Which means that both alignments are the same.
In HLASM, the alignment is given with the CATTR statement. If the CATTR defines no PART, then the alignment is set at the ED element, otherwise the same value is set at the ED and PR element.
I remove the alignment in the PR.

@MaskRay
Copy link
Member

MaskRay commented May 3, 2025 via email

@redstar
Copy link
Member Author

redstar commented May 6, 2025

@MaskRay Sorry, your comment is basically empty. I guess a GitHub problem?

@MaskRay
Copy link
Member

MaskRay commented May 7, 2025

@MaskRay Sorry, your comment is basically empty. I guess a GitHub problem?

Sorry... Could be my accidentally pushing a comment to a wrong PR..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants