Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate processor and feature detection from OpenJ9 to OMR #4339

Open
fjeremic opened this issue Sep 20, 2019 · 43 comments
Open

Migrate processor and feature detection from OpenJ9 to OMR #4339

fjeremic opened this issue Sep 20, 2019 · 43 comments

Comments

@fjeremic
Copy link
Contributor

Background

The CPU class in the compiler in OMR [1] currently has no way of determining the processor we are running on at runtime. The OMR::CPU class is an extensible class which is meant to be overriden by the target code generator. Inside of these extended codegen classes we have APIs to determine whether we support generating instructions for a particular processor. For example [2] is the override on Z and [3] is the override on Power.

Each code generator seems to use different, although quite similar APIs to determine if we can generate instructions for a particular processor.

  • On Power we use TR::Compiler->target.cpu.id()
    • Example in [4]
  • On Z we use TR::Compiler->target.cpu.getSupportsArch()
    • Example in [5]
  • On x86 we use cg()->getX86ProcessorInfo()
    • Example in [6]

As we can see we're somewhat inconsistent in the APIs we use. More over, if you notice the APIs we use across the different codegens will always default down to the minimum Architecture Level Set (ALS) defined in OMR. This means that irregardless of what processor we are actually running on we'll generate instructions for the least supported processor.

This is problematic in dynamic runtimes.

[1] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/env/OMRCPU.cpp
[2] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/z/env/OMRCPU.cpp
[3] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/p/env/OMRCPU.cpp
[4] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/p/codegen/BinaryEvaluator.cpp#L2087-L2096
[5] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/z/codegen/BinaryEvaluator.cpp#L3125-L3128
[6] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/x/codegen/OMRCodeGenerator.cpp#L385-L388

Proposed Solution

Our friends at the OpenJ9 project have implemented processor detection in the port library already, and the JIT compiler in OpenJ9 is already making use of the data provided by the port library and caching the various answers for use in the JIT.

For example on Power and Z we use the j9sysinfo_processor_has_feature API [7] to query whether we have support for a particular CPU feature, which is initialized in the port library on initialization. Example of this is in the j9sysinfo.c file for both Power and Z [8].

Similarly processors are detected via the J9ProcessorDesc struct which is initialized via the j9sysinfo_get_processor_description API which can be found in the same j9sysinfo.c file. The way in which we initialize these structures depends on the OS and architecture type.

What we effectively need to do is to sink the OpenJ9 pieces of the processor and feature detection down into the OMR layer so that we can access it directly from the OMR compiler. The work here will involve investigating what pieces of the OpenJ9 port library are used and finding the correct place to move the APIs in OMR. Processor and feature detection is a delicate piece of OpenJ9 and is crucial for performance so such a change must be staged, that is to first copy the processor detection into OMR, make use of it in OMR, only once that is working we can work on deprecating the OpenJ9 piece to take advantage of the OMR support.

[7] https://github.com/eclipse/openj9/blob/82609efc1fa05a5d1af51ad6ceed7215376dde84/runtime/compiler/z/env/J9CPU.cpp#L230-L301
[8] https://github.com/eclipse/openj9/blob/82609efc1fa05a5d1af51ad6ceed7215376dde84/runtime/port/unix/j9sysinfo.c

@fjeremic
Copy link
Contributor Author

@0xdaryl FYI. @AidanHa is this something you could tackle?

@AidanHa
Copy link
Contributor

AidanHa commented Sep 23, 2019

Sure, I can definitely look into this.

@AidanHa
Copy link
Contributor

AidanHa commented Sep 25, 2019

After reading through the issue and getting familiar with the code, I think I will divide this issue into 4 PRs.
1: Creation of processor and feature detection in OMR (from OpenJ9).
2: Implementation of the new feature throughout out OMR.
3: Implementation of the new feature throughout OpenJ9.
4: Deletion of the processor detection in OpenJ9 port library.

@0xdaryl
Copy link
Contributor

0xdaryl commented Sep 25, 2019

For #2 and #3, by "implementation of the feature" do you mean "exploitation of the feature" ? (for lack of a better word).

@AidanHa
Copy link
Contributor

AidanHa commented Sep 25, 2019

@0xdaryl that is correct, I couldn't find a better way to phrase that point.

@AidanHa
Copy link
Contributor

AidanHa commented Oct 16, 2019

Update for those following: I got a working version of processor detection locally on my x86 machine. Will test other architectures, and submit a PR when all is ready!

@AidanHa
Copy link
Contributor

AidanHa commented Oct 17, 2019

I was wondering how should we be testing feature detection? In my opinion, Openj9 currently has a fairly "weak" test for feature detection (https://github.com/eclipse/openj9/blob/master/runtime/tests/port/si.c#L965-L971), and I would like to improve upon this if possible...

@fjeremic
Copy link
Contributor Author

I was wondering how should we be testing feature detection? In my opinion, Openj9 currently has a fairly "weak" test for feature detection (eclipse/openj9:runtime/tests/port/si.c@master#L965-L971

On Linux we can likely query /proc/cpuinfo and parse the string and compare it against what we programmatically detect?

@AidanHa
Copy link
Contributor

AidanHa commented Oct 18, 2019

On Linux we can likely query /proc/cpuinfo and parse the string and compare it against what we programmatically detect?

This might have some issues where the user doesn't have permission to view the output of /proc/cpuinfo, and all that will show up is "permission denied".

IMO, it's very difficult to actually precisely verify if all the known features (for a specific processor) are set...

@fjeremic
Copy link
Contributor Author

IMO, it's very difficult to actually precisely verify if all the known features (for a specific processor) are set...

I would agree. We'll just have to be very diligent when manually testing that this was done right. i.e. hop on a bunch of different machines and validate before/after. Perhaps using OpenJ9 may help here since processor detection is known to work there already.

@AidanHa
Copy link
Contributor

AidanHa commented Dec 5, 2019

I am currently on step 2 of resolving this issue: Implementing the port library function throughout OMR and unifying the different API calls for processor detection. One thing I noticed is that the current processor detection in compiler/env/OMRCPU.hpp (TR::Compiler>target.cpu.setProcessor() and TR::Compiler>target.cpu.id() is currently being used downstream in OpenJ9 (Eg: https://github.com/eclipse/openj9/blob/master/runtime/compiler/env/ProcessorDetection.cpp#L696), which makes it difficult to simply remove the current implementation and replace it with the port library version. To work my way around this issue, I've decided to go with the following:

1: In the CPU class in compiler/env/OMRCPU.hpp, add separate processor detection calls that utilize the new port library functions at #4503.

2: In OpenJ9, reconfigure all of the instances where the old processor detection functions were use to utilize the new port library functions.

3: Safely delete the old processor detection functions from OMR.

Let me know if this works!

@AidanHa
Copy link
Contributor

AidanHa commented Dec 5, 2019

For clarity:
"old processor detection functions" = TR::Compiler>target.cpu.setProcessor().
"new processor detection functions" = the new port library functions.

@fjeremic
Copy link
Contributor Author

Let me know if this works!

Sounds like a good plan.

@harryyu1994
Copy link
Contributor

harryyu1994 commented Jan 20, 2020

Went through the above discussions:

Here's my understanding:

In OMR we have

env/OMRCPU
z/env/OMRCPU
p/env/OMRCPU
x/env/OMRCPU
  • the main issue here is that when you use the OMRCPU APIs, we get default values.
  • further more the APIs across platforms aren't consistent

In OpenJ9 we have

env/J9CPU
z/env/J9CPU
p/env/J9CPU
x/env/J9CPU
  • in addition to those we have compiler/env/ProcessorDetection.cpp
    • which contains a collection of VM APIs that initializes the processor features depending on the platform during startup
    • TR_J9VM::initializeProcessorType() as an example
  • we want to be able to initialize the processor features in OMR like we do in OpenJ9.
  • The processor detection APIs in ProcessorDetection.cpp relies on a bunch of openj9 port library functions.
  • So this means we need to first move the openj9 port library functions from OpenJ9 to OMR, which has been dealt with in Migrate processor/feature detection from OpenJ9 to OMR Part 1 #4503
  • next step to add processor detection in OMR
  • ^ a question I have with that is in OpenJ9 we call initializeProcessorType() during the onLoadInternal() routine. Where should we do that in OMR?

Last Week's Meeting:

  • to get things going and as a first task, we want to migrate the class TR_X86ProcessorInfo which is in OMRCodeGenerator.hpp to x/OMRCPU.hpp in order to make the API calls for X somewhat consistent with Z and P. (to all use TR::Compiler->target.cpu, right now X's feature queries are in the codegen class instead of the cpu class).
  • and I would assume this is also a 3 step process:
    • add new APIs in x/OMRCPU
    • switch OMR to use the new APIs
    • switch OpenJ9 to use the new APIs

@fjeremic @dsouzai Let me know if I understood everything correctly, I want to make sure everything is clear before I proceed to implementation. Thanks!

@mpirvu FYI, this is the processor feature detection work for OMR that I'll be working on for a while, I will likely log all my progress in this issue.

@fjeremic
Copy link
Contributor Author

^ a question I have with that is in OpenJ9 we call initializeProcessorType() during the onLoadInternal() routine. Where should we do that in OMR?

The initialization should happen in the constructor of the OMRCPU class, i.e. calling the port library and caching the result if needed.

@fjeremic fjeremic assigned harryyu1994 and unassigned AidanHa Jan 21, 2020
@harryyu1994
Copy link
Contributor

harryyu1994 commented Jan 28, 2020

Had a high priority JITServer item last week, back to this work item now.

Some notes about TR::Compiler->target.cpu, how and where it's initialized

  • TR::Compiler is a global pointer of type TR::CompilerEnv
  • TR::CompilerEnv(from CompilerEnv.hpp) is a child of TR::OMR::CompilerEnv (from OMRCompilerEnv.hpp) through extensible classes
namespace OMR
{
class OMR_EXTENSIBLE CompilerEnv
   {
public:
   CompilerEnv(TR::RawAllocator raw, const TR::PersistentAllocatorKit &persistentAllocatorKit);
   TR::CompilerEnv *self();
   ...
   // Compilation target environment
   //
   TR::Environment target;
  • target is a TR::Environment object.
  • TR::Environment (from Environment.hpp) is a child of TR::OMR::Environment (from OMREnvironment.hpp)
namespace OMR
{
class Environment
   {
public:
   Environment() :
         _majorOS(TR::os_unknown),
         _bitness(TR::bits_unknown),
         _isSMP(false),
         _numberOfProcessors(1),
         cpu()
      {}
   Environment(TR::MajorOperatingSystem o, TR::Bitness b) :
         _majorOS(o),
         _bitness(b),
         _isSMP(false),
         _numberOfProcessors(1),
         cpu()
      {}
   TR::CPU cpu;
  • cpu is a TR::CPU object
  • TR::CPU is a child of TR::OMR::X86::CPU (from x/env/OMRCPU.hpp if we are on x) which is a child of TR::OMR::CPU (from env/OMRCPU.hpp) through extensible classes
  • in TR::OMR::CPU there is an API initializeByHostQuery()
    • so there could be two options for processor feature initialization, one is inside the TR::OMR::X86::CPU constructor and the other one is by overriding initializeByHostQuery() in TR::OMR::X86::CPU
namespace OMR
{
class CPU
   {
protected:
   CPU() :
         _processor(TR_NullProcessor),
         _endianness(TR::endian_unknown),
         _majorArch(TR::arch_unknown),
         _minorArch(TR::m_arch_none)
      {}
public:
   TR::CPU *self();
   // Initialize CPU info by querying the host processor at compile-time
   //
   void initializeByHostQuery();
void
OMR::CPU::initializeByHostQuery()
   {

#ifdef OMR_ENV_LITTLE_ENDIAN
   _endianness = TR::endian_little;
#else
   _endianness = TR::endian_big;
#endif

   // Validate host endianness #define with an additional compile-time test
   //
   char SwapTest[2] = { 1, 0 };
   bool isHostLittleEndian = (*(short *)SwapTest == 1);

   if (_endianness == TR::endian_little)
      {
      TR_ASSERT(isHostLittleEndian, "expecting host to be little endian");
      }
   else
      {
      TR_ASSERT(!isHostLittleEndian, "expecting host to be big endian");
      }

#if defined(TR_HOST_X86)
   _majorArch = TR::arch_x86;
   #if defined (TR_HOST_64BIT)
   _minorArch = TR::m_arch_amd64;
   #else
   _minorArch = TR::m_arch_i386;
   #endif
#elif defined (TR_HOST_POWER)
   _majorArch = TR::arch_power;
#elif defined(TR_HOST_ARM)
   _majorArch = TR::arch_arm;
#elif defined(TR_HOST_S390)
   _majorArch = TR::arch_z;
#elif defined(TR_HOST_ARM64)
   _majorArch = TR::arch_arm64;
#elif defined(TR_HOST_RISCV)
   _majorArch = TR::arch_riscv;
   #if defined (TR_HOST_64BIT)
   _minorArch = TR::m_arch_riscv64;
   #else
   _minorArch = TR::m_arch_riscv32;
   #endif
#else
   TR_ASSERT(0, "unknown host architecture");
   _majorArch = TR::arch_unknown;
#endif

   }
  • initializeByHostQuery() is called in OMR::CompilerEnv::initializeTargetEnvironment()
void
OMR::CompilerEnv::initializeTargetEnvironment()
   {

   // Target processor bitness
   //
#ifdef TR_TARGET_64BIT
   target.setBitness(TR::bits_64);
#elif TR_TARGET_32BIT
   target.setBitness(TR::bits_32);
#else
   target.setBitness(TR::bits_unknown);
#endif

   // Initialize the target CPU by querying the host processor
   //
   target.cpu.initializeByHostQuery();

   // Target major operating system
   //
#if HOST_OS == OMR_LINUX
   target.setMajorOS(TR::os_linux);
#elif HOST_OS == OMR_AIX
   target.setMajorOS(TR::os_aix);
#elif HOST_OS == OMR_WINDOWS
   target.setMajorOS(TR::os_windows);
#elif HOST_OS == OMR_ZOS
   target.setMajorOS(TR::os_zos);
#elif HOST_OS == OMR_OSX
   target.setMajorOS(TR::os_osx);
#else
   target.setMajorOS(TR::os_unknown);
#endif

   }

@fjeremic
Copy link
Contributor Author

Let's be consistent with other platforms. If an object can be fully initialized at the allocation point we should initialize the object in the Constructor and simplify the callers by not having to call an additional API which they may or may not be aware of.

@harryyu1994
Copy link
Contributor

harryyu1994 commented Jan 30, 2020

Summary of the Processor Detection OMR Port Library Additions

  • port/common/omrport.c

    • add omrsysinfo_get_processor_description() and omrsysinfo_processor_has_feature() into OMRPortLibrary MasterPortLibraryTable
  • include_core/omrport.h

    • enum OMRProcessorArchitecture
    • struct OMRProcessorDesc (comments still contain j9, needs to be corrected, j9sysinfo_get_processor_description, j9sysinfo_processor_has_feature)
    • PowerPC feature macros
    • s390 feature macros
    • x86 feature macros
    • add omrsysinfo_get_processor_description and omrsysinfo_processor_has_feature function pointers to struct OMRPortLibrary
    • shorthanded omrsysinfo_get_processor_description and omrsysinfo_processor_has_feature macros
  • port/common/omrport.tdf

    • Trc_PRT_sysinfo_get_processor_description_Entered, Group = j9sysinfo, this needs to be corrected (get rid of j9)
    • Trc_PRT_sysinfo_get_processor_description_Exit, Group = j9sysinfo, this needs to be corrected (get rid of j9)
    • Trc_PRT_sysinfo_processor_has_feature_Entered
    • Trc_PRT_sysinfo_processor_has_feature_Exit
  • port/common/omrsysinfo.c and omrsysinfo.c

    • omrsysinfo_get_processor_description()
    • omrsysinfo_processor_has_feature()
  • port/common/omrsysinfo_helpers.c and omrsysinfo_helpers.h

    • getX86Description()
    • getX86CPUID()
  • port/linuxs390/omrsysinfo_helpers.c and omrsysinfo_helpers.h

    • getstfle()
  • port/unix/omrsysinfo.c

    • mapPPCProcessor()
    • setFeature()
    • getLinuxPPCDescription()
    • getAIXPPCDescription()
    • testSTEFLE()
    • getS390Description()
    • omrsysinfo_get_processor_descriptions()
    • omrsysinfo_processor_has_feature()
  • port/unix_include/omrportpg.h

    • OMRSTFLEFacilities
    • OMRSTFLECache
    • PPG_stfleCache
  • port/win32/omrsysinfo.c

    • omrsysinfo_get_processor_description()
    • omrsysinfo_processor_has_feature()
  • port/zos390/omrportpg.h

    • OMRSTFLEFacilities
    • OMRSTFLECache
    • PPG_stfleCache

@harryyu1994
Copy link
Contributor

harryyu1994 commented Jan 30, 2020

How OpenJ9 initializes "TR::Compiler->target.cpu" for x86

  • top level, we are using the vm/frontend API initializeProcessorType()
   else if (TR::Compiler->target.cpu.isX86())
      {
      const char *vendor = TR::Compiler->target.cpu.getProcessorVendorId();
      uint32_t processorSignature = TR::Compiler->target.cpu.getProcessorSignature();

      TR::Compiler->target.cpu.setProcessor(portLibCall_getX86ProcessorType(vendor, processorSignature));

      TR_ASSERT(TR::Compiler->target.cpu.id() >= TR_FirstX86Processor
             && TR::Compiler->target.cpu.id() <= TR_LastX86Processor, "Not a valid X86 Processor Type");
      }
  • portLibCall_getX86ProcessorType() is not a very special call, just parses the vendor and processorSignature to get the processor type.

  • the more special call is getProcessorSignature(), which is able to get us feature flags on x86

  • it seems that, unlike Power and Z, X doesn't use j9sysinfo_get_processor_description() and j9sysinfo_processor_has_feature() to get its features, instead it uses queryX86TargetCPUID().

  • there are two versions of queryX86TargetCPUID(), one in x/env/OMRCPU.hpp (from omr) and one in x/env/J9CPU.hpp(from openj9). They are very similar in that they both call jitGetCPUID()

  • openj9's version

TR_X86CPUIDBuffer *
CPU::queryX86TargetCPUID()
   {
   static TR_X86CPUIDBuffer buf = { {'U','n','k','n','o','w','n','B','r','a','n','d'} };
   jitGetCPUID(&buf);
   return &buf;
   }
  • omr's version
TR_X86CPUIDBuffer *
OMR::X86::CPU::queryX86TargetCPUID()
   {
   static TR_X86CPUIDBuffer *buf = NULL;
   auto jitConfig = TR::JitConfig::instance();

   if (jitConfig && jitConfig->getProcessorInfo() == NULL)
      {
      buf = (TR_X86CPUIDBuffer *) malloc(sizeof(TR_X86CPUIDBuffer));
      if (!buf)
         return 0;
      jitGetCPUID(buf);
      jitConfig->setProcessorInfo(buf);
      }
   else
      {
      if (!buf)
         {
         if (jitConfig && jitConfig->getProcessorInfo())
            {
            buf = (TR_X86CPUIDBuffer *)jitConfig->getProcessorInfo();
            }
         else
            {
            buf = (TR_X86CPUIDBuffer *) malloc(sizeof(TR_X86CPUIDBuffer));

            if (!buf)
               memcpy(buf->_vendorId, "Poughkeepsie", 12); // 12 character dummy string (NIL term not used)

            buf->_processorSignature = 0;
            buf->_brandIdEtc = 0;
            buf->_featureFlags = 0x00000000;
            buf->_cacheDescription.l1instr = 0;
            buf->_cacheDescription.l1data  = 0;
            buf->_cacheDescription.l2      = 0;
            buf->_cacheDescription.l3      = 0;
            }
         }
      }

   return buf;
   }
  • the omr version seems unnecessarily complicated for such a straightforward task

  • it looks like it's trying to cache the results in jitConfig, but if we cache the result in the CPU class, then it's probably not necessary.

  • also this query means that if we don't have a jitConfig then we can't get the results, which doesn't make much sense either

  • also getProcessorInfo() is not used anywhere else except in this function

  • can we replace this with the OpenJ9 version? @fjeremic

  • another question, which is about the extensible classes:
    OMR::CPU --> OMR::X86::CPU --> J9::CPU --> J9::X86::CPU --> TR::CPU
    The TR::CompilerEnv pointer will be of TR::CPU type

TR::CPU *
OMR::CPU::self()
   {
   return static_cast<TR::CPU*>(this);
   }
  • when we do self()->queryX86TargetCPUID(), which one are we actually invoking? the one closer to TR::CPU?

@harryyu1994
Copy link
Contributor

harryyu1994 commented Jan 30, 2020

  • when we do self()->queryX86TargetCPUID(), which one are we actually invoking? the one closer to TR::CPU?

This question I can answer it myself. self() returns a pointer to the most derived/specialized class, which means we will always invoke the functions "closer" to the most specialized class.
Dynamic polymorphism: we have a base pointer, then rely on the virtual keyword to find the most specialized function.
Static polymorphism (what's used here): we will always get the most specialized pointer through self()

harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jan 31, 2020
The purpose of this change is to make x86 consistent with other platforms (Power, Z) which use the CPU class instead of codegen to query and cache processor related information.

Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jan 31, 2020
The purpose of this change is to make x86 consistent with other platforms (Power, Z) which use the CPU class instead of codegen to query and cache processor related information.

Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
@fjeremic
Copy link
Contributor Author

fjeremic commented Jan 31, 2020

can we replace this with the OpenJ9 version? @fjeremic

Yes we can. Ideally as you mentioned earlier where we want to arrive at the end is for this processor detection to be migrated into the port library (sysinfo_get_processor_description) and x86 would query that, similarly to other platforms.

harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 8, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 8, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 8, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 8, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 8, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/omr that referenced this issue Jun 9, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
kbeaton2-UNB3035 pushed a commit to CAS-Atlantic/omr that referenced this issue Jun 23, 2020
- Replace old processor APIs with new APIs from the CPU class
- Replace global TR::Compiler target with per comp target so that each compilation can either use target or relocatableTarget depending on the type of the compilation (JIT or AOT)
- When there is no port lib available, continue to use old APIs

Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
kbeaton2-UNB3035 pushed a commit to CAS-Atlantic/omr that referenced this issue Jun 23, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
kbeaton2-UNB3035 pushed a commit to CAS-Atlantic/omr that referenced this issue Jun 23, 2020
Issue: eclipse-omr#4339

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
harryyu1994 added a commit to harryyu1994/openj9 that referenced this issue Sep 23, 2020
Summary of changes:
- Retire old version cpu detection code on Power
- Move `CPU::getProcessorDescription()` to base J9CPU class as the 3 platforms (X, Z, Power) share the same code. There is no need to have 3 copies
- Reasons for the removal:
  - We kept the old version because we wanted to make sure new version and old version give us the same behaviour
  - Now it's been a few month and we don't see any bugs with the new version there's no reason to keep the old version
  - Also we want to prevent developers from using the old version.

Old design
- Initializes the `TR_Processor` struct via j9port library
  - Initialization occurs in `TR_J9VM::initializeProcessorType()`
  - obtain cpu information using `CPU::TO_PORTLIB_getJ9ProcessorDesc()`
  - `TR::Compiler->target.cpu.setProcessor` to set processor
- Fields and methods associated with the old design live in `Class CPU`
- Common APIs:
  - `CPU::TO_PORTLIB_getJ9ProcessorDesc()`
  - `j9sysinfo_processor_has_feature(processorDesc, feature)` (j9port library API)
  - `CPU::id()`
  - `CPU::setProcessor(TR_Processor)`

New design
- Initializes the `OMRProcessorDesc` struct via omrport library
  - `CPU::detect()` to initialize `OMRProcessorDesc processorDescription`
  - `CPU::applyUserOptions()` to apply any debug options after initialization
  - `TR::Compiler->target.cpu` singleton variable that holds the CPU instance
  - `comp()->target().cpu` per compilation variable which is used for cross-cpu compilation support
    - For JIT we initialize `comp()->target().cpu` with `TR::Compiler->target.cpu` (Host CPU)
    - For AOT we initialize `comp()->target().cpu` with `TR::Compiler->relocatableTarget.cpu` (Portable CPU)
- Fields and methods associated with the new design live in `Class CPU`
- Common APIs:
  - `CPU::supportsFeature(feature)`
  - `CPU::isAtLeast(OMRProcessorArchitecture)`
- Benefits:
  - Common to all 3 platforms
  - Simpler interface
  - Cross-CPU compilation support (Use one cpu for AOT and another for JIT)
- The implementation for the new design is first introduced here eclipse-omr/omr#5197 and eclipse-openj9#9571

issue: eclipse-omr/omr#4339
depends on: eclipse-omr/omr#5547

Signed-off-by: Harry Yu <harryyu1994@gmail.com>
@harryyu1994
Copy link
Contributor

harryyu1994 commented Oct 5, 2020

Latest Changes:

Summary of changes:

  • Retire old version cpu detection code on Power
  • Move CPU::getProcessorDescription() to base J9CPU class as the 3 platforms (X, Z, Power) share the same code. There is no need to have 3 copies
  • Reasons for the removal:
    • We kept the old version because we wanted to make sure new version and old version give us the same behaviour
    • Now it's been a few month and we don't see any bugs with the new version there's no reason to keep the old version
    • Also we want to prevent developers from using the old version.

Old design

  • Initializes the TR_Processor struct via j9port library
    • Initialization occurs in TR_J9VM::initializeProcessorType()
    • obtain cpu information using CPU::TO_PORTLIB_getJ9ProcessorDesc()
    • TR::Compiler->target.cpu.setProcessor to set processor
  • Fields and methods associated with the old design live in Class CPU
  • Common APIs:
    • CPU::TO_PORTLIB_getJ9ProcessorDesc()
    • j9sysinfo_processor_has_feature(processorDesc, feature) (j9port library API)
    • CPU::id()
    • CPU::setProcessor(TR_Processor)

New design

  • Initializes the OMRProcessorDesc struct via omrport library
    • CPU::detect() to initialize OMRProcessorDesc processorDescription
    • CPU::applyUserOptions() to apply any debug options after initialization
    • TR::Compiler->target.cpu singleton variable that holds the CPU instance
    • comp()->target().cpu per compilation variable which is used for cross-cpu compilation support
      • For JIT we initialize comp()->target().cpu with TR::Compiler->target.cpu (Host CPU)
      • For AOT we initialize comp()->target().cpu with TR::Compiler->relocatableTarget.cpu (Portable CPU)
  • Fields and methods associated with the new design live in Class CPU
  • Common APIs:
    • CPU::supportsFeature(feature)
    • CPU::isAtLeast(OMRProcessorArchitecture)
  • Benefits:
    • Common to all 3 platforms
    • Simpler interface
    • Cross-CPU compilation support (Use one cpu for AOT and another for JIT)

Introduce Feature Masks

  • Guard cpu feature tests against old version cpu feature detection on x86 with an environment variable.
  • Remove redundant assertion tests on x86.
  • Eliminate applyUserOptions().
    • Disassociate the options from TR::CPU.
    • Issue with applyUserOptions() is that the options are not initialized when TR::CPU is initialized which means if the user calls applyUserOptions() at the incorrect place the options queries that it contains will either crash the program or not take effect. I want to eliminate this risk of incorrect usage.
  • Introduce new API called TR::CPU::customize(OMRProcessorDesc processorDescription)
    • it's a static factory method that takes in an OMRProcessorDesc from user and output a TR::CPU object
    • The design is that we do not give user to ability to modify the fields of TR::CPU object in place. You will need to go through a factory method to change TR::CPU. This makes it easier for us to keep track where TR::CPU is modified and eliminate any unintentional illegal TR::CPU modifications from user which can be disastrous.
  • Introduce new API called TR::CPU::enableFeatureMasks()
    • You can turn on "featureMasks" support using this API
    • Once "featureMasks" support is turned on, we will mask out all cpu features that are not used by the current compiler from the OMRProcessorDesc struct in TR::CPU which allows to do CPU compatibility checks for AOT more correctly
    • For each platform, we keep track of the utilized cpu features inside TR::CPU::enableFeatureMasks()
    • If developer utilizes a previously unused feature, they need to add it to TR::CPU::enableFeatureMasks()
    • We have assert tests in place to detect queries to unused feature
      • The purpose of this detection is to minimize the risk of masking out cpu features that are being used.
  • Summary of current design:
    • TR::CPU::detect(omrPortLib): static factory method that gives you the TR::CPU object based on port library
    • TR::CPU::customize(processorDescription): static factory method that gives you the TR::CPU object based on a processorDescription provided by user

@0xdaryl
Copy link
Contributor

0xdaryl commented Apr 26, 2021

@fjeremic @dsouzai : There are a lot of detailed comments in this issue tracking the progress of this work (thanks @harryyu1994!). However, can one of you please provide a summary of what work remains to be done from its current state?

@nbhuiyan will return to the port library initialization problem soon (he's working on a PR) which will make the port lib accessible within OMR and should deprecate some of the duplicate code within OMR itself.

@fjeremic
Copy link
Contributor Author

AFAIK the detection has been sunk into OMR and there are two main pieces still remaining to finish:

  1. Hookup the OMR port library to the detection

    For example we have the following code: https://github.com/eclipse/omr/blob/3d85fbdf9c73e46c6da5df8989c119164711b368/compiler/env/OMRCompilerEnv.cpp#L77-L79

    Which detects the target CPU using the OMR port library. Unfortunately because the port library has not been hooked up to the compiler in OMR this value is NULL, so when we query for processor detection, for example here:https://github.com/eclipse/omr/blob/3d85fbdf9c73e46c6da5df8989c119164711b368/compiler/x/env/OMRCPU.cpp#L244-L256

    We default down to using the old API.

  2. Deprecate the old processor detection code

    When the port library is hooked up to the compiler the above queries should go away since the port library should never be NULL and the old processor detection APIs become dead at that point. We need to clean them up.

@fjeremic
Copy link
Contributor Author

Just stumbled on eclipse-openj9/openj9#12670 where it seems we have duplicated code for processor detection in OpenJ9. We need to remove this as part of the migration effort here to avoid developers adding more code into the OpenJ9 versions.

@nbhuiyan wondering if you have a timeline to resume this work?

@nbhuiyan
Copy link
Contributor

@fjeremic I plan on resuming this work later this week or early next week.

@r30shah
Copy link
Contributor

r30shah commented Jul 19, 2022

While fixing a bug to make sure we do vectorize some of the vector API intrinsics for Float type (On Z, we need to have Vector Facility Enhancement-1 installed to generate vector instruction for float type), we found out that we do not set the _flags which were supposed to be checked when portLib is null in compiler [1][2]. While working on the change, I found out that we do not set the _flags on first place, so thought of cleaning up them but bumped into this issue.

I guess continuing discussion from Filip's comment, if we have hooked up the port library to compiler, we can clean-up the old checks so that we would not use them accidentally. Trying to find out in the code-base, if we ever had change for hooking up portlib to compiler but can not find. @0xdaryl @nbhuiyan did we finished working on remaining pieces of work for this issue and can the code-gen go ahead and clean-up old feature detection code?

[1]. https://github.com/eclipse/omr/blob/edf2ae5033cb412348dbeb74d31bab2d26fe933d/compiler/z/env/OMRCPU.cpp#L101-L102
[2]. https://github.com/eclipse/omr/blob/edf2ae5033cb412348dbeb74d31bab2d26fe933d/compiler/z/env/OMRCPU.cpp#L154

@nbhuiyan
Copy link
Contributor

@r30shah I had the work to support the port library usage in the compiler completed a few months ago, but I was not able to get around to opening a PR for it yet. I am working on that and will update this issue soon.

@r30shah
Copy link
Contributor

r30shah commented Jul 20, 2022

Thanks a lot @nbhuiyan , We will wait for your PR before cleaning up the old processor feature recognition code. Trying to understand so as for current state, in the function supportsFeature we are not guaranteed to have omrPortLib ? I am asking as we do not set the flags for old API anywhere in the code, and right now code in the same function does seem to call [1].

[1]. https://github.com/eclipse/omr/blob/066d80097f5c72c48456423b87cd495a7c2c8277/port/unix/omrsysinfo.c#L762

@nbhuiyan
Copy link
Contributor

Currently while within the OMR Compiler, there is no guarantee that the port library is initialized. This is because OMR Compiler-based projects can initialize the compiler without the port library, which made it necessary to perform the check for the port library when used within OMR Compiler code. Given that the flags necessary for the old API is never set currently, supportsFeature was only returning the correct results if the port library was available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants