Difference between revisions of "Development:InstCountCI"

From FEX-Emu Wiki
Jump to navigation Jump to search
(Created page with "Category:Development InstCountCI is a continuous integration tool that FEX-Emu uses to ensure that instruction implementations aren't getting worse over time. == Getting S...")
 
Line 80: Line 80:
  
 
=== Diving Deeper ===
 
=== Diving Deeper ===
* I would recommend looking at the man page for FEX to see additional options that can be useful
+
I would recommend looking at the man page for FEX to see additional options that can be useful
** Specifically the options for '''FEX_PASSMANAGERDUMPIR''' to get more IR dumping options and '''FEX_HOSTFEATURES''' to fake CPU feature support.
+
* Specifically the options for '''FEX_PASSMANAGERDUMPIR''' to get more IR dumping options and '''FEX_HOSTFEATURES''' to fake CPU feature support.
 +
* Enabling the vixl simulator with the cmake option '''-DENABLE_VIXL_SIMULATOR=True''' can be useful to test features your CPU doesn't support!
 +
** '''Be warned! This invalidates InstCountCI results since it augments x87 instruction counts which need to jump out of the JIT.

Revision as of 08:21, 23 August 2023

InstCountCI is a continuous integration tool that FEX-Emu uses to ensure that instruction implementations aren't getting worse over time.

Getting Started

Make sure to follow Development:Setting_up_FEX to get an initial build environment set up.

What you need

  • An Arm64 Linux device that can build FEX

Additional cmake options

Some additional cmake options need to be passed to the FEX-Emu cmake options to get the tests building.

  • -DBUILD_TESTS=True
  • -DENABLE_VIXL_DISASSEMBLER=True

Quality of life improvements

Add these cmake options to make iteration time faster and have debug assertions to catch problems.

  • -DENABLE_LTO=False
  • -DCMAKE_BUILD_TYPE=RelWithDebInfo
  • -DENABLE_ASSERTIONS=True

Running InstCountCI

First thing you need to build the tests. This step will parse all the json files inside of unittests/InstructionCountCI/ and set up running CI in the next step.

  • ninja instcountci_test_files

Next you need to actually run the tests. This will run all the instructions declared in unittests/InstructionCountCI/*.json. If this step fails, that is okay since that just means that either an instruction translation has gotten worse, or if the test crashed then something catastrophic happened.

  • ninja instcountci_tests

The next step is to take the data generated from the previous step and modify the resulting json that is tracked by git.

  • ninja instcountci_update_tests

Now to see how the implementations have changed, you can just run git diff to see how the json files in unittests/InstructionCountCI/ have changed.

  • Reset the files with git checkout -- unittests/InstructionCountCI/*.json if the changes weren't desired.

Diving deeper in to the assembly

While the instruction count CI is good at showing the final result, it isn't the best at showing what FEX did to get to that result. This is where the assembly test harness can come in handy.

  • Create a file in unittests/ASM/Test.asm
  • Add the following data:
 %ifdef CONFIG
 {
 }
 %endif
 addps xmm0, xmm1
 hlt
  • Recompile the asm tests with `ninja asm_files`
  • Run the assembly test manually now with FEX_DUMPIR=stderr FEX_DISASSEMBLE=blocks ./Bin/TestHarnessRunner -c irjit -n 1 -g ./unittests/ASM/Test.asm.bin ./unittests/ASM/Test.asm.config.bin
    • This will dump both FEX's internal IR and the disassembly of the code for each instruction
    • The second code block for the hlt can be ignored. It is just necessary for this test harness to run.

The resulting output will be:

 IR-post 0x10000:
       (%0) IRHeader %2, #65536, #0, #1
       (%2) CodeBlock %3, %10
               (%3 i0) BeginBlock %2(Invalid)
               %4(FPRFixed1) i128 = LoadRegister #0x0, #0xd0, FPR, FPRFixed, u8:Tmp:Size
               %5(FPRFixed0) i128 = LoadRegister #0x0, #0xc0, FPR, FPRFixed, u8:Tmp:Size
               %6(FPRFixed0) i32v4 = VFAdd u8:Tmp:RegisterSize, u8:Tmp:ElementSize, %5(FPRFixed0) i128, %4(FPRFixed1) i128
               (%7 i128) StoreRegister %6(FPRFixed0) i32v4, #0x0, #0xc0, FPR, FPRFixed, u8:Tmp:Size
               (%8 i64) InlineEntrypointOffset #0x3, u8:Tmp:RegisterSize
               (%9 i64) ExitFunction %8(Invalid)
               (%10 i0) EndBlock %2(Invalid)
 @@@@@
 [INFO] Disassemble Begin
 [INFO] adr x0, #-0x4 (addr 0xffff6fa00018)
 [INFO] str x0, [x28, #184]
 [INFO] fadd v16.4s, v16.4s, v17.4s
 [INFO] ldr x0, pc+8 (addr 0xffff6fa00030)
 [INFO] blr x0
 [INFO] unallocated (Unallocated)
 [INFO] udf #0xffff
 [INFO] unallocated (Unallocated)
 [INFO] udf #0x0
 [INFO] Disassemble End
  • The disassembly has some instructions at the start and end which are necessary for the JIT to run
    • InstCountCI strips this code out automatically.
  • In a vacuum of a single instruction, the code block header and tail can dominate the code size.
    • It's recommended to become familiar with what the header and tail look like and ignore it in the resulting code generation.
    • Currently the header is the first two instructions adr+str
    • Currently the tail starts with the ldr+blr after the fadd and continues with some metadata afterwards.

Diving Deeper

I would recommend looking at the man page for FEX to see additional options that can be useful

  • Specifically the options for FEX_PASSMANAGERDUMPIR to get more IR dumping options and FEX_HOSTFEATURES to fake CPU feature support.
  • Enabling the vixl simulator with the cmake option -DENABLE_VIXL_SIMULATOR=True can be useful to test features your CPU doesn't support!
    • Be warned! This invalidates InstCountCI results since it augments x87 instruction counts which need to jump out of the JIT.