Page MenuHomeFreeBSD

loader: support for mixed-endianness ELF/loader and POWER8
ClosedPublic

Authored by wma on Sep 20 2017, 11:09 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 4, 8:30 PM
Unknown Object (File)
Thu, Jan 2, 9:46 AM
Unknown Object (File)
Sun, Dec 29, 12:37 AM
Unknown Object (File)
Wed, Dec 18, 8:51 PM
Unknown Object (File)
Fri, Dec 13, 2:31 PM
Unknown Object (File)
Dec 8 2024, 10:45 AM
Unknown Object (File)
Dec 8 2024, 10:45 AM
Unknown Object (File)
Dec 8 2024, 10:45 AM
Subscribers

Details

Summary

On POWER8 with the latest petitpoot, the loader.kboot is being
run as little-endian application. The FreeBSD kernel is
always big-endian on PPC, so the load_elf_* routines must be aware
of proper endianness of all fields.

Also:

  • dynamically resolve the base address for kexec load
  • modify trampoline code to be mixed-endianness compatible
  • fix syscall implementation on PPC Linux

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/boot/common/load_elf.c
90–91

Since there's only 3 of them (each) I'd be tempted to encode the bits into the macro name. I think the code will be more readable that way... Just a thought.

681–690

Why can't all this be inside of arch_copyin?

747–748

Don't like the duplication. Can this code be written so that you don't need it?

sys/boot/powerpc/kboot/main.c
39

I'd remove this

sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
165

I thought __dead2 was the right way to tag a function to keep the compiler happy...

sys/boot/common/load_elf.c
681–690

arch_copyin is responsible only for copying, and is not aware of the nature of the data provided - it's just an architecture-specific equivalent of bcopy.

sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
165

Might be. Tomorrow I'll check why it was added and remove if not required.

sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
137

Uh, should be htobe here as well.

Some advice from an IBM firmware team member:
‎[22:51] ‎<‎segher‎>‎ ah ok. some review... don't use endian-swapping macros; use a macro to do the actual memory accesses in the endian you want, instead
‎[22:52] ‎<‎segher‎>‎ and yup, very cool to see some other os on p8 :-)

Kevin, is there any explanation for this change?
All ELF parsing code is machine-independent. Some architectures don't have instructions allowing choose of data endianness, so I don't see a way how could we avoid using generic byte-swap macros.
I've used instructions you mentioned in the machine-specific trampoline code, which is a part of ppc64-kernel bootstrap and where we no longer care about code interoperatibility.

Also, most changes here touch only the loader.kboot, which is a Linux appliation being run to start the fbsd kernel.

I'm just relaying a conversation :)

‎[02:14] ‎<‎segher‎>‎ https://commandcenter.blogspot.nl/2012/04/byte-order-fallacy.html
‎[02:15] ‎<‎segher‎>‎ pike indepently reached the same conclusion as i did, so we must both be right ;-)
‎[02:16] ‎<‎segher‎>‎ the point is that you *never* have to byteswap
‎[02:16] ‎<‎segher‎>‎ either read or write in either BE or LE
‎[02:16] ‎<‎segher‎>‎ if stuff is not in memory, it *has* no endianness
‎[02:24] ‎<‎segher‎>‎ the point is that you should not access "foreign" data as if it is "native"

Freenode #ppc64 if you want to go direct, a lot of the firmware and kernel team are there

If I understood you correctly, you're referring to the part such as load_elf, lines 131-177.
The purpose of this code is to read the ELF header and store it as ehdr structure, which can be accessed by CPU. To do that, following algorithm is used:

  1. Read the ELF header structure and store it in the memory. Loader operates on blocks of data, so we're loading the whole chunk of data at once using "block read" operation.
  2. We might fall into sutuation, that some ELF header contents are BE-encoded while the loader runs as LE.
  3. An exact ehdr structure shape denepends on the ELF arch/bits/etc. and is generated during compilation.

I read Pike's blog, and agree that having

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

is just dumb. However, the case here is sligtly different

function load_elf_header(...) {
 read(fd, ehdr, PAGE_SIZE)
 /* The ELF might have values either BE or LE encoded, so once the structure is loaded to memory using " block read", we need to be aware of its endianness */
 // here goes the swapping code making structure fields native-endian
 xETOH(ehdr->e_type, 16);  // etc.
}

/* generic code, which uses ehdr->e_xxx values to navigate through ELF file */
printf("%x", ehdr->e_entry);  // etc.

The problem here has two solutions:

  1. Swap the fields just after the blob containing elfhdr is loaded into the memory, since that data is only read and used by the loader code itself. This is the way how it's implemented here.
  2. Leave structure fields in as-loaded endianness and provide accessors for the structure fields performing Xetoh translation when necessary. I didn't go that path, because:
    • the accessors will be complex and every layer would require an access to the ehdr to resovle class and endianness. For example, the section header accessor would be like
shdr_XXX_get(ehdr, shdr) {
   if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
      if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
         return  be64toh(shdr->XXX);
      else
         return  be32toh(shdr->XXX);
   else
      if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
         return  le64toh(shdr->XXX);
      else
         return  le32toh(shdr->XXX);
}
  • changing every possible place which access hdr->sh_XXX directly to use accessors will generate diff having about 2-3k lines changed. I just don't see a reason for such complication, especially that it all maps to the same operation - swapping the structure fields before they're used.
  • it is more logical that the function called elf_load_header returns a header structure which fields can be used directly without any black magic
  • I wanted to have an ugly and error prone code contained to one file instead of spread across numerous places :)

Just to be clear, I'm not opting for my solution and I'm OK with either #1 or #2. It's just a matter of which one suits our needs.

Any objections for commiting this?

This revision was not accepted when it landed; it landed in state Needs Review.Jan 29 2018, 9:24 AM
This revision was automatically updated to reflect the committed changes.