PPC64: gcc currently compiles for power4 by default, causing glibc's sqrtf to fail on e6500

Discussion:

PPC64: gcc currently compiles for power4 by default, causing glibc's sqrtf to fail on e6500

(too old to reply)

Bas Vermeulen

2018-02-06 14:50:01 UTC

Hi,

I am trying to run the ppc64 unstable on a Freescale T2080, which uses the
e6500 CPU. Running python (or any other application using sqrt or sqrtf)
will cause an illegal instruction exception, because the sqrtf opcode is
not supported on the e6500.

This seems to be caused by gcc compiling for power4 by default
(_ARCH_PWR4=1 and _ARCH_PPCSQ=1 set in gcc -E -mD - < /dev/null), although
jrtc27 on #debian-ports pointed to
https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107e388cb706517ced9505/gcc/config/rs6000/default64.h#L30
which sets MASK_PPC_GPOPT by default (which includes fp sqrt).

What would be the best way to solve this problem?

Bas Vermeulen

Lennart Sorensen

2018-02-06 16:30:01 UTC

Post by Bas Vermeulen
I am trying to run the ppc64 unstable on a Freescale T2080, which uses the
e6500 CPU. Running python (or any other application using sqrt or sqrtf)
will cause an illegal instruction exception, because the sqrtf opcode is
not supported on the e6500.
This seems to be caused by gcc compiling for power4 by default
(_ARCH_PWR4=1 and _ARCH_PPCSQ=1 set in gcc -E -mD - < /dev/null), although
jrtc27 on #debian-ports pointed to
https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107e388cb706517ced9505/gcc/config/rs6000/default64.h#L30
which sets MASK_PPC_GPOPT by default (which includes fp sqrt).
What would be the best way to solve this problem?

Why is the default not powerpc64 instead of power4? After all that is
the setting for generic 64 bit bit endian powerpc as far as I know.

It seems the e6500 is missing quite a few floating point instructions
from the Power ISA so targeting power4 (which is I believe full Power
ISA 2.01) is likely to cause issues.

--
Len Sorensen

John Paul Adrian Glaubitz

2018-02-06 16:50:01 UTC

I am evaluating the following patch to gcc-7 to fix the problem. It's currently building, I'll follow up when I know it works. The patch is added to debian/rules.patch to get it included.

Why do you want to patch the upstream sources when you could just modify debian/rules2 to pass the proper ââwith-cpu=powerpc64â to gccâs configure script?

Adrian

Bas Vermeulen

Post by Lennart Sorensen

Post by Bas Vermeulen
I am trying to run the ppc64 unstable on a Freescale T2080, which uses the
e6500 CPU. Running python (or any other application using sqrt or sqrtf)
will cause an illegal instruction exception, because the sqrtf opcode is
not supported on the e6500.
This seems to be caused by gcc compiling for power4 by default
(_ARCH_PWR4=1 and _ARCH_PPCSQ=1 set in gcc -E -mD - < /dev/null), although
jrtc27 on #debian-ports pointed to
https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107e388cb706517ced9505/gcc/config/rs6000/default64.h#L30
which sets MASK_PPC_GPOPT by default (which includes fp sqrt).
What would be the best way to solve this problem?

Why is the default not powerpc64 instead of power4? After all that is
the setting for generic 64 bit bit endian powerpc as far as I know.
It seems the e6500 is missing quite a few floating point instructions
from the Power ISA so targeting power4 (which is I believe full Power
ISA 2.01) is likely to cause issues.
--
Len Sorensen

<ppc64-use-powerpc64-by-default.diff>

Bas Vermeulen

2018-02-06 17:10:01 UTC

Mostly because I didn't know/think of that. But you are right, that would
be better.

Bas Vermeulen

On Tue, Feb 6, 2018 at 5:48 PM, John Paul Adrian Glaubitz <

I am evaluating the following patch to gcc-7 to fix the problem. It's
currently building, I'll follow up when I know it works. The patch is added
to debian/rules.patch to get it included.
Why do you want to patch the upstream sources when you could just modify
debian/rules2 to pass the proper ââwith-cpu=powerpc64â to gccâs configure
script?
Adrian
Bas Vermeulen
On Tue, Feb 6, 2018 at 5:16 PM, Lennart Sorensen <

Post by Bas Vermeulen

Post by Bas Vermeulen
I am trying to run the ppc64 unstable on a Freescale T2080, which uses

the

Post by Bas Vermeulen
e6500 CPU. Running python (or any other application using sqrt or sqrtf)
will cause an illegal instruction exception, because the sqrtf opcode is
not supported on the e6500.
This seems to be caused by gcc compiling for power4 by default
(_ARCH_PWR4=1 and _ARCH_PPCSQ=1 set in gcc -E -mD - < /dev/null),

although

Post by Bas Vermeulen
jrtc27 on #debian-ports pointed to
https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107

e388cb706517ced9505/gcc/config/rs6000/default64.h#L30

Post by Bas Vermeulen
which sets MASK_PPC_GPOPT by default (which includes fp sqrt).
What would be the best way to solve this problem?

Why is the default not powerpc64 instead of power4? After all that is
the setting for generic 64 bit bit endian powerpc as far as I know.
It seems the e6500 is missing quite a few floating point instructions
from the Power ISA so targeting power4 (which is I believe full Power
ISA 2.01) is likely to cause issues.
--
Len Sorensen

<ppc64-use-powerpc64-by-default.diff>

John Paul Adrian Glaubitz

2018-02-06 18:00:01 UTC

Mostly because I didn't know/think of that. But you are right, that would be better.
https://sources.debian.org/src/gcc-7/7.3.0-1/debian/rules2/#L380-L392

It would be nice if anyone who cares about the ppc64 could actually
post what kind of hardware they have. We should find a consensus on
where to put the baseline.

Rebuilding the whole archive isn't so much of a problem. We have already
done that :).

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Bas Vermeulen

2018-02-06 18:30:02 UTC

I just checked gcc's configure, and that doesn't take --with-cpu= as an
argument. It doesn't fail when it has it, but it doesn't actually do
anything with it either.
So my patch would be the correct way to handle this.

Bas Vermeulen

On Tue, Feb 6, 2018 at 6:58 PM, John Paul Adrian Glaubitz <

Post by Bas Vermeulen
Mostly because I didn't know/think of that. But you are right, that

would be better.

Post by Bas Vermeulen
https://sources.debian.org/src/gcc-7/7.3.0-1/debian/rules2/#L380-L392

It would be nice if anyone who cares about the ppc64 could actually
post what kind of hardware they have. We should find a consensus on
where to put the baseline.
Rebuilding the whole archive isn't so much of a problem. We have already
done that :).
Adrian
--
.''`. John Paul Adrian Glaubitz
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

John Paul Adrian Glaubitz

2018-02-06 18:30:02 UTC

I just checked gcc's configure, and that doesn't take --with-cpu= as an argument. It doesn't fail when it has it, but it doesn't actually do anything with it
either.
So my patch would be the correct way to handle this.
https://sources.debian.org/src/gcc-7/7.3.0-1/debian/rules2/#L390

It's in the very same file I linked :).

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Bas Vermeulen

2018-02-06 18:30:02 UTC

I get that. But configure in the gcc sources doesn't actually process
those. So while they are in rules2, it doesn't actually change anything.

See
https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107e388cb706517ced9505/configure
and search for with-cpu (which would catch both --with-cpu-32 and
--with-cpu-64), and you won't find it there.

Bas Vermeulen

On Tue, Feb 6, 2018 at 7:25 PM, John Paul Adrian Glaubitz <

Post by Bas Vermeulen

Post by Bas Vermeulen
I just checked gcc's configure, and that doesn't take --with-cpu= as an

argument. It doesn't fail when it has it, but it doesn't actually do
anything with it

Post by Bas Vermeulen
either.
So my patch would be the correct way to handle this.
https://sources.debian.org/src/gcc-7/7.3.0-1/debian/rules2/#L390

It's in the very same file I linked :).
--
.''`. John Paul Adrian Glaubitz
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

John Paul Adrian Glaubitz

2018-02-06 18:40:01 UTC

I get that. But configure in the gcc sources doesn't actually process those. So while they are in rules2, it doesn't actually change anything.
See https://github.com/gcc-mirror/gcc/blob/da8dff89fa9398f04b107e388cb706517ced9505/configure and search for with-cpu (which would catch both --with-cpu-32 and
--with-cpu-64), and you won't find it there.

I really don't think that Matthias Klose had put those there without testing.

"--with-cpu-*" is a common configure switch supported by gcc across all
architectures. We use it on sparc64 as well, for example to set the
default CPU to UltraSPARC for 32-bit sparc.

https://github.com/gcc-mirror/gcc/blob/master/gcc/config.gcc

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

John Paul Adrian Glaubitz

2018-02-09 10:40:01 UTC

mator on #debian-ports compiled gcc-7 for me with the attached patch.
With the resulting gcc, I compiled glibc and got a library I can use
sqrtf without running into an illegal instruction exception.
Would it be possible to get this applied by default? The resulting
binaries work on e6500, and ought to work on all supported CPUs
for the ppc64 port.

This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.

So, while I'm generally not against such a change, I would like to hear
some voices first.

Thanks,
Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Mathieu Malaterre

2018-02-09 11:30:02 UTC

On Fri, Feb 9, 2018 at 11:34 AM, John Paul Adrian Glaubitz

Post by John Paul Adrian Glaubitz

mator on #debian-ports compiled gcc-7 for me with the attached patch.
With the resulting gcc, I compiled glibc and got a library I can use
sqrtf without running into an illegal instruction exception.
Would it be possible to get this applied by default? The resulting
binaries work on e6500, and ought to work on all supported CPUs
for the ppc64 port.

This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.
So, while I'm generally not against such a change, I would like to hear
some voices first.

uh, could someone please actually list the diff ? I am guessing something like:

$ echo | gcc -dM -E -

or maybe:

$ gcc -Q --help=target

Bas Vermeulen

2018-02-09 12:00:01 UTC

diff of gcc -dM -E - without and with the patch applied:

--- gcc-default 2018-01-28 18:06:14.116000000 +0000
+++ gcc-powerpc64 2018-01-28 18:03:06.572000000 +0000
@@ -34,7 +34,6 @@
#define __FLT64_DECIMAL_DIG__ 17
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
#define pixel pixel
-#define _ARCH_PPCSQ 1
#define bool bool
#define __UINT_FAST64_MAX__ 0xffffffffffffffffUL
#define __SIG_ATOMIC_TYPE__ int
@@ -78,7 +77,6 @@
#define __USER_LABEL_PREFIX__
#define __STDC_HOSTED__ 1
#define __LDBL_HAS_INFINITY__ 1
-#define _ARCH_PWR4 1
#define __builtin_vsx_xvmaddmsp __builtin_vsx_xvmaddsp
#define __CMODEL_MEDIUM__ 1
#define __FLT32_DIG__ 6

gcc -Q --help=target (default vs with the patch applied):

--- gcc-Q-default 2018-01-28 18:08:29.232000000 +0000
+++ gcc-Q-powerpc64 2018-01-28 18:07:39.984000000 +0000
@@ -35,7 +35,7 @@
-mcmodel= medium
-mcmpb [disabled]
-mcompat-align-parm [disabled]
- -mcpu= [default]
+ -mcpu= powerpc64
-mcrypto [disabled]
-mdebug=
-mdirect-move [disabled]
@@ -70,7 +70,7 @@
-mlong-double-<n> 128
-mlongcall [disabled]
-mlra [enabled]
- -mmfcrf [enabled]
+ -mmfcrf [disabled]
-mmfpgpr [disabled]
-mminimal-toc [disabled]
-mmodulo [disabled]
@@ -89,7 +89,7 @@
-mpointers-to-nested-functions [enabled]
-mpopcntb [disabled]
-mpopcntd [disabled]
- -mpower8-fusion [enabled]
+ -mpower8-fusion [disabled]
-mpower8-fusion-sign [disabled]
-mpower8-vector [disabled]
-mpower9-dform [enabled]
@@ -101,9 +101,9 @@
-mpower9-vector [disabled]
-mpowerpc
-mpowerpc-gfxopt [enabled]
- -mpowerpc-gpopt [enabled]
+ -mpowerpc-gpopt [disabled]
-mpowerpc64 [enabled]
- -mprioritize-restricted-insns= 1
+ -mprioritize-restricted-insns= 0
-mprofile-kernel [disabled]
-mprototype [disabled]
-mquad-memory [disabled]
@@ -145,7 +145,7 @@
-mtoc [disabled]
-mtoc-fusion [disabled]
-mtraceback= [default]
- -mtune= [default]
+ -mtune= powerpc64
-muclibc [disabled]
-mupdate [enabled]
-mupper-regs [enabled]

Bas Vermeulen

Bas Vermeulen

2018-02-09 12:10:01 UTC

Taking ***@daedalean.ai out of the cc list (it has an auto replay I
didn't know about, sorry).

Bas Vermeulen

Post by Bas Vermeulen
--- gcc-default 2018-01-28 18:06:14.116000000 +0000
+++ gcc-powerpc64 2018-01-28 18:03:06.572000000 +0000
@@ -34,7 +34,6 @@
#define __FLT64_DECIMAL_DIG__ 17
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
#define pixel pixel
-#define _ARCH_PPCSQ 1
#define bool bool
#define __UINT_FAST64_MAX__ 0xffffffffffffffffUL
#define __SIG_ATOMIC_TYPE__ int
@@ -78,7 +77,6 @@
#define __USER_LABEL_PREFIX__
#define __STDC_HOSTED__ 1
#define __LDBL_HAS_INFINITY__ 1
-#define _ARCH_PWR4 1
#define __builtin_vsx_xvmaddmsp __builtin_vsx_xvmaddsp
#define __CMODEL_MEDIUM__ 1
#define __FLT32_DIG__ 6
--- gcc-Q-default 2018-01-28 18:08:29.232000000 +0000
+++ gcc-Q-powerpc64 2018-01-28 18:07:39.984000000 +0000
@@ -35,7 +35,7 @@
-mcmodel= medium
-mcmpb [disabled]
-mcompat-align-parm [disabled]
- -mcpu= [default]
+ -mcpu= powerpc64
-mcrypto [disabled]
-mdebug=
-mdirect-move [disabled]
@@ -70,7 +70,7 @@
-mlong-double-<n> 128
-mlongcall [disabled]
-mlra [enabled]
- -mmfcrf [enabled]
+ -mmfcrf [disabled]
-mmfpgpr [disabled]
-mminimal-toc [disabled]
-mmodulo [disabled]
@@ -89,7 +89,7 @@
-mpointers-to-nested-functions [enabled]
-mpopcntb [disabled]
-mpopcntd [disabled]
- -mpower8-fusion [enabled]
+ -mpower8-fusion [disabled]
-mpower8-fusion-sign [disabled]
-mpower8-vector [disabled]
-mpower9-dform [enabled]
@@ -101,9 +101,9 @@
-mpower9-vector [disabled]
-mpowerpc
-mpowerpc-gfxopt [enabled]
- -mpowerpc-gpopt [enabled]
+ -mpowerpc-gpopt [disabled]
-mpowerpc64 [enabled]
- -mprioritize-restricted-insns= 1
+ -mprioritize-restricted-insns= 0
-mprofile-kernel [disabled]
-mprototype [disabled]
-mquad-memory [disabled]
@@ -145,7 +145,7 @@
-mtoc [disabled]
-mtoc-fusion [disabled]
-mtraceback= [default]
- -mtune= [default]
+ -mtune= powerpc64
-muclibc [disabled]
-mupdate [enabled]
-mupper-regs [enabled]
Bas Vermeulen

Mathieu Malaterre

2018-02-09 15:30:02 UTC

Hi all,

I apologize for the 'reply in 48 hours' spam my extcontacts mailinglist
caused for any of you that replied here. I asked bas to keep us posted but
i didn't realize it would send this out. I switched it off.
thanks for all your hard work.
/Luuk

Post by Bas Vermeulen
--- gcc-default 2018-01-28 18:06:14.116000000 +0000
+++ gcc-powerpc64 2018-01-28 18:03:06.572000000 +0000
@@ -34,7 +34,6 @@
#define __FLT64_DECIMAL_DIG__ 17
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
#define pixel pixel
-#define _ARCH_PPCSQ 1
#define bool bool
#define __UINT_FAST64_MAX__ 0xffffffffffffffffUL
#define __SIG_ATOMIC_TYPE__ int
@@ -78,7 +77,6 @@
#define __USER_LABEL_PREFIX__
#define __STDC_HOSTED__ 1
#define __LDBL_HAS_INFINITY__ 1
-#define _ARCH_PWR4 1
#define __builtin_vsx_xvmaddmsp __builtin_vsx_xvmaddsp
#define __CMODEL_MEDIUM__ 1
#define __FLT32_DIG__ 6
--- gcc-Q-default 2018-01-28 18:08:29.232000000 +0000
+++ gcc-Q-powerpc64 2018-01-28 18:07:39.984000000 +0000
@@ -35,7 +35,7 @@
-mcmodel= medium
-mcmpb [disabled]
-mcompat-align-parm [disabled]
- -mcpu= [default]
+ -mcpu= powerpc64
-mcrypto [disabled]
-mdebug=
-mdirect-move [disabled]
@@ -70,7 +70,7 @@
-mlong-double-<n> 128
-mlongcall [disabled]
-mlra [enabled]
- -mmfcrf [enabled]
+ -mmfcrf [disabled]
-mmfpgpr [disabled]
-mminimal-toc [disabled]
-mmodulo [disabled]
@@ -89,7 +89,7 @@
-mpointers-to-nested-functions [enabled]
-mpopcntb [disabled]
-mpopcntd [disabled]
- -mpower8-fusion [enabled]
+ -mpower8-fusion [disabled]
-mpower8-fusion-sign [disabled]
-mpower8-vector [disabled]
-mpower9-dform [enabled]
@@ -101,9 +101,9 @@
-mpower9-vector [disabled]
-mpowerpc
-mpowerpc-gfxopt [enabled]
- -mpowerpc-gpopt [enabled]
+ -mpowerpc-gpopt [disabled]
-mpowerpc64 [enabled]
- -mprioritize-restricted-insns= 1
+ -mprioritize-restricted-insns= 0
-mprofile-kernel [disabled]
-mprototype [disabled]
-mquad-memory [disabled]
@@ -145,7 +145,7 @@
-mtoc [disabled]
-mtoc-fusion [disabled]
-mtraceback= [default]
- -mtune= [default]
+ -mtune= powerpc64
-muclibc [disabled]
-mupdate [enabled]
-mupper-regs [enabled]

The diff looks a bit odd to me, I do not understand why we would be
moving away from the default mtune. Anyway you have my vote. I know
Adrian gets sometime inspiration from fedora team, maybe time to reach
a consensus with them also. IBM seems to be going toward ppc64el
anyway, so I believe that should not affect too many people.

2cts,
-M

Christian Zigotzky

2018-02-09 11:40:02 UTC

We need this change too. We have e5500 CPU’s in our AmigaOnes.

— Christian

Sent from my iPhone

Post by John Paul Adrian Glaubitz

mator on #debian-ports compiled gcc-7 for me with the attached patch.
With the resulting gcc, I compiled glibc and got a library I can use
sqrtf without running into an illegal instruction exception.
Would it be possible to get this applied by default? The resulting
binaries work on e6500, and ought to work on all supported CPUs
for the ppc64 port.

This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.
So, while I'm generally not against such a change, I would like to hear
some voices first.
Thanks,
Adrian
--
.''`. John Paul Adrian Glaubitz
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Lennart Sorensen

2018-02-09 16:50:01 UTC

Post by John Paul Adrian Glaubitz
This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.
So, while I'm generally not against such a change, I would like to hear
some voices first.

Without this change, you have to declare officially that e6500 and e5500
are simply not supported CPUs.

--
Len Sorensen

Dennis Clarke

2018-02-10 22:10:02 UTC

Post by John Paul Adrian Glaubitz

mator on #debian-ports compiled gcc-7 for me with the attached patch.
With the resulting gcc, I compiled glibc and got a library I can use
sqrtf without running into an illegal instruction exception.
Would it be possible to get this applied by default? The resulting
binaries work on e6500, and ought to work on all supported CPUs
for the ppc64 port.

This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.

Not sure how modern an Apple Mac is but here is a photo I took only a
few minutes ago:

Loading Image...

I have this old Mac G5 running as a fine example of a big-endian machine
and the PPC970MP processors in it seem to work very well. However it is
certainly becoming difficult to get results from it that can compare to
what I get from some other machines like Fujitsu SPARC for example. The
biggest complaint is with floating point wherein the data representation
may be actual IEEE 754-2008 style or some new IBM variant that I am not
at all familiar with. In fact, some code, trivial, won't compile at all
if I try to use "IEEE extended precision long double" with very few ways
to get around that :

gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors \
-pedantic-errors -mabi=ieeelongdouble ...

The gcc that I am using claims to be :

GNU C99 (Debian 7.2.0-17) version 7.2.1 20171205 (powerpc64-linux-gnu)
compiled by GNU C version 7.2.1 20171205, GMP version 6.1.2,
MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP

I can take the exact same source of a trivial floating point test and
drop it on very very old sparc as well as a system running very up to
date Red Hat Enterprise Linux 7.4 with AMD Opterons. Also this old mac
g5 with its PPC970MP processors where I see wildly different results on
all of them. When I say "wildly" I mean to say that the in memory data
isn't even remotely the same given the same constant inputs. I know that
the x86 hardware is somewhat crippled ( a strange ten byte format ) in
this regard but I was quite surprised by what happens on the PPC970MP
processors when compared to sparc. Regardless what compiler I use on
the sparc ( very very old Sun and much newer Fujitsu ) with Solaris 10
I always get nearly perfect results. The Debian PPC970MP produces close
results but again the in memory data is quite different.

In any case there are people out there messing with these things for
various reasons ( educational even in that I do teach ) and it is quite
weird to have to say to a student that in the year 2018 don't expect
similar results across different machines when it comes to doing any
floating point math.

Dennis

ps: long boring stuff follows where numbers don't quite work
and libquadmath seems to be out of the question.

----- feel free to compile this on anything and show results ------

#define _XOPEN_SOURCE 600

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <locale.h>
#include <sys/utsname.h>

int main (int argc, char* argv[]){

int j;
struct utsname uname_data;
long double theta, pi, approx_pi, one_over_sqrt2, ld_error;

setlocale( LC_MESSAGES, "C" );
if ( uname( &uname_data ) < 0 ) {
fprintf ( stderr,
"WARNING : Could not attain system uname data.\n" );
perror ( "uname" );
} else {
printf (" system name = %s\n", uname_data.sysname );
printf (" node name = %s\n", uname_data.nodename );
printf (" release = %s\n", uname_data.release );
printf (" version = %s\n", uname_data.version );
printf (" machine = %s\n", uname_data.machine );
}
printf ("\n");

/* plenty of digits well past the precision of binary128 */
pi = 3.1415926535897932384626433832795028841971693993751L;

printf("sizeof(long double) = %2i\n", sizeof(long double));
printf(" pi may be %+40.38Lf\n", pi);
printf("reference val = ");
printf("+3.1415926535897932384626433832795028841971693993751\n\n");

printf("%p : ", &pi);
for ( j=0; j<sizeof(long double); j++ )
printf("%02x ", ((unsigned char *)&pi)[j] );
printf("\n\n" );

ld_error = (long double)
3.1415926535897932384626433832795028841971693993751L
- pi;
printf(" ld_error = %+40.38Lf\n\n", ld_error);

printf("sinl(pi) may be %+40.38Lf\n", sinl(pi));

approx_pi = (long double) 4.0L * atanl( (long double) 1.0L);
printf(" approx_pi = %+40.38Lf\n", approx_pi);
ld_error = (long double)
3.1415926535897932384626433832795028841971693993751L
- approx_pi;

printf(" ld_error = %+40.38Lf\n\n", ld_error);

theta = pi / ( (long double) 4.0L);
printf(" theta = %+40.38Lf\n", theta);
one_over_sqrt2 = sinl(theta);
printf(" sinl(theta) = %+40.38Lf\n", one_over_sqrt2);

ld_error = (long double)
0.7071067811865475244008443621048490392848359376884L
- one_over_sqrt2;

printf(" ld_error = %+40.38Lf\n\n", ld_error);

return EXIT_SUCCESS;

}

EOF
If you copy and paste that correctly you should have sha256 hash :

836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d

***@n0$ psrinfo -pv
The physical processor has 8 virtual processors (0-7)
SPARC64-VII+ (portid 1024 impl 0x7 ver 0xa1 clock 2860 MHz)
***@n0$ /usr/local/gcc6/bin/gcc --version
gcc (genunix Wed Jul 26 02:41:24 GMT 2017) 6.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

***@n0$ /usr/local/gcc6/bin/gcc -m64 -std=iso9899:1999 -Wfatal-errors
-pedantic-errors -o s s.c -lm
***@n0$ ./s
system name = SunOS
node name = node000
release = 5.10
version = Generic_150400-59
machine = sun4u

sizeof(long double) = 16
pi may be +3.14159265358979323846264338327950279748
reference val = +3.1415926535897932384626433832795028841971693993751

ffffffff7fffeed0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8

ld_error = +0.00000000000000000000000000000000000000

sinl(pi) may be +0.00000000000000000000000000000000008672
approx_pi = +3.14159265358979323846264338327950279748
ld_error = +0.00000000000000000000000000000000000000

theta = +0.78539816339744830961566084581987569937
sinl(theta) = +0.70710678118654752440084436210484899217
ld_error = +0.00000000000000000000000000000000000000

however ....

ppc_nix$
ppc_nix$ gcc --version
gcc (Debian 7.2.0-17) 7.2.1 20171205
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ppc_nix$ grep "^cpu" /proc/cpuinfo
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
ppc_nix$

ppc_nix$ openssl dgst -sha256 s.c
SHA256(s.c)=
836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d

ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999
-Wfatal-errors -pedantic-errors -mabi=ieeelongdouble -o s s.c -lm
gcc: warning: using IEEE extended precision long double
cc1: warning: using IEEE extended precision long double
/tmp/cc348kuM.o: In function `main':
s.c:(.text+0x26c): undefined reference to `_q_sub'
s.c:(.text+0x3ac): undefined reference to `_q_sub'
s.c:(.text+0x424): undefined reference to `_q_div'
s.c:(.text+0x4ec): undefined reference to `_q_sub'
collect2: error: ld returned 1 exit status
ppc_nix$

ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999
-Wfatal-errors -pedantic-errors -mabi=ibmlongdouble -o s s.c -lm
gcc: warning: using IBM extended precision long double
cc1: warning: using IBM extended precision long double
ppc_nix$ ./s
system name = Linux
node name = nix
release = 4.13.0-1-powerpc64
version = #1 SMP Debian 4.13.13-1 (2017-11-16)
machine = ppc64

sizeof(long double) = 16
pi may be +3.14159265358979323846264338327948122706
reference val = +3.1415926535897932384626433832795028841971693993751

0x7fffc9d0c230 : 40 09 21 fb 54 44 2d 18 3c a1 a6 26 33 14 5c 06

ld_error = +0.00000000000000000000000000000000000000

sinl(pi) may be +0.00000000000000000000000000000002165713
approx_pi = +3.14159265358979323846264338327948122706
ld_error = +0.00000000000000000000000000000000000000

theta = +0.78539816339744830961566084581987030677
sinl(theta) = +0.70710678118654752440084436210483464400
ld_error = +0.00000000000000000000000000000000616298

ppc_nix$

A twenty year old sparc gives better results when using gcc 7.2.0 :

mimas $ psrinfo -pv
The physical processor has 1 virtual processor (0)
UltraSPARC-IIe (portid 0 impl 0x13 ver 0x14 clock 500 MHz)

mimas $ /usr/local/gcc7/bin/gcc --version
gcc (genunix Tue Aug 29 11:48:17 GMT 2017) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

mimas $

mimas $ openssl dgst -sha256 s.c
SHA256(s.c)=
836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d

mimas $ /usr/local/gcc7/bin/gcc -m64 -std=iso9899:1999 -Wfatal-errors
-pedantic-errors -o s s.c -lm
mimas $ ./s
system name = SunOS
node name = mimas
release = 5.10
version = Generic_150400-57
machine = sun4u

sizeof(long double) = 16
pi may be +3.14159265358979323846264338327950279748
reference val = +3.1415926535897932384626433832795028841971693993751

ffffffff7ffff0a0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8

ld_error = +0.00000000000000000000000000000000000000

sinl(pi) may be +0.00000000000000000000000000000000008672
approx_pi = +3.14159265358979323846264338327950279748
ld_error = +0.00000000000000000000000000000000000000

theta = +0.78539816339744830961566084581987569937
sinl(theta) = +0.70710678118654752440084436210484899217
ld_error = +0.00000000000000000000000000000000000000

mimas $

Other than the memory address this is bit for bit exact same as the
newer Fujitsu server. I was hoping to see the exact same from the
mac PPC970MP based unit.

Gabriel Paubert

2018-02-11 01:10:01 UTC

Post by Dennis Clarke

Post by John Paul Adrian Glaubitz

mator on #debian-ports compiled gcc-7 for me with the attached patch.
With the resulting gcc, I compiled glibc and got a library I can use
sqrtf without running into an illegal instruction exception.
Would it be possible to get this applied by default? The resulting
binaries work on e6500, and ought to work on all supported CPUs
for the ppc64 port.

This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.

Not sure how modern an Apple Mac is but here is a photo I took only a
https://i.imgur.com/6UbviKb.jpg
I have this old Mac G5 running as a fine example of a big-endian machine
and the PPC970MP processors in it seem to work very well. However it is
certainly becoming difficult to get results from it that can compare to
what I get from some other machines like Fujitsu SPARC for example. The
biggest complaint is with floating point wherein the data representation
may be actual IEEE 754-2008 style or some new IBM variant that I am not
at all familiar with. In fact, some code, trivial, won't compile at all
if I try to use "IEEE extended precision long double" with very few ways
gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors \
-pedantic-errors -mabi=ieeelongdouble ...
GNU C99 (Debian 7.2.0-17) version 7.2.1 20171205 (powerpc64-linux-gnu)
compiled by GNU C version 7.2.1 20171205, GMP version 6.1.2,
MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP
I can take the exact same source of a trivial floating point test and
drop it on very very old sparc as well as a system running very up to
date Red Hat Enterprise Linux 7.4 with AMD Opterons. Also this old mac
g5 with its PPC970MP processors where I see wildly different results on
all of them. When I say "wildly" I mean to say that the in memory data
isn't even remotely the same given the same constant inputs. I know that
the x86 hardware is somewhat crippled ( a strange ten byte format ) in
this regard but I was quite surprised by what happens on the PPC970MP
processors when compared to sparc. Regardless what compiler I use on
the sparc ( very very old Sun and much newer Fujitsu ) with Solaris 10
I always get nearly perfect results. The Debian PPC970MP produces close
results but again the in memory data is quite different.
In any case there are people out there messing with these things for
various reasons ( educational even in that I do teach ) and it is quite
weird to have to say to a student that in the year 2018 don't expect
similar results across different machines when it comes to doing any
floating point math.
Dennis
ps: long boring stuff follows where numbers don't quite work
and libquadmath seems to be out of the question.

This is quite well known, for a long time, IBM on Power (not on
mainframes) used a non IEEE format for long doubles. Actually these are
two IEEE doubles "concatenated", so:
- the mantissa is somewhat less precise, 2 times 53 bits instead of 112
- the exponent range is way smaller, in powers of 10 the range is
roughly ±308 (same as double) instead of ±4932.

The fact the the in memory representation is completely different is not
surprising when you take this into account.

This was somewhat faster than a full emulation of IEEE quad math, but
now IBM has switched to real IEEE quad (in hardware even on Power9, I
suspect most Sparc do it in software).

For more details, you may have a look at:
https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format
there is even a full paragraph on the double-double arithmetic.

I'm away from my Power machine right now and it is switched off, so I
can't try your code and play with compiler options.

Cheers,
Gabriel

Post by Dennis Clarke
----- feel free to compile this on anything and show results ------
#define _XOPEN_SOURCE 600
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <locale.h>
#include <sys/utsname.h>
int main (int argc, char* argv[]){
int j;
struct utsname uname_data;
long double theta, pi, approx_pi, one_over_sqrt2, ld_error;
setlocale( LC_MESSAGES, "C" );
if ( uname( &uname_data ) < 0 ) {
fprintf ( stderr,
"WARNING : Could not attain system uname data.\n" );
perror ( "uname" );
} else {
printf (" system name = %s\n", uname_data.sysname );
printf (" node name = %s\n", uname_data.nodename );
printf (" release = %s\n", uname_data.release );
printf (" version = %s\n", uname_data.version );
printf (" machine = %s\n", uname_data.machine );
}
printf ("\n");
/* plenty of digits well past the precision of binary128 */
pi = 3.1415926535897932384626433832795028841971693993751L;
printf("sizeof(long double) = %2i\n", sizeof(long double));
printf(" pi may be %+40.38Lf\n", pi);
printf("reference val = ");
printf("+3.1415926535897932384626433832795028841971693993751\n\n");
printf("%p : ", &pi);
for ( j=0; j<sizeof(long double); j++ )
printf("%02x ", ((unsigned char *)&pi)[j] );
printf("\n\n" );
ld_error = (long double)
3.1415926535897932384626433832795028841971693993751L
- pi;
printf(" ld_error = %+40.38Lf\n\n", ld_error);
printf("sinl(pi) may be %+40.38Lf\n", sinl(pi));
approx_pi = (long double) 4.0L * atanl( (long double) 1.0L);
printf(" approx_pi = %+40.38Lf\n", approx_pi);
ld_error = (long double)
3.1415926535897932384626433832795028841971693993751L
- approx_pi;
printf(" ld_error = %+40.38Lf\n\n", ld_error);
theta = pi / ( (long double) 4.0L);
printf(" theta = %+40.38Lf\n", theta);
one_over_sqrt2 = sinl(theta);
printf(" sinl(theta) = %+40.38Lf\n", one_over_sqrt2);
ld_error = (long double)
0.7071067811865475244008443621048490392848359376884L
- one_over_sqrt2;
printf(" ld_error = %+40.38Lf\n\n", ld_error);
return EXIT_SUCCESS;
}
EOF
836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d
The physical processor has 8 virtual processors (0-7)
SPARC64-VII+ (portid 1024 impl 0x7 ver 0xa1 clock 2860 MHz)
gcc (genunix Wed Jul 26 02:41:24 GMT 2017) 6.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-pedantic-errors -o s s.c -lm
system name = SunOS
node name = node000
release = 5.10
version = Generic_150400-59
machine = sun4u
sizeof(long double) = 16
pi may be +3.14159265358979323846264338327950279748
reference val = +3.1415926535897932384626433832795028841971693993751
ffffffff7fffeed0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8
ld_error = +0.00000000000000000000000000000000000000
sinl(pi) may be +0.00000000000000000000000000000000008672
approx_pi = +3.14159265358979323846264338327950279748
ld_error = +0.00000000000000000000000000000000000000
theta = +0.78539816339744830961566084581987569937
sinl(theta) = +0.70710678118654752440084436210484899217
ld_error = +0.00000000000000000000000000000000000000
however ....
ppc_nix$
ppc_nix$ gcc --version
gcc (Debian 7.2.0-17) 7.2.1 20171205
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ppc_nix$ grep "^cpu" /proc/cpuinfo
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
cpu : PPC970MP, altivec supported
ppc_nix$
ppc_nix$ openssl dgst -sha256 s.c
SHA256(s.c)=
836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d
ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors
-pedantic-errors -mabi=ieeelongdouble -o s s.c -lm
gcc: warning: using IEEE extended precision long double
cc1: warning: using IEEE extended precision long double
s.c:(.text+0x26c): undefined reference to `_q_sub'
s.c:(.text+0x3ac): undefined reference to `_q_sub'
s.c:(.text+0x424): undefined reference to `_q_div'
s.c:(.text+0x4ec): undefined reference to `_q_sub'
collect2: error: ld returned 1 exit status
ppc_nix$
ppc_nix$ gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors
-pedantic-errors -mabi=ibmlongdouble -o s s.c -lm
gcc: warning: using IBM extended precision long double
cc1: warning: using IBM extended precision long double
ppc_nix$ ./s
system name = Linux
node name = nix
release = 4.13.0-1-powerpc64
version = #1 SMP Debian 4.13.13-1 (2017-11-16)
machine = ppc64
sizeof(long double) = 16
pi may be +3.14159265358979323846264338327948122706
reference val = +3.1415926535897932384626433832795028841971693993751
0x7fffc9d0c230 : 40 09 21 fb 54 44 2d 18 3c a1 a6 26 33 14 5c 06
ld_error = +0.00000000000000000000000000000000000000
sinl(pi) may be +0.00000000000000000000000000000002165713
approx_pi = +3.14159265358979323846264338327948122706
ld_error = +0.00000000000000000000000000000000000000
theta = +0.78539816339744830961566084581987030677
sinl(theta) = +0.70710678118654752440084436210483464400
ld_error = +0.00000000000000000000000000000000616298
ppc_nix$
mimas $ psrinfo -pv
The physical processor has 1 virtual processor (0)
UltraSPARC-IIe (portid 0 impl 0x13 ver 0x14 clock 500 MHz)
mimas $ /usr/local/gcc7/bin/gcc --version
gcc (genunix Tue Aug 29 11:48:17 GMT 2017) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
mimas $
mimas $ openssl dgst -sha256 s.c
SHA256(s.c)=
836282023b62d3a09b6ad59424951d873b965a594f23e6c41d596c4845f74d5d
mimas $ /usr/local/gcc7/bin/gcc -m64 -std=iso9899:1999 -Wfatal-errors
-pedantic-errors -o s s.c -lm
mimas $ ./s
system name = SunOS
node name = mimas
release = 5.10
version = Generic_150400-57
machine = sun4u
sizeof(long double) = 16
pi may be +3.14159265358979323846264338327950279748
reference val = +3.1415926535897932384626433832795028841971693993751
ffffffff7ffff0a0 : 40 00 92 1f b5 44 42 d1 84 69 89 8c c5 17 01 b8
ld_error = +0.00000000000000000000000000000000000000
sinl(pi) may be +0.00000000000000000000000000000000008672
approx_pi = +3.14159265358979323846264338327950279748
ld_error = +0.00000000000000000000000000000000000000
theta = +0.78539816339744830961566084581987569937
sinl(theta) = +0.70710678118654752440084436210484899217
ld_error = +0.00000000000000000000000000000000000000
mimas $
Other than the memory address this is bit for bit exact same as the
newer Fujitsu server. I was hoping to see the exact same from the
mac PPC970MP based unit.

Dennis Clarke

2018-02-11 03:40:02 UTC

Post by Gabriel Paubert

Post by Dennis Clarke

Post by John Paul Adrian Glaubitz
This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.

Not sure how modern an Apple Mac is but here is a photo I took only a
https://i.imgur.com/6UbviKb.jpg
I have this old Mac G5 running as a fine example of a big-endian machine
and the PPC970MP processors in it seem to work very well. However it is
certainly becoming difficult to get results from it that can compare to
what I get from some other machines like Fujitsu SPARC for example. The
biggest complaint is with floating point wherein the data representation
may be actual IEEE 754-2008 style or some new IBM variant that I am not
at all familiar with. In fact, some code, trivial, won't compile at all
if I try to use "IEEE extended precision long double" with very few ways
gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors \
-pedantic-errors -mabi=ieeelongdouble ...
GNU C99 (Debian 7.2.0-17) version 7.2.1 20171205 (powerpc64-linux-gnu)
compiled by GNU C version 7.2.1 20171205, GMP version 6.1.2,
MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP
... snip ...

This is quite well known, for a long time, IBM on Power (not on
mainframes) used a non IEEE format for long doubles. Actually these are
- the mantissa is somewhat less precise, 2 times 53 bits instead of 112
- the exponent range is way smaller, in powers of 10 the range is
roughly ±308 (same as double) instead of ±4932.

That seems to make sense looking at the in memory values. I can't make
heads or tails out of it in terms of IEEE754-2008 formats. As for the
IBM mainframes, well gee, that is a long lost love of mine as I was an
IBM systems admin for the 3090 MVS/ESA systems and they were a real joy
with Fortran IV. A million years ago.

Post by Gabriel Paubert
The fact the the in memory representation is completely different is not
surprising when you take this into account.
This was somewhat faster than a full emulation of IEEE quad math, but
now IBM has switched to real IEEE quad (in hardware even on Power9, I
suspect most Sparc do it in software).

I can assure you that every sparc does it in software emulation. The
64-bit floating point is pure hardware and works very well.

Post by Gabriel Paubert
I'm away from my Power machine right now and it is switched off, so I
can't try your code and play with compiler options.

Thank you for getting back to me and I look forwards to seeing what your
IBM Power system has to say from that code snippit.

Dennis

Mathieu Malaterre

2018-02-11 08:50:01 UTC

Post by Dennis Clarke

Post by Gabriel Paubert

Post by Dennis Clarke

Post by John Paul Adrian Glaubitz
This is something that needs to be discussed. A single user alone shouldn't
warrant such major change in a port. You always have to keep in mind that
changing the default compiler options also has potential impact on the
performance on more modern ppc64 systems like Apple Macintosh.

Not sure how modern an Apple Mac is but here is a photo I took only a
https://i.imgur.com/6UbviKb.jpg
I have this old Mac G5 running as a fine example of a big-endian machine
and the PPC970MP processors in it seem to work very well. However it is
certainly becoming difficult to get results from it that can compare to
what I get from some other machines like Fujitsu SPARC for example. The
biggest complaint is with floating point wherein the data representation
may be actual IEEE 754-2008 style or some new IBM variant that I am not
at all familiar with. In fact, some code, trivial, won't compile at all
if I try to use "IEEE extended precision long double" with very few ways
gcc -mcpu=970 -mno-altivec -m64 -std=iso9899:1999 -Wfatal-errors \
-pedantic-errors -mabi=ieeelongdouble ...
GNU C99 (Debian 7.2.0-17) version 7.2.1 20171205 (powerpc64-linux-gnu)
compiled by GNU C version 7.2.1 20171205, GMP version 6.1.2,
MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP
... snip ...

This is quite well known, for a long time, IBM on Power (not on
mainframes) used a non IEEE format for long doubles. Actually these are
- the mantissa is somewhat less precise, 2 times 53 bits instead of 112
- the exponent range is way smaller, in powers of 10 the range is
roughly ±308 (same as double) instead of ±4932.

That seems to make sense looking at the in memory values. I can't make
heads or tails out of it in terms of IEEE754-2008 formats. As for the
IBM mainframes, well gee, that is a long lost love of mine as I was an
IBM systems admin for the 3090 MVS/ESA systems and they were a real joy
with Fortran IV. A million years ago.

Post by Gabriel Paubert
The fact the the in memory representation is completely different is not
surprising when you take this into account.
This was somewhat faster than a full emulation of IEEE quad math, but
now IBM has switched to real IEEE quad (in hardware even on Power9, I
suspect most Sparc do it in software).

I can assure you that every sparc does it in software emulation. The
64-bit floating point is pure hardware and works very well.

Post by Gabriel Paubert
I'm away from my Power machine right now and it is switched off, so I
can't try your code and play with compiler options.

Thank you for getting back to me and I look forwards to seeing what your
IBM Power system has to say from that code snippit.

One shorted example from SO recently:

const constexpr long double DEGREE_TO_RAD =
0.0174532925199432954743716805978693;
const constexpr long double RAD_TO_DEGREE = 1. / DEGREE_TO_RAD;

https://stackoverflow.com/questions/48553127/error-with-long-doubles-on-powerpc-when-compiling-with-gcc

Bas Vermeulen

2018-02-16 10:20:01 UTC

Enabling CONFIG_MATH_EMULATION and CONFIG_MATH_EMULATION_HW_UNIMPLEMENTED
configuration options in the kernel fixes this problem without changes to
gcc-7/gcc-8.
This makes unstable run on the e6500 (and probably the e5500 as well)
without a problem.

Sorry about the scare, I should have looked for emulation options earlier.

Bas Vermeulen

Lennart Sorensen

2018-02-06 18:30:02 UTC

Post by John Paul Adrian Glaubitz
It would be nice if anyone who cares about the ppc64 could actually
post what kind of hardware they have. We should find a consensus on
where to put the baseline.

Well at least the CPUs listed on https://wiki.debian.org/PPC64 would
be nice. Either that or make the page match what is really supported.

Post by John Paul Adrian Glaubitz
Rebuilding the whole archive isn't so much of a problem. We have already
done that :).

Yeah at least there are some pretty fast PPC64 build boxes.

And it might be possible to scan the binaries for unsupported instructions
and just rebuild those packages, but it might not be worth the bother.

--
Len Sorensen

Bas Vermeulen

2018-02-06 17:20:01 UTC

Hi Christian,

I am running into a similar problem with my T2080 (using an e6500). I'm
currently rebuilding gcc to build for powerpc64 by default, eliminating
those from being generated by default. Once that is done, I'll have to
rebuild glibc with the new gcc compiler to update libm.

Bas Vermeulen

Hi All,
I use Debian Buster/Sid PPC64 on my Cyrus board with a P5020 SoC [1]. The
E5500 fpu doesn't have a fpsqrt instruction. Does Debian's libm have
support for emulation of the fpsqrt instruction?
Thanks,
Christian
[1]
https://plus.google.com/u/0/photos/photo/1155156240564770149
71/6518950042669509138
https://plus.google.com/u/0/photos/photo/1155156240564770149
71/6518949590349211394

Christian Zigotzky

2018-02-06 17:40:01 UTC

Hi Bas,

Thank you. Could you please upload the libm when you are finished with
compiling? I would like to test it on my Cyrus board.

Thanks,
Christian

Post by Bas Vermeulen
Hi Christian,
I am running into a similar problem with my T2080 (using an e6500).
I'm currently rebuilding gcc to build for powerpc64 by default,
eliminating those from being generated by default. Once that is done,
I'll have to rebuild glibc with the new gcc compiler to update libm.
Bas Vermeulen
On Tue, Feb 6, 2018 at 6:04 PM, Christian Zigotzky
Hi All,
I use Debian Buster/Sid PPC64 on my Cyrus board with a P5020 SoC
[1]. The E5500 fpu doesn't have a fpsqrt instruction. Does
Debian's libm have support for emulation of the fpsqrt instruction?
Thanks,
Christian
[1]
https://plus.google.com/u/0/photos/photo/115515624056477014971/6518950042669509138
<https://plus.google.com/u/0/photos/photo/115515624056477014971/6518950042669509138>
https://plus.google.com/u/0/photos/photo/115515624056477014971/6518949590349211394
<https://plus.google.com/u/0/photos/photo/115515624056477014971/6518949590349211394>

Bas Vermeulen

2018-02-09 12:30:02 UTC

Sorry about the delay. You can find the glibc-2.26 compiled with powerpc64
at http://blackstar.xs4all.nl/glibc-2.26-ppc64-powerpc64.tgz

Bas Vermeulen

Post by Christian Zigotzky
Hi Bas,
Thank you. Could you please upload the libm when you are finished with
compiling? I would like to test it on my Cyrus board.
Thanks,
Christian

Post by Bas Vermeulen
Hi Christian,
I am running into a similar problem with my T2080 (using an e6500). I'm
currently rebuilding gcc to build for powerpc64 by default, eliminating
those from being generated by default. Once that is done, I'll have to
rebuild glibc with the new gcc compiler to update libm.
Bas Vermeulen
On Tue, Feb 6, 2018 at 6:04 PM, Christian Zigotzky <
Hi All,
I use Debian Buster/Sid PPC64 on my Cyrus board with a P5020 SoC
[1]. The E5500 fpu doesn't have a fpsqrt instruction. Does
Debian's libm have support for emulation of the fpsqrt instruction?
Thanks,
Christian
[1]
https://plus.google.com/u/0/photos/photo/1155156240564770149
71/6518950042669509138
<https://plus.google.com/u/0/photos/photo/115515624056477014
971/6518950042669509138>
https://plus.google.com/u/0/photos/photo/1155156240564770149
71/6518949590349211394
<https://plus.google.com/u/0/photos/photo/115515624056477014
971/6518949590349211394>

Christian Zigotzky

2018-02-09 14:00:01 UTC

Many thanks!

â Christian

Sent from my iPhone

Sorry about the delay. You can find the glibc-2.26 compiled with powerpc64 at http://blackstar.xs4all.nl/glibc-2.26-ppc64-powerpc64.tgz
Bas Vermeulen

Post by Christian Zigotzky
Hi Bas,
Thank you. Could you please upload the libm when you are finished with compiling? I would like to test it on my Cyrus board.
Thanks,
Christian

Post by Bas Vermeulen
Hi Christian,
I am running into a similar problem with my T2080 (using an e6500). I'm currently rebuilding gcc to build for powerpc64 by default, eliminating those from being generated by default. Once that is done, I'll have to rebuild glibc with the new gcc compiler to update libm.
Bas Vermeulen
Hi All,
I use Debian Buster/Sid PPC64 on my Cyrus board with a P5020 SoC
[1]. The E5500 fpu doesn't have a fpsqrt instruction. Does
Debian's libm have support for emulation of the fpsqrt instruction?
Thanks,
Christian
[1]
https://plus.google.com/u/0/photos/photo/115515624056477014971/6518950042669509138
<https://plus.google.com/u/0/photos/photo/115515624056477014971/6518950042669509138>
https://plus.google.com/u/0/photos/photo/115515624056477014971/6518949590349211394
<https://plus.google.com/u/0/photos/photo/115515624056477014971/6518949590349211394>

Christian Zigotzky

2018-02-06 17:20:01 UTC

Hi All,

I use Debian Buster/Sid PPC64 on my Cyrus board with a P5020 SoC [1].
The E5500 fpu doesn't have a fpsqrt instruction. Does Debian's libm have
support for emulation of the fpsqrt instruction?

Thanks,
Christian

[1]

https://plus.google.com/u/0/photos/photo/115515624056477014971/6518950042669509138
https://plus.google.com/u/0/photos/photo/115515624056477014971/6518949590349211394

26 Replies
85 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Bas Vermeulen 2018-02-06 14:50:01 UTC

Lennart Sorensen 2018-02-06 16:30:01 UTC

John Paul Adrian Glaubitz 2018-02-06 16:50:01 UTC

Bas Vermeulen 2018-02-06 17:10:01 UTC

John Paul Adrian Glaubitz 2018-02-06 18:00:01 UTC

Bas Vermeulen 2018-02-06 18:30:02 UTC

John Paul Adrian Glaubitz 2018-02-06 18:30:02 UTC

Bas Vermeulen 2018-02-06 18:30:02 UTC

John Paul Adrian Glaubitz 2018-02-06 18:40:01 UTC

John Paul Adrian Glaubitz 2018-02-09 10:40:01 UTC

Mathieu Malaterre 2018-02-09 11:30:02 UTC

Bas Vermeulen 2018-02-09 12:00:01 UTC

Bas Vermeulen 2018-02-09 12:10:01 UTC

Mathieu Malaterre 2018-02-09 15:30:02 UTC

Christian Zigotzky 2018-02-09 11:40:02 UTC

Lennart Sorensen 2018-02-09 16:50:01 UTC

Dennis Clarke 2018-02-10 22:10:02 UTC

Gabriel Paubert 2018-02-11 01:10:01 UTC

Dennis Clarke 2018-02-11 03:40:02 UTC

Mathieu Malaterre 2018-02-11 08:50:01 UTC

Bas Vermeulen 2018-02-16 10:20:01 UTC

Lennart Sorensen 2018-02-06 18:30:02 UTC

Bas Vermeulen 2018-02-06 17:20:01 UTC

Christian Zigotzky 2018-02-06 17:40:01 UTC

Bas Vermeulen 2018-02-09 12:30:02 UTC

Christian Zigotzky 2018-02-09 14:00:01 UTC

Christian Zigotzky 2018-02-06 17:20:01 UTC

about - legalese

Loading...