inline(1)inline(1)NAMEinline - in-line procedure call expander
DESCRIPTION
Assembly language call instructions are replaced by a copy of their
corresponding function body obtained from the inline template (*.il)
file.
Inline files have a suffix of .il,
for example: % CC foo.il hello.c
Inlining is done by the compiler's code generator.
USAGE
Each inlinefile contains one or more labeled assembly language tem‐
plates of the form:
inline-directive
instructions
...
.end
where the instructions constitute an in-line expansion of the named
routine. An inline-directive is a command of the form:
.inline identifier, argsize
This declares a block of code for the routine named by identifier, with
argsize as the total size of the routine's arguments, in bytes. Calls
to the named routine are replaced by the code in the in-line template.
NOTE:
The value of argsize is ignored but the argument should be included for
compatibility with compiler versions predating the Sun WorkShop[tm] 5.0
compilers.
Multiple templates are permitted; matching templates after the first
are ignored.
Coding Conventions for all Sun Systems
Inline templates should be coded as expansions of C-compatible proce‐
dure calls, with the difference that the return address cannot be
depended upon to be in the expected place, since no call instruction
will have been executed.
Inline templates must conform to standard Sun parameter passing and
register usage conventions, as detailed below. They must not call rou‐
tines that violate these conventions; for example, assembly language
routines such as setjmp(3c) may cause problems.
Registers other than the ones mentioned below must not be used or set.
Branch instructions in an in-line template may only transfer to numeric
labels (1f, 2b, and so on) defined within the in-line template. No
other control transfers are allowed.
Templates do not need ret or retl instructions, and should not include
them.
Only opcodes and addressing modes generated by Sun compilers are guar‐
anteed to work. Binary encodings of instructions are not supported.
Coding Conventions for SPARC Systems
The first six arguments are passed in registers %o0-%o5. Arguments
beyond the sixth are passed using stack locations in accordance with
the target ABI. %sp is guaranteed to be 64-bit aligned. The contents
of %o7 are undefined, since no call instruction will have been exe‐
cuted.
Results are returned in %o0 or %f0/%f1.
Registers %o0-%o5 and %f0-%f31 may be used as temporaries.
Integral and single-precision floating-point arguments are 32-bit
aligned.
Double-precision floating-point arguments are guaranteed to be 64-bit
aligned if their offsets are multiples of 8.
Each control-transfer instruction (branches and calls) must be immedi‐
ately followed by a nop.
Call instructions must include an extra (final) argument which indi‐
cates the number of registers used to pass parameters to the called
routine.
Note that for SPARC systems, the instruction following an expanded
'call' is deleted.
Coding Conventions for 32-bit x86 Systems
Arguments are passed on the stack. Since no call instruction was
issued, the first argument is at (%esp), the second argument is at 4
(%esp), etc. Integer results of 32 bits or less are returned in %eax,
64-bit integer results are returned in %edx:%eax. Floating point
results are returned in %st(0).
The code may use registers %eax, %ecx and %edx. The values in any other
registers must be preserved. The floating point stack will be empty at
the start of the inline expansion template, and must be empty (except
for a returned floating point value) at the end.
SPECIAL x86 NOTE
Programs compiled with -xarch={sse|sse2} to run on Solaris x86 SSE/SSE2
Pentium 4-compatible platforms must be run only on platforms that are
SSE/SSE2 enabled. Running such programs on platforms that are not
SSE/SSE2-enabled could result in segmentation faults or incorrect
results occuring without any explicit warning messages. Starting with
the Solaris 10 release, the OS and compilers will prevent execution of
SSE/SSE2-compiled binaries on platforms not SSE/SSE2-enabled.
OS releases starting with Solaris 9 update 6 are SSE/SSE2-enabled on
Pentium 4-compatible platforms. Earlier versions of Solaris OS are not
SSE/SSE2-enabled.
This warning extends also to programs that employ .il inline assembly
language functions or __asm() assembler code that utililize SSE/SSE2
instructions.
If you compile and link in separate steps, always link using the com‐
piler and with -xarch={sse|sse2} to ensure that the correct startup
routine is linked.
Coding Conventions for x64 Platforms
Arguments are passed according to their classification. The classifi‐
cation includes integer-, sse- and memory-arguments.
Arguments of types (signed and unsigned) _Bool, char, short, int, long,
long long and pointers are integer arguments. Arguments of aggregate
types (struct,union,array) of size less than or equal to 16 bytes and
that contain aligned members of types _Bool, char, short, int, long,
long long and pointers are also integer.
Arguments of types float and double are sse arguments. Arguments of
aggregate types of size less than or equal to 16 bytes and that contain
aligned members of types float and double are also sse.
Arguments of types long double and of aggregate types of size greater
than 16 bytes, or with unaligned members are memory arguments.
Integer arguments are passed in integer registers by the next sequence:
%rdi, %rsi, %rdx, %rcx, %r8 and %r9. One integer argument of aggregate
type can hold up to 2 integer registers. If the number of integer argu‐
ments is greater than 6, the 7th and next integer arguments are consid‐
ered as memory arguments.
Sse arguments are passed in sse registers in the order from %xmm0 to
%xmm7. One sse argument of aggregate type can hold up to 2 sse regis‐
ters, each sse register holds up to 8 bytes of argument. For example,
argument of type double complex is passed in 2 consequent see regis‐
ters, argument of type float complex is passed in 1 see register. If
the number of sse arguments is greater than 8, the 9th and next sse
arguments are considered as memory arguments.
Integer and sse arguments are numbered independently.
Memory arguments are passed on the stack in order from right to left
how they appear in function arguments list. Each argument on stack is
aligned according to its size, on 8 if size is less or equal to 8, on
16 otherwise. at the start of the inline expansion template stack is
aligned on 16.
Since no call instruction was issued, the first memory argument is at
(%rsp), the second argument is at 8(%rsp) or at 16(%rsp) depending on
the first memory argument size and the second memory argument align‐
ment, etc.
Returning values are classified in the same way as arguments.
Integer results of 8 bytes or less are returned in %rax, integer
results of 9 to 16 bytes are returned in %rdx:%rax.
Sse results are returned depending on their size too, in %xmm0 or in
%xmm1:%xmm0.
Results of type long double are returned in %st(0).
If returning value is of type long double complex, the real part of the
value is returned in %st0 and the imaginary part in %st1.
For memory results the caller provides space for the return value and
passes the address of this storage in %rdi as if it were the first
argument to the function. In effect, this address becomes a hidden
first argument. On return %rax will contain the address that has been
passed in by the caller in %rdi.
The code may not change register %rbp. The floating point stack will
be empty at the start of the inline expansion template, and must be
empty (except for a returned floating point value) at the end.
In addition to %rbp, the values in registers %rbx and %r12-%r15 must be
preserved across the inlined code.
EXAMPLES
Please review libm.il or vis.il for examples. You can find a version of
these libraries that is specific to each supported architecture under
the compiler's lib/ directory.
WARNINGinline does not check for violations of the coding conventions
described above.
SEE ALSO:
"Techniques for Optimizing Applications: High Performance Computing" by
Rajat P. Garg and Ilya Sharapov uses Fortran to provide a useful expla‐
nation of inline templates. See Chapter 8.
"The SPARC Architecture Manual Version 9" provided by SPARC Interna‐
tional Inc. at http://www.sparc.com/resource.htm. See appendix G.
March 2007 inline(1)