by Peter McGoron
This SRFI organizes flonum operations into libraries depending on what representation of flonum they operate on. Each library also has the ability to inspect properties of the flonum operations, such as rounding mode and deviations from IEEE 754 arithmetic. Additional procedures are defined for flonums the fact that their representation is known from their library name.
#e#fl(binary64 1/2)?
(:+ x y z w …)?
This section is non-normative.
Standard Scheme doesn’t give specifics about the precision and range
of inexact numbers.
From the R4RS onward,
implementations could use s, f, d,
and l to denote inexact constants of different precisions.
The R6RS and
SRFI 144 included
“flonum” operations. However, these specifications do not specify what
the format of the flonum is. The flonum might not be an IEEE format
number and operations may differ from implementation to implementation.
This SRFI proposes a variant of SRFI 144 that is organized into
representation-specific libraries. The functions exported from
a specific library operate on a precisely defined number format. For
example, if one wanted to operate on binary64 floating point numbers,
one can import (srfi ### binary64).
One reason to use type-specific procedures is speed: the function
sqrt from (srfi ### binary32) can be compiled
to a single FSQRT instruction on a RISC-V processor. One could also
compile multiple square roots to a single vectorized SQRTPS
instruction on an x86_64 processor with SSE2.
Another reason to use type-specific procedures is portability. Given the same rounding mode, format, and IEEE 754 conformance flag below, operations like +, -, and √ will always return the same value given the same inputs.
Most programmers do not have the speed and portability of floating point operations as their top priorities. They want their floating point calculations to work well above blazing speeds or bit-for-bit reproducibility across architectures. Basically, floating-point should do “what they want.” A type-flexible system is more likely to do what the non-numerically-inclined programmer wants: see Kahan 1997 p. 29 and Kahan and Darcy 1998 pp. 60ff.
Scheme’s module system, lack of special arithmetic syntax, and latent
typing allow us to separate strict correctness and “do what I want.”
Programmers who wish
for their programs to do “what they want” should use Scheme’s generic
arithmetic. An implementation is free to do things like widen operands
or optimize expressions (for example, using the SSE2 instruction
RSQRTPS for (/ 1 (sqrt x))) without worrying
about strict reproducability or the absolute fastest speed.
All references to IEEE 754 refer to its 2019 revision.
A representation is a type of inexact number that has fixed properties, like exponent range and mantissa width. Examples include binary32, binary64, and posit32 (Gustafson 2022).
An operation is correctly rounded if the returned value is the same as if the operation was calculated to infinite precision, and then rounded to fit in the resulting representation according to the current rounding mode.
Operations are non-stop if all functions return a flonum in the same format as the input, even if the result is a subnormal, infinity, or NaN.
Square brackets [] are used to denote a group of arguments that are optional, but all arguments must be present or absent. If one pair of square brackets are nested in another pair, then the nested pair is optional even when the other arguments are supplied.
In procedure arguments, it is an error if endianness is not the
symbol little, big, or an endianness
supplied by the macro in (rnrs bytevectors). When
endianness is not supplied, it is the native endianness.
Requirements on implementations using RFC 2119 terminology are marked up in strong text.
The following library names, if available, must implement the library described in the section “IEEE binary floating point library”:
(srfi ### binary16)(srfi ### binary32)(srfi ### binary64)(srfi ### binary128)(srfi ### binary256)
The implementation must export (srfi ###),
which implements the same library. It should be the same
as one of the above libraries.
The following library names are reserved (where ⟨n⟩ is a base-10 numeral). They are reserved because some of the functions in the flonum library may not be appropriate for these format numbers. A future SRFI or Report will define operations on these representations.
(srfi ### decimal⟨n⟩)(srfi ### complex-⟨format⟩⟨n⟩) where ⟨format⟩ is either binary or decimal(srfi ### binary⟨n⟩) for ⟨n⟩ not previously defined
An implementation may provide libraries with different
names than the ones above. Such a library should
implement all of the procedures described below. For example,
an implementation could provide (srfi ### posit32) for
operations on posits, with a similar API to the one below. However,
posits do not have infinite values, so infinite? would
not be exported.
This library exports the identifiers of SRFI 144, with the following modifications:
fl. Instead, they are
prefixed with :. (Exceptions: procedures that start
with flonum now start with :flonum.
The procedure make-fllog-base becomes
:make-log-base.)
fl-fast-fl+* identifier is not exported.
flnormalized? is now normal?,
and fldenormalized? is now subnormal?.
flonum must return a NaN
value if the argument is a non-real number.
:+, :-, :*,
:/, :+* (aka fused multiply-add), :sqrt,
:abs, :absdiff, :posdiff,
:floor, :ceiling, :round,
:truncate and :remainder
return correctly rounded values.
Rationale: Multiple floating-point
libraries may be pulled into the same library. To disambiguate them,
one would prefix them differently. This would mean that procedures
would look like f32:fl+, which is redundant. Hence
this SRFI uses the shorter : prefix. Then the above
can be imported as f32:+ using f32 as a prefix.
The operations that must return correctly rounded values are the one that IEEE 754 mandates to be correctly rounded.
The following identifiers are exported:
:e
:1/e
:e-2
:e-pi/4
:log2-e
:log10-e
:log-2
:1/log-2
:log-3
:log-pi
:log-10
:1/log-10
:pi
:1/pi
:2pi
:pi/2
:pi/4
:2/sqrt-pi
:pi-squared
:degree
:2/pi
:sqrt-2
:sqrt-3
:sqrt-5
:sqrt-10
:1/sqrt-2
:cbrt-2
:cbrt-3
:4thrt-2
:phi
:log-phi
:1/log-phi
:euler
:e-euler
:sin-1
:cos-1
:gamma-1/2
:gamma-1/3
:gamma-2/3
:greatest
:least
:epsilon
:integer-exponent-zero
:integer-exponent-nan
:flonum
:adjacent
:copysign
:make-flonum
:integer-fraction
:exponent
:integer-exponent
:normalized-fraction-exponent
:sign-bit
:flonum?
:=?
:<?
:>?
:<=?
:>=?
:unordered?
:max
:min
:integer?
:zero?
:positive?
:negative?
:odd?
:even?
:finite?
:infinite?
:nan?
:normal?
:subnormal?
:+
:*
:+*
:-
:/
:abs
:absdiff
:posdiff
:sgn
:numerator
:denominator
:floor
:ceiling
:round
:truncate
:exp
:exp2
:exp-1
:square
:sqrt
:cbrt
:hypot
:expt
:log
:log1+
:log2
:log10
:make-log-base
:sin
:cos
:tan
:asin
:acos
:atan
:sinh
:cosh
:tanh
:asinh
:acosh
:atanh
:quotient
:remainder
:remquo
:gamma
:loggamma
:first-bessel
:second-bessel
:erf
:erfc
:rounding-mode
:features
:read-random-flonum
:round/ties-to-away
:byte-width
:bytevector-flonum-ref
:bytevector-flonum-set!
:string->flonum
:flonum->string
The examples assume use the optional reader syntax suggestions to denote values of different repersentation.
(:rounding-mode)
Returns the current rounding mode for this flonum type. This SRFI defines the following symbols which can be returned from this procedure. The SRFI defers to the IEEE 754 standard for the complete definition of these rounding modes. An implementation may add other rounding modes, which should be symbols. For example, an implementation with support for GNU MPFR may add MPFR's additional rounding modes.
round-to-nearest/ties-to-evenround-to-nearest/ties-to-awayround-towards-positiveround-towards-negativeround-towards-zero
Note: This SRFI provides no portable way
to change the rounding mode because it is a major implementation burden
with little benefit. In a vacuum, the rounding mode is best
represented as a dynamic variable that can be parameterized.
However, the rounding mode is generally a global variable, and can
sometimes be attached to individual instructions (RISC-V is an example).
Modifying the rounding mode is a pretty rare operation: an analysis of
RISC-V code saw no use of any mode besides roundTiesToEven outside of
conversion procedures [Zurstraßen 2023].
In a similar vein, this SRFI provides no way of inspecting and raising IEEE 754 exceptions. An example of an implementation that has both IEEE 754 exception handling and rounding mode control is MIT Scheme.
The rounding mode is independent of the behavior of the round
function.
(:features)
Returns a list containing information about the floating point operations in this library. The following symbols have defined meanings. An implementation may add other features, which should be symbols.
subnormals-are-zeroflush-to-zeroieee-754-2019subnormals-are-zero or flush-to-zero
appear.non-stopfast-fma(:+* x y z) is at least as fast as or faster
than (:+ (:* x y) z). (Fused multiply add must be rounded
correctly when IEEE 754 compliance mode is on, regardless of
fast-fma being available or not.)⟨name⟩-correctly-rounded where ⟨name⟩ is a procedure from the
library without : prefixedieee-754-2019 appears, then features corresponding
to functions the IEEE 754 be correctly rounded must not
appear.)Note: DAZ/FTZ modes are usually enabled by the compiler, or are baked-in features of the architecture. As such, this SRFI does not provide a portable way to manipulate this mode.
This should not be confused with the
features
procedure in the R7RS.
This is a run-time procedure that reports on the run-time environment,
and the flags may change over the runtime of the program.
These flags are not accessable through cond-expand.
(:read-random-flonum
binary-input-port
[start [end]])
Returns a random flonum between start (default 0) exclusive and end (default 1) exclusive calculated from the bytes from binary-input-port. If the bytes from the port are uniformly distributed, then the resulting flonum is drawn from a uniform distribution of flonums between the two supplied numbers, to the best extent possible.
Rationale: Floating point random number generators may take a variable number of bytes to return an answer: see for example Campbell 2014. Because of this, this procedure cannot take a bytevector. This procedure could take an SRFI 158 generator, but those have issues as described in SRFI 271.
This procedure requires that any flonum between the two ends
may be returned with roughly equal probability. This precludes some
methods such as filling in the lower 52 bits of a binary64 number,
because that does not sample all possible exponents. If one wants
this faster (and less accurate) sampling method, one can directly
manipulate the structure of the flonum using a bytevector and bytevector-flonum-ref.
It is not possible to pick a random flonum between two arbitrary finite flonums uniformly. It is possible in special cases, including the important case of (0,1): see Goualard 2022 for discussion and an algorithm that attempts to sample from arbitrary intervals as uniformly as possible.
(:round/ties-to-away
fl)
Round fl to an integer flonum, with ties broken as in
roundTiesToAway. (C99 round).
(:round/ties-to-away 2.5)⇒ 3.0(:round 2.5)⇒ 2.0(:round/ties-to-away 3.5)⇒ 4.0(:round 3.5)⇒ 4.0
Note: The flround procedure
in the R6RS and
SRFI 144 implements
Scheme’s round ties-to-even behavior, which is the
behavior of roundeven in C11.
:byte-width
Size of the flonum in bytes.
(:bytevector-flonum-ref
bv
k
[endianness])
It is an error if k to k +
:byte-width
are not valid indices of bv.
If endianness is not supplied, it is an error if
k is not a multiple of :byte-width.
Read the bytes in bv at k as a flonum of this type, with the endianness.
If the value is a NaN, then the NaN may be coerced into another NaN.
(import (scheme base) (prefix (srfi ### binary64) f64:)) (define bv (make-bytevector f64:byte-width)) (bytevector-u8-set! bv 0 #b01000000) (bytevector-u8-set! bv 1 #b00001001) (bytevector-u8-set! bv 2 #b00100001) (bytevector-u8-set! bv 3 #b11111011) (bytevector-u8-set! bv 4 #b01010100) (bytevector-u8-set! bv 5 #b01000100) (bytevector-u8-set! bv 6 #b00101101) (bytevector-u8-set! bv 7 #b00011000)(f64:bytevector-flonum-ref bv 0 'big)⇒ 3.141592653589793116
Rationale: Some implementations, in particular those that use NaN boxing, may only be able to represent a limited set of NaNs. There were few requirements on quiet versus signalling NaN formats until 2019. Different systems may have different canonical NaNs. For this reason portable code should not expect that different NaNs are distinguishable.
(:bytevector-flonum-set!
bv
k
fl
[endianness])
It is an error if k to k +
:byte-width
are not valid indices of bv.
If endianness is not supplied, it is an error if
k is not a multiple of byte-width.
Write fl to bv at k with endianness.
With the exception of NaNs, this procedure and
:bytevector-flonum-ref
must to round-trip. That is, given a non-NaN flonum fl,
(let ((bv (make-bytevector :byte-width)))
(:bytevector-flonum-set! bv 0 fl)
(eqv? fl (:bytevector-flonum-ref bv 0)))
always evaluates to #t.
(import (scheme base) (prefix (srfi ### binary32) f32)) (define bv (make-bytevector f32:byte-width)) (f32:bytevector-flonum-set! bv 0 1.41421353816986083984f0 'little)bv⇒ #u8(#xf3 #x04 #xb5 #x3f)
(:string->flonum
string
[radix])
It is an error if radix is not 2, 8, 10, or 16. The value of radix defaults to 10.
Read string as a number in that representation.
(import (scheme base) (prefix (srfi ### binary64) f64) (prefix (srfi ### binary128) f128))(f128:string->flonum "1e400")⇒ 1l400(f64:string->flonum "1e400")⇒ #fl(binary64 +inf.0)
(:flonum->string
fl
[radix])
It is an error if radix is not 2, 8, 10, or 16. The value of radix defaults to 10.
Return a string that represents fl in radix.
This procedure must round-trip fl with
string->flonum in the way that the
R7RS specifies for
number->string.
When an implementation advertises that it implements,
e.g. sqrt with one rounding, then it must
not reorder or optimize the program if it would return
a different result. For example, (/ 1 (sqrt x))
may return a different result if implemented as two operations
literally, versus as one inverse square root operation. Implementations
should offer modes that do not optimize mathematical
operations at the expense of reproducibility.
Given the same rounding mode, input values,
with ieee-754-2019 and non-stop as features,
any set of operations that are correctly rounded will produce the same
answers on one correctly conforming implementation as on another with
the same rounding mode, input values, and features implicating correct
rounding.
This section is non-normative.
(import (scheme base) (prefix (srfi ### binary32) f32))
(unless (member 'ieee-754-2019 (f32:features))
(error "requires IEEE 754 arithmetic"))
(define (f32:kahan-sum lst)
(do ((sum (f32:flonum 0.0))
(c (f32:flonum 0.0))
(lst lst (cdr lst)))
((null? lst) sum)
(let* ((y (fl32:- (car lst) c))
(t (fl32:+ sum y)))
(set! c (fl32:- (fl32:- t sum) y))
(set! sum t))))
This code will always calculate the correct results with the desired algorithmic properties on any conforming implementation. In particular, a conforming implementation will not re-order operations in such a way to make the output values differ.
SRFI 4 specifies f32vectors and f64vectors, and SRFI 160 specifies c64vectors and c128vectors. Implementors should make the elements of each vector the corresponding representation in the table:
| Vector | Representation | |
|---|---|---|
| f32vector | binary32 | |
| f64vector | binary64 | |
| c64vector | each part is | binary32 |
| c128vector | each part is | binary64 |
The R7RS-Large is likely to promote that “should” to “must.”
On implementations with native binary floating point of multiple precisions, the exponent specifiers in the table should map to the corresponding representation:
| Exponent | Representation |
|---|---|
s | binary16 |
f | binary32 |
d | binary64 |
l | binary128 |
On implementations with wildly varying representations, such as decimal floats or posit numbers, one wants to specify the number format in a precise and portable manner. For that one may modify the grammar of the R7RS to be the following:
⟨real R⟩ → ⟨real numeral R⟩
| ⟨represented flonum R⟩
⟨real numeral R⟩ → ⟨sign⟩ ⟨ureal R⟩ | ⟨infnan⟩
⟨represented flonum R⟩ → #fl( ⟨representation name⟩ ⟨real numeral R⟩ )
⟨representation name⟩ → binary16 | binary32 | …
For example, #fl(binary256 1e400) reads as a finite number,
while #fl(binary64 1e400) reads the same as #fl(binary64 +inf.0).
The syntax allows for complex numbers to be written with mixed precision: for example,
#fl(binary32 1.0)+#fl(binary64 2.0)i.
A portable implementation is impossible. In general, a complete implementation of this SRFI would require knowledge of what optimizations occur on floating point operations, and the target architecture.
Most implementations only have one floating-point type.
An implementation can copy most of their
SRFI 144
implementation to (srfi ### binary64) with minor renamings
without issue.
The simplest way to implement this SRFI is an FFI to C’s
fenv.h.
Checking the FTZ/DAZ mode (for example, on Intel CPUs) requires intrinsics to
check the MXCSR register.
A sample implementation specific to an implementation + architecture will be written.
Thanks to those in Working Group 2 for discussing the semantics of this SRFI. In particular, I would like to thank Zhu Zihao for lots of information gathering.
I thank Bradley Lucier for his input.
I thank the authors of SRFI 144, as this work builds on theirs.
I also thank William Kahan, whose work on IEEE 754 and his many complaints about how programming language designers fail to understand it influenced the design of this SRFI (even if I could not incorporate all of his suggestions).
© 2026 Peter McGoron.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.