Implement Pure BF16 Computation for Approximation

For intervals outside $[0,\frac{\pi}{2}]$, the current approach uses payne_hanek_reduc for range reduction. However, this implementation converts BF16 types to floating-point types during the reduction process, and the resulting reduced interval is then converted back from floating-point to BF16. When computing cos (L571), the value is again converted from BF16 to floating-point, see below.

https://github.com/Max042004/bf16_approximation/blob/3e486079e033d0b3985e20a345bdf887914f58e7/floating_point/bf16.c#L563-L572

Ideally, range reduction should be performed entirely with BF16 type.

Referring to the LLVM implementation [1], one can compute sin and cos approximations using very small angles within $\frac{\pi}{32}$, and apply a lookup table corrections for specific angles whose output exceeds 0.5 ULP. This approach avoids the need for floating-point computations in payne_hanek_reduc.

[1] [llvm - sinf16.cpp](https://github.com/llvm/llvm-project/blob/main/libc/src/math/generic/sinf16.cpp)

	if ((a.bits & 0x7FFF) >= 0b0011111111001010) {
	a = fp32_to_bf16(payne_hanek_reduc(bf16_to_fp32(a), &k));
	}

	// sin(x)
	bf16_t sin_x = chebyshev_sin_8degrees(a);

	// cos(x) = sin(pi/2 - x)
	float cos_a = pi_over_two_float + bf16_to_fp32(a);
	bf16_t cos_x = chebyshev_sin_8degrees(fp32_to_bf16(cos_a));

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Pure BF16 Computation for Approximation #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implement Pure BF16 Computation for Approximation #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions