-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio Math: Improve audio format saturation functions. #9344
Merged
lgirdwood
merged 10 commits into
thesofproject:main
from
ShriramShastry:shastry_saturation_hifi_performance
Aug 22, 2024
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
1fae91a
Audio Format: Update SPDX-License and copyright
ShriramShastry 03b2ca4
Audio Format: Include required libraries
ShriramShastry 0abfd5f
Audio Format: Improve sat_int32 function
ShriramShastry 1e0e080
Audio Format: Add vec_sat_int32x2 function
ShriramShastry 7ab5a58
Audio Format: Improve sat_int24 function
ShriramShastry f36904a
Audio Format: Add vec_sat_int24x2 function
ShriramShastry 10ad635
Audio Format: Improve sat_int16 function
ShriramShastry 0d7a041
Audio Format: Add vec_sat_int16x4 function
ShriramShastry c8efb45
Audio Format: Improve sat_int8 function
ShriramShastry c6af29a
Audio Format: Add vec_sat_int8x2 function
ShriramShastry File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,40 +1,177 @@ | ||
/* SPDX-License-Identifier: BSD-3-Clause | ||
* | ||
* Copyright(c) 2021 Intel Corporation. All rights reserved. | ||
* Copyright(c) 2024 Intel Corporation. All rights reserved. | ||
* | ||
* Author: Shriram Shastry <malladi.sastry@linux.intel.com> | ||
*/ | ||
|
||
#ifndef __SOF_AUDIO_FORMAT_HIFI3_H__ | ||
#define __SOF_AUDIO_FORMAT_HIFI3_H__ | ||
|
||
#include <stdint.h> | ||
#include <xtensa/config/defs.h> | ||
#include <xtensa/tie/xt_hifi3.h> | ||
#include <stdint.h> | ||
|
||
/* Saturation inline functions */ | ||
|
||
/** | ||
* @brief Saturate a 64-bit integer to 32-bit. | ||
* | ||
* @param x 64-bit integer. | ||
* @return 32-bit saturated integer. | ||
* | ||
* This function takes a 64-bit integer, performs bitwise shifting, | ||
* and returns a saturated 32-bit integer. | ||
*/ | ||
static inline int32_t sat_int32(int64_t x) | ||
{ | ||
return (ae_int32)AE_ROUND32F48SSYM(AE_SLAI64(x, 16)); | ||
/* Shift left by 32 bits with saturation */ | ||
ae_f64 shifted = AE_SLAI64S(x, 32); | ||
|
||
/* Shift right by 32 bits */ | ||
return (int32_t)AE_MOVINT32_FROMINT64(AE_SRAI64(shifted, 32)); | ||
} | ||
|
||
/** | ||
* @brief Saturate and round two 64-bit integers to 32-bit packed into a 32x2 vector. | ||
* | ||
* @param x 64-bit integer. | ||
* @param y 64-bit integer. | ||
* @return Packed 32-bit saturated integers. | ||
* | ||
* This function takes two 64-bit integers, converts them to ae_f64 format, | ||
* and returns them as a packed ae_int32x2 vector after saturation. | ||
*/ | ||
static inline ae_int32x2 vec_sat_int32x2(int64_t x, int64_t y) | ||
{ | ||
ae_f64 d0, d1; | ||
|
||
/* Convert 64-bit integers to ae_f64 format */ | ||
d0 = (ae_f64)x; | ||
d1 = (ae_f64)y; | ||
|
||
/* Round and saturate both 64-bit values to 32-bit and pack them */ | ||
return (ae_int32x2)AE_ROUND32X2F64SSYM(d0, d1); | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit about commit text, the new version isn't more accurate, it's mote efficient. |
||
/** | ||
* @brief Saturate and round a 32-bit integer to 24-bit. | ||
* | ||
* @param x 32-bit integer. | ||
* @return 24-bit saturated integer. | ||
* | ||
* This function takes a 32-bit integer and saturates it to a 24-bit integer | ||
* using saturating arithmetic instructions. | ||
*/ | ||
static inline int32_t sat_int24(int32_t x) | ||
{ | ||
return AE_SRAI32(AE_SLAI32S(x, 8), 8); | ||
/* Move 32-bit value to ae_f32x2 type */ | ||
ae_f32x2 d0 = AE_MOVDA32(x); | ||
|
||
/* Saturate to 24-bit */ | ||
return (ae_int32)AE_SAT24S(d0); | ||
singalsu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
/** | ||
* @brief Saturate and round two 32-bit integers to 24-bit packed into a 32x2 vector. | ||
* | ||
* @param x 32-bit integer. | ||
* @param y 32-bit integer. | ||
* @return Packed 24-bit saturated integers. | ||
* | ||
* This function takes two 32-bit integers, packs them into a 32x2 vector, | ||
* and performs saturation to 24-bit on each element. | ||
*/ | ||
static inline ae_int32x2 vec_sat_int24x2(int32_t x, int32_t y) | ||
{ | ||
/* Move two 32-bit values to ae_f32x2 type */ | ||
ae_f32x2 d0 = AE_MOVDA32X2(x, y); | ||
|
||
/* Saturate both values to 24-bit and pack them */ | ||
return (ae_int32x2)AE_SAT24S(d0); | ||
singalsu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
/** | ||
* @brief Saturate a 32-bit integer to 16-bit. | ||
* | ||
* @param x 32-bit integer. | ||
* @return 16-bit saturated integer. | ||
* | ||
* This function takes a 32-bit integer, packs it into a vector, | ||
* and performs saturation to 16-bit on each element. | ||
*/ | ||
static inline int16_t sat_int16(int32_t x) | ||
{ | ||
return AE_SAT16X4(x, x); | ||
singalsu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/* Move 32-bit value to ae_int32x2 type */ | ||
ae_int32x2 d0 = AE_MOVDA32(x); | ||
|
||
/* Saturate both values to 16-bit and pack them */ | ||
ae_int16x4 result = AE_SAT16X4(d0, d0); | ||
|
||
/* Use AE_MOVAD16_0 to extract the first 16-bit value */ | ||
return AE_MOVAD16_0(result); | ||
} | ||
|
||
/** | ||
* @brief Saturate and round four 32-bit integers to 16-bit packed into a 16x4 vector. | ||
* | ||
* @param x 32-bit integer. | ||
* @param y 32-bit integer. | ||
* @param z 32-bit integer. | ||
* @param q 32-bit integer. | ||
* @return Packed 16-bit saturated integers. | ||
* | ||
* This function takes four 32-bit integers, packs them into two 32x2 vectors, | ||
* and performs 16-bit saturation on each element, returning the result in a 16x4 vector. | ||
*/ | ||
static inline ae_int16x4 vec_sat_int16x4(int32_t x, int32_t y, int32_t z, int32_t q) | ||
{ | ||
/* Move four 32-bit values to ae_int32x2 type */ | ||
ae_int32x2 d0 = AE_MOVDA32X2(x, y); | ||
ae_int32x2 d1 = AE_MOVDA32X2(z, q); | ||
|
||
/* Saturate all values to 16-bit and pack them */ | ||
return AE_SAT16X4(d0, d1); | ||
} | ||
|
||
singalsu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/** | ||
* @brief Saturate and round a 32-bit integer to 8-bit. | ||
* | ||
* @param x 32-bit integer. | ||
* @return 8-bit saturated integer. | ||
* | ||
* This function takes a 32-bit integer, performs bitwise operations to extract the lower 8 | ||
* bits, and returns an 8-bit saturated integer. | ||
*/ | ||
static inline int8_t sat_int8(int32_t x) | ||
{ | ||
if (x > INT8_MAX) | ||
return INT8_MAX; | ||
else if (x < INT8_MIN) | ||
return INT8_MIN; | ||
else | ||
return (int8_t)x; | ||
/* Shift left by 24 bits */ | ||
ae_f32x2 a_i = AE_SLAI32S(x, 24); | ||
|
||
/* Shift right by 24 bits to sign-extend the 8-bit value */ | ||
a_i = AE_SRAI32(a_i, 24); | ||
|
||
/* Extract the lower 8 bits as an int8_t */ | ||
return (int8_t)AE_MOVAD32_L(a_i); | ||
} | ||
|
||
/** | ||
* @brief Saturate and round two 32-bit integers to 8-bit packed into an 8x2 vector. | ||
* | ||
* @param x 32-bit integer. | ||
* @param y 32-bit integer. | ||
* @return Packed 8-bit saturated integers. | ||
* | ||
* This function takes two 32-bit integers, packs them into a 32x2 vector, | ||
* and performs 8-bit saturation on each element, returning the result in an 8x2 vector. | ||
*/ | ||
static inline ae_int32x2 vec_sat_int8x2(int32_t x, int32_t y) | ||
{ | ||
/* Move two 32-bit values to ae_f32x2 type */ | ||
ae_f32x2 d0 = AE_MOVDA32X2(x, y); | ||
|
||
/* Shift left by 24 bits and then right by 24 bits to sign-extend and saturate */ | ||
return AE_F32X2_SRAI(AE_F32X2_SLAIS(d0, 24), 24); | ||
} | ||
|
||
#endif /* __SOF_AUDIO_FORMAT_HIFI3_H__ */ |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes an overflow with very large input values. But wouldn't this be a simpler fix:
return (ae_int32)AE_ROUND32F48SSYM(AE_SLAI64S(x, 16));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do you have idea why your version is faster in your test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/thesofproject/sof/issues/9285
With input values of -9223372036854775808 and 9223372036854775807 . It is expected to return -2147483648 and 2147483647 but we are observing 0 and -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is possible overflow with AE_SLAI64() instead of AE_SLAI64S(). The proper link to issue is: #9285.
Not sure, but without looking disasembly I'm guessing your version has the small speed advantage from easier type casts. Maybe (int32_t)AE_MOVINT32_FROMINT64() is just a normal store operation. I'm OK with this if you can change the typecast from (int) to (int32_t), if the code still compiles, since the that's type declared type of the function.