The PMADDWD/PMADDUBSW intrinsics can all be used in constexpr with suitable handling of the __builtin_ia32_pmadd* builtins inside VectorExprEvaluator::VisitCallExpr and InterBuiltin.cpp similar to #152540
_mm_madd_pi16
_mm_madd_epi16 _mm256_madd_epi16 _mm512_madd_epi16
_mm_mask_madd_epi16 _mm256_mask_madd_epi16 _mm512_mask_madd_epi16
_mm_maskz_madd_epi16 _mm256_maskz_madd_epi16 _mm512_maskz_madd_epi16
_mm_maddubs_pi16
_mm_maddubs_epi16 _mm256_maddubs_epi16 _mm512_maddubs_epi16
_mm_mask_maddubs_epi16 _mm256_mask_maddubs_epi16 _mm512_mask_maddubs_epi16
_mm_maskz_maddubs_epi16 _mm256_maskz_maddubs_epi16 _mm512_maskz_maddubs_epi16
The constant folding doesn't map to an APInt operation - you should handle it locally, plus its a horizontal pairwise addition so that will need handling as well:
// Split Lo/Hi elements pairs, extend and add together.
// PMADDWD(X,Y) =
// add(mul(sext(lhs[0]),sext(rhs[0])),mul(sext(lhs[1]),sext(rhs[1])))
// PMADDUBSW(X,Y) =
// sadd_sat(mul(zext(lhs[0]),sext(rhs[0])),mul(zext(lhs[1]),sext(rhs[1])))
We might need to consider adjusting the PMADDUBSW builtins to take an unsigned lhs argument as well.