Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add BLAS Level 1 drot routine #276

Closed
3 tasks done
kgryte opened this issue Mar 13, 2020 · 11 comments · Fixed by #1823
Closed
3 tasks done

RFC: add BLAS Level 1 drot routine #276

kgryte opened this issue Mar 13, 2020 · 11 comments · Fixed by #1823
Assignees
Labels
Accepted RFC feature request which has been accepted. BLAS Issue or pull request related to Basic Linear Algebra Subprograms (BLAS). C Issue involves or relates to C. difficulty: 3 Likely to be challenging but manageable. Feature Issue or pull request for adding a new feature. JavaScript Issue involves or relates to JavaScript. Native Addons Issue involves or relates to Node.js native add-ons. priority: High High priority concern or feature request. RFC Request for comments. Feature requests and proposed changes.

Comments

@kgryte
Copy link
Member

kgryte commented Mar 13, 2020

Checklist

Please ensure the following tasks are completed before submitting a feature request.

  • Read and understood the Code of Conduct.
  • Searched for existing issues and pull requests.
  • The issue name begins with RFC:.

Description

Description of the feature request.

This RFC proposes to add a routine to apply a plane rotation as defined in BLAS Level 1 routines. Specifically adding @stdlib/blas/base/drot is proposed.

Related Issues

Does this feature request have any related issues?

No.

Questions

Any questions for reviewers?

No.

Other

Any other information relevant to this feature request? This may include screenshots, references, sample output, and/or implementation notes.

No.

@kgryte kgryte added Feature Issue or pull request for adding a new feature. difficulty: 3 Likely to be challenging but manageable. Math Issue or pull request specific to math functionality. RFC Request for comments. Feature requests and proposed changes. labels Mar 13, 2020
@kgryte kgryte added BLAS Issue or pull request related to Basic Linear Algebra Subprograms (BLAS). and removed Math Issue or pull request specific to math functionality. labels Jun 26, 2020
@Pranavchiku
Copy link
Member

@aman-095 if you wish you can proceed with this one.

Fortran implementation: netlib

Reference PR: #929

@aman-095
Copy link
Member

Sure @Pranavchiku I would like to work on this. Please assign me.

@Pranavchiku
Copy link
Member

Pranavchiku commented Mar 10, 2024

Follow: @stdlib/blas/base/dswap.

@kgryte kgryte added Native Addons Issue involves or relates to Node.js native add-ons. JavaScript Issue involves or relates to JavaScript. C Issue involves or relates to C. priority: High High priority concern or feature request. labels Mar 10, 2024
@kgryte
Copy link
Member Author

kgryte commented Mar 10, 2024

Reference implementations:

@Pranavchiku
Copy link
Member

@kgryte
Copy link
Member Author

kgryte commented Mar 10, 2024

@Pranavchiku Ah, yes, those relatively inscrutable test cases. 😅

@kgryte kgryte added the Accepted RFC feature request which has been accepted. label Mar 10, 2024
@Pranavchiku
Copy link
Member

To be honest, I am unable to figure out how those are working XD

@kgryte
Copy link
Member Author

kgryte commented Mar 10, 2024

IIRC you were able to figure out for drotg/srotg, correct?

@Pranavchiku
Copy link
Member

Yes, they have values listed in data blocks. I do not remember it well, will try to figure out something.

@Pranavchiku
Copy link
Member

Run the following file using gfortran a.f && ./a.out to get corresponding input and output values.

Fortran file ( a.f )
      SUBROUTINE DROT(N,DX,INCX,DY,INCY,C,S)
*
*  -- Reference BLAS level1 routine (version 3.8.0) --
*  -- Reference BLAS is a software package provided by Univ. of Tennessee,    --
*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
*     November 2017
*
*     .. Scalar Arguments ..
      DOUBLE PRECISION C,S
      INTEGER INCX,INCY,N
*     ..
*     .. Array Arguments ..
      DOUBLE PRECISION DX(*),DY(*)
*     ..
*
*  =====================================================================
*
*     .. Local Scalars ..
      DOUBLE PRECISION DTEMP
      INTEGER I,IX,IY
*     ..
      IF (N.LE.0) RETURN
      IF (INCX.EQ.1 .AND. INCY.EQ.1) THEN
*
*       code for both increments equal to 1
*
         DO I = 1,N
            DTEMP = C*DX(I) + S*DY(I)
            DY(I) = C*DY(I) - S*DX(I)
            DX(I) = DTEMP
         END DO
      ELSE
*
*       code for unequal increments or equal increments not equal
*         to 1
*
         IX = 1
         IY = 1
         IF (INCX.LT.0) IX = (-N+1)*INCX + 1
         IF (INCY.LT.0) IY = (-N+1)*INCY + 1
         DO I = 1,N
            DTEMP = C*DX(IX) + S*DY(IY)
            DY(IY) = C*DY(IY) - S*DX(IX)
            DX(IX) = DTEMP
            IX = IX + INCX
            IY = IY + INCY
         END DO
      END IF
      RETURN
      END SUBROUTINE DROT

      SUBROUTINE CHECK3(SFAC)
*     .. Parameters ..
      INTEGER           NOUT
      PARAMETER         (NOUT=6)
*     .. Scalar Arguments ..
      DOUBLE PRECISION  SFAC
*     .. Scalars in Common ..
      INTEGER           ICASE, INCX, INCY, N
      LOGICAL           PASS
*     .. Local Scalars ..
      DOUBLE PRECISION  SC, SS
      INTEGER           I, K, KI, KN, KSIZE, LENX, LENY, MX, MY
*     .. Local Arrays ..
      DOUBLE PRECISION  COPYX(5), COPYY(5), DT9X(7,4,4), DT9Y(7,4,4),
     +                  DX1(7), DY1(7), MWPC(11), MWPS(11), MWPSTX(5),
     +                  MWPSTY(5), MWPTX(11,5), MWPTY(11,5), MWPX(5),
     +                  MWPY(5), SSIZE2(14,2), STX(7), STY(7), SX(7),
     +                  SY(7)
      INTEGER           INCXS(4), INCYS(4), LENS(4,2), MWPINX(11),
     +                  MWPINY(11), MWPN(11), NS(4)
*     .. External Subroutines ..
      EXTERNAL          DROT, STEST
*     .. Intrinsic Functions ..
      INTRINSIC         ABS, MIN
*     .. Common blocks ..
      COMMON            /COMBLA/ICASE, N, INCX, INCY, PASS
*     .. Data statements ..
      DATA              INCXS/1, 2, -2, -1/
      DATA              INCYS/1, -2, 1, -2/
      DATA              LENS/1, 1, 2, 4, 1, 1, 3, 7/
      DATA              NS/0, 1, 2, 4/
      DATA              DX1/0.6D0, 0.1D0, -0.5D0, 0.8D0, 0.9D0, -0.3D0,
     +                  -0.4D0/
      DATA              DY1/0.5D0, -0.9D0, 0.3D0, 0.7D0, -0.6D0, 0.2D0,
     +                  0.8D0/
      DATA              SC, SS/0.8D0, 0.6D0/
      DATA              DT9X/0.6D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.78D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.78D0, -0.46D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.78D0, -0.46D0, -0.22D0,
     +                  1.06D0, 0.0D0, 0.0D0, 0.0D0, 0.6D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.78D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.66D0, 0.1D0, -0.1D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.96D0, 0.1D0, -0.76D0, 0.8D0, 0.90D0,
     +                  -0.3D0, -0.02D0, 0.6D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.78D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, -0.06D0, 0.1D0,
     +                  -0.1D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.90D0,
     +                  0.1D0, -0.22D0, 0.8D0, 0.18D0, -0.3D0, -0.02D0,
     +                  0.6D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.78D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.78D0, 0.26D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.78D0, 0.26D0, -0.76D0, 1.12D0,
     +                  0.0D0, 0.0D0, 0.0D0/
      DATA              DT9Y/0.5D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.04D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.04D0, -0.78D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.04D0, -0.78D0, 0.54D0,
     +                  0.08D0, 0.0D0, 0.0D0, 0.0D0, 0.5D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.04D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.7D0,
     +                  -0.9D0, -0.12D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.64D0, -0.9D0, -0.30D0, 0.7D0, -0.18D0, 0.2D0,
     +                  0.28D0, 0.5D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.04D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.7D0, -1.08D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.64D0, -1.26D0,
     +                  0.54D0, 0.20D0, 0.0D0, 0.0D0, 0.0D0, 0.5D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.04D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.04D0, -0.9D0, 0.18D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.04D0, -0.9D0, 0.18D0, 0.7D0,
     +                  -0.18D0, 0.2D0, 0.16D0/
      DATA              SSIZE2/0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0, 0.0D0,
     +                  0.0D0, 1.17D0, 1.17D0, 1.17D0, 1.17D0, 1.17D0,
     +                  1.17D0, 1.17D0, 1.17D0, 1.17D0, 1.17D0, 1.17D0,
     +                  1.17D0, 1.17D0, 1.17D0/
*     .. Executable Statements ..
*
      ICASE = 4
      DO 60 KI = 1, 4
         INCX = INCXS(KI)
         INCY = INCYS(KI)
         MX = ABS(INCX)
         MY = ABS(INCY)
*
         DO 40 KN = 1, 4
            N = NS(KN)
            KSIZE = MIN(2,KN)
            LENX = LENS(KN,MX)
            LENY = LENS(KN,MY)
*
            IF (ICASE.EQ.4) THEN
*              .. DROT ..
               DO 20 I = 1, 7
                  SX(I) = DX1(I)
                  SY(I) = DY1(I)
                  STX(I) = DT9X(I,KN,KI)
                  STY(I) = DT9Y(I,KN,KI)
   20          CONTINUE
                print *, "Input: "
                
                print *, "N: ", N
                print *, "SX: ", SX
                print *, "INCX: ", INCX
                print *, "SY: ", SY
                print *, "INCY: ", INCY
                print *, "SC: ", SC
                print *, "SS: ", SS

               CALL DROT(N,SX,INCX,SY,INCY,SC,SS)

               print *, "Output: "
               print *, "SX Output: ", SX
                print *, "SY Output: ", SY
                
              !  CALL STEST(LENX,SX,STX,SSIZE2(1,KSIZE),SFAC)
              !  CALL STEST(LENY,SY,STY,SSIZE2(1,KSIZE),SFAC)
            ELSE
               WRITE (NOUT,*) ' Shouldn''t be here in CHECK3'
               STOP
            END IF
   40    CONTINUE
   60 CONTINUE
*
      MWPC(1) = 1
      DO 80 I = 2, 11
         MWPC(I) = 0
   80 CONTINUE
      MWPS(1) = 0
      DO 100 I = 2, 6
         MWPS(I) = 1
  100 CONTINUE
      DO 120 I = 7, 11
         MWPS(I) = -1
  120 CONTINUE
      MWPINX(1) = 1
      MWPINX(2) = 1
      MWPINX(3) = 1
      MWPINX(4) = -1
      MWPINX(5) = 1
      MWPINX(6) = -1
      MWPINX(7) = 1
      MWPINX(8) = 1
      MWPINX(9) = -1
      MWPINX(10) = 1
      MWPINX(11) = -1
      MWPINY(1) = 1
      MWPINY(2) = 1
      MWPINY(3) = -1
      MWPINY(4) = -1
      MWPINY(5) = 2
      MWPINY(6) = 1
      MWPINY(7) = 1
      MWPINY(8) = -1
      MWPINY(9) = -1
      MWPINY(10) = 2
      MWPINY(11) = 1
      DO 140 I = 1, 11
         MWPN(I) = 5
  140 CONTINUE
      MWPN(5) = 3
      MWPN(10) = 3
      DO 160 I = 1, 5
         MWPX(I) = I
         MWPY(I) = I
         MWPTX(1,I) = I
         MWPTY(1,I) = I
         MWPTX(2,I) = I
         MWPTY(2,I) = -I
         MWPTX(3,I) = 6 - I
         MWPTY(3,I) = I - 6
         MWPTX(4,I) = I
         MWPTY(4,I) = -I
         MWPTX(6,I) = 6 - I
         MWPTY(6,I) = I - 6
         MWPTX(7,I) = -I
         MWPTY(7,I) = I
         MWPTX(8,I) = I - 6
         MWPTY(8,I) = 6 - I
         MWPTX(9,I) = -I
         MWPTY(9,I) = I
         MWPTX(11,I) = I - 6
         MWPTY(11,I) = 6 - I
  160 CONTINUE
      MWPTX(5,1) = 1
      MWPTX(5,2) = 3
      MWPTX(5,3) = 5
      MWPTX(5,4) = 4
      MWPTX(5,5) = 5
      MWPTY(5,1) = -1
      MWPTY(5,2) = 2
      MWPTY(5,3) = -2
      MWPTY(5,4) = 4
      MWPTY(5,5) = -3
      MWPTX(10,1) = -1
      MWPTX(10,2) = -3
      MWPTX(10,3) = -5
      MWPTX(10,4) = 4
      MWPTX(10,5) = 5
      MWPTY(10,1) = 1
      MWPTY(10,2) = 2
      MWPTY(10,3) = 2
      MWPTY(10,4) = 4
      MWPTY(10,5) = 3
      DO 200 I = 1, 11
         INCX = MWPINX(I)
         INCY = MWPINY(I)
         DO 180 K = 1, 5
            COPYX(K) = MWPX(K)
            COPYY(K) = MWPY(K)
            MWPSTX(K) = MWPTX(I,K)
            MWPSTY(K) = MWPTY(I,K)
  180    CONTINUE
          print *, "Here input: "
          
          print *, "N: ", MWPN(I)
          print *, "SX: ", COPYX
          print *, "INCX: ", INCX
          print *, "SY: ", COPYY
          print *, "INCY: ", INCY
          print *, "SC: ", MWPC(I)
          print *, "SS: ", MWPS(I)
         CALL DROT(MWPN(I),COPYX,INCX,COPYY,INCY,MWPC(I),MWPS(I))
          print *, "Here output: "
          print *, "SX Output: ", COPYX
          print *, "SY Output: ", COPYY
          
        !  CALL STEST(5,COPYX,MWPSTX,MWPSTX,SFAC)
        !  CALL STEST(5,COPYY,MWPSTY,MWPSTY,SFAC)
  200 CONTINUE
      RETURN
      END


      PROGRAM MAIN
        CALL CHECK3(1.0D-20)
      END PROGRAM

@aman-095
Copy link
Member

Thanks, @Pranavchiku I was confused about how to create the test files this code helps. I have pushed the final changes please review them.

kgryte added a commit that referenced this issue Apr 19, 2024
PR-URL: #1823
Closes: #276
Ref: #2039
Co-authored-by: Athan Reines <kgryte@gmail.com>
Co-authored-by: Pranav Goswami <goswami.4@iitj.ac.in>
Reviewed-by: Athan Reines <kgryte@gmail.com> 
Reviewed-by: Pranav Goswami <goswami.4@iitj.ac.in>
Signed-off-by: Athan Reines <kgryte@gmail.com>
Signed-off-by: Pranav Goswami <goswami.4@iitj.ac.in>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted RFC feature request which has been accepted. BLAS Issue or pull request related to Basic Linear Algebra Subprograms (BLAS). C Issue involves or relates to C. difficulty: 3 Likely to be challenging but manageable. Feature Issue or pull request for adding a new feature. JavaScript Issue involves or relates to JavaScript. Native Addons Issue involves or relates to Node.js native add-ons. priority: High High priority concern or feature request. RFC Request for comments. Feature requests and proposed changes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants