Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

Closed
kvaragan opened this issue Jun 17, 2020 · 6 comments

Comments

@kvaragan
Copy link
Contributor

We get error failures as:
Output:
blis_sgemm_cn_ccc 1 1 1 0.00 -nan FAILURE
blis_cgemm1m_hc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_nc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_cc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_ct_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_ht_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_hh_ccc 1 1 1 0.00 -nan FAILURE
blis_zgemm1m_tc_ccc 1 1 1 0.00 -nan FAILURE

In input.general we enable 1 to use Randomize vectors and matrices using powers of 2 in narrow precision range.
However this error doesn't occur when use real values [-1,1].

@devinamatthews
Copy link
Member

@kvaragan can you explain a little more how the initialization is done (or @fgvanzee if this is something built-in)?

@kvaragan
Copy link
Contributor Author

Initialization is built into testsuite.

@devinamatthews
Copy link
Member

@fgvanzee ?

@fgvanzee
Copy link
Member

@kvaragan Thanks for this bug report, Kiran. I've isolated the issue.

The problem is that the current implementation of both scalar randomization functions (regular or narrow powers of 2) can sometimes return 0. For the narrow powers of 2, this happens much more frequently because only a handful of values are possible. Why is this of concern? When I wrote the narrow powers of 2 randomizer, I figured that 0 should be a valid value in our test matrices just like any other value in the range. The problem is that it can sometimes cause the numerical test to malfunction because it will attempt to normalize by the largest value in the matrix, even if it's 0. (So the nans were presumably there result of computing 0/0.) This can very easily happen for 1x1 matrices when using the narrow powers of 2 randomization.

So, in retrospect, this problem can manifest for randomization on the real range [-1,1], but never does in practice because it is highly improbable.

At first I was going to simply disallow 0 as a valid random value--likely for both types of randomization--but then I realized that the problem is not zeroes, per se, but rather objects that are completely zero. So instead, I think I will check the randomized object's 1-norm, and if it's zero, re-randomize until it's not zero.

fgvanzee added a commit that referenced this issue Jun 17, 2020
Details:
- Fixed an innocuous bug that manifested when running the testsuite on
  extremely small matrices with randomization via the "powers of 2 in
  narrow precision range" option enabled. When the randomization
  function emits a perfect 0.0 to fill a 1x1 matrix, the testsuite will
  then compute 0.0/0.0 during the normalization process, which leads to
  NaN residuals. The solution entails smarter implementaions of randv,
  randnv, randm, and randnm, each of which will compute the 1-norm of
  the vector or matrix in question. If the object has a 1-norm of 0.0,
  the object is re-randomized until the 1-norm is not 0.0. Thanks to
  Kiran Varaganti for reporting this issue (#413).
- Updated the implementation of randm_unb_var1() so that it loops over
  a call to the randv_unb_var1() implementation directly rather than
  calling it indirectly via randv(). This was done to avoid the overhead
  of multiple calls to norm1v() when randomizing the rows/columns of a
  matrix.
- Updated comments.
@fgvanzee
Copy link
Member

@kvaragan Please try out b5b604e. It should fix the test failures (which were false to begin with).

@kvaragan
Copy link
Contributor Author

Its resolved now. Thanks Field.

pradeeptrgit pushed a commit to amd/blis that referenced this issue Jun 30, 2020
Details:
- Fixed an innocuous bug that manifested when running the testsuite on
  extremely small matrices with randomization via the "powers of 2 in
  narrow precision range" option enabled. When the randomization
  function emits a perfect 0.0 to fill a 1x1 matrix, the testsuite will
  then compute 0.0/0.0 during the normalization process, which leads to
  NaN residuals. The solution entails smarter implementaions of randv,
  randnv, randm, and randnm, each of which will compute the 1-norm of
  the vector or matrix in question. If the object has a 1-norm of 0.0,
  the object is re-randomized until the 1-norm is not 0.0. Thanks to
  Kiran Varaganti for reporting this issue (flame#413).
- Updated the implementation of randm_unb_var1() so that it loops over
  a call to the randv_unb_var1() implementation directly rather than
  calling it indirectly via randv(). This was done to avoid the overhead
  of multiple calls to norm1v() when randomizing the rows/columns of a
  matrix.
- Updated comments.

Change-Id: I0e3d65ff97b26afde614da746e17ed33646839d1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants