-
Notifications
You must be signed in to change notification settings - Fork 105
Add BSGS support for bicyclic matmul #2366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc216d2 to
c2162e6
Compare
|
The culprit may be this heuristic with a TODO (which is hidden by the diff in this PR because it is unchanged): // Use a value of sqrt(n) as the baby step / giant step size.
int64_t numBabySteps = static_cast<int64_t>(std::floor(std::sqrt(steps)));
if (steps % numBabySteps != 0) {
// Find the nearest divisible number to use for baby step
// TODO(#2162): determine the right tradeoff here
int lower = numBabySteps;
int upper = numBabySteps;
while (steps % lower != 0 && steps % upper != steps) {
lower--;
upper++;
}
if (steps % lower == 0 && lower > 1) {
numBabySteps = lower;
} else if (steps % upper == 0) {
numBabySteps = upper;
} else {
numBabySteps = steps;
}
} |
|
This may just be caused by the divisibility requirement in the above code. In python Then for an example input like 137, which is prime, we don't find anything divisible and go full baby steps. Ah, #2162 already has tons of people giving suggestions for what to do! Now it seems it's necessary to resolve that issue :) In the mean time, I think we can submit this PR as-is and improve it going forward. |
b88a1cb to
021d561
Compare
|
Yeah, I'll take a look. |
lawrencekhlim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work! At a high level, this looks great!
021d561 to
f8b3003
Compare
This PR adds baby-step-giant-step support for the bicyclic matmul kernel, as per 6.2.2 of https://eprint.iacr.org/2025/1200 and #2359 (comment)
It does so by extracting the subset of the "rotate and reduce" kernel that implements the BSGS routine into a separate helper, which is sufficiently generalized to express the bicyclic matmul variant of BSGS. This is two steps:
[EDIT]: the stuff below was the issue I was seeing before I found #2162 again which has suggestions for what to do to fix the issue.
I did notice that for the test I ran, for
n=124the actual number of rotations is158, while the predicted formula ofn + 2sqrt(n) - 3gives 143 rotations. 158 is closer ton + 3 sqrt(n)than the other formula. This led me to run the kernel for a range of dimension sizes (x, x+1, x+2) for x odd, and I got this chartNote troughs seem to line up with the claimed bound in the paper, but the peaks are decently close to
3n/2, which seems like a bug in my code. A concrete example that exhibits this behavior isn=133, m=134, p=135I don't have time to dig in before the weekend, so posting in case anyone has a chance to look over it before I return to work Monday.