Skip to content

Conversation

@j2kun
Copy link
Collaborator

@j2kun j2kun commented Oct 31, 2025

This PR adds baby-step-giant-step support for the bicyclic matmul kernel, as per 6.2.2 of https://eprint.iacr.org/2025/1200 and #2359 (comment)

It does so by extracting the subset of the "rotate and reduce" kernel that implements the BSGS routine into a separate helper, which is sufficiently generalized to express the bicyclic matmul variant of BSGS. This is two steps:

  • Adding the ability to have just one "plaintext" operand (which is renamed to the "babySteppedOperand") instead of a tensor of plaintexts that is extracted from.
  • Adding a callback that computes the actual rotation to use for the baby stepped operand, which is different in bicyclic matmul than in the simpler rotate-and-reduce case.

[EDIT]: the stuff below was the issue I was seeing before I found #2162 again which has suggestions for what to do to fix the issue.

I did notice that for the test I ran, for n=124 the actual number of rotations is 158, while the predicted formula of n + 2sqrt(n) - 3 gives 143 rotations. 158 is closer to n + 3 sqrt(n) than the other formula. This led me to run the kernel for a range of dimension sizes (x, x+1, x+2) for x odd, and I got this chart

image

Note troughs seem to line up with the claimed bound in the paper, but the peaks are decently close to 3n/2, which seems like a bug in my code. A concrete example that exhibits this behavior is n=133, m=134, p=135

I don't have time to dig in before the weekend, so posting in case anyone has a chance to look over it before I return to work Monday.

@j2kun j2kun marked this pull request as draft October 31, 2025 17:35
@j2kun j2kun force-pushed the bicyclic-matmul-bsgs branch from cc216d2 to c2162e6 Compare October 31, 2025 22:57
@j2kun j2kun marked this pull request as ready for review October 31, 2025 22:59
@j2kun j2kun requested review from asraa and lawrencekhlim October 31, 2025 23:10
@j2kun
Copy link
Collaborator Author

j2kun commented Oct 31, 2025

The culprit may be this heuristic with a TODO (which is hidden by the diff in this PR because it is unchanged):

  // Use a value of sqrt(n) as the baby step / giant step size.
  int64_t numBabySteps = static_cast<int64_t>(std::floor(std::sqrt(steps)));
  if (steps % numBabySteps != 0) {
    // Find the nearest divisible number to use for baby step
    // TODO(#2162): determine the right tradeoff here
    int lower = numBabySteps;
    int upper = numBabySteps;

    while (steps % lower != 0 && steps % upper != steps) {
      lower--;
      upper++;
    }

    if (steps % lower == 0 && lower > 1) {
      numBabySteps = lower;
    } else if (steps % upper == 0) {
      numBabySteps = upper;
    } else {
      numBabySteps = steps;
    }
  }

@j2kun
Copy link
Collaborator Author

j2kun commented Oct 31, 2025

This may just be caused by the divisibility requirement in the above code.

In python

import math

def find_baby_steps(steps):
    # Use a value of sqrt(n) as the baby step / giant step size
    num_baby_steps = int(math.floor(math.sqrt(steps)))
    
    if steps % num_baby_steps != 0:
        # Find the nearest divisible number to use for baby step
        lower = num_baby_steps
        upper = num_baby_steps
        
        while steps % lower != 0 and steps % upper != steps:
            lower -= 1
            upper += 1
        
        if steps % lower == 0 and lower > 1:
            num_baby_steps = lower
        elif steps % upper == 0:
            num_baby_steps = upper
        else:
            num_baby_steps = steps
    
    return num_baby_steps

Then for an example input like 137, which is prime, we don't find anything divisible and go full baby steps.

find_baby_steps(137)  # sqrt(137 ~ 11)
137

Ah, #2162 already has tons of people giving suggestions for what to do! Now it seems it's necessary to resolve that issue :)

In the mean time, I think we can submit this PR as-is and improve it going forward.

@lawrencekhlim
Copy link
Collaborator

Yeah, I'll take a look.

Copy link
Collaborator

@lawrencekhlim lawrencekhlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! At a high level, this looks great!

@j2kun j2kun force-pushed the bicyclic-matmul-bsgs branch from 021d561 to f8b3003 Compare November 5, 2025 18:46
@j2kun j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Nov 5, 2025
@copybara-service copybara-service bot merged commit 55ed92f into google:main Nov 5, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants