Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up needed_blocks calculations in sharktank and shortfin #652

Open
renxida opened this issue Dec 5, 2024 · 0 comments
Open

Clean up needed_blocks calculations in sharktank and shortfin #652

renxida opened this issue Dec 5, 2024 · 0 comments

Comments

@renxida
Copy link
Contributor

renxida commented Dec 5, 2024

Here's an example. This is what sharktank uses for the decode step:

    # sharktank/sharktank/examples/paged_llm_v1.py:134
    def allocate_seq_block_ids(self):
        for i in range(self.bs):
            sl = int(self.seq_lens[i])

            if (sl % self.parent.block_seq_stride) == 0:
                needed_blocks = sl // self.parent.block_seq_stride + 1
            else:
                needed_blocks = math.ceil(sl / self.parent.block_seq_stride)

            block_ids_row = self.seq_block_ids[i]
            while len(block_ids_row) < needed_blocks:
                block_ids_row.append(self.parent.alloc_page())

There are 3 problems with the section:

            if (sl % self.parent.block_seq_stride) == 0:
                needed_blocks = sl // self.parent.block_seq_stride + 1
            else:
                needed_blocks = math.ceil(sl / self.parent.block_seq_stride)
  1. It's too complicated. This is really equivalent to needed_blocks = math.ceil( (sl + 1) / block_seq_stride. proof
  2. It allocates one extra page on block edges. It really should be needed_blocks = math.ceil( sl / block_seq_stride )
  3. It uses math.ceil instead of pure integer math. why this is bad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant