Skip to content

Conversation

@KrystofS
Copy link
Contributor

@KrystofS KrystofS commented Nov 5, 2025

Microsoft Reviewers: Open in CodeFlow

Copilot AI review requested due to automatic review settings November 5, 2025 17:02
@github-actions github-actions bot added the area-ai Microsoft.Extensions.AI libraries label Nov 5, 2025
Comment on lines 46 to 49
foreach (var chunk in chunks)
{
yield return chunk;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Won't chunks always be empty here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will, I've missed it in the initial version.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new SectionChunker class for the Microsoft.Extensions.DataIngestion library, which treats each IngestionDocumentSection as a separate entity for chunking. The implementation handles nested sections and maintains header context across chunks.

  • Adds SectionChunker class to support section-based document chunking
  • Creates comprehensive test coverage for the new chunker through SectionChunkerTests
  • Introduces a base test class DocumentChunkerTests for shared chunker test scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/Libraries/Microsoft.Extensions.DataIngestion/Chunkers/SectionChunker.cs Implements the new SectionChunker class with support for nested sections and header context preservation
test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Chunkers/SectionChunkerTests.cs Provides comprehensive test coverage including single/multiple sections, nested sections, size limits, and headers
test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Chunkers/DocumentChunkerTests.cs Defines abstract base test class with common test scenarios for all document chunkers

Comment on lines +46 to +49

private void Process(IngestionDocument document, IngestionDocumentSection section, List<IngestionChunk<string>> chunks, string? parentContext = null)
{
List<IngestionDocumentElement> elements = new(section.Elements.Count);
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final loop (lines 46-49) is unreachable dead code. The chunks list is cleared after each section iteration (line 43), so it will always be empty when this loop executes. This code should be removed.

Copilot uses AI. Check for mistakes.
@KrystofS KrystofS requested a review from stephentoub November 5, 2025 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ai Microsoft.Extensions.AI libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants