Higher-quality code retrieval with AST-based chunking

A new AST-based Code Chunker splits code into semantically meaningful units, preserving function and class boundaries across multiple languages and tokenizer options. This improves retrieval and embedding relevance for code RAG and analysis, reduces token waste, and eliminates the need for custom chunking logic.

‍

Details:

Language-agnostic AST parsing for structured, coherent chunks
Configurable tokenizer settings to align with your model choices
Drop-in adoption for existing ingestion and retrieval pipelines

‍

Who this is for: Teams building code-aware RAG, search, review assistants, static analysis, and compliance workflows that require precise code understanding.