Migrating to v0.8.x
New capabilities (non-breaking)
v0.8 adds optional workflows; existing chunk() / chunk_file() usage remains valid.
Hierarchical chunking
Use when you need multiple granularities (leaves for embeddings, roots for LLM context):
from omnichunk import Chunker
chunker = Chunker(max_chunk_size=256, size_unit="chars")
tree = chunker.hierarchical_chunk("api.py", source, levels=[64, 256, 1024])
Incremental diff
Use when syncing a vector database with file updates:
from omnichunk import Chunker
chunker = Chunker()
new_chunks = chunker.chunk("api.py", new_source)
diff = chunker.chunk_diff("api.py", new_source, previous_chunks=old_chunks)
# diff.added, diff.removed_ids, diff.unchanged
Stable IDs align with stable_chunk_id() and vector export row IDs from earlier releases.
Token budget selection
from omnichunk.budget import TokenBudgetOptimizer
opt = TokenBudgetOptimizer(budget=4096, strategy="greedy")
result = opt.select(retrieved_chunks, scores=scores)
Breaking changes
None required for basic chunking callers; new APIs are additive.