xgr.GrammarCompiler¶
- class xgrammar.GrammarCompiler(tokenizer_info: TokenizerInfo, *, max_threads: int = 8, cache_enabled: bool = True, cache_limit_bytes: int = -1)[source]¶
 The compiler for grammars. It is associated with a certain tokenizer info, and compiles grammars into CompiledGrammar with the tokenizer info. It allows parallel compilation with multiple threads, and has a cache to store the compilation result, avoiding compiling the same grammar multiple times.
Methods:
__init__(tokenizer_info, *[, max_threads, ...])Construct the compiler.
compile_json_schema(schema, *[, ...])Get CompiledGrammar from the specified JSON schema and format.
Get CompiledGrammar from the standard JSON.
compile_regex(regex)Get CompiledGrammar from the specified regex.
Compile a grammar from a structural tag.
Compile a grammar object.
Clear all cached compiled grammars.
The approximate memory usage of the cache in bytes.
Attributes:
The maximum memory usage for the cache in bytes.
- __init__(tokenizer_info: TokenizerInfo, *, max_threads: int = 8, cache_enabled: bool = True, cache_limit_bytes: int = -1)[source]¶
 Construct the compiler.
- Parameters:
 tokenizer_info (TokenizerInfo) – The tokenizer info.
max_threads (int, default: 8) – The maximum number of threads used to compile the grammar.
cache_enabled (bool, default: True) – Whether to enable the cache.
cache_limit_bytes (int, default: -1) – The maximum memory usage for the cache in the specified unit. Note that the actual memory usage may slightly exceed this value.
- compile_json_schema(schema: Union[str, Type[BaseModel], Dict[str, Any]], *, any_whitespace: bool = True, indent: Optional[int] = None, separators: Optional[Tuple[str, str]] = None, strict_mode: bool = True, max_whitespace_cnt: Optional[int] = None) CompiledGrammar[source]¶
 Get CompiledGrammar from the specified JSON schema and format. The indent and separators parameters follow the same convention as in json.dumps().
- Parameters:
 schema (Union[str, Type[BaseModel], Dict[str, Any]]) – The schema string or Pydantic model or JSON schema dict.
indent (Optional[int], default: None) – The number of spaces for indentation. If None, the output will be in one line.
separators (Optional[Tuple[str, str]], default: None) – Two separators used in the schema: comma and colon. Examples: (“,”, “:”), (”, “, “: “). If None, the default separators will be used: (“,”, “: “) when the indent is not None, and (”, “, “: “) otherwise.
strict_mode (bool, default: True) –
Whether to use strict mode. In strict mode, the generated grammar will not allow properties and items that is not specified in the schema. This is equivalent to setting unevaluatedProperties and unevaluatedItems to false.
This helps LLM to generate accurate output in the grammar-guided generation with JSON schema.
max_whitespace_cnt (Optional[int], default: None) – The maximum number of whitespace characters allowed between elements, such like keys, values, separators and so on. If None, there is no limit on the number of whitespace characters. If specified, it will limit the number of whitespace characters to at most max_whitespace_cnt. It should be a positive integer.
- Returns:
 compiled_grammar – The compiled grammar.
- Return type:
 
- compile_builtin_json_grammar() CompiledGrammar[source]¶
 Get CompiledGrammar from the standard JSON.
- Returns:
 compiled_grammar – The compiled grammar.
- Return type:
 
- compile_regex(regex: str) CompiledGrammar[source]¶
 Get CompiledGrammar from the specified regex.
- Parameters:
 regex (str) – The regex string.
- Returns:
 compiled_grammar – The compiled grammar.
- Return type:
 
- compile_structural_tag(structural_tag: Union[StructuralTag, str, Dict[str, Any]]) CompiledGrammar[source]¶
 - compile_structural_tag(tags: List[StructuralTagItem], triggers: List[str]) CompiledGrammar
 Compile a grammar from a structural tag. See the Structural Tag Usage in XGrammar documentation for its usage.
This method supports two calling patterns:
Single structural tag parameter: compile_structural_tag(structural_tag)
Legacy pattern (deprecated): compile_structural_tag(tags, triggers)
- Parameters:
 structural_tag (Union[StructuralTag, str, Dict[str, Any]]) – The structural tag either as a StructuralTag object, or a JSON string or a dictionary.
tags (List[StructuralTagItem]) – (Deprecated) The structural tags. Use StructuralTag class instead.
triggers (List[str]) – (Deprecated) The triggers. Use StructuralTag class instead.
- Returns:
 compiled_grammar – The compiled grammar from the structural tag.
- Return type:
 - Raises:
 InvalidJSONError – When the structural tag is not a valid JSON string.
InvalidStructuralTagError – When the structural tag is not valid.
TypeError – When the arguments are invalid.
Notes
The legacy pattern compile_structural_tag(tags, triggers) is deprecated. Use the StructuralTag class to construct structural tags instead.
- compile_grammar(ebnf_string: str, *, root_rule_name: str = 'root') CompiledGrammar[source]¶
 - compile_grammar(grammar: Grammar) CompiledGrammar
 Compile a grammar object.
Overloads:
compile_grammar(ebnf_string: str, *, root_rule_name: str = "root") -> CompiledGrammarCompile a grammar from an EBNF string. The string should follow the format described in https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md.
compile_grammar(grammar: Grammar) -> CompiledGrammarCompile a grammar from a Grammar object.
- Parameters:
 - Returns:
 compiled_grammar – The compiled grammar.
- Return type: