xgr.CompiledGrammar

class xgrammar.CompiledGrammar[source]

This is the primary object to store compiled grammar.

A CompiledGrammar can be used to construct GrammarMatcher to generate token masks efficiently.

Notes

Do not construct this class directly, instead use GrammarCompiler to construct the object.

Attributes:

grammar

The original grammar.

tokenizer_info

The tokenizer info associated with the compiled grammar.

memory_size_bytes

The approximate memory usage of the compiled grammar in bytes.

Methods:

serialize_json()

Serialize the compiled grammar to a JSON string.

deserialize_json(json_str, tokenizer_info)

Deserialize the compiled grammar from a JSON string and associate it with the specified tokenizer info.

property grammar: Grammar

The original grammar.

property tokenizer_info: TokenizerInfo

The tokenizer info associated with the compiled grammar.

property memory_size_bytes: int

The approximate memory usage of the compiled grammar in bytes.

serialize_json() str[source]

Serialize the compiled grammar to a JSON string. It will serialize the compiled grammar without the tokenizer info, since the tokenizer info is shared by multiple compiled grammars.

Notes

The metadata of the tokenizer info is serialized and will be checked when deserializing.

Returns:

json_string – The JSON string.

Return type:

str

static deserialize_json(json_str: str, tokenizer_info: TokenizerInfo) CompiledGrammar[source]

Deserialize the compiled grammar from a JSON string and associate it with the specified tokenizer info.

Notes

This will check the metadata of the tokenizer info matching the serialized metadata in json_str. If the metadata does not match, a DeserializeFormatError will be raised.

Parameters:
  • json_str (str) – The JSON string.

  • tokenizer_info (TokenizerInfo) – The tokenizer info.

Returns:

compiled_grammar – The compiled grammar.

Return type:

CompiledGrammar

Raises:
  • InvalidJSONError – When the JSON string is invalid.

  • DeserializeFormatError – When the JSON string does not follow the serialization format of the grammar, or the tokenizer info metadata does not match.

  • DeserializeVersionError – When the __VERSION__ field in the JSON string is not the same as the current version.