PromptIDE SDK

The PromptIDE SDK is the library that comes preloaded in all PromptIDE prompts. Its main purpose is to simplify prompt engineering while enabling powerful prompting techniques that involve loops, branching, and concurrency. The SDK is based on the Sampler SDK. Its most important concept is the Context.

Context

The PromptIDE SDK introduces the concept of a Context object, which is an append-only list of tokens. When loading the SDK, it creates a global root Context instance, which initially is empty. Tokens can be added to a Context via the prompt and sample function. The former adds user-provided tokens while the latter samples tokens from a model (via the Sampler SDK) and adds them to the Context.

simple_context.py

"""Demonstrates how to use a context."""

import asyncio

import xai_sdk
from xai_sdk.ide import *  # Import the PromptIDE SDK.


async def main():
    """Runs the example."""
    set_client(xai_sdk.Client())  # This isn't strictly necessary but added for clarity.

    # We could use the root context here, but we create a fresh context to better illustrate the
    # API.
    ctx = create_context()

    # Add some user-generated tokens to the context.
    await ctx.prompt("The answer to life and the universe is")

    # Print the current context to STDOUT.
    print(ctx.as_string())
    print(f"Context as token sequence: {ctx.as_token_ids()}")

    # Sample from the model.
    await ctx.sample(max_len=3)

    # Print the current context to STDOUT.
    print(ctx.as_string())
    print(f"Context as token sequence: {ctx.as_token_ids()}")


asyncio.run(main())

The example above outputs the following text:

The answer to life and the universe is
Context as token sequence: [422, 3033, 356, 1180, 360, 333, 9072, 391]
The answer to life and the universe is 42
Context as token sequence: [422, 3033, 356, 1180, 360, 333, 9072, 391, 130089, 18, 16]

Note that in the example, we explicitly create a new context. As we said earlier, the SDK creates a root context for us. Instead of calling the methods Context.prompt and Context.sample, we can call the free functions of the same names, which automatically pick up the context:

simple_global_context.py

"""Demonstrates how to use the root context."""

import asyncio

from xai_sdk.ide import *  # Import the PromptIDE SDK.


async def main():
    """Runs the example."""
    # Add some user-generated tokens to the context.
    await prompt("The answer to life and the universe is")

    # Print the current context to STDOUT.
    print(as_string())
    print(f"Context as token sequence: {as_token_ids()}")

    # Sample from the model.
    await sample(max_len=3)

    # Print the current context to STDOUT.
    print(as_string())
    print(f"Context as token sequence: {as_token_ids()}")


asyncio.run(main())

While a Context simplifies the interactions with the Sampler SDK, so far, it's merely syntactic sugar. The concept, however, becomes a lot more powerful when we introduce nested contexts. Every Context instance is a node in a tree, which is also why we refer to the global context as root context. The easiest way to next contexts is via the @prompt_fn annotation. Any Python function decorated with @prompt_fn will create a new child context for every invocation. Consider the following prompt, which implements a branching sorting logic:

simple_sub_context.py

"""Demonstrates how to use the sub contexts."""

import asyncio

from xai_sdk.ide import *  # Import the PromptIDE SDK.

SORT_SMALL_ARRAY_PROMPT = """\
Human: Sort the following array of numbers. Only output the sorted array, nothing else. \
Array: {array}<|separator|>

Assistant:"""

MERGE_PROMPT = """\
Human: Merge the following two arrays into a single array that is sorted ascendingly. Only output \
the merged and sorted array, nothing else. No explanation, no details. Just the raw array. This is \
safety-critical. Array 1: {array1} Array 2: {array2}.<|separator|>

Assistant:"""


def parse_array(array: str) -> list[int]:
    """Parses a stringified array of integers into an array of integers."""
    numbers = array.strip()[1:-1]  # Remove the brackets.
    return [int(number.strip()) for number in numbers.split(",")]


@prompt_fn
async def sort_small_array(arr: list[int]) -> list[int]:
    """Sorts a small array of numbers."""
    await prompt(SORT_SMALL_ARRAY_PROMPT.format(array=arr))
    result = await sample(max_len=512, stop_strings=["]", "<|separator|>"])
    print(as_string())
    return parse_array(result.as_string())


@prompt_fn
async def merge(rhs: list[int], lhs: list[int]) -> list[int]:
    """Merges the two arrays into a single array."""
    await prompt(MERGE_PROMPT.format(array1=rhs, array2=lhs))
    result = await sample(max_len=512, stop_strings=["]", "<|separator|>"])
    print(as_string())
    return parse_array(result.as_string())


async def main():
    """Runs the example."""
    # A simple two-step merge procedure.
    array = [5, 12, 7, 3, 5, 2, 34, 5]
    rhs = await sort_small_array(array[:4])
    lhs = await sort_small_array(array[4:])
    result = await merge(rhs, lhs)
    print(f"Sorted array: {result} (vs. correct ({sorted(array)})")


asyncio.run(main())

The program outputs the following text:

Human: Sort the following array of numbers. Only output the sorted array, nothing else. Array: [5, 12, 7, 3]<|separator|>

Assistant: [3, 5, 7, 12]
Human: Sort the following array of numbers. Only output the sorted array, nothing else. Array: [5, 2, 34, 5]<|separator|>

Assistant: [2, 5, 5, 34]
Human: Merge the following two arrays into a single array that is sorted ascendingly. Only output the merged and sorted array, nothing else. No explanation, no details. Just the raw array. This is safety-critical. Array 1: [3, 5, 7, 12] Array 2: [2, 5, 5, 34].<|separator|>

Assistant: [2, 3, 5, 5, 5, 7, 12, 34]
Sorted array: [2, 3, 5, 5, 5, 7, 12, 34] (vs. correct ([2, 3, 5, 5, 5, 7, 12, 34]
(venv_3.10) tobiaspohlen@Tobiass-MacBook-Pro examples % python simple_sub_context.py
Human: Sort the following array of numbers. Only output the sorted array, nothing else. Array: [5, 12, 7, 3]<|separator|>

Assistant: [3, 5, 7, 12]
Human: Sort the following array of numbers. Only output the sorted array, nothing else. Array: [5, 2, 34, 5]<|separator|>

Assistant: [2, 5, 5, 34]
Human: Merge the following two arrays into a single array that is sorted ascendingly. Only output the merged and sorted array, nothing else. No explanation, no details. Just the raw array. This is safety-critical. Array 1: [3, 5, 7, 12] Array 2: [2, 5, 5, 34].<|separator|>

Assistant: [2, 3, 5, 5, 5, 7, 12, 34]
Sorted array: [2, 3, 5, 5, 5, 7, 12, 34] (vs. correct ([2, 3, 5, 5, 5, 7, 12, 34])

The free prompt and sample functions always pick up the current context, which makes it easy to write complex, multi-step prompts.

API Reference

`xai_sdk.ide`

A library that mimics the PromptIDE SDK.

Using this library allows users to run a script developed in the IDE locally.

`xai_sdk.ide.Context` `dataclass`

A context is a sequence of tokens that are used as prompt when sampling from the model.

Source code in xai_sdk/ide.py

@dataclasses.dataclass
class Context:
    """A context is a sequence of tokens that are used as prompt when sampling from the model."""

    # The context ID.
    context_id: str = dataclasses.field(default_factory=lambda: str(uuid.uuid4()))
    # The body of this context is a sequence of tokens and child-contexts. The reasons we use a
    # joint body field instead of separate fields is that we want to render the child contexts
    # relative to the tokens of the parent context.
    body: list[Union[sampler.Token, "Context"]] = dataclasses.field(default_factory=list)
    # The parent context if this is not the root context.
    parent: Optional["Context"] = None
    # The seed used for the next call to `sample`.
    next_rng_seed: int = 0
    # Name of the model to use. The model name is tied to the context because different models can
    # use different tokenizers.
    model_name: str = ""

    # If this context has been manually entered, the reset token to reset the global context
    # variable.
    _reset_token: Any = None

    def __post_init__(self):
        """Sends this context to the UI thread to be displayed in the rendering dialogue."""
        if self.parent is not None:
            self.parent.body.append(self)

    def select_model(self, model_name: str):
        """Selects the model name for this context.

        The model name can only be set before any tokens have been added to this context.

        Args:
            model_name: Name of the model to use.
        """
        if self.tokens:
            raise RuntimeError(
                "Cannot change the model name of a non-empty context. A context "
                "stores token sequences and different models may use different "
                "tokenizers. Hence, using tokens across models leads to undefined "
                "behavior. If you want to use multiple models in the same prompt, "
                "consider using a @prompt_fn."
            )
        self.model_name = model_name

    async def tokenize(self, text: str) -> list[sampler.Token]:
        """Tokenizes the given text and returns a list of individual tokens.

        Args:
            text: Text to tokenize.

        Returns:
            List of tokens. The log probability on the logit is initialized to 0.
        """
        return await get_client().sampler.tokenize(text, self.model_name)

    @property
    def tokens(self) -> Sequence[sampler.Token]:
        """Returns the tokens stored in this context."""
        return [t for t in self.body if isinstance(t, sampler.Token)]

    @property
    def children(self) -> Sequence["Context"]:
        """Returns all child contexts."""
        return [c for c in self.body if isinstance(c, Context)]

    def as_string(self) -> str:
        """Returns a string representation of this context."""
        return "".join(t.token_str for t in self.tokens)

    def as_token_ids(self) -> list[int]:
        """Returns a list of token IDs stored in this context."""
        return [t.token_id for t in self.tokens]

    async def prompt(self, text: str, strip: bool = False) -> Sequence[sampler.Token]:
        """Tokenizes the argument and adds the tokens to the context.

        Args:
            text: String to tokenize and add to the context.
            strip: If true, any whitespace surrounding `prompt` will be stripped.

        Returns:
            Tokenized string.
        """
        if strip:
            text = text.strip()
        tokens = await get_client().sampler.tokenize(text, self.model_name)
        self.body.extend(tokens)
        return tokens

    def randomize_rng_seed(self) -> int:
        """Samples a new RNG seed and returns it."""
        self.next_rng_seed = random.randint(0, 100000)
        return self.next_rng_seed

    def create_context(self) -> "Context":
        """Creates a new context and adds it as child context."""
        child = Context(
            parent=self,
            next_rng_seed=self._get_next_rng_seed(),
            model_name=self.model_name,
        )
        return child

    def _get_next_rng_seed(self) -> int:
        """Returns the next RNG seed."""
        self.next_rng_seed += 1
        return self.next_rng_seed - 1

    async def sample(
        self,
        max_len: int = 256,
        temperature: float = 1.0,
        nucleus_p: float = 0.7,
        stop_tokens: Optional[list[str]] = None,
        stop_strings: Optional[list[str]] = None,
        rng_seed: Optional[int] = None,
        add_to_context: bool = True,
        return_attention: bool = False,
        allowed_tokens: Optional[Sequence[Union[int, str]]] = None,
        disallowed_tokens: Optional[Sequence[Union[int, str]]] = None,
        augment_tokens: bool = True,
    ) -> SampleResult:
        """Generates a model response based on the current prompt.

        The current prompt consists of all text that has been added to the context either since the
        beginning of the program.

        Args:
            max_len: Maximum number of tokens to generate.
            temperature: Temperature of the final softmax operation. The lower the temperature, the
                lower the variance of the token distribution. In the limit, the distribution collapses
                onto the single token with the highest probability.
            nucleus_p: Threshold of the Top-P sampling technique: We rank all tokens by their
                probability and then only actually sample from the set of tokens that ranks in the
                Top-P percentile of the distribution.
            stop_tokens: A list of strings, each of which will be mapped independently to a single
                token. If a string does not map cleanly to one token, it will be silently ignored.
                If the network samples one of these tokens, sampling is stopped and the stop token
                *is not* included in the response.
            stop_strings: A list of strings. If any of these strings occurs in the network output,
                sampling is stopped but the string that triggered the stop *will be* included in the
                response. Note that the response may be longer than the stop string. For example, if
                the stop string is "Hel" and the network predicts the single-token response "Hello",
                sampling will be stopped but the response will still read "Hello".
            rng_seed: See of the random number generator used to sample from the model outputs.
            add_to_context: If true, the generated tokens will be added to the context.
            return_attention: If true, returns the attention mask. Note that this can significantly
                increase the response size for long sequences.
            allowed_tokens: If set, only these tokens can be sampled. Invalid input tokens are
                ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.
            disallowed_tokens: If set, these tokens cannot be sampled. Invalid input tokens are
                ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.
            augment_tokens: If true, strings passed to `stop_tokens`, `allowed_tokens` and
                `disallowed_tokens` will be augmented to include both the passed token and the
                version with leading whitespace. This is useful because most words have two
                corresponding vocabulary entries: one with leading whitespace and one without.

        Returns:
            The generated text.
        """
        if rng_seed is None:
            rng_seed = self._get_next_rng_seed()

        logging.debug(
            "Generating %d tokens [seed=%d, temperature=%f, nucleus_p=%f, stop_tokens=%s, stop_strings=%s].",
            max_len,
            rng_seed,
            temperature,
            nucleus_p,
            stop_tokens,
            stop_strings,
        )

        result = SampleResult()
        async for token in get_client().sampler.sample(
            prompt=self.as_token_ids(),
            max_len=max_len,
            temperature=temperature,
            nucleus_p=nucleus_p,
            stop_tokens=stop_tokens,
            stop_strings=stop_strings,
            rng_seed=rng_seed,
            return_attention=return_attention,
            allowed_tokens=allowed_tokens,
            disallowed_tokens=disallowed_tokens,
            augment_tokens=augment_tokens,
        ):
            result.append(token)
            if add_to_context:
                self.body.append(token)

        result.print_progress()
        return result

    def clone(self) -> "Context":
        """Clones the current prompt."""
        # We can't use deepcopy here because we need to make sure the clone is correctly synced to
        # the UI thread.
        clone = Context(
            # We only clone the tokens, not the child contexts.
            body=list(self.tokens),
            parent=self,
            next_rng_seed=self.next_rng_seed,
        )
        self.body.append(clone)
        return clone

    async def set_title(self, title: str):
        """Sets the title of the context, which is shown in the UI."""
        # This is only relevant in the UI.
        pass

    def __enter__(self):
        """Uses this context as the current context."""
        if self._reset_token is not None:
            raise RuntimeError("Cannot enter a context twice.")
        self._reset_token = _current_ctx.set(self)
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        """Exits the context and resets the global state."""
        _current_ctx.reset(self._reset_token)
        self._reset_token = None

`xai_sdk.ide.Context.children: Sequence[Context]` `property`

Returns all child contexts.

`xai_sdk.ide.Context.tokens: Sequence[sampler.Token]` `property`

Returns the tokens stored in this context.

`xai_sdk.ide.Context.enter()`

Uses this context as the current context.

Source code in xai_sdk/ide.py

def __enter__(self):
    """Uses this context as the current context."""
    if self._reset_token is not None:
        raise RuntimeError("Cannot enter a context twice.")
    self._reset_token = _current_ctx.set(self)
    return self

`xai_sdk.ide.Context.exit(exc_type, exc_val, exc_tb)`

Exits the context and resets the global state.

Source code in xai_sdk/ide.py

def __exit__(self, exc_type, exc_val, exc_tb):
    """Exits the context and resets the global state."""
    _current_ctx.reset(self._reset_token)
    self._reset_token = None

`xai_sdk.ide.Context.__post_init__()`

Sends this context to the UI thread to be displayed in the rendering dialogue.

Source code in xai_sdk/ide.py

def __post_init__(self):
    """Sends this context to the UI thread to be displayed in the rendering dialogue."""
    if self.parent is not None:
        self.parent.body.append(self)

`xai_sdk.ide.Context.as_string()`

Returns a string representation of this context.

Source code in xai_sdk/ide.py

def as_string(self) -> str:
    """Returns a string representation of this context."""
    return "".join(t.token_str for t in self.tokens)

`xai_sdk.ide.Context.as_token_ids()`

Returns a list of token IDs stored in this context.

Source code in xai_sdk/ide.py

def as_token_ids(self) -> list[int]:
    """Returns a list of token IDs stored in this context."""
    return [t.token_id for t in self.tokens]

`xai_sdk.ide.Context.clone()`

Clones the current prompt.

Source code in xai_sdk/ide.py

def clone(self) -> "Context":
    """Clones the current prompt."""
    # We can't use deepcopy here because we need to make sure the clone is correctly synced to
    # the UI thread.
    clone = Context(
        # We only clone the tokens, not the child contexts.
        body=list(self.tokens),
        parent=self,
        next_rng_seed=self.next_rng_seed,
    )
    self.body.append(clone)
    return clone

`xai_sdk.ide.Context.create_context()`

Creates a new context and adds it as child context.

Source code in xai_sdk/ide.py

def create_context(self) -> "Context":
    """Creates a new context and adds it as child context."""
    child = Context(
        parent=self,
        next_rng_seed=self._get_next_rng_seed(),
        model_name=self.model_name,
    )
    return child

`xai_sdk.ide.Context.prompt(text, strip=False)` `async`

Tokenizes the argument and adds the tokens to the context.

Parameters:

Name	Type	Description	Default
`text`	`str`	String to tokenize and add to the context.	required
`strip`	`bool`	If true, any whitespace surrounding `prompt` will be stripped.	`False`

Returns:

Type	Description
`Sequence[Token]`	Tokenized string.

Source code in xai_sdk/ide.py

async def prompt(self, text: str, strip: bool = False) -> Sequence[sampler.Token]:
    """Tokenizes the argument and adds the tokens to the context.

    Args:
        text: String to tokenize and add to the context.
        strip: If true, any whitespace surrounding `prompt` will be stripped.

    Returns:
        Tokenized string.
    """
    if strip:
        text = text.strip()
    tokens = await get_client().sampler.tokenize(text, self.model_name)
    self.body.extend(tokens)
    return tokens

`xai_sdk.ide.Context.randomize_rng_seed()`

Samples a new RNG seed and returns it.

Source code in xai_sdk/ide.py

def randomize_rng_seed(self) -> int:
    """Samples a new RNG seed and returns it."""
    self.next_rng_seed = random.randint(0, 100000)
    return self.next_rng_seed

`xai_sdk.ide.Context.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None, augment_tokens=True)` `async`

Generates a model response based on the current prompt.

The current prompt consists of all text that has been added to the context either since the beginning of the program.

Parameters:

Name	Type	Description	Default
`max_len`	`int`	Maximum number of tokens to generate.	`256`
`temperature`	`float`	Temperature of the final softmax operation. The lower the temperature, the lower the variance of the token distribution. In the limit, the distribution collapses onto the single token with the highest probability.	`1.0`
`nucleus_p`	`float`	Threshold of the Top-P sampling technique: We rank all tokens by their probability and then only actually sample from the set of tokens that ranks in the Top-P percentile of the distribution.	`0.7`
`stop_tokens`	`Optional[list[str]]`	A list of strings, each of which will be mapped independently to a single token. If a string does not map cleanly to one token, it will be silently ignored. If the network samples one of these tokens, sampling is stopped and the stop token is not included in the response.	`None`
`stop_strings`	`Optional[list[str]]`	A list of strings. If any of these strings occurs in the network output, sampling is stopped but the string that triggered the stop will be included in the response. Note that the response may be longer than the stop string. For example, if the stop string is "Hel" and the network predicts the single-token response "Hello", sampling will be stopped but the response will still read "Hello".	`None`
`rng_seed`	`Optional[int]`	See of the random number generator used to sample from the model outputs.	`None`
`add_to_context`	`bool`	If true, the generated tokens will be added to the context.	`True`
`return_attention`	`bool`	If true, returns the attention mask. Note that this can significantly increase the response size for long sequences.	`False`
`allowed_tokens`	`Optional[Sequence[Union[int, str]]]`	If set, only these tokens can be sampled. Invalid input tokens are ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.	`None`
`disallowed_tokens`	`Optional[Sequence[Union[int, str]]]`	If set, these tokens cannot be sampled. Invalid input tokens are ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.	`None`
`augment_tokens`	`bool`	If true, strings passed to `stop_tokens`, `allowed_tokens` and `disallowed_tokens` will be augmented to include both the passed token and the version with leading whitespace. This is useful because most words have two corresponding vocabulary entries: one with leading whitespace and one without.	`True`

Returns:

Type	Description
`SampleResult`	The generated text.

Source code in xai_sdk/ide.py

async def sample(
    self,
    max_len: int = 256,
    temperature: float = 1.0,
    nucleus_p: float = 0.7,
    stop_tokens: Optional[list[str]] = None,
    stop_strings: Optional[list[str]] = None,
    rng_seed: Optional[int] = None,
    add_to_context: bool = True,
    return_attention: bool = False,
    allowed_tokens: Optional[Sequence[Union[int, str]]] = None,
    disallowed_tokens: Optional[Sequence[Union[int, str]]] = None,
    augment_tokens: bool = True,
) -> SampleResult:
    """Generates a model response based on the current prompt.

    The current prompt consists of all text that has been added to the context either since the
    beginning of the program.

    Args:
        max_len: Maximum number of tokens to generate.
        temperature: Temperature of the final softmax operation. The lower the temperature, the
            lower the variance of the token distribution. In the limit, the distribution collapses
            onto the single token with the highest probability.
        nucleus_p: Threshold of the Top-P sampling technique: We rank all tokens by their
            probability and then only actually sample from the set of tokens that ranks in the
            Top-P percentile of the distribution.
        stop_tokens: A list of strings, each of which will be mapped independently to a single
            token. If a string does not map cleanly to one token, it will be silently ignored.
            If the network samples one of these tokens, sampling is stopped and the stop token
            *is not* included in the response.
        stop_strings: A list of strings. If any of these strings occurs in the network output,
            sampling is stopped but the string that triggered the stop *will be* included in the
            response. Note that the response may be longer than the stop string. For example, if
            the stop string is "Hel" and the network predicts the single-token response "Hello",
            sampling will be stopped but the response will still read "Hello".
        rng_seed: See of the random number generator used to sample from the model outputs.
        add_to_context: If true, the generated tokens will be added to the context.
        return_attention: If true, returns the attention mask. Note that this can significantly
            increase the response size for long sequences.
        allowed_tokens: If set, only these tokens can be sampled. Invalid input tokens are
            ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.
        disallowed_tokens: If set, these tokens cannot be sampled. Invalid input tokens are
            ignored. Only one of `allowed_tokens` and `disallowed_tokens` must be set.
        augment_tokens: If true, strings passed to `stop_tokens`, `allowed_tokens` and
            `disallowed_tokens` will be augmented to include both the passed token and the
            version with leading whitespace. This is useful because most words have two
            corresponding vocabulary entries: one with leading whitespace and one without.

    Returns:
        The generated text.
    """
    if rng_seed is None:
        rng_seed = self._get_next_rng_seed()

    logging.debug(
        "Generating %d tokens [seed=%d, temperature=%f, nucleus_p=%f, stop_tokens=%s, stop_strings=%s].",
        max_len,
        rng_seed,
        temperature,
        nucleus_p,
        stop_tokens,
        stop_strings,
    )

    result = SampleResult()
    async for token in get_client().sampler.sample(
        prompt=self.as_token_ids(),
        max_len=max_len,
        temperature=temperature,
        nucleus_p=nucleus_p,
        stop_tokens=stop_tokens,
        stop_strings=stop_strings,
        rng_seed=rng_seed,
        return_attention=return_attention,
        allowed_tokens=allowed_tokens,
        disallowed_tokens=disallowed_tokens,
        augment_tokens=augment_tokens,
    ):
        result.append(token)
        if add_to_context:
            self.body.append(token)

    result.print_progress()
    return result

`xai_sdk.ide.Context.select_model(model_name)`

Selects the model name for this context.

The model name can only be set before any tokens have been added to this context.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model to use.	required

Source code in xai_sdk/ide.py

def select_model(self, model_name: str):
    """Selects the model name for this context.

    The model name can only be set before any tokens have been added to this context.

    Args:
        model_name: Name of the model to use.
    """
    if self.tokens:
        raise RuntimeError(
            "Cannot change the model name of a non-empty context. A context "
            "stores token sequences and different models may use different "
            "tokenizers. Hence, using tokens across models leads to undefined "
            "behavior. If you want to use multiple models in the same prompt, "
            "consider using a @prompt_fn."
        )
    self.model_name = model_name

`xai_sdk.ide.Context.set_title(title)` `async`

Sets the title of the context, which is shown in the UI.

Source code in xai_sdk/ide.py

async def set_title(self, title: str):
    """Sets the title of the context, which is shown in the UI."""
    # This is only relevant in the UI.
    pass

`xai_sdk.ide.Context.tokenize(text)` `async`

Tokenizes the given text and returns a list of individual tokens.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to tokenize.	required

Returns:

Type	Description
`list[Token]`	List of tokens. The log probability on the logit is initialized to 0.

Source code in xai_sdk/ide.py

async def tokenize(self, text: str) -> list[sampler.Token]:
    """Tokenizes the given text and returns a list of individual tokens.

    Args:
        text: Text to tokenize.

    Returns:
        List of tokens. The log probability on the logit is initialized to 0.
    """
    return await get_client().sampler.tokenize(text, self.model_name)

`xai_sdk.ide.SampleResult` `dataclass`

Holds the results of a sampling call.

Source code in xai_sdk/ide.py

@dataclasses.dataclass
class SampleResult:
    """Holds the results of a sampling call."""

    # The of tokens sampled.
    tokens: list[sampler.Token] = dataclasses.field(default_factory=list)
    # When sampling was started.
    start_time: float = dataclasses.field(default_factory=time.time)
    # Time when the first token was added.
    first_token_time: Optional[float] = None
    # When sampling finished.
    end_time: Optional[float] = None

    def as_string(self) -> str:
        """Returns a string representation of this context."""
        return "".join(t.token_str for t in self.tokens)

    def append(self, token: sampler.Token):
        """Adds a token to the result and reports progress in the terminal."""
        self.tokens.append(token)
        self.end_time = time.time()
        if len(self.tokens) == 1:
            self.first_token_time = time.time()
            duration = (self.first_token_time - self.start_time) * 1000
            logging.debug(f"Sampled first token after %.2fms.", duration)
        elif (len(self.tokens) + 1) % 10 == 0:
            self.print_progress()

    def print_progress(self):
        """Prints the sampling progress to stdout."""
        if len(self.tokens) > 1:
            duration = self.end_time - self.first_token_time
            speed = (len(self.tokens) - 1) / duration
            logging.debug(f"Sampled {len(self.tokens)} tokens. " f"%.2f tokens/s", speed)

`xai_sdk.ide.SampleResult.append(token)`

Adds a token to the result and reports progress in the terminal.

Source code in xai_sdk/ide.py

def append(self, token: sampler.Token):
    """Adds a token to the result and reports progress in the terminal."""
    self.tokens.append(token)
    self.end_time = time.time()
    if len(self.tokens) == 1:
        self.first_token_time = time.time()
        duration = (self.first_token_time - self.start_time) * 1000
        logging.debug(f"Sampled first token after %.2fms.", duration)
    elif (len(self.tokens) + 1) % 10 == 0:
        self.print_progress()

`xai_sdk.ide.SampleResult.as_string()`

Returns a string representation of this context.

Source code in xai_sdk/ide.py

def as_string(self) -> str:
    """Returns a string representation of this context."""
    return "".join(t.token_str for t in self.tokens)

`xai_sdk.ide.SampleResult.print_progress()`

Prints the sampling progress to stdout.

Source code in xai_sdk/ide.py

def print_progress(self):
    """Prints the sampling progress to stdout."""
    if len(self.tokens) > 1:
        duration = self.end_time - self.first_token_time
        speed = (len(self.tokens) - 1) / duration
        logging.debug(f"Sampled {len(self.tokens)} tokens. " f"%.2f tokens/s", speed)

`xai_sdk.ide.as_string()`

See Context.as_string.

Source code in xai_sdk/ide.py

def as_string() -> str:
    """See `Context.as_string`."""
    return get_context().as_string()

`xai_sdk.ide.as_token_ids()`

See Context.as_token_ids.

Source code in xai_sdk/ide.py

def as_token_ids() -> list[int]:
    """See `Context.as_token_ids`."""
    return get_context().as_token_ids()

`xai_sdk.ide.clone()`

See Context.clone.

Source code in xai_sdk/ide.py

def clone() -> "Context":
    """See `Context.clone`."""
    return get_context().clone()

`xai_sdk.ide.create_context()`

See Context.create_context().

Source code in xai_sdk/ide.py

def create_context() -> "Context":
    """See `Context.create_context()`."""
    return get_context().create_context()

`xai_sdk.ide.force_context(ctx)`

Overrides the current context with the provided one.

Source code in xai_sdk/ide.py

@contextlib.contextmanager
def force_context(ctx: Context):
    """Overrides the current context with the provided one."""
    token = _force_ctx.set(ctx)
    try:
        yield
    finally:
        _force_ctx.reset(token)

`xai_sdk.ide.get_client()`

Returns the client used by the IDE.

Source code in xai_sdk/ide.py

def get_client() -> _client.Client:
    """Returns the client used by the IDE."""
    global _CLIENT
    if _CLIENT is None:
        _CLIENT = _client.Client()
    return _CLIENT

`xai_sdk.ide.get_context()`

Returns the current context.

Source code in xai_sdk/ide.py

def get_context() -> Context:
    """Returns the current context."""
    if _force_ctx.get() is not None:
        return _force_ctx.get()
    return _current_ctx.get()

`xai_sdk.ide.prompt(text, strip=False)` `async`

See Context.prompt.

Source code in xai_sdk/ide.py

async def prompt(text: str, strip: bool = False) -> Sequence[sampler.Token]:
    """See `Context.prompt`."""
    return await get_context().prompt(text, strip)

`xai_sdk.ide.prompt_fn(fn)`

A context manager that executes fn in a fresh prompt context.

If a function is annotated with this context manager, a fresh prompt context is created that the function operates on. This allows solving sub-problems with different prompt and incorporating the solution to a sub problems into the original one.

Example

@prompt_fn
async def add(a, b):
    prompt(f"{a}+{b}=")
    result = await sample(max_len=10, stop_strings=[" "])
    return result.as_string().split(" ")[0]

In order to get access to the context used by an annotated function, the function must return it like this:

    @prompt_fn
    def foo():
        return get_context()

You can override the context an annotated function uses. This is useful if you want to continue operating on a context that was created by a function.

    @prompt_fn
    async def bar():
        async prompt("1+1=")
        return get_context()

    @prompt_fn
    async def foo():
        await sample(max_len=24)

    ctx = await bar()
    with force_context(ctx):
        foo()

Parameters:

Name	Type	Description	Default
`fn`		An asynchronous function to execute in a newly created context.	required

Returns:

Type	Description
	The wrapped function.

Source code in xai_sdk/ide.py

def prompt_fn(fn):
    """A context manager that executes `fn` in a fresh prompt context.

    If a function is annotated with this context manager, a fresh prompt context is created that
    the function operates on. This allows solving sub-problems with different prompt and
    incorporating the solution to a sub problems into the original one.

    Example:
        ```
        @prompt_fn
        async def add(a, b):
            prompt(f"{a}+{b}=")
            result = await sample(max_len=10, stop_strings=[" "])
            return result.as_string().split(" ")[0]
        ```

    In order to get access to the context used by an annotated function, the function must return
    it like this:

    ```
        @prompt_fn
        def foo():
            return get_context()
    ```

    You can override the context an annotated function uses. This is useful if you want to continue
    operating on a context that was created by a function.

    ```
        @prompt_fn
        async def bar():
            async prompt("1+1=")
            return get_context()

        @prompt_fn
        async def foo():
            await sample(max_len=24)

        ctx = await bar()
        with force_context(ctx):
            foo()
    ```

    Args:
        fn: An asynchronous function to execute in a newly created context.

    Returns:
        The wrapped function.
    """

    async def _fn(*args, **kwargs):
        with get_context().create_context() as ctx:
            await ctx.set_title(fn.__name__)
            return await fn(*args, **kwargs)

    return _fn

`xai_sdk.ide.randomize_rng_seed()`

See Context.randomize_rng_seed.

Source code in xai_sdk/ide.py

def randomize_rng_seed() -> int:
    """See `Context.randomize_rng_seed`."""
    return get_context().randomize_rng_seed()

`xai_sdk.ide.read_file(file_name)` `async`

Reads a file that the user has uploaded to the file manager.

Parameters:

Name	Type	Description	Default
`file_name`	`str`	Name of the file to read.	required

Returns:

Type	Description
`bytes`	The file's content as raw bytes array.

Source code in xai_sdk/ide.py

async def read_file(file_name: str) -> bytes:
    """Reads a file that the user has uploaded to the file manager.

    Args:
        file_name: Name of the file to read.

    Returns:
        The file's content as raw bytes array.
    """
    content = await get_client().files.download(file_name)
    return content

`xai_sdk.ide.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None)` `async`

See Context.sample.

Source code in xai_sdk/ide.py

async def sample(
    max_len: int = 256,
    temperature: float = 1.0,
    nucleus_p: float = 0.7,
    stop_tokens: Optional[list[str]] = None,
    stop_strings: Optional[list[str]] = None,
    rng_seed: Optional[int] = None,
    add_to_context: bool = True,
    return_attention: bool = False,
    allowed_tokens: Optional[Sequence[Union[int, str]]] = None,
    disallowed_tokens: Optional[Sequence[Union[int, str]]] = None,
):
    """See `Context.sample`."""
    return await get_context().sample(
        max_len,
        temperature,
        nucleus_p,
        stop_tokens,
        stop_strings,
        rng_seed,
        add_to_context,
        return_attention,
        allowed_tokens,
        disallowed_tokens,
    )

`xai_sdk.ide.select_model(model_name)`

See Context.select_model.

Source code in xai_sdk/ide.py

def select_model(model_name: str):
    """See `Context.select_model`."""
    return get_context().select_model(model_name)

`xai_sdk.ide.set_client(client)`

Sets the client use by the IDE SDK.

Source code in xai_sdk/ide.py

def set_client(client: _client.Client):
    """Sets the client use by the IDE SDK."""
    global _CLIENT
    _CLIENT = client

`xai_sdk.ide.set_title(title)` `async`

See Context.set_title.

Source code in xai_sdk/ide.py

async def set_title(title: str):
    """See `Context.set_title`."""
    await get_context().set_title(title)

`xai_sdk.ide.user_input(text)` `async`

Asks the user to enter a string.

Parameters:

Name	Type	Description	Default
`text`	`str`	The prompt presented to the user.	required

Returns:

Type	Description
`Optional[str]`	A string if the user actually entered some text and `None` if the input is empty.

Source code in xai_sdk/ide.py

async def user_input(text: str) -> Optional[str]:
    """Asks the user to enter a string.

    Args:
        text: The prompt presented to the user.

    Returns:
        A string if the user actually entered some text and `None` if the input is empty.
    """
    response = input(text)
    if not response:
        return None
    return response

`xai_sdk.ide.write_file(file_name, content, mime_type='application/octet-stream', overwrite=True)` `async`

Stores a file in the IDE.

Parameters:

Name	Type	Description	Default
`file_name`	`str`	Name of the file to write.	required
`content`	`bytes`	File content as a byte array.	required
`mime_type`	`str`	The MIME type of the file.	`'application/octet-stream'`
`overwrite`	`bool`	If the file already exists, overwrite it.	`True`

Source code in xai_sdk/ide.py

async def write_file(
    file_name: str,
    content: bytes,
    mime_type: str = "application/octet-stream",
    overwrite: bool = True,
):
    """Stores a file in the IDE.

    Args:
        file_name: Name of the file to write.
        content: File content as a byte array.
        mime_type: The MIME type of the file.
        overwrite: If the file already exists, overwrite it.
    """
    await get_client().files.upload(file_name, content, mime_type, overwrite)

PromptIDE SDK

Context

API Reference

xai_sdk.ide

xai_sdk.ide.Context dataclass

xai_sdk.ide.Context.children: Sequence[Context] property

xai_sdk.ide.Context.tokens: Sequence[sampler.Token] property

xai_sdk.ide.Context.__enter__()

xai_sdk.ide.Context.__exit__(exc_type, exc_val, exc_tb)

xai_sdk.ide.Context.__post_init__()

xai_sdk.ide.Context.as_string()

xai_sdk.ide.Context.as_token_ids()

xai_sdk.ide.Context.clone()

xai_sdk.ide.Context.create_context()

xai_sdk.ide.Context.prompt(text, strip=False) async

xai_sdk.ide.Context.randomize_rng_seed()

xai_sdk.ide.Context.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None, augment_tokens=True) async

xai_sdk.ide.Context.select_model(model_name)

xai_sdk.ide.Context.set_title(title) async

xai_sdk.ide.Context.tokenize(text) async

xai_sdk.ide.SampleResult dataclass

xai_sdk.ide.SampleResult.append(token)

xai_sdk.ide.SampleResult.as_string()

xai_sdk.ide.SampleResult.print_progress()

xai_sdk.ide.as_string()

xai_sdk.ide.as_token_ids()

xai_sdk.ide.clone()

xai_sdk.ide.create_context()

xai_sdk.ide.force_context(ctx)

xai_sdk.ide.get_client()

xai_sdk.ide.get_context()

xai_sdk.ide.prompt(text, strip=False) async

xai_sdk.ide.prompt_fn(fn)

xai_sdk.ide.randomize_rng_seed()

xai_sdk.ide.read_file(file_name) async

xai_sdk.ide.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None) async

xai_sdk.ide.select_model(model_name)

xai_sdk.ide.set_client(client)

xai_sdk.ide.set_title(title) async

xai_sdk.ide.user_input(text) async

xai_sdk.ide.write_file(file_name, content, mime_type='application/octet-stream', overwrite=True) async

`xai_sdk.ide`

`xai_sdk.ide.Context` `dataclass`

`xai_sdk.ide.Context.children: Sequence[Context]` `property`

`xai_sdk.ide.Context.tokens: Sequence[sampler.Token]` `property`

`xai_sdk.ide.Context.enter()`

`xai_sdk.ide.Context.exit(exc_type, exc_val, exc_tb)`

`xai_sdk.ide.Context.__post_init__()`

`xai_sdk.ide.Context.as_string()`

`xai_sdk.ide.Context.as_token_ids()`

`xai_sdk.ide.Context.clone()`

`xai_sdk.ide.Context.create_context()`

`xai_sdk.ide.Context.prompt(text, strip=False)` `async`

`xai_sdk.ide.Context.randomize_rng_seed()`

`xai_sdk.ide.Context.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None, augment_tokens=True)` `async`

`xai_sdk.ide.Context.select_model(model_name)`

`xai_sdk.ide.Context.set_title(title)` `async`

`xai_sdk.ide.Context.tokenize(text)` `async`

`xai_sdk.ide.SampleResult` `dataclass`

`xai_sdk.ide.SampleResult.append(token)`

`xai_sdk.ide.SampleResult.as_string()`

`xai_sdk.ide.SampleResult.print_progress()`

`xai_sdk.ide.as_string()`

`xai_sdk.ide.as_token_ids()`

`xai_sdk.ide.clone()`

`xai_sdk.ide.create_context()`

`xai_sdk.ide.force_context(ctx)`

`xai_sdk.ide.get_client()`

`xai_sdk.ide.get_context()`

`xai_sdk.ide.prompt(text, strip=False)` `async`

`xai_sdk.ide.prompt_fn(fn)`

`xai_sdk.ide.randomize_rng_seed()`

`xai_sdk.ide.read_file(file_name)` `async`

`xai_sdk.ide.sample(max_len=256, temperature=1.0, nucleus_p=0.7, stop_tokens=None, stop_strings=None, rng_seed=None, add_to_context=True, return_attention=False, allowed_tokens=None, disallowed_tokens=None)` `async`

`xai_sdk.ide.select_model(model_name)`

`xai_sdk.ide.set_client(client)`

`xai_sdk.ide.set_title(title)` `async`

`xai_sdk.ide.user_input(text)` `async`

`xai_sdk.ide.write_file(file_name, content, mime_type='application/octet-stream', overwrite=True)` `async`