Skip to main content

Inference Model Resource Reference

The inference model resource configures which third-party inference provider should be accessed and contains both general and provider-specific options that control session recording summarization. Multiple models can be configured.

kind: inference_model
version: v1
metadata:
  name: example-model
spec:

  # openai tells Teleport to use OpenAI for summarizing session recordings
  # using this model. Currently this is the only option.
  openai:

    # openai_model_id is the provider-specific model name. If this model
    # connects to a real OpenAI API, this needs to be a valid OpenAI model
    # name. If it connects to an OpenAI-compatible proxy, this ID is whatever
    # the public model name is used by that particular proxy. Required.
    openai_model_id: gpt-4o
    
    # temperature is an optional temperature of the model. Defaults to whatever
    # is the model's default.
    temperature: 0.4
    
    # api_key_secret_ref is a name of an inference_secret resource containing
    # the OpenAI API key (in case of direct connection to OpenAI) or other
    # secret required to authenticate against an LLM proxy. Required.
    api_key_secret_ref: example-openai-key
    
    # base_url allows changing the base URL to point Teleport to an alternate
    # OpenAI-compatible API, e.g. a proxy. Optional, defaults to the public
    # OpenAI API endpoint.
    base_url: "http://my-llm-proxy:4000/"
  
  # max_session_length_bytes is the maximum length of a session recording to be
  # processed using this model configuration. Setting it protects from
  # incurring additional cost from summarization attempts that fail because of
  # the model's context window limit. Optional, defaults to 200kB.
  max_session_length_bytes: 235000