Usage

Learn how to use LLM Vision

Image Analyzer

Example Service Call

Once you have set up at least one provider and have at least one camera in Home Assistant, you can test LLM Vision:

service: llmvision.image_analyzer
data:
  provider: OpenAI
  message: Describe the image
  max_tokens: 100
  model: gpt-4o-mini
  image_entity: ["{{ states.camera|map(attribute='entity_id')|first }}"]
  temperature: 0.5

More examples & inspiration: https://llm-vision.gitbook.io/examples/

Full Service Call Reference

service: llmvision.image_analyzer
data:
  provider: OpenAI
  message: Describe what you see?
  max_tokens: 100
  model: gpt-4o-mini
  image_file: |-
    /config/www/tmp/example.jpg
    /config/www/tmp/example2.jpg
  image_entity:
    - camera.garage
    - image.front_door_person
  target_width: 1280
  detail: low
  temperature: 0.5
  include_filename: true

Video Analyzer

Similarly, you can also analyze video files by calling the llmvision.video_analyzer service with the following data:

service: llmvision.video_analyzer
data:
  provider: OpenAI
  message: What is happening in the video?
  max_tokens: 100
  model: gpt-4o-mini
  video_file: |-
    /config/www/tmp/front_door.mp4
    /config/www/tmp/garage.mp4
  event_id: 1712108310.968815-r28cdt
  interval: 5 # Analyze one frame every 5 seconds
  target_width: 1280
  detail: low
  temperature: 0.5
  include_filename: true

Service Call Parameters

ParameterRequiredDescriptionDefaultValid Values

provider

Yes

The AI provider call.

OpenAI

OpenAI, Anthropic, Google, Ollama, LocalAI

model

No

Model used for processing the image(s).

See 'Choosing the right model'

message

Yes

The prompt to send along with the image(s).

String

image_file

No*

The path to the image file(s). Each path must be on a new line.

Valid path to an image file

image_entity

No*

An alternative to image_file for providing image input.

any image or camera entity

video_file

No*

The path to the video file(s). Each path must be on a new line.

Valid path to an video file

event_id

No*

Event ID from Frigate. Each id must be on a new line.

e.g. 1712108310.968815-r28cdt

interval

Yes

Analyze frame every 'interval' seconds

3

Integer between 1 and 100 file

include_filename

No

Whether to include the filename in the request.

false

true, false

target_width

No

Width to downscale the image to before encoding.

1280

Integer between 512 and 3840

detail

No

Level of detail to use for image understanding.

auto

auto, low, high

max_tokens

Yes

The maximum number of response tokens to generate.

100

Integer between 10 and 1000

temperature

Yes

Randomness of the output.

0.5

Float between 0.0 and 1.0

Service Call Parameters

Last updated