Jefouree

The discoveries worth talking about each week.


← Back to Jefouree

Story permalink

arXiv AI/ML

How AI Models Secretly Know Where to Look When They Describe Pictures

Log in to share

Like a spotlight operator following a stage actor's monologue, some attention heads in vision-language models consistently track the exact image region being described — and yanking that spotlight to a different spot forces the model to narrate something else entirely.

This means we're learning to see *inside* AI brains in concrete, manipulable ways, not just as black boxes.


Bug reported: No

Confirm action