The project “See for Me” addressed the problem of creating a smart audio guide cheaper and flexible, adaptable to the actions and interests of visitor
Personalization is viewed as a factor in enabling museums to change from “talking to the visitor” to “talking with the visitors”, turning a monologue to a dialogue. Digital and mobile technologies are becoming, indeed, a key factor to enhance visitors’ experiences during a museum visit, e.g. creating interactive and personalized visits. The project “See for Me” addressed the problem of creating a smart audio guide that adapts to the actions and interests of the visitor of a museum, understanding both the context of the visit and what the visitor is looking at.

The "See for Me" project is a prototype for a context-aware audio guide that, based on sensors commonly available in wearable devices like microphone, camera and accelerometers, can identify the artwork that is being looked at, and if the visitor is paying attention (e.g. he is not walking, talking to other persons and stays in front of the artwork) provides the audioguide info.
The smart audio guide, deployable on the most common Smartphone in use, perceives the context and is able to interact with users: it performs automatic recognition of artworks, to enable a semi-automatic interaction with the wearer.

The goal of this work has been to implement a real-time computer vision system that can run on wearable devices to perform object classification and artwork recognition, to improve the experience of a museum visit through the automatic detection of the behavior of users.

Resources needed

The project was funded by the Tuscan Region resources available from the Regional Operative Program “POR CREO 2014-2020”, funding line “support to Research and Innovation.

Evidence of success

The system runs in real-time on a mobile device obtaining high precision artwork recognition (less then 0.3% errors on our dataset). We tested the system in the Bargello National Museum of Florence and we also a conduct a usability test using SUS questionnaires, obtaining a high score (~80/100).

Difficulties encountered

Although computer vision techniques are improving, features extraction procedure for an artwork detection is a process that takes time and needs, in order to be fine-tuned to recognise artworks and people simultaneously, to use a former dataset containing the artworks described by the audio-guide.

Potential for learning or transfer

The potential concerning the knowledge/technology transfer of this practice is very high due to its easy production method as well as for the scalability of the above-mentioned solution to every kind of museum.

The audio-guide understands when the user is engaging in a conversation, if his field of view is occluded by other visitors or he is paying attention to another person or an human guide. In that event, it is reasonable to stop the audio-guide or temporarily interrupt the reproduction of any content, in order to let the user carry out his conversation. This function allows to use the smart audio guide also in crowded museums and to prevent an excessive isolation of visitors during its use.

Furthermore, the smart audio guide has been developed in order to be deployed on the most common Smartphone available on the market: this is the key factor for its scalability and one of the factor that can endorse its use also among other type of cultural assets

Please login to see the expert opinion of this good practice.

Main institution
University of Florence, NEMECH (“New Media for Cultural Heritage” competence centre)
Toscana, Italy (Italia)
Start Date
March 2016
End Date
March 2017


Please login to contact the author.

Good Practices being followed by

Gianluca Vagnarelli

Ciborghi Heritage Academy