(Requires Raspberry Pi/Dragonboard). Instructions given on the github page mentioned at the end of this description.
Why:
There are about 285 million visually impaired people in the world. They are not able to experience the world the way we do. Smart cap aims to provide this missing experience for them. The system uses state of the art deep learning techniques from Microsoft Cognitive Services for image classification and tagging.
What:
The smart cap aims bring the beautiful world as a narrative to the visually impaired. The narrative is generated by converting the scenes in front of them to text which describes the important objects in the scene. Examples of text include 'A group of people playing a game of football', 'yellow truck parked next to the car', a bowl of salad kept on table'. For the first prototype of the system, one line along with some keywords would be played as an audio to the users but in the later versions a detailed description would be added as the feature.
How:
The architecture of the system includes Amazon Alexa, Raspberry Pi and online computer vision API's.
A webcam which is retrofitted into a regular cap is connected to the Raspberry Pi. The code give on the github page (mentioned in testing instructions) runs of Raspberry Pi. The function of the code is to capture the image from the webcam and send it to Microsoft API's for recognition task. The response is then inserted to DynamoDB. When the user asks Alexa to describe the scene, the Alexa Skills Kit triggers Amazon Lambda function to fetch the data from the database (DynamoDB). The correct text is the played as an audio on the Alexa device.
#Testing instructions 1. Speak to Amazon Echo - "Alexa start smart cap" (you should hear the response as: "Sure, You can ask me to describe the scene") 2. Speak to Amazon Echo - "Alexa ask smart cap" -wait- "describe the scene"" (you should hear the response as: "No data received from device in past one minute"). This makes sure that the Alexa skills kit and dynamoDb are working as expected. 3. Get the userId. Speak to Amazon Echo - "Alexa ask smart cap to get the user info" (you should hear a long code) 4. Open http://alexa.amazon.com/ and login 5. In the userId card, you would see a long string 6. Copy the userId and paste it in aws_dynamodb.py file 7. Make sure you have python 2.7.9 +. [Terminal] which python. [Terminal] python --version 8. [Terminal] Run camera_image.py: python camera_image.py (You should see the images in the same folder) 9. [Terminal] Run ms_visionapi.py: python ms_visionapi.py (You should see the results in the terminal) 10. [Terminal] Run aws_dynamodb.py: sudo python aws_dynamosb.py (Note: this might require sudo access depending on if you used sudo while doing aws configure. It will tell you if update item succedded for dynamodb) 11. Speak to Amazon Echo - "Alexa start smart cap" - wait - "describe the scene". If everything went well, you should now hear some relevant to the image that was capture by the camera
Example: 'I think it is a yellow truck going on the road and the keywords are road, car, trees, sky'
Detailed instructions can be found at: https://github.com/TusharChugh/SmartCap