Modifying Expectations

Ushahidi
Jul 10, 2019

My initial internship project timeline is summarized below.

The goals I have met are thus;

Familiarise myself with my Organisations coding standards and Conventions

Where I learned about the PSR2 coding style and its features like;

Visibility MUST be declared on all properties and methods; abstract and finalMUST be declared before the visibility; static MUST be declared after the visibility.

Opening braces for methods MUST go on the next line, and closing braces MUST go on the next line after the body.

Opening braces for classes MUST go on the next line, and closing braces MUST go on the next line after the body.

There MUST be one blank line after the namespace declaration, and there MUST be one blank line after the block of use declarations.

There MUST NOT be a hard limit on line length; the soft limit MUST be 120 characters; lines SHOULD be 80 characters or less.

Code MUST use 4 spaces for indenting, not tabs.

Research and decided on which speech recognition library to use; Here i had to decided between the AWS-transcribe (https://aws.amazon.com/transcribe/) and Google speech to text (https://cloud.google.com/speech-to-text/) after a lot of research and findings i finally concluded on using the Google Speech to text Service. some of the reasons why i decided to choose Google speech to text over the AWS-transcribe;

Powerful speech recognitionGoogle Cloud Speech-to-Text enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The API recognizes 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google’s machine learning technology

Automatically identifies spoken languageUsing Cloud Speech-to-Text you can identify what language is spoken in the utterance (limit to four languages). This can be used for voice search (such as, “What is the temperature in Paris?”) and command use cases (such as, “Turn the volume up.”)

Returns text transcription in real time for short-form or long-form audioCloud Speech-to-Text can stream text results, immediately returning text as it’s recognized from streaming audio or as the user is speaking. Alternatively, Cloud Speech-to-Text can return recognized text from audio stored in a file. It’s capable of analyzing short-form and long-form audio.

Just to name a few

In additon to the Google cloud speech-to-text library i had and am still researching on Twilios voice-api; I had to discover which audio formats it provides, I had to know if it is possible to get a hold of audio-file,I had to also know what is needed to talk to the API.

Then later on i had to create the lumen application for the project;here i had to use the knowledge i had gained from learning Laravel and Lumen and also setup the developement environment for the application which i successfully did after several efforts and struggles. The nature of the task was as such I had to Install and create a Lumen-project and connect it to a mySql-database locally on my computer I was recommended Homestead by my project mentors but i didn't use it because i had already been running Lumen environment on my machine using the Native lampp stack without the help of a third party more to that i had to commit the appication to the project repository created by my mentor on Github and also i had to make a Pull request for the project.

At this level coding phase had already begun my mentors knowing that i had never done such a project before had to give me all the possible help to understand more about a lumen project so then i was given the link to the Facebook bot my mentor had worked on a few years back in the Organisation. I had to look at the code and get a grasp of how it is setup. Though the Facebook bot is not a project similar to mine but it is also built on Lumen so it was of much help.

I then had to practice routing,controllers,http requests in Lumen and the Lumen application structure.My mentors actually gave me the time to study this concepts as they are very essential for my project I had to Download Postman, Experiment with routes in the Voice-integration-application, Follow the examples in https://lumen.laravel.com/docs/5.8/routing and add my own routes in the voice application,I had to Test the routes i had created in Postman, Add controllers with help of the Lumen-docs: https://lumen.laravel.com/docs/5.8/controllers,I read through the requests and response-docs for Lumen: https://lumen.laravel.com/docs/5.8/requestsand https://lumen.laravel.com/docs/5.8/responses .

Learning those concepts and trying them out was very empowering. The time then came for me to express what i had learned so then i had a task of Send a post to the platform from the voice integration application; Create a new route that accepts a post-request a message is to be sent and is to have "message" as parameter,create a new controller for talking to the platform api an also create a function in the controller that will talk to the platform api. Here also i had to install Guzzle library which will be used to send a post request to the platform-api when we have incoming data. In doing this task i had to write the code and commit it to the repo my first attempts were not good but with help from my mentor i got a hold it was in doing this task that i realised that i had not yet fully understood all the concepts i then had to go back once more and understand them again.

Presently i am at the point of integrating the Google Cloud voice text library into the application; I am reading the documentation for the speech-text library (https://cloud.google.com/speech-to-text/) I have to answer the following questions What is needed to get started? Do we need an account? Does it cost something? and more.

In doing this project the goals that took longer than expected are Learning Routing and controllers and how to make http request and also viewing my route result on POSTMAN.

This was difficult because i was learning this concepts for the very first time they were also a bit difficult to learn especially learning all the specifications on the different types of routes that exist how to create controllers to respond to routes,creating functions and most of all viewing the route result on POSTMAN. I had a week to learn them and then another week to express but when the time to express came i couldn't perform as required i had to have a lot of guidance from my mentors and also another week to study it more. This extra time studying this concepts helped me a lot as i realized what i had missed out and what i needed to emphasize on more.

If i was to begin the project all over i would have spent more of my time on learning how to build a voice to text library similar or even almost as good as the Google Cloud speech to text library.I think so because it would have helped me learn more and a lot and also that will already engage me on Cloud computing which is what i have always had interests in.

Actually building the library myself was the original goal of the project but later on looking at the internship period and the fact that i still had to learn so much, we came to the conclusion that i will just use those services which are already available. Nevertheless i still intend to work on this project even after the Outreachy internship and i believe at some point in time i will need to implement my own microservices.

For the second half of the internship is to find a way to integrate the Google speech to text service and then later on as well integrate Twilios voice api. I have to be able to integrate this services in a way that they are succesfully communicating with the platform api. The application database is also a huge concern where i have setup the database for the voice application smoothly store data sent and received by users communicating using the voice data source.The project documentation is also a very important section of the project;i will have to first of all create a Readme file for the project documentation using Markdown and HTML,also create documentation for the voice integration project on USHAHIDI'S gitbook documentation where i will outline the building process for the project,where it needs modifications and also specification areas for more contributors to emphasize on,and then testing and usage of the application