CS-660 Interactive Machine Learning Final Paper.

Finally finished my semester project to ID stairs using Convolutional Neural Nets (CNN) for SEAR-RL.

Read it here if you want.

The Effect of Data Content and Human Oracles on Convolutional Neural Networks and Incremental Learning

If you just want to tl;dr here it is (nothing ground breaking):

  1. If you want to ID 3D objects with a CNN, you are better off using 3D data (point clouds) then 2D image data (even for 2D CNNs like I used).
  2. Using human reinforcement in incremental training of Neural Nets does not really improve training.  It might help if you are adding new classes to ID along with the data, but that would be future work to explore.

You can check out the code for the project here:


Although you need to get the data I collected for training from here:


(the data is too big to store on GitHub)

To run everything you need.

  • Anaconda 5.0 / Python 3.6
  • TensorFlow 1.1.0
  • Keras 2.0.8

And if you want to check out the data collection Application for iOS (of just need a start Occipital Structure App written for Swift 4.0) you can get that here:


Teaching the Machine.

Progress update.

Well, I have not stopped working on SEAR-RL. I have just taken a small break to focus on a different aspect of it.

While I look for a full-time job, I have continued to take classes UK, this semester it was an Interactive Machine Learning graduate level class. As usual, my education has been a trial by fire, but for the most part I am enjoying it and learning a lot.

For, my semester project I decided to work on something to extend SEAR-RL. I am curious if I can build a neural-net model using the depth data from the Occipital Structure to identify important pedestrian obstacles. Specifically, I want SEAR-RL to id stairs, for walking up and down, and ledges someone could fall off. I have completed the data collection portion of the project, and collected about 5 gigs of data. Now, comes the building the deep neural net.

Fortunately, while taking the UK class, I have been supplementing my education with Andrew Ng’s Deep Learning Course, as well as a great book “Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems.”

It is scary because like most things I end up wanting to do in life, there is no formal process to learn what I want to learn and do. For example, unfortunately I was just off-cycle in the curriculum to be able to take the Machine Learning classes during my proper graduate studies. So now that I taking a graduate level class, I am having to learn everything at high speed and by fire just to complete the course. [I have tried to complain to the management of the universe I get tired of this recurring theme in my life, but oh well. Maybe someday, I can do something I want without feeling under the gun for once.]

In general, I guess I just wish I knew what to do with SEAR-RL. I am not sure it is enough to turn into a business. Not to mention, business acumen is not my thing. But you think in a world where tech companies keep trying to push Augmented Reality tech, there might be a place.

I just do not know.

Anyway, that is the update.

The Great Refactoring

After a couple of slow months, I finally finished the great refactoring of SEAR-RL.

Hopeful, I can now clean up and add new features, with a lot less pain now.

So yes, I am still working on it.

It is now April

So a couple small updates.

Even though I made a fair Sample Bank for all the sounds in SEAR, but preferred goal has always been to have an actual software synth producing the noises. In this case, that would be an iOS AudioUnit.

However, trying to learn how to program AudioUnits is like one of more annoying things I have tried to learn in iOS.   There are a few good resources, and those sources are usually either dated or piecemeal.

Still, I have created a new GitHub depot as a way to document everything I learn in trying to build one.  You can find it here.  https://github.com/ForeverTangent/AUv3Breakdown

Also, I have run into a really nasty SegFault bug in SEAR.  I think I might have a solution, but it is one of those bugs that makes you step away from a project to collect yourself and to build up the energy to deal with it.

All, of this because, Jaromczyk asked me to demo SEAR this summer as some Engineering Summer Camp UK is host.  So if I don’t have a job by then, I am using it as a ‘artificial’ deadline to get a few more things done.


Post-Masters Work

Ok, I know it has been a while since a blog update.

Yes, I passed my Masters and graduated but that has not stopped me from working on SEAR-RL.

With UK’s E-Day, coming up this weekend, and Prof. Jaromczyk wanted me to demonstrate it for the event it gave me a deadline to improve it. As, my old teacher Jesse Schell once said, “Deadlines are magic.”

So, I did some major clean-up to the code base and UI. However, the most important change, is gone is my old system for locating closest objects in the user’s view. I replaced it with a particle filter system to do the same thing [because all the cool kids are using Machine Learning in their projects in one way or another]. What is nice is that the particle filter does work much better than my old system. Sure, it is a little wacky at moments [like all machine learning algorithms can be] but overall it seems to be a win.

The hardest part was just self-teaching myself everything I needed to implement one. This GitHub project and this YouTube video probably helped me understand how Particle Filters work the best. Which is good, because most of the literature is just a little math-y, and I am not great when it comes to learning from books. I am definitively, a show me once [maybe twice] type of learner.

Now just a couple of random thoughts.

I am constantly, surprised how bad a lot of scientific/math/engineering writing is. I honestly, think it is one of the things that turns people off from science. Just because one is writing about complex things does not mean one needs to write complicatedly.

The rule I was always taught, is write toward an audience at 5th grade level, and I think there is something to be said for that. If I had never found that video and project, I am not sure I would have ever figured out how to build a particle filter. I think the scientific communities really needs to take a look at what is considered good writing for the public at large.

At much as SEAR-RL is a passion project for me, I think one of the reason my previous few blog posts seem sort of light was that working on anything for a long time can wear someone down.

Seriously, I have been working on this project for at least a decade time, and even though I still have a lot of ideas for improvement, I am still not sure about my personal future nor what to do with the project. So much of that can begin to hang on oneself like an albatross.

I really did not have anything to follow up that notion, but I think it does provide insights on how people like George Lucas can give up Star Wars and the rest of the LucasFilm to Disney. Even the best of things can wear on a person creating it.

Still, thanks to Jamie Martini’s for the suggestion to use a particle filter.

Finally, I think I finally thought of the perfect way to describe my project to people: Augmented Reality for the Blind and Visually Impaired. I will try that for a while and see how it works.

It worked!

OK, I know this all is long overdue for an update.


Wow, the last blog post was over a month ago. So, I guess I should fill people in so here is the short version.


I got my project running at the end of October. Then I had to schedule some user testing, to see if all my work was good. Despite promise of pizza my first round of user-testing was sort of bust. Only 8 people showed up. I had to improvise a second round of user testing over Thanksgiving. Since family was coming to our house for Excess Carbs Day, I built a maze in our ‘playroom’ in the house, and had everyone walk through.


Here is a video of Sky doing the test.


In general user-testing was a success, in that I just wanted to test if people could easily pick up and use the system. Mostly everyone was able to use sound of the system to get through the simple maze. The most interesting part, and this is just anecdotal evidence is despite scientific research stating that men do better on special tests, women tended to get through the maze faster than the men.


Then two weeks after that, I had my Masters Defense.




I guess you can figure it out. I passed and I am now a Master of Computer Science.


I probably should write something more introspective about the semester, but honestly, I have really been up in my head space since my Masters Defense, trying to figure a bunch of stuff out. So maybe I will add a few thoughts on how everything when eventually figure that out.


Along with making it through my masters, and I am just glad to know, that I got a more practically prototype to work, and that the idea itself is viable. Now I just need to figure out what happens next.


I got the kit.

Just a short update.

Got the Occipital Structure VR Kit today. In short, I have a proper headset to hold the Structure Sensor for testing on Saturday.



I am not sure if I need the Wide Angel Lens but I think Bridge needs it.


It is nice to have a proper headset, but I cannot wear my glasses wearing the headset, also there is no trigger button like on a normal Google Cardboard. At least unless you stick your finger through the nose space.


Still, I am glad to have a second sensor for Saturday. I am not sure how long the charges last, so it will be nice to have a backup.


Rounding the bend, report focus time.

So this week has been a bit quieter. For the most part, since I got the Beta, any code work is frozen for the rest of the semester. There is no way I am risking breaking anything until after graduation. However, I was able to schedule time in Young Library’s multipurpose room for user testing on the 19th. Yes, it is the weekend before, Thanksgiving, but you take what you can get. Also, got some feedback on my early draft of my final report, and I hoping with a month to work on the text, I should get it reading pretty nice.

There is no question that there is a lot I would like to add to add to SEAR-RL but I have say, not that I have a base functioning beta, it is a lot easier to plan what to do next. In building the prototype, it was very problematic, because every component had some purpose to the overall structure. This made it difficult to separate tasks. However, now I have a base working system, it is a little easier to envision a specific feature and at least see in my mind a set series of tasks that need to be completed.

Finally, it was pretty cool that the Occipital guys noticed my project, and gave it a shout out on their Twitter feed. It is funny that all came about because I decided to order a second sensor and get the VR kit. Good thing too, because my Google Cardboard hack, has begun to fall apart, so it will nice to have a proper headset. Plus, it will be interesting to see if the Bridge API Occipital is making for the VR kit can provide additional functionality when I get a chance to work on it again.



After a few months of madness, I have finally reached BETA!

Since it would be easier to show then tell, here is a short video of SEAR-RL in action.

Remember: LISTEN WITH HEADPHONES, or a pair of stereo speakers with good stereo separation to get the effect.

A quick explanation: for those who don’t understand what they are hearing. Where there is an object someone could run into there is a sound. The color of the object determines the type of sound played. The sonar ping plays from the direction of North.

Anyway, after a few panics, I am just glad I am here. Knowing there is a working base makes it easier to improve now, but what did I learn the past couple of weeks.

First, working with external hardware on iOS is a pain. Mainly because you are playing with quantum mechanics. The reason being, it is very hard to run a program and debug it. The data either exists in one state or another, and trying to get to the data can sometimes destroy it. Part of this is because I need the Lightning port on the iPhone 6 for the Structure Sensor, so I cannot get debugging data straight from the iPhone. On top of that the Wireless Debug system that the Structure offers can be a little finicky to work with.

Aside from deal with hardware the biggest problem was with Swift itself. In short Apple, is not kidding around about making Swift type safe.

The simplest example would be as such.

for element in someList {

element.someVaribleToChange = theChange


With C or C++, element could be a reference to an element in someList which is a reference itself. Even with Python, all variables are references, so if I change something somewhere most of the time it is changed everywhere.

However in Swift when you change element.someVaribleToChange, the change takes immediately. Not so with Swift. First, if the list was passed into the function, the list would be passed-by-value. Next unless you state var before element in the for-in loop that too is passed-by-value element in the list. So unless you explicitly state otherwise you are often working with a copy of an object and not the object itself.

Therefore, when you change the variable, your change is totally forgotten about once you leave the loop. You have to explicitly replace the element from where you got it for any changes to stick. This can be tedious, but it does enforce good type-safe habits. All this caused me the greatest headaches, trying to make sure I was interacting with the objects in the way I wanted too, the object itself and not a copy.

Still, I feel a great bit of relief now, and hoping I can pull together at least one good user testing session in the next couple of weeks. This weekend, a break from code, and work on my report.

To flirt with madness.

The verge of madness.

It is sort of funny, earlier in this week I had an interview with Kurzweil music. Although, the interview went really well, we decided I probably was not the best fit for the position they had at this time. [They need someone who can program closer to the metal then I am used to]. Still, it was a good interview and ironic because of one of the topics we talked about.

Trying to program audio for a system is to flirt with insanity.

The topic came up because we made the observation one cannot simply cannot debug audio. With just about every other form of computing, if something bad is happening, one can freeze to debug a system to figure what is going wrong. But if you freeze a system running audio, the audio stops. So trying to figure out if something wrong with your programming truly becomes a task for a mad scientist. That is what happened this week.

First, I started the week totally excited and really to go. I had all my classes set, the code looked good and… crash. Why? Well, I soon discovered the beautiful sample bank I had created for sound generation was too big. Actually, specifically it has too many sample files in it.

I assumed that every instance I created for Apple’s sampler player AudioUnit, AUSampler, would take care of the files I load into each instance. I was mistaken. Apparently, CoreAudio as an environment has a maximum limit on the number of samples all the instances of AUSampler can have open. This is really weird, because it means the AudioUnits are more closely coupled to the greater audio sub-system then one would be let to believe. After some experiments I discovered that limit seems to be around 250 [probably 256]. The problem was over the summer I crafted a near perfect sample bank, for one of the sounds I need for the project. I was forced to cut my beautiful sound bank of 160 samples, down to just 8 [5% of what it was] so I get enough copies loaded into memory. The new bank does not have the fidelity of the old, but it does the job. That was the first blow.

However, after dealing with that problem, and building a new sampler bank, I ran into a new problem… silence.

For most of the week, I have been tearing my hair out trying to figure out why my new sample bank would not produce sound. It was made it maddening, because of a tiny test audio functions I wrote. The sound would play perfectly with the test function and not for anything else.

Initially, I thought it had to do something with the structure of my code. Specifically, my test functions build, plays, and destroys everything to play a sound all with-in the scope on one function. Since I have to spread the elements for the audio over a couple of classes for the application proper, I thought the way Swift deals with referencing data might be causing the problem. Because sometimes Swift passes data by value and other time by reference. So I thought maybe I was not actually playing audio but just a copy. However, that rabbit hole ended in a dead end. The real culprit was something else.

I noticed one little difference in my test code vs. the project code, which led me to the answer. I am triggering all my audio via MIDI commands to AUSampler. I noticed my test code triggered a sound using a velocity of 1. Compared to the main project code which was using the MIDI max velocity of 127. This led me back to look at the NEW sample bank I had created at the beginning of the week. For some reason AUSampler had mangled the key mapping for the sample bank, and in the end the bank would only respond to a velocity of 15 or below.

OK… I know for the non-MIDI-literate a lot of that sounded like Greek. So let me explain otherwise. I started out by building a Lamborghini. However, after discovering how much insurance was on the Lamborghini, I cannibalize the Lamborghini and built Toyota Yaris from a parts. However, in building the Yaris, something went wrong, and now the Yaris can only drive in first gear.

After making a correct test bank late last night that responds to all the velocity level [so I can drive past first gear], I realized all my code was working, so now I just have to finalized the new bank once more.

Despite the madness there were two silver linings this week. The first, is in rebuilding the sample bank, I had to revisit a Python script I made this summer to create the sound files using SOX. Over the summer, I had so thoroughly documented and robustly wrote that script that I was able to use it to generate new sample files I needed with one minor modifications. SOX, although a great tool, is not the more intuitive command line audio program, especially if you want to do more complex tasks. I think that was first time, I realized I have been able to re-use code I have written in the pass for a current practical task. It was nice to take some solace in that thought.

The second, in trying to chase down reasons the audio would not play, is that it forced me to take a good look at the code I had. Through that, I was able to refactor and simplify some of the code to be able to test and use which I am sure will help going forward.