It worked!

OK, I know this all is long overdue for an update.

 

Wow, the last blog post was over a month ago. So, I guess I should fill people in so here is the short version.

 

I got my project running at the end of October. Then I had to schedule some user testing, to see if all my work was good. Despite promise of pizza my first round of user-testing was sort of bust. Only 8 people showed up. I had to improvise a second round of user testing over Thanksgiving. Since family was coming to our house for Excess Carbs Day, I built a maze in our ‘playroom’ in the house, and had everyone walk through.

 

Here is a video of Sky doing the test.

 

In general user-testing was a success, in that I just wanted to test if people could easily pick up and use the system. Mostly everyone was able to use sound of the system to get through the simple maze. The most interesting part, and this is just anecdotal evidence is despite scientific research stating that men do better on special tests, women tended to get through the maze faster than the men.

 

Then two weeks after that, I had my Masters Defense.

 

 

 

I guess you can figure it out. I passed and I am now a Master of Computer Science.

 

I probably should write something more introspective about the semester, but honestly, I have really been up in my head space since my Masters Defense, trying to figure a bunch of stuff out. So maybe I will add a few thoughts on how everything when eventually figure that out.

 

Along with making it through my masters, and I am just glad to know, that I got a more practically prototype to work, and that the idea itself is viable. Now I just need to figure out what happens next.

 

I got the kit.

Just a short update.

Got the Occipital Structure VR Kit today. In short, I have a proper headset to hold the Structure Sensor for testing on Saturday.

 

 

I am not sure if I need the Wide Angel Lens but I think Bridge needs it.

 

It is nice to have a proper headset, but I cannot wear my glasses wearing the headset, also there is no trigger button like on a normal Google Cardboard. At least unless you stick your finger through the nose space.

 

Still, I am glad to have a second sensor for Saturday. I am not sure how long the charges last, so it will be nice to have a backup.

 

Rounding the bend, report focus time.

So this week has been a bit quieter. For the most part, since I got the Beta, any code work is frozen for the rest of the semester. There is no way I am risking breaking anything until after graduation. However, I was able to schedule time in Young Library’s multipurpose room for user testing on the 19th. Yes, it is the weekend before, Thanksgiving, but you take what you can get. Also, got some feedback on my early draft of my final report, and I hoping with a month to work on the text, I should get it reading pretty nice.

There is no question that there is a lot I would like to add to add to SEAR-RL but I have say, not that I have a base functioning beta, it is a lot easier to plan what to do next. In building the prototype, it was very problematic, because every component had some purpose to the overall structure. This made it difficult to separate tasks. However, now I have a base working system, it is a little easier to envision a specific feature and at least see in my mind a set series of tasks that need to be completed.

Finally, it was pretty cool that the Occipital guys noticed my project, and gave it a shout out on their Twitter feed. It is funny that all came about because I decided to order a second sensor and get the VR kit. Good thing too, because my Google Cardboard hack, has begun to fall apart, so it will nice to have a proper headset. Plus, it will be interesting to see if the Bridge API Occipital is making for the VR kit can provide additional functionality when I get a chance to work on it again.

BETA!

BETA!

After a few months of madness, I have finally reached BETA!

Since it would be easier to show then tell, here is a short video of SEAR-RL in action.

Remember: LISTEN WITH HEADPHONES, or a pair of stereo speakers with good stereo separation to get the effect.

A quick explanation: for those who don’t understand what they are hearing. Where there is an object someone could run into there is a sound. The color of the object determines the type of sound played. The sonar ping plays from the direction of North.

Anyway, after a few panics, I am just glad I am here. Knowing there is a working base makes it easier to improve now, but what did I learn the past couple of weeks.

First, working with external hardware on iOS is a pain. Mainly because you are playing with quantum mechanics. The reason being, it is very hard to run a program and debug it. The data either exists in one state or another, and trying to get to the data can sometimes destroy it. Part of this is because I need the Lightning port on the iPhone 6 for the Structure Sensor, so I cannot get debugging data straight from the iPhone. On top of that the Wireless Debug system that the Structure offers can be a little finicky to work with.

Aside from deal with hardware the biggest problem was with Swift itself. In short Apple, is not kidding around about making Swift type safe.

The simplest example would be as such.

for element in someList {

element.someVaribleToChange = theChange

}

With C or C++, element could be a reference to an element in someList which is a reference itself. Even with Python, all variables are references, so if I change something somewhere most of the time it is changed everywhere.

However in Swift when you change element.someVaribleToChange, the change takes immediately. Not so with Swift. First, if the list was passed into the function, the list would be passed-by-value. Next unless you state var before element in the for-in loop that too is passed-by-value element in the list. So unless you explicitly state otherwise you are often working with a copy of an object and not the object itself.

Therefore, when you change the variable, your change is totally forgotten about once you leave the loop. You have to explicitly replace the element from where you got it for any changes to stick. This can be tedious, but it does enforce good type-safe habits. All this caused me the greatest headaches, trying to make sure I was interacting with the objects in the way I wanted too, the object itself and not a copy.

Still, I feel a great bit of relief now, and hoping I can pull together at least one good user testing session in the next couple of weeks. This weekend, a break from code, and work on my report.

To flirt with madness.

The verge of madness.

It is sort of funny, earlier in this week I had an interview with Kurzweil music. Although, the interview went really well, we decided I probably was not the best fit for the position they had at this time. [They need someone who can program closer to the metal then I am used to]. Still, it was a good interview and ironic because of one of the topics we talked about.

Trying to program audio for a system is to flirt with insanity.

The topic came up because we made the observation one cannot simply cannot debug audio. With just about every other form of computing, if something bad is happening, one can freeze to debug a system to figure what is going wrong. But if you freeze a system running audio, the audio stops. So trying to figure out if something wrong with your programming truly becomes a task for a mad scientist. That is what happened this week.

First, I started the week totally excited and really to go. I had all my classes set, the code looked good and… crash. Why? Well, I soon discovered the beautiful sample bank I had created for sound generation was too big. Actually, specifically it has too many sample files in it.

I assumed that every instance I created for Apple’s sampler player AudioUnit, AUSampler, would take care of the files I load into each instance. I was mistaken. Apparently, CoreAudio as an environment has a maximum limit on the number of samples all the instances of AUSampler can have open. This is really weird, because it means the AudioUnits are more closely coupled to the greater audio sub-system then one would be let to believe. After some experiments I discovered that limit seems to be around 250 [probably 256]. The problem was over the summer I crafted a near perfect sample bank, for one of the sounds I need for the project. I was forced to cut my beautiful sound bank of 160 samples, down to just 8 [5% of what it was] so I get enough copies loaded into memory. The new bank does not have the fidelity of the old, but it does the job. That was the first blow.

However, after dealing with that problem, and building a new sampler bank, I ran into a new problem… silence.

For most of the week, I have been tearing my hair out trying to figure out why my new sample bank would not produce sound. It was made it maddening, because of a tiny test audio functions I wrote. The sound would play perfectly with the test function and not for anything else.

Initially, I thought it had to do something with the structure of my code. Specifically, my test functions build, plays, and destroys everything to play a sound all with-in the scope on one function. Since I have to spread the elements for the audio over a couple of classes for the application proper, I thought the way Swift deals with referencing data might be causing the problem. Because sometimes Swift passes data by value and other time by reference. So I thought maybe I was not actually playing audio but just a copy. However, that rabbit hole ended in a dead end. The real culprit was something else.

I noticed one little difference in my test code vs. the project code, which led me to the answer. I am triggering all my audio via MIDI commands to AUSampler. I noticed my test code triggered a sound using a velocity of 1. Compared to the main project code which was using the MIDI max velocity of 127. This led me back to look at the NEW sample bank I had created at the beginning of the week. For some reason AUSampler had mangled the key mapping for the sample bank, and in the end the bank would only respond to a velocity of 15 or below.

OK… I know for the non-MIDI-literate a lot of that sounded like Greek. So let me explain otherwise. I started out by building a Lamborghini. However, after discovering how much insurance was on the Lamborghini, I cannibalize the Lamborghini and built Toyota Yaris from a parts. However, in building the Yaris, something went wrong, and now the Yaris can only drive in first gear.

After making a correct test bank late last night that responds to all the velocity level [so I can drive past first gear], I realized all my code was working, so now I just have to finalized the new bank once more.

Despite the madness there were two silver linings this week. The first, is in rebuilding the sample bank, I had to revisit a Python script I made this summer to create the sound files using SOX. Over the summer, I had so thoroughly documented and robustly wrote that script that I was able to use it to generate new sample files I needed with one minor modifications. SOX, although a great tool, is not the more intuitive command line audio program, especially if you want to do more complex tasks. I think that was first time, I realized I have been able to re-use code I have written in the pass for a current practical task. It was nice to take some solace in that thought.

The second, in trying to chase down reasons the audio would not play, is that it forced me to take a good look at the code I had. Through that, I was able to refactor and simplify some of the code to be able to test and use which I am sure will help going forward.

Let there be sound.

Joy! I finally got some audio working in SEAR-RL. I also have to say deciding to use Apple’s higher level audio API [AVFoundation] was probably a wise choice over using CoreAudio. First of all, AVFoundation is just so much cleaner then CoreAudio functions. AVAudio is Obj-C/Swift based verses CoreAudio which is C based. It also seems that Apple has tried to simplify much of the process setting up an Audio processing graph with AVFoundation over CoreAudio.

However, I cannot say it was a perfectly smooth start. This is largely because how I am handing the sound assets themselves. While, it is true I could have used the raw sound files, being the audio guy I am I wanted to put the sound in a sample instrument and access them that way. The first problem was building the sample instrument, which I did over the summer. Actually, creating the sample instrument was not any more difficult than using any other sampler. The problem what that I needed to move the instrument files from where they would reside for music applications to my programming project. Since I was using Apple’s EXS32 Sample Synthesizer and AUSampler AudioUnit, this was a little tricky because EXS32 expects to store and read data from particular locations on the system. Apple does provide some help with their Technical Note TN2283 articles which details how to load instruments into an app using AUSampler, it was just less clear about how one exports the sample instrument from EXS32. But I did get it exported, and was also please to discover that the instrument behaved pretty much how I designed it in EXS32. This is a minor concern, but how the Technical Note reported AUSampler did not support all the funcationally EXS32 does.

The only minor disappointment, is looking at the available AudioUnits on iOS, it seems Apple has removed the Speech AudioUnit in favor of having people using it at the system level. This in should not be a problem, but I must admit I hoping to integrate the Speech synthesizer in the audio processing graph along with everything else.

The best part is finally to be able to hear sound from project that I am responsible for is a big relief.

October and the Technical Debt monster.

So October arrives, along with the Technical Debt Monster.

I was ready this week to finally get to jump into the audio system this week. Instead the Technical Debt Monster decided to turn up. Fortunately, it wasn’t as savage as I have seen on projects in the past. This one was ending up being an annoying little yapping dog, as opposed to a rampaging elephant. Specifically, it showed up as I was trying to transfer my work from the scanning system from the XCode playgrounds where I had been developing it into the actual project. In doing so I learned several things.

First, in terms of code executing speed XCode playgrounds and the Swift’s REPL is nowhere near as fast as running Swift code on the iPhone or even the iOS emulators. I was surprised by this after many years of writing Python code in Python’s REPL environments. I think the correct observation is that I took the speed of Python’s REPLs for granted and just assumed all REPLs were similar. This actually ended up being a good thing, because it helped be realized that some multi-threading I had added to my code to speed things along in the Playground was not needed. However, just because my code worked in the Playground, does not means it was perfect yet inside the project.

Second, I am seriously beginning to think the second law of thermodynamics does not apply to me in terms of coding. In all the software development I have done I have noticed I have a habit of over-designing code before I write anything. In general, this isn’t a bad thing. I like being organized and not wasting effort. However, my final code solutions always seem to be significantly simpler than what I expect them to be. Moving the scanning system over was no different. I realized I had too much redundant code and seeing the code run in the final implementation helped me notice bugs that were not apparent in the Playground. For example, I had not properly created a Model-View-Controller paradigm in the Playground, by not creating a proper controller class. I had just used the Main() loop as a controller. In doing so, my functions were referencing parts of the model and views that were valid in the Playground, but not in a true application.

In the end once I fixed all the bugs, I realized my code was much easier than what I envisioned. I know this because I wanted to properly document everything in the project, I decide to go back and update my original UML documentation for the designs. After making the changes I noticed that the UML designs were much clearer.

 

The third thing I noticed this week, was not so much a lesson as a realization. First, my idea to use a Priority Queue to keep track of the scanning points worked. Actually, I ended up creating a Priority Queue of Priority Queues. Seems a little weird and it was a little confusing to design but it does seem to be doing the job well. The second thing was in order to debug the scanning points, I had to write a simple scaling linear transformation to covert from the model data to the view. I know neither of these two thing seem particularly unique, but what got me is seeing something that we have learned as theory in school become relevant and practical. I don’t know why, but particularly seeing the linear transformation work just seemed like magic. I think it is because along with over-designing my code I have an annoying need to test every line of code I write with something small and work bigger [a good habit, but tedious at times]. With the set of points, I didn’t do that, I just ran it and it worked. If there is a lesson, I think it is just the universe trying to tell me to have a little more faith in my code and all the theory I have learned for computer science.

 

Finally, I think I am a little behind in terms of where I was hoping to be in the project. Looking back, I have to admit my scheduling was a little too ambitious. I think part of the reason is when I was thinking through the scheduling I broke one of my own rules. Years ago I did an internship at KET and my manager there taught me a really elegant rule to planning and scheduling that I notice really does seem to work out more times than not. His rule was, when trying to calculate scheduling and supplies out for a project, take whatever you calculate as fairly optimal and add a third of everything to that: time, resources, whatever. About 95% of the time, your totals will work out to exactly what is needed. In my anecdotal observations of projects his idea seems to be right on the money. With this semester, I forgot to do that for a variety of reasons most of them just based in ambition. However, when I think of everything taking the 1/3 rule into account my scheduling does seem about right. That makes me think that a friend’s observation I might be putting too much pressure on myself to get through it is correct. Fortunately, with October and Autumn [my favorite month and season] arriving I feel like I will probably makeup any time I have lost. I always seem to work a lot better when the oppressive summer has finally broken.

Behold life on iOS 10.

My apologies if this report is a little late. I have been under the weather the past 24 hours and only now have a clear head [so additional apologies if this report is rough]. In general, a lot of little things happened this week.

First, I think my idea to scan the depth data along a spiral path is going to work. Before, I got too far in idea, I spoke to my friend in Australia, Titus Tang, who has done a lot of similar visualization work. He was able to offer some good feedback on my algorithm idea and how to manage retrieving of points from the depth field for analyzing them. The biggest idea, Titus suggested it that I rescan the closest points first for each consecutive pass. Fortunately, I think that can be handled using a priority queue but I am still trying to refine a couple of details in my head, before I put anything to code.

Still, I feel like I have enough of the scanning system, that I can move on to building the audio tree and linking them to each region. This is actually the scariest part for me. Dealing with audio programming is always a little nerve racking for me. It is basically real-time programming. With graphic and most data programming, one is able to freeze the state of the machine to examine to debug it. One cannot actually do that with audio. Audio only exists in the context of passing time. You either code everything right and hear something or you do not. Plus, since will be the first time I am use AVFoundation so I am not sure what to expect, even though it does seem it will be easier than using Straight CoreAudio.

Finally, I moved to iOS 10, this week on my iPhone for development. This transition actually was not all that painful because I made the smart to move to shift development to Swift 3.0 and XCode 8.0 a couple of weeks ago. When, I recompiled for the new OS, I did not have any problems which was a relief. Still, trying to develop for and on ‘bleeding’ edge technology is challenging. The main reason, is the documentation for much of iOS and Swift’s API does not seem to keep up. For example, in trying to build the scanning points set I mentioned above, I started using Grand Central Dispatch, iOS’s concurrency technology. On the plus side Apple, greatly simplified the syntax for creating queues [high-level threads] in Swift 3.0. On the down side, there are just not many examples online on the correct usage of the new syntax. So I had to spend a bit of time try to cobble together answers, and I am still not sure I doing everything 100% correct.

In other tangentially related tasks, I also started writing a first draft of the final report. This is going to be an interesting report to write not because I am afraid I might be tempted to write too much. I have been working on this idea for so long, I am constantly find myself having to exercise restraint from writing my life’s story and just focus on the project. I have also begun to apply for positions after school. It is always a little odd when I see positions describing projects similar to what I working on now. I do apply to them, but I wonder if I will stand out to all the other applicants.

Start to feel like things are happening.

To be honest, I was little excited this week. After a bunch of preliminary work, and 2-3 years of CS Graduate School, I finally got to the point where I am working on what I consider one of the core parts of SEAR-RL. Specifically, most of this week I focused  on an idea I had for a more efficient way to analyzing the 3D from the Structure Sensor. It is works then the running time of the algorithm would be less then O(n) [where n is all the points coming from the scanner]. That in itself should help, since I want to make this a real-time application.

The most troublesome parts so far actually haven’t been anything to do with the idea, but how to debug it. Specifically, I been trying to create a visual overlay to the data views I receive from the sensor. This is so I can see that the points I m scanning are the correct points I want to scan. To do this I have had to dive into iOS GUI code once again on how to present the data. However, this task has not been that ominous as I feared as Apple’s introduction of the XCode Playgrounds have been a significant help. The Playgrounds are Apple’s REPL system to just experiment with code. They have been a help because I have been able to quickly experiment with the iOS GUI and see the results. It helped me focus on this specific part of my project, and not worry about accidentally breaking another part. But even with Playgrounds I have to say getting answers from Apple’s Documentation is not the best. I feel like I am having to search the interweb to cross-reference how to do something in Apple’s documentation I feel I should not have to.

But aside from that, yes there are still a few little issues, but the fact I am finally getting to see if this idea will work soon has got me excited. I just find it funny, that it seems that everything from Linear Algebra to Algorithms has been helpful in figuring this one problem out, so I guess coming back to graduate school was not a terrible idea.

Another new iPhone.

I think I finally finished the last of the preliminary development coding. The main issue standing in my way was trying to figure out a way to debug the project while testing. Occipital’s Structure sensor occupies the Lightning Port when attached to the iPhone. This is a problem for development because XCode needs to be connected to the iDevice via the Lightning port in order to debug. Fortunately, Occipital realized this and added a class STWirelessLog, which rebroadcasts debug messages over the IP Wireless. One just needs to run ‘netcat’ on a nearby machine to receive the messages. The problem for me, is that the way STWirelessLog is written requires a hardcoded IP address to receive the message. Since I am moving around on campus that means my IP address is always changing. So I added some helper functions to be able to modify the receiving netcat receiver machine on the fly.

Similarly, another problem related to the Lightning port had to do with Apple’s announcement of their new iPhone this week. The biggest news about the new iPhone was probably the fact they are removing the headphone port. So now if you want to listen to music with head phones on future iPhones, you either have to use a Lightning based pair of wired headphones, or wireless headphones. This could be problematic for future proofing my project. Since I need one Lightning port for the Structure Sensor, I am immediately short a port for the audio needs. Belkin has announce they are releasing an adapter called “Lightning Audio + Charge RockStar™”, which initially seems like a Lightning port doubler. However, it remains to be seen if it actually works that way since it is advertised as a way to listen to music through one port while the device charges through the other. If it does not allow for using two Lightning devices at the same time, then I am left with using wireless headphones. I am skeptical about this approach. The reason is that most wireless headphones use some form of audio compression to pass music from the device to the headphone to save bandwidth. Bluetooth headphones I have listened to in the past, I was not impressed with the quality. Since, audio is such a critical aspect of this project and people need to discern even the slightest of frequencies. I am worried that using Bluetooth headphones will degrade too much audio information.

Fortunately, I am safe for the moment since my current development platform is the iPhone 6 and it still has a traditional headphone jack. However, I probably should keep these issues in mind if I plan to do anything with this project after this semester.