Playground

A virtual journey through sound

Here at TheTin we run monthly Tinnovation sessions, a chance to explore and discuss new trends, tools and technologies. Often they lead to internal projects which act as a practical way to work with something new, outside the normal constraints of deadlines, and taking more risks than we normally would on client work. Our latest Tinnovation project, Band Explorer VR, is up and running – but how did we do it? Strategist Dave and Designer Daisy talk us through the process…

Project inception

Following on from a Tinnovation focusing on VR we knew we wanted to have a go at building something for ourselves, but there had to be a reason for doing so. The underlying goal was to produce something in VR where the medium itself allowed for a solution with benefits over and above a more traditional 2D experience, not just for the sake of it. The end product should allow the user to do something faster, more easily or in a more engaging way.

We ended up going back to an idea we had previously looked at several years ago. We had wanted to explore connections within music, to be able to jump from band to band via their members, the label, the producer, the genre, anything and everything. A bit like the BBC’s Comedy Connections, but for music. The only problem was the data set, and that filtering it all might be a bit excessive for all but the biggest of music geeks.

We started looking at building a custom API on top of the fantastic MusicBrainz database, but we all agreed we should start somewhere simpler. We love Spotify and use it everyday - it has a robust API, and already has related artists mapped. It also has the 30 second samples of the music we would need - a music exploration tool without music wouldn’t be much use!

They even had a visual representation of this in the form of an Artist Explorer, a nested tree diagram, in their developer showcase. It was built by one of Spotify’s own developers to help show of the API, and all the code was available in a repo for forking.

So we had our API, but how were we going to build our VR experience?

Tech considerations

Initially we thought about using Unity, a mature platform with lots of built in VR capabilities, with cross platform support from (almost) the one code base. It’s possible to code in C# or Unity Script (based on JavaScript), skills we have in house, but it’s not a tool we have extensive knowledge of working with. It’s also more focused on richer experiences than we would have the resource to invest in.

We had preciously built a prototype for a client using three.js, a JavaScript framework which allows for browser based WebGL powered 3D experiences. It was certainly impressive, but not the simplest library to work with at times. It didn’t take long however to discover things had moved on!

It took only 5 minutes looking at A-Frame to realise we had found our solution. A-Frame is built on top of three.js, but instead of coding every object by hand it has lots of primitive shapes built in. You don’t even need to create a scene, camera or renderer as these are setup automatically. It’s built around the traditional building blocks of a web page, DOM elements, extended with JavaScript. This makes it easier to keep the code and the layout separate. It incorporates an entity-component-system which allows for easy extensibility. It’s also got a great community of developers actively working together to build a wealth of ever expanding components and tools.

Backed by Mozilla, its born with WebVR in mind, and would allow us to build for every major VR platform around (apart from the PSVR, come on Sony) and serve it straight from a web page.

So we had our API, and our framework. Time to get going.

Starting to code

The first step was to pull down the repo from the Spotify Developer Showcase and figure out how it was working. There was quite a lot of code we knew we could use, most notably the helpers for making calls to Spotify’s API and providing authentication for logging in. But there was also a lot of code we knew we would have to rewrite.

The sample project uses another great JavaScript library, d3. It’s very powerful in itself, allowing the mapping of data sets to screen elements, automatically adding and updating elements as the data changes. But it's not something I had used before, so at first caused a bit of confusion. It was hard to see how everything tied together, and how I could adapt it for our needs.

At first I tried swapping the 2D elements for basic objects in a VR space. But not only was this tricky (I hadn’t figured out how to actually make them interactive yet, so you couldn’t actually do anything) it quickly became apparent we wanted to represent the data in a different way.

The original example allows users to expand an artist and see their related artists, which in turn can be selected, showing more related artists in an ever expanding tree diagram. We wanted to show more artists initially, but where the direct connection was only between the focused artist and their related artists. Related artists of related artists would all be grouped together, levelled depending on the number of steps it would take to reach them, expanding further into the distance to show a precursor of what may come next.

The sample project also had a lot of functionality and on screen elements we didn’t think we needed which was complicating refactoring it for our needs. After a few days trying to bend the original project I realised I would need my own helper file to load, filter, sort and group the artists, so I set about creating an artist controller.

Once I had the artists logging nicely to the console I set about getting something to actually appear in a VR scene. Having stripped everything even further back, d3 suddenly clicked. It didn’t take long at all before I had artists appearing in rows, automatically arranging themselves based on the sorting within the artist controller. It was immensely satisfying to have something working, and it was time for our design function to really get involved.

Over to Daisy...

Designing for a new medium always has its challenges, but when it came to designing for VR there were a lot more questions than usual. Where to begin? What software should be used? What had already been done? And, even worse - would my designs actually make people literally throw up? As far as bad user experiences go, I don’t think there’s a worse outcome than that.

Design research

The first thing that needed to be done was some fairly extensive research into the topic - and thankfully, there’s plenty of material out there for the newbie to get started with. As we discussed in a recent Tinnovation session on design trends, aesthetic standards for VR are still being figured out, but there are a few established rules to follow. Around the same time that Cardboard was launched, Google released a handy set of guidelines that addressed the most basic principles - always maintain head tracking, keep the user at a constant velocity when they’re moving in the app, don’t make too many brightness changes, and anchor the user to their environment were some of the key takeaways.

I also found some great UX/VR resources at uxofvr.com - Mike Agler’s video on his VR interaction manifesto was particularly useful in getting a better understanding of how to create an accessible UI. In addition to those general rules, we also had to consider what was achievable in build. It looked as though A-frame was fairly flexible in terms of what we could achieve design-wise for our artist-explorer app - as long as we weren’t creating hyper-realistic new universes, the simple graphic interface seemed ideal for what we wanted to achieve.

UX

Once I had a better understanding of what I was dealing with, it was time to start designing. But before we could leap headfirst into the design world of tomorrow, we had to think about the basic UX of the thing. Who would be using it? What did we want our main features to be? What would be the user’s primary journey? Turns out that even when designing for cutting edge tech, it’s always helpful to start with something familiar - good old pen and paper. I drew up some userflows and some initial layout sketches, including an ideal interface based on familiar objects like vinyl records and their sleeves, which would be simplified down the line.

We knew that we wanted our app’s main function to be music discovery, and we had to determine the quickest path for the user to reach this goal. We realised that an introductory screen was needed to establish not only the perimeters of the experience, but also demonstrate how all the controls would function. Beyond that, we also wanted the user to be able to search for artists. This would normally require a traditional field input…but without a keyboard in the VR environment, how could this be done? It was here that that we realised the opportunities available to us via audio input and voice recognition.

Perhaps the most exciting feature was allowing the user to generate a playlist from their experience, so we also needed to come up with a simple VUI. Another key consideration was, of course, exactly what the surrounding environment would look like. We wanted to make use of the 360 space and created an interface that allowed the user to explore all around them, but kept the main focus straight ahead so as not to create too much confusion.

Eventually we came up with a wireframe of sorts for our experience - we could only achieve so much of our vision on pen and paper, though. There was little point in creating an extensive mockup in Photoshop or Illustrator in the limited timeframe we had, especially when we weren’t sure what would work style wise in the VR environment - so it was time to strap on the headsets and start testing.

Trial and error... and error...

It was through a pretty long process of trial and error where we would figure out what would work in terms of the general look and feel of our experience. We kept in mind all the basics from our research - users don’t like spaces that are too bright, so we kept the colour palette dark. Floating text was a no, so we made sure to align any copy with objects in the scene. We wanted to make use of the depth of the VR space too, but still have the objects further out be visible, which was a tricky problem to solve.

Eventually we came up with a way to display our artists in staggered rows that disappeared upwards and outwards, which avoided clashes in depth perception. The user was now surrounded with a network of musicians that they could click on and shuffle around, and the more they clicked around, the more artists would appear. With the experience potentially displaying hundreds of artists at a time, we kept the UI simple and uncluttered.

There were some limitations with A-frame - some animated transitions had to be left out so that the experience could run smoothly. We faced a few challenges in getting the audio to play in a way that made sense too - we didn’t want the user to be turning their head and triggering sounds every second - so we made the decision to have the user click to play the audio to avoid any sound clashes. We added a simple media player UI too, allowing the user to skip between tracks, pause the track if they wanted to, and generate a playlist that they could listen to in Spotify later on.

We were close to achieving the vision set out in our sketches, but it needed something more than just floating heads in a black space. We needed to make the environment a little more 'real' - adding a simple horizon with a recognisable landscape made a huge difference. A sky, a ground and a subtle gradient gave the space some character. The user was no longer staring into an empty black void, and was instead in an environment that felt at least a little familiar. Spotify’s own collection of artist images completed the interface.

Back to development – have you done your maths homework?

Thanks to the simple way in which A-Frame handles assets, it didn’t take long at all to incorporate all of our assets into the project giving us a functional interface. A-Frame provides a wonderful inspector which can be used to tweak the positions of all the elements in real time, eliminating the need for trial and error.

We needed to find away to translate our 2D gird (albeit in a 3D space) of artists into something with more depth, and have it wrap around the user. The solution was to map cartesian coordinates to spherical ones. I found a very in-depth explanation and was initially overwhelmed, but thankfully the wikipedia entry was more forgiving.

I plugged in the equation and sat back. I didn’t see what I was hoping to see, it was a confusing mess. At first I wasn’t sure what was going on, but then I noticed something. In the maths I was using x, y, and z didn’t map to those that three.js (and hence A-Frame) use. To translate: y is x, z is y, and x is y.

I swapped things round but it still wasn’t right, so resorted to an old tactic. Swap sin for cos, multiply by -1, convert radians to degrees, generally just hack around and see what changes. And then it happened. We original conceived all the artists appearing on the inside of a sphere, with the user in the centre. But somehow we were at the base, with everything spreading out like a theatre audience in front of a stage. Result!

So we had our layout, and our interface. Audio was playing, playlist saving was functional, and it all worked rather well in a desktop browser. But this was a VR project, and whilst it ran fine on powerful desktop hardware, we always wanted this to be a mobile experience too.

The complications of the bleeding edge

The amazing team working on A-Frame are exceedingly active, and each new version brings performance enhancements across the board, but browsers are less frequently updated. We were developing with beta versions of browsers, and the WebVR spec (which allows integration with VR headsets, and hardware acceleration) was yet to be implemented. As such performance was well below where we wanted it to be.

December was an exciting month. Oculus released a Developer Preview of their Carmel browser for the Samsung/Oculus GearVR. Suddenly we had fantastic visual performance, but there was an issue - we had no sound. It took a while to figure out the problem. Carmel is based on the open source Chromium project, the base which Chrome is also built on. Spotify streams its audio as MP3, which is not an open source codec. Chromium, unless extend, only plays WAV or OGG encoded audio. As such it simply can’t play the audio files we were using. So we had perfect visual fidelity, but no music!

At almost the same time Google released a beta version of Chrome for Android with WebVR support, allowing us to use a Daydream headset and its controller. This gave us sound, but a rather pixelated display, and a lot of lag, something you really don’t want in VR for a comfortable experience. Performance was also poor, with animations seeing the frame rate drop well below where we wanted it.

This forced us to really inspect our code, and optimise everything as much as possible. A new release of A-Frame allowed us to revert back to the built in animation techniques (I had for a while been relying on the faithful Greensock tweaning libraries as they seemed to run faster) but we still needed to tweak things further.

We found loading our artist images sequentially as opposed to in parallel gave us a noticeable boost. Then we lowered the complexity of some of our 3D shapes and used the A-Frame inspector to optimise further.

And then, in the first week of February, Chrome 56 for Android came out of beta. The impact was huge. Performance increased dramatically. The lag was greatly reduced, the image was far less pixelated, and everything just worked.

So how did we do?

We are delighted where we have got to, with a stable build running on publicly accessible browsers which anyone can use. There are still issues to overcome, but we have learned so much.

But does it solve our original goal - to produce something in VR which works better than its 2D counterpart?

We believe it does.

Of course our solution differs from the original project in many ways. For starters the original represented the artists, their relationships, and groupings differently. But we find ours simpler to see large volumes of artists at once, and focus on what matters, the new artists you discover as you explore. At any given time, you can quickly see the journey, how you got from your start artist to your current, and move forward and back through the experience without getting lost. It’s also more immersive and engaging with a more advanced visual representation and positional audio.

We did have to drop some features. We wanted to enable voice control, but due to a “feature” with Chrome this would always have to be preceded with a button press, as It isn’t possible to have the microphone constantly listen for commands. But overall we achieved pretty much everything we set out to do.

Hopefully Carmel will support MP3 audio soon, and we will be able to bring the full experience to the GearVR. And we will integrate new versions of A-Frame as they are released, whilst continuing to make optimisations. Maybe we will plug in some more data sets, adding in some more of those relationships we had originally envisaged too.

Look out for more developments in the future - but for now, please enjoy Band Explorer VR.

TheTin is a brand & technology agency. Visit us at thetin.net

Share this

Launch full screen