Vocalio

Specialized pedagogical tool to improve the user experience of online voice lessons

by Dan Smith

October–December 2020

Background

Voice lessons provide students a creative outlet, artistic growth, and the satisfaction of self-improvement in a supportive, collaborative environment
They cost anywhere from $50 to $200 / hour
Since COVID-19, nearly all voice instructors have shifted to online lessons via Skype/Zoom

Contributions

User research (interviews, contextual inquiry, task analysis, usability testing)
Solution ideation, feature design
Wireframes, interface design, interaction design, high-fidelity prototype

Timeline

This project lasted for about 10 weeks, from October 2020 to December 2020.

Problem

Singing instructors who teach over Skype or Zoom sometimes find it difficult to match the effectiveness of teaching in-person.

If we compare an in-person lesson with an online lesson, we're presented with the dilemma of audio lag:

User needs

Singing teachers who teach online need a way to:

Provide students live accompaniment for exercises they prescribe during lessons
Improve their effectiveness as vocal technicians
Maintain control of the flow and pace of each lesson
Focus more on students and less on technology

Solution

The proposed solution is a responsive web app workflow tool for voice instructors that leverages a specific browser API that enables accompaniment to be quickly modified and sent to a student's device, resulting in a much more seamless and effective educational experience. An instructor would be able to connect a MIDI keyboard with the application for a seamless workflow.

I approached the problem by making research the driving force, ensuring the user's needs were met, while making sure any of my personal biases were minimized.

Link to interactive prototype →

Link to Figma board →

Based on the demo, one can see that, by using the solution, an online lesson is much more similar to an in-person lesson than before. The aforementioned needs of live accompaniment, increased effectiveness, instructor control of lesson flow, and increased focus on the student are all achieved.

Target audience

The target audience is quite niche since the problem is highly specific. Voice instructors who teach online lessons are the primary audience, and their students are the secondary audience.

Process

For my process, I visualized each research method as a landmark, with its output informing the subsequent research phase. This enabled me to pivot quickly and easily.

Research

As mentioned, research was continual, allowing each phase to inform the next:

Specific research methods included:

Semi-structured remote user interviews (30 minutes)

3 instructors and 3 students
Used sacrificial concepts for both user types
For instructors, proposed process idea (capturing pattern on MIDI keyboard, transmitting to student's interface, modulating pre-captured pattern); seemed viable and solved for audio lag issues
For students, proposed interface visual idea via video recording, showing example from GarageBand:

Contextual inquiry (30 minutes)

Discussion then observation of 3 instructors and 3 students during online lessons

Task analysis (20 minutes)

More in-depth walkthrough with 2 instructors
Helped yield user requirements, user flows (e.g. modes), and wireframes

Usability testing (20 minutes)

Tested mid-fidelity wireframes with 3 instructors
Helped validate implementation of modes
Validated layout with mostly static toolbars and dynamic main content

Research findings

Students need accompaniment audio source to be on their end

Allows for better pitch, since piano over video calls can sound "wobbly"/distorted
Helps students improve ear training and pitch matching

Students exert excess mental effort (cognitive load) when they accompany themselves on the piano during exercises
Students generally see online lessons as a bit of a compromise; current perceived value is lower than in-person lessons

Instructors prefer to accompany students and collaborate more directly
Instructors would like to retain control of lesson flow and pace; some of that is lost in online lessons
Instructors currently rely on online lessons to stay in business, and also expand their studios, increasing access to students who might not otherwise have quality local instruction
Goal for some instructors is to change public perception of online lessons and improve perceived value and validity, sometimes hindered by audio lag and technological issues

Both user types need a solution that's easy-to-use, lightweight, and responsive to different devices

Determined video calls would be an unnecessary feature for first version

Concept evolution

With the aforementioned sketching as a starting place, the design evolved and pivoted as new research shed new light.

Initial concept: dashboard / single-page app layout

After task analysis, I pivoted, due to research pointing to "modes" that provide distinct functionality, flexibility, and structure based on instructor needs

After usability testing the "modes" prototype, I moved forward with visual design and refined interaction design

The student's interface remained largely the same throughout the process, since, aside from live accompaniment, there were only a few basic (mostly read-only) needs

Design principles

1. Keep users informed of system status

Users should be aware of any relevant system mode adjustments with constant feedback and visibility

2. Design with aesthetics and minimalism in mind

Users shouldn’t feel overloaded with information or stimuli, especially since they’re already either singing or teaching

3. Make systems flexible so novices and experts can choose to do more or less on them

Users of varying levels of musical experience should be able to customize their experience with features and settings relevant to them

Other considerations

Progressive disclosure

Used on main view for setting parameters via dropdown menus, modals

Staged disclosure

Used in setting pitch range for an auto mode exercise set

Responsive enabling

For features and parameters that depend on another condition, namely an active mode

Vertical positioning of content for each user's context

An instructor's eyes go between a piano and the application interface very often, so more frequently used content is placed toward the bottom of the screen to reduce the physical distance between either target
The student's priority is physical alignment and posture, so content was positioned toward the top of the screen to help minimize neck constriction

Outcome

The end result was a high-fidelity clickable prototype and concept, showing how to eliminate some of the issues that were initially present with online voice lessons. Many of my assumptions were validated, and some were proven false, which helped direct the project toward what the end user ultimately needs.

Challenges

Understanding and simplifying user needs into a streamlined, yet flexible process
Recruiting was challenging sometimes since I needed to schedule time to speak with busy instructors
Technical challenges: learned that there will be limitations in how this can be implemented as a web app (e.g. only usable with specific browsers)

Reflection

There's a tremendous need for a solution to this problem
There's value in leveraging technology to be as present as possible, even when not physically possible

Next steps

Build out / code MVP
Additional usability testing and design iteration
Validate benefit and effectiveness of the tool
Prioritize additional features:

Record entire lesson of exercises for later practice
Denote consonants and vowels (with phonetic spelling or IPA) for each exercise