An AI-guided voice and speech analysis tool that assists mental health practitioners during diagnosis.
At a glance
An accurate mental health diagnosis can take weeks up to years to determine. Vocal Mind provides a seamless solution that is incorporated into the clinical workflow. First, with the help of AI, voice patterns such as rhythm, emotion, brevity, and volume of speech are extracted. Then, doctors are provided with additional objective information about each session. The analysis is stored in a record database, allowing doctors and patients to track progress.
Selected as 1 of 9 teams to pitch to Cornell Tech partners
Team
1 Product designer (self)
2 Developers
1 Product manager
1 LLM
Timeline
4 months
Area
MVP validation
Discovery
How is tech currently integrated into healthcare processes?
Transcarent partnered with Cornell Tech to explore how tech could be leveraged in the healthcare industry. We researched the stakeholder ecosystem, current patient experience, and technological advancements to get a clear picture of the complex systems within the healthcare industry and discover a key area of opportunity.
System Mapping
I outlined the journey of a patient seeking treatment in today's healthcare industry. We noticed that there was an increase in technology integration from online consultations to prescription delivery after the pandemic.
Stakeholder Mapping
We combined our industry and stakeholder research to visually represent the processes, players, and their concerns.
User Interviews
Tech could help with early diagnostics, especially for mental healthcare.
I spearheaded the creation of our interview guides— one for medical providers and the other for people who have sought medical care. It was essential to understand their unique experiences and identify the interconnected issues.
5
Doctors
5
Patients
2
Med Students
Key Insights
Variability in mental health diagnostics criteria and limitations of subjective screening tools can lead to delayed or misdiagnosis.
Patients have difficulty articulating their emotions, which slows or prevents timely diagnosis.
Market Research
The Design Challenge
How might we improve the mental health diagnosis process by utilizing an AI-driven digital tool?
Final Solution
Vocal Mind is an AI-guided voice and speech analysis tool that assists mental health practitioners in diagnosis during sessions.
User Journey
Validating
the MVP
To validate our product's technological feasibility and key desirability, we conducted risky assumption testing.
I collaborated with engineers and product manager to identify and address the most uncertain and high-impact assumptions early on in the process. We were able to prioritize critical features and reduce the risk of building something that doesn’t resonate with users or meet technical requirements, increasing the chances of the product’s success.
Assumption 1
The way people speak is affected by their mental state.
Ten people were enlisted to complete a week-long experiment in which they filled out a mood tracker and sent audio recordings about their day. Our questionnaire was modeled on the clinical guidelines for mental health diagnosis. We interpreted the recordings using an existing speech behavior analysis API by Humane AI.
Clinical guideline for mental healthcare diagnostics
Our questionnaire that we sent to participants
There is a significant and useful correlation found between the participants’ inputs from the mood tracker and API analysis.
Participant's Mood Tracker Answers
Voice Sentiment Analysis AI API
We identified consistency in the AI’s insights throughout the experiment.
Day 1
“Confident and focused”
Day 2
“Tired, bit stressed, and focused”
Day 3
“Felt confident”
Assumption 2
People speak as they normally would when being recorded.
We compared the answers and mannerisms of interviewees while initially speaking unrecorded and after we began recording the conversation. The participants were randomly selected, unaware of the purpose of the research, and did not know ahead of time that they would be asked to be recorded halfway.
80% had no significant voice change with or without recording.
We inferred the changes in speech patterns were minimal and wouldn't drastically affect the accuracy of the analysis.
75% took longer pauses as the conversation continued.
We took these pauses as an indicator that participants felt more comfortable the more they spoke with us.
What I Learned
Collaboration is the key to success.
I thoroughly loved collaborating with my team as each of our unique perspectives was valued throughout the development process. We were able to thoughtfully consider the user's needs, the tech needed to make our solution feasible, and the roadmap for our product launch. This genuine involvement and curiosity led to a meaningful solution, and we were selected as 1 of 9 teams from a cohort of 80 groups to pitch to Cornell Tech faculty and partners.
Kirsten Geiger
Work
About
Resume