At a glance

Vocal Mind

An AI-guided voice and speech analysis tool that assists mental health practitioners in diagnosis during sessions.

Selected as 1 of 9 teams to pitch to Cornell Tech faculty and partners

Vocal Mind provides a seamless solution that is incorporated into the clinical workflow. First, with the help of AI, voice patterns such as rhythm, emotion, brevity, and volume of speech are extracted. Then, doctors are provided with additional objective information about each session. The analysis is stored in a record database, allowing doctors and patients to track progress.

Info
Role
Role
Role
Role

UX Researcher

Team
Team
Team
Team

2 Engineers

1 Product manager

1 ML

Timeline

4 months
Aug - Dec 2023

Mentors

Adam Yormark

Kiyan Rajabi

Timeline
Timeline
Timeline

4 months
Aug - Dec 2023

Mentors
Mentors
Mentors

Adam Yormark

Kiyan Rajabi

Problem

An accurate mental health diagnosis can take weeks up to years to determine.

An accurate mental health diagnosis can take weeks up to years to determine.

On the patient side, people face difficulty articulating their emotions, which slows or prevents timely diagnosis. On the medical provider side, variability in diagnostic criteria and limitations of subjective, standardized screening tools can lead to delayed or misdiagnosis.

Solution

Mental health diagnosis tool leveraging AI voice and speech analysis for timely diagnosis.

Mental health diagnosis tool leveraging AI voice and speech analysis for timely diagnosis.

Vocal Mind leverages AI to analyze patient voice and speech patterns to enable timely, accurate diagnosis of mental health disorders.

Discovery

Industry Research

How might tech be leveraged in the healthcare industry?

Transcarent partnered with Cornell Tech to tackle this challenge. We researched the stakeholder ecosystem, current patient experience, and technological advancements to get a clear picture of the systems within the healthcare industry.

I led the user flow outline for a patient seeking treatment in today's healthcare industry. We noticed that there was an increase in technology integration from online consultations to prescription delivery after the pandemic.

We combined our industry and stakeholder research to visually represent the processes, players, and their positive or negative concerns.

Interviews

I spearheaded the creation of our interview guides— one for medical providers and the other for people who have sought medical care. It was essential to understand their unique experiences and identify the interconnected issues.

“Providers are interested in generative AI, but the research is very new. AI is already used in diagnostics and pathology, classification tasks.”

“Providers are interested in generative AI, but the research is very new. AI is already used in diagnostics and pathology, classification tasks.”

“Providers are interested in generative AI, but the research is very new. AI is already used in diagnostics and pathology, classification tasks.”

Medical Student 
at Weill Cornell Medicine

5

Doctors

Doctors

5

Patients

Patients

2

Med Students

Med Students

Key Insight
Key Insight
Key Insight
Key Insight

Patients have difficulty articulating their emotions, which slows or prevents timely diagnosis.

How might we create a tool that helps patients effectively express their emotions, enabling quicker and more accurate diagnoses?

Key Insight
Key Insight
Key Insight
Key Insight

Variability in mental health diagnostics criteria and limitations of subjective screening tools can lead to delayed or misdiagnosis.

How might we develop more consistent and objective diagnostic tools that reduce variability in mental health assessments?

Defining the Problem

How might we…

Improve the mental health diagnosis process by utilizing an AI digital tool?

Improve the mental health diagnosis process by utilizing an AI digital tool?

By 2025, 90% of hospitals will use AI-driven technology for remote patient monitoring and early diagnostics. 

Our idea was fueled by the pain points in the diagnosis process that could benefit from technological enhancements and the healthcare industry's current openness to adopting technology.

Risky Assumption Testing

Backing Our Idea

I collaborated with the engineers and product manager to determine the product's technological feasibility and key desirability through our risky assumption tests.

RISKY ASSUMPTION 1

RISKY ASSUMPTION 1

The way people speak is affected by their mental state.

Ten people were enlisted to complete a week-long experiment in which they filled out a mood tracker and sent audio recordings about their day. Our questionnaire was modeled on the clinical guidelines for mental health diagnosis. We interpreted the recordings using an existing speech behavior analysis API by Humane AI.

Clinical guideline for mental healthcare diagnostics

Our questionnaire that we sent to participants

Participant's mood tracker answers

Voice Sentiment Analysis AI API

There was a significant correlation found between the participants’ inputs from the mood tracker and API analysis.
There was a significant correlation found between the participants’ inputs from the mood tracker and API analysis.
There was a significant correlation found between the participants’ inputs from the mood tracker and API analysis.
There was a significant correlation found between the participants’ inputs from the mood tracker and API analysis.

Day 1

“Confident and focused”

Day 2

“Tired, bit stressed, and focused”

Day 3

“Felt confident”

We identified consistency in the AI’s insights throughout the experiment.
We identified consistency in the AI’s insights throughout the experiment.
We identified consistency in the AI’s insights throughout the experiment.
We identified consistency in the AI’s insights throughout the experiment.

Clinical guideline for mental healthcare diagnostics

Our questionnaire that we sent to participants

Key Takeaway

We found there is a useful correlation between the participant’s mental health state and audio recording patterns.

RISKY ASSUMPTION 2

RISKY ASSUMPTION 2

People speak as they normally would when being recorded.

We compared the answers and mannerisms of interviewees while initially speaking unrecorded and after we began recording the conversation. The participants were randomly selected, unaware of the purpose of the research, and did not know ahead of time that they would be asked to be recorded halfway. 

80% of participants had no significant voice change with or without recording the conversation.
80% of participants had no significant voice change with or without recording the conversation.
80% of participants had no significant voice change with or without recording the conversation.
75% of participants took longer pauses as the conversation continued, which we took as an indicator that they felt more comfortable the more they spoke.
75% of participants took longer pauses as the conversation continued, which we took as an indicator that they felt more comfortable the more they spoke.
75% of participants took longer pauses as the conversation continued, which we took as an indicator that they felt more comfortable the more they spoke.

Key Takeaway

Even with the conversation being recorded, people can still speak normally. 

Solution

Vocal Mind

An AI-guided voice and speech analysis tool that assists mental health practitioners in diagnosis during sessions.

User journey of a patient and practitioner leveraging Vocal Mind during their consultation session

Reflection

Each Piece of the Puzzle Matters to the Whole

I thoroughly loved collaborating with my team as each of our unique perspectives was valued throughout the development process. We were able to thoughtfully consider the user's needs, the tech needed to make our solution feasible, and the roadmap for our product launch. This genuine involvement and curiosity led to a meaningful solution, and we were selected as 1 of 9 teams from a cohort of 80 groups to pitch to Cornell Tech faculty and partners.

Let's have a conversation ◡̈

Let's have a conversation! Get in touch ◡̈

Problem

An accurate mental health diagnosis can take weeks up to years to determine.

On the patient side, people face difficulty articulating their emotions, which slows or prevents timely diagnosis. On the medical provider side, variability in diagnostic criteria and limitations of subjective, standardized screening tools can lead to delayed or misdiagnosis.

Solution

Mental health diagnosis tool leveraging AI voice and speech analysis for timely diagnosis.

Vocal Mind leverages AI to analyze patient voice and speech patterns to enable timely, accurate diagnosis of mental health disorders.