PRODUCT, UX, & AI EVALUATION
KARL A. NEUMANN
Will people actually follow the experience you plan for them? I evaluate products against how real people think, decide, & act, before you spend months building in the wrong direction.
Previously: led UXR for a mental health app from 0→100K active users. Employee #8 at Memora Health from Seed→Series B.
Published on AI evaluation in BMJ Innovations.
Work
"Don't think, but look!"
– Ludwig Wittgenstein
Recent Engagements
Embedded UX leadership and product strategy for a scaling proptech company. I redesigned their onboarding and streamlined their core experience, while providing competitive intelligence, product marketing, hiring, & UX writing support.
Embedded | Evaluative | Proptech
UX evaluation of their platform, candidate assessments, and onboarding. Surfaced usability issues that reduce efficiency and cause user churn. Delivered executive briefs and product strategy workshops that shape their product roadmap.
Project-based | Strategy | Multicultural
Product advisory for a platform facilitating court evidence in custody and family law contexts. Engagement covers UX evaluation of desktop & iOS app development, QA of key user flows, and strategic direction for branding and product marketing.
Advisory | End-to-end | Legal Tech
Case Studies
Statistical analyses of >700 patients investigated patient engagement, satisfaction, & accuracy related to patients’ race & SES. Answered: "how might Memora Health quantify AI bias?"
Quantitative | Evaluative | Health Equity
Semi-structured interviews explored clinicians' & care teams' workflows, expectations, & needs. This telehealth system connected rural ER patients & remote psychiatrists.
Qualitative | Generative | Lean UX
Product strategy & 8-week research timeline to design and refine a mobile app for Crohn's patients to monitor key gut biomarkers. Methods include interviews, focus groups, & unmoderated tree testing.
Mixed Methods | Service Design
How I Work
Every engagement begins by understanding what you have and what our open questions are
01
Evaluate
Your product, designs, and analytics hold clues about what's working and what isn't. I identify friction, gaps, and assumptions, then we iterate to improve the experience before testing with real people.
02
Test
How I test depends on what stage you are at, it can be stakeholder interviews, usability studies, in-app experiments, or analysis of usage patterns. We find exactly where the experience holds up and where it breaks.
03
Integrate
You get data-driven guidance tied to specific decisions via stakeholder readouts, team workshops, executive summaries, or strategy sessions. Then we move to the next priority or cycle back to step 1 with sharper questions.
Writing
"There is no great writing, only great rewriting."
– Justice Louis Brandeis
Co-published with UPenn researchers in BMJ Innovations on methodology for evaluating healthcare AI chatbots. Established processes for proactive training and safety validation across patient interactions.
AI Eval | Peer-Reviewed Publication
Established a four-pillar framework for evaluating digital health equity: accessibility, literacy, demographics, and identity. Includes analyses of AI performance across 1,400+ patients in a UPenn postpartum program.
Technical Writing | Public Health Tech
Explored ethics of machine intelligence via a popular TV show, arguing that how we treat AI reflects power dynamics rather than principled ethics. Written before AI was a mainstream concern.
AI Ethics | Public Writing
And now, a poem
Behind the Scenes:
Every artificial object secretly saturated with intent, its subjects in mind, by design
Each item that comprises life involved countless meetings to (re)invent, iteratively built, by design
Everything engineered to fix problems we are now likely to prevent, taken for granted, by design.
June 9th, 2018
by Karl
About
"We can only see a short distance ahead, but we can see plenty there that needs to be done.”
– Alan Turing
About
"We can only see a short distance ahead, but we can see plenty there that needs to be done.”
– Alan Turing

I run Desire Path Research, an independent UX research and AI evaluation consultancy. I work with founders & product leaders who need to understand how real people will experience what they've built before hiring a full UX org.
My background is in research. I began by studying the sense of smell at the University of Chicago, the neuroscience of music and pain psychology at McGill, and then the placebo effect during an internship at the NIH. After graduation, I spent two years in clinical research at a biopharma startup before moving into user research and tech.
In 2019, I started as UX researcher and employee #8 at Memora Health, scaling with the company from Seed through Series B. While there, I co-published research on AI evaluation methodology with UPenn in BMJ Innovations, which shaped how I think about what it means to evaluate language-based systems rigorously. I then consulted with ZS Associates on digital health products for Fortune 500 clients.
Before going independent, I led UX research at Kooth Digital Health, where I worked on their app Soluna. I joined before launch and saw the product scale and help >100K 13- to 25-year-olds with their mental health. I also was part of their global AI ethics committee and designed LLM evaluation frameworks designed around clinical review panels.
The common thread across all of my work is a focus on whether the experiences we build actually hold up when real people try them, and what rigorous research can tell us about how we can improve their designs.
Outside of work, most of my time revolves around music: listening, producing, and playing instruments. Otherwise, you might find me tending to my plants, making coffee, or somewhere along Lake Michigan.










