— PRODUCT DESIGNER · END-TO-END · 2 WEEKS

Designing an

AI - Powdered

survey experience

How I designed a human-in-the-loop AI system that increased study launch rates by 78%, making it easier for researchers to move fast without losing control or rigor.

+78%

LAUNCH CONVERSION

-46%

time to launch

+67%

answer richness

~80%

faster insights

ROLE
Founding Product Designer
TEAM
1 Designer · 3 Engineers · 1 PM
CROSS-FUNCTIONAL
CEO · Development
DURATION
2 Weeks
AREAS OF OWNERSHIP
  • 01 Product strategy & experience reframing
  • 02 AI-native UX / human-in-the-loop interaction design
  • 03 End-to-end UI / UX design across the survey lifecycle
  • 04 Decision architecture & information hierarchy design
  • 05 Trust & transparency system design
  • 06 Responsible AI & guardrail definition
  • 07 Validation, failure-mode testing & behavioral metrics
  • 08 Systems thinking & AI product system design

00 — executive summary

Executive summary – 90 second skim

PRODUCT
Satellica — AI Voice Surveys

AI-powered user research platform. I led the end-to-end design of an AI-moderated, voice-based survey workflow as founding product designer.

FOUNDING DESIGNER 2 WEEKS END-TO-END
USERS
Researchers

PMs, designers, marketers running studies

Participants

Respondents in voice-based sessions

PROBLEM

Many users created a study but did not launch — due to three compounding issues:

  • Low trust in AI-generated plans — users rewrote plans manually before approving anything.
  • High setup friction — too many decisions front-loaded, high drop-off before launch.
  • Shallow responses — open-ended text questions failed to surface context or emotion.
STRATEGY
"AI proposes, people decide."

Make setup a human-in-the-loop decision system. Increase transparency over configurability, improve intake signal quality, and use voice moderation to drive deeper participant responses.

SHIPPED
2 WEEKS
01 AI Intake
02 Plan Editor
03 Recruitment Config
04 Payment → Launch
SCOPE &
CONSTRAINTS

Shipped the core end-to-end flow in 2 weeks with a small team (3 engineers). Prioritized setup → launch conversion and trust-critical review moments. Supported async stakeholder review via export/share artifacts (download + share link), while deferring in-product collaboration (approval states, commenting, co-editing) to the next iteration.

IMPACT
PILOT FUNNEL
+78%SETUP → LAUNCH CONVERSION
-46%TIME-TO-LAUNCH
+67%ANSWER RICHNESS
~80%FASTER INSIGHTS

01 — background

The intersection of AI capability, user trust, and business growth

Satellica is an AI-powered user research platform that helps teams run interviews, surveys, and usability tests at scale. Using AI to moderate conversations, ask adaptive follow-up questions, and synthesize responses in real time, Satellica enables teams to generate insights in hours instead of weeks. 

joined as the founding product designer for the AI Voice Survey lifecycle, owning the end-to-end experience across intake, study planning, recruitment, and pay to launch. The challenge was not simply automation, but designing decision-making with AI so researchers could move fast without losing control or rigor.

AI Intake

Conversational setup that structures intent naturally

Plan Editor

Review & control surface for AI-generated study plans

Recruitment

Decision-first configuration for participants

Payment

Streamlined checkout with clear next steps

Launch

Confident launch with guardrails and recovery paths

Target Users

Researchers PRIMARY

PMs, designers, and marketers who run studies.
  • Own the decision to launch.
  • Accountable for study quality.

Participants SECONDARY

End users answering surveys.
  • Response depth ultimately determines insight quality.

02 — problem

Many users could create a study, but never launched

Many users could create a study,  but hesitated to launch due to low trust in AI-generated plans and high setup friction. I break the problem into business, user, and system layers to guide the strategy and design decisions that follow.

📉

Business Problem

Activation · Trust · Speed
  • Low activation: users created studies but hesitated to launch because they did not trust AI-generated plans.
  • Setup friction: too many decisions and manual inputs increased drop-off.
  • Competitive pressure: speed alone was not enough; differentiation required quality + trust.
  • Slow stakeholder alignment: approvals from managers/clients delayed launches.
🔬

User Problem

Quality · Scale
Researchers
  • Shallow open-ended responses in text surveys forced interpretation work and lowered insight quality.
  • High effort to design strong guides required expertise and time.
  • Limited bandwidth to moderate at scale capped qualitative learning.
Participants
  • Open-ended questions are hard to answer deeply in text. Many participants need guidance, examples, and follow-ups to express context and emotions.
🔁

System Problem

Input · Decision Quality
  • Existing survey tools are largely one-directional Q&A flows. Even when analytics are AI-assisted, they still operate on shallow inputs.
  • We needed to improve input quality (what participants say) and decision quality (how studies are configured), not only analysis.

03 — how might we

design question

How might we help researchers generate deeper insights faster while keeping them in control of AI-driven decisions?

solution

  • Turned AI from a black box into a trusted decision partner by redesigning how intent is captured and how recommendations are reviewed.
  • Shifted research from manual workflows to scalable conversations using AI voice moderation to unlock deeper insights.
  • Moved the business from “AI curiosity” to real adoption by removing launch friction and supporting stakeholder approval workflows.

04 — goal

What success looks like — for the business and the user

I defined success through two lenses: business outcomes, including launch rate, time-to-value, and voice adoption, and experience outcomes, including trust, control, and response depth, using them to guide every design trade-off that followed.

Business goals
Growth · Adoption
01
Increase study launch rate (setup → launch conversion)
02
Reduce time-to-value (time-to-launch)
03
Drive adoption of AI voice surveys as a differentiator
User goals
Researchers · Participants
Researchers
  • Create high-quality studies faster.
  • Understand and control AI decisions.
  • Scale qualitative research without extra headcount.
Participants
  • Make it easier to express detailed answers through guided voice conversations.

05 — approach

How I approached the problem 

I translated the goals into a human-in-the-loop workflow that reduces setup friction without sacrificing rigor, then validated the approach through interviews, usability tests, and early pilot signals to focus the design on the moments that most directly determine launch.

06 — ai system design

How I made this AI system reliable

I designed the system as a human-in-the-loop workflow with explicit decision rights, layered guardrails, and clear recovery paths, so the AI could accelerate setup without creating fragile or “black box” moments. I then pressure-tested the architecture with prototypes and early pilot usage, tuning the workflow for predictable outcomes under real constraints like model variability, latency, and cost.

DECISION RIGHTS MODEL (HUMAN-IN-THE-LOOP)
01

AI proposes

a study plan, question structure, and recommended defaults.

02

Human reviews

and edits key decisions.

03

System enforces

guardrails to prevent invalid configurations.

GUARDRAILS AND RECOVERY PATHS
GUARDRAILS
  • Flag goal–method mismatches.
  • Detect insufficient depth.
  • Trigger follow-up questions when critical info is missing.
RECOVERY
  • Reprompt or regenerate the plan.
  • Revert to a previous version.
  • Manual editing at any step.
RESPONSIBLE AI CONSIDERATIONS (PRODUCT-LEVEL)
01

Transparency

Clear attribution of AI-generated content and what changed.

02

Bias risk

Avoid leading questions and overconfident conclusions.

03

Consent & privacy

Make voice recording expectations explicit and avoid unnecessary data capture.

04

Misuse prevention

Do not auto-generate conclusions beyond what the study design can support.

08 — how i validated

How I  de-risked AI behaviour before launch

I validated both the user experience and the underlying AI behaviour by combining usability testing at trust checkpoints with targeted failure-mode scenarios, then iterating on prompts, guardrails, and recovery actions until users could confidently diagnose issues and move forward.


AI EVALUATION MODEL (LIGHTWEIGHT)
AI output Quality bar How I evaluated
Study plan draft Complete structure, decision-ready, low bias risk, matches stated intent Human rubric scoring in usability sessions + pilot review of edits and regenerate behavior
Follow-up probes (voice moderation) Non-leading, specific, increases depth without derailing Failure-mode scripts (shallow/off-topic/silence) + qualitative review of probe sequences
System feedback and guardrails Explains issues clearly, prevents invalid states, offers recovery paths Scenario testing on contradictions and edge cases + usability observation of recovery success
Concrete Failure Stories (What Broke + How Users Recovered)

01 · Contradictory constraints (research intent vs feasibility)

What broke
Researchers asked for "deep qualitative insights" but set constraints that made it unrealistic — too short, too few participants, or conflicting method choices.
System did
Guardrails flagged the mismatch and asked for a focused trade-off (depth vs speed) before allowing a final plan.
Recovered
Users adjusted constraints or accepted recommended defaults, then re-reviewed the plan with clearer confidence.

02 · Shallow or off-topic participant answers (voice moderation)

What broke
Participants gave short answers or drifted off topic, which reduced insight quality.
System did
The moderator shifted to probing and pacing patterns (broad → specific, restate → ask for an example) and used gentle re-anchoring.
Recovered
Researchers could review the probe chain, regenerate a probe, or edit questions to better elicit depth in future sessions.

09 — impact

How the work moved the metrics that matter

I tied the design decisions back to measurable outcomes across activation, trust, and research quality, using early pilot funnel signals to quantify what improved and where the remaining drop-offs were.


10 — takeaway

What I’d carry forward

I distilled the work into a set of reusable principles about designing with AI, plus what I would change next time to strengthen trust, quality, and adoption.


WHAT I LEARNED
01

Usability and trust are separate tracks in AI products.

Making something easy to use doesn’t automatically make it trustworthy. They require different design interventions — friction reduction vs. control and transparency.

02

Human-in-the-loop is an interaction contract.

Not just a philosophical principle — it needs to manifest as a clear UI contract: what the AI did, why, what you can change, and how to recover.

WHAT I GOT WRONG AND FIXED
GOT WRONG

Optimized for configurability too early caused decision fatigue moved to decision-first hierarchy.

GOT WRONG

AI felt like a black box added review checkpoints and explicit controls.

FIXED

Didn’t design enough for failure added recovery paths and clearer error states.

WHAT I WOULD DO NEXT
01
Build an end-to-end quality measurement system tied to funnel outcomes.
02
Strengthen async collaboration into a full "comment → decision → version" loop.
THIS VERSION
Export/download + shareable artifacts for manager/client review.
NEXT ITERATION
Full stakeholder collaboration and approval workflows.
  • Commenting & @mentions — threaded comments on specific questions/sections, with decisions captured.
  • Suggested edits + change tracking — reviewers can propose edits with an audit trail.
  • Request approval — per section + overall plan (approve or request changes).
  • Approval status & audit trail — clear state (Draft → In review → Approved) + who/when.
  • Versioning — snapshot versions for each approval cycle, with diff/restore.
  • Success metric — reduce approval-cycle time and decrease "launch blocked by stakeholder review" drop-offs.
03
Improve participant pacing with turn-taking feedback — pause detection, supportive prompts, lightweight structure.