AI Platform

Verbal Quiz Case Study

Architecting a low-latency, real-time verbal examination and voice evaluation platform. Built voice-to-text pipelines and automated scoring frameworks designed to respond instantly.

Client
Verbal Quiz
Services
Platform
Stack
React, FastAPI
Status
Active
Launch Live Platform
Verbal Quiz Case Study Visual

Traditional browser-based examinations rely on manual text inputs or multiple-choice interfaces. Verbal Quiz shifts this paradigm by executing full oral evaluations. The primary engineering hurdle lay in keeping voice transcription, semantic analysis, and follow-up audio queries below a natural 1.5-second conversational latency threshold.

GCAN resolved this by writing an asynchronous voice processing core running over FastAPI WebSockets. Audio input streams are chunked and dispatched to specialized inference endpoints, generating concurrent transcripts while downstream LLM rubrics prepare grading criteria and return text-to-speech payloads in parallel.

Audio Streaming

Compresses raw voice inputs locally using Opus codecs prior to transport.

FastAPI Sockets

Maintains persistent bi-directional connection channels for immediate response delivery.

Evaluation Core
Transcription output will render here...
NLP Analytics
Pronunciation 0%
Grammar Coherence 0%
Semantic Match 0%
Waiting for speech simulation to finish calculations...
Speed Verification
Socket Handshake
Audio frame streams are packetized and routed.
Whisper ASR Inference
Whisper neural network processes audio data to string formats.
LLM Semantic Review
Embeddings map correct variables against rubric criteria.
Text-to-Speech Generation
Synthesized question payload returned to page receiver.
Speech-to-Text (ASR) 0ms
NLP Grading Analytics 0ms
Text-to-Speech Synthesis 0ms

Platform Processing Flow

🎙️

Audio Capture

Captures user speech inputs through standard Web Audio API integrations.

WS Streaming

Streams WebM/Opus data packets immediately over secure Fast WebSockets.

🤖

Inference Grid

Processes transcripts concurrently alongside vector search embeddings.

🔊

Pre-Synthesized Output

Compiles answers and returns natural vocal audio prompts to the page.