Hello Gen AI,
I have deployed multiple voice ai solutions and one thing consistently keeps coming is the prompts work great for shorter conversation duration, let us say 12-15 turns. In minutes ~3 minutes.
But as the conversation lengthens, an am specifically referencing gpt 4o, the conversation context starts to fade. Lets say its not a local llm or a smol, which can have latency advantages, so I have to stick to single prompts.
Whats your strategy for a reliable and predictable deployment ?