AI, Latency, and the Changing Face of User Expectations

TL;DR (AI generated):

While users currently tolerate slower AI response times due to its novelty and usefulness, growing reliance on AI in daily tasks may revive expectations for instant results. This clashes with the trend of increasingly complex, resource-heavy models causing delays. The future will likely hinge on balancing user demands with technical constraints as AI evolves.

For the longest time, usability experts have emphasised the importance of reducing latency in user interfaces. Jakob Nielsen’s article on response times famously underscores three critical reaction time thresholds:

  1. 0.1 second: Feels instantaneous
  2. 1 second: Interrupts the user’s thought process but still keeps them in the flow
  3. 10 seconds: Causes users to switch their focus or multitask while waiting

Traditionally, developers and designers have done their utmost to stay within these limits. However, with the rise of AI—particularly large language models capable of complex, “thinking” operations—delays are creeping closer to that 10-second mark. Paradoxically, users appear more tolerant of these extended waiting times. So why has latency suddenly become more acceptable?

It’s often the case that when a new technology emerges, users are swept up in its novelty. This effect can be powerful enough to temporarily override established norms, such as expectations for fast response times. Virtual reality (VR) is a good example: many people initially find it exciting and futuristic, but the novelty often wears off once they realise they must remain tethered to a single spot and wear what can feel like a heavy, cumbersome headset. Over time, this form factor becomes a deal-breaker, and users may abandon VR because they’re unwilling to sacrifice comfort and freedom of movement.

AI, however, offers far more tangible benefits. While VR has its sceptics (of which I am one), AI demonstrates real potential—particularly in processing unstructured or unknown data and providing human-readable insights. As an educator, researcher, and amateur developer, I can begin to think about developing systems with capabilities I couldn’t have reasonably considered before, but which could now realistically be a single English-language prompt (and some shoe-horning) away from being realised.

Has this leap in capability, for now, persuaded users to overlook slow response times? Does the value gained often outweigh the frustration of waiting? Or will users eventually revert to insisting on near-instantaneous responses as AI matures? Possibly. As soon as the novelty fades and AI becomes fully integrated into everyday workflows, frustration with any noticeable delay may resurface. However, the sheer utility of AI may ensure that a certain amount of delay is permanently acceptable—if not expected—provided the results significantly reduce overall effort.

One potential solution for improving latency is rooted in continued advances in computing. In many ways, we’re witnessing a shift where software capabilities now dictate the need for more advanced hardware, rather than hardware placing strict limits on software ambitions. AI models are growing ever larger, and hardware must keep pace. This new dynamic could lead to rapid improvements in inference times, making the 10-second threshold feel archaic in the near future.

The question remains whether this acceptance of latency is merely a temporary phase or an indicator of a permanent shift in user behaviour. One thing is certain: as AI continues to evolve, the conversation around latency—and what users are willing to tolerate relative to the capabilities and benefits AI deliver-will remain something to watch closely.