1 Comment

Hi Katherine, thank you for sharing this. I’m curious, how many days or sprints did this entire process take? Would it have been easier to gather relevant feedback to evaluate your LLM if you had quickly shipped the MVP to a very small but significant and low-risk set of customers as a pilot, and then made decisions based on that?

Also, I assume you’re working in a product-based company that has the flexibility to invest in this type of research and testing. Do you have any insights on how agencies, which are building LLM bots for clients with much shorter deadlines (perhaps just a few days to a week at most), might go about testing and evaluating LLM outputs?

Expand full comment