Over the past week, we've been hard at work improving our backend systems to speed up queries and reduce compute costs. Previously, our system would send multiple verbose text prompts back and forth with the language model to gather the information needed to answer users' questions.
We have now updated this process to use structured JSON responses instead of plain text. In addition, we consolidated many separate prompts into a single prompt with multiple steps. Together, these changes have resulted in significant improvements:
- For queries answered directly from connected data sources, we've reduced token usage by 1-5% and improved speed by up to 30%
- For queries needing additional data gathering, like search or database access, we've cut token usage by up to 60% and improved latency by 20%
In concrete terms, this means faster response times for your queries and lower monthly bills for compute resources. We're constantly working to optimize our systems to provide the same great experience in answering questions from your data while lowering costs.
As always, please feel free to reach out if you have any questions or feedback! We look forward to continuing to improve our service for you.