Based on the social mentions provided, users appreciate Baserun's simplicity and developer-friendly approach, with the SDK requiring only 2 steps to get started and easy integration with popular testing frameworks like pytest and Jest. The tool's main strengths appear to be its comprehensive visibility into LLM workflows (including sequence, duration, costs, and API calls) and its powerful comparison features that help developers spot differences between test executions and understand the impact of code changes. Users find the side-by-side comparison view particularly valuable for debugging complex agent workflows and identifying where divergence occurs. The community seems engaged and supportive, with active development including new evaluation features for automated test result checking, though no pricing information or significant complaints are evident in these mentions.
Mentions (30d)
0
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Based on the social mentions provided, users appreciate Baserun's simplicity and developer-friendly approach, with the SDK requiring only 2 steps to get started and easy integration with popular testing frameworks like pytest and Jest. The tool's main strengths appear to be its comprehensive visibility into LLM workflows (including sequence, duration, costs, and API calls) and its powerful comparison features that help developers spot differences between test executions and understand the impact of code changes. Users find the side-by-side comparison view particularly valuable for debugging complex agent workflows and identifying where divergence occurs. The community seems engaged and supportive, with active development including new evaluation features for automated test result checking, though no pricing information or significant complaints are evident in these mentions.
Industry
information technology & services
Employees
2
Funding Stage
Seed
Total Funding
$0.1M
🧵 A step-by-step guide for testing LLM features with Baserun SDK: When building LLM features, we don't always know how the end user might interact with them. It's useful to create a regression suite
🧵 A step-by-step guide for testing LLM features with Baserun SDK: When building LLM features, we don't always know how the end user might interact with them. It's useful to create a regression suite to ensure the most common cases are covered, add new test cases as you build. https://t.co/yQMdlFzf0C
View originalLet us help you fine-tuning open source models, for free! https://t.co/tyQZYanGsW
Let us help you fine-tuning open source models, for free! https://t.co/tyQZYanGsW
View originalStart to sharing weekly update here this year:
Start to sharing weekly update here this year:
View original@gm_mertd @knarfeel_ You can signup and use it right away, not need to "get a demo"
@gm_mertd @knarfeel_ You can signup and use it right away, not need to "get a demo"
View original@knarfeel_ @gm_mertd super easy to setup :)
@knarfeel_ @gm_mertd super easy to setup :)
View original@swyx @retool @weaviate_io @CeloOrg @DeepAI @aragon_ai @helicone_ai Thank you!
@swyx @retool @weaviate_io @CeloOrg @DeepAI @aragon_ai @helicone_ai Thank you!
View originalDM us on Twitter or join our Discord https://t.co/738yrwEQgL if you have any suggestions or questions!
DM us on Twitter or join our Discord https://t.co/738yrwEQgL if you have any suggestions or questions!
View originalCheck out the full demo video:https://t.co/bzaxYFN5Lj or signup to trying it out. https://t.co/fuHPgMlBs6
Check out the full demo video:https://t.co/bzaxYFN5Lj or signup to trying it out. https://t.co/fuHPgMlBs6
View originalSo, what's next? We understand that comparing results side by side is useful but still manual. We launched an evaluation feature to help users automatically check their test results. https://t.co/C728
So, what's next? We understand that comparing results side by side is useful but still manual. We launched an evaluation feature to help users automatically check their test results. https://t.co/C728Yk7tnO
View originalYou can also compare two test executions side by side and see the differences in individual steps. This has been extremely helpful for folks building agents to spot at which step divergence occurred.
You can also compare two test executions side by side and see the differences in individual steps. This has been extremely helpful for folks building agents to spot at which step divergence occurred. https://t.co/EuFey9yAo4
View originalBaserun provides full visibility into your workflow, including the sequence and duration of the event, input, output, token sizes, cost of each LLM call, and 3rd party API calls or custom logic. https
Baserun provides full visibility into your workflow, including the sequence and duration of the event, input, output, token sizes, cost of each LLM call, and 3rd party API calls or custom logic. https://t.co/znVovMhdRR
View originalBaserun also has a comparison view to help understand the impact of your code changes. https://t.co/Z5dXbjfMBW
Baserun also has a comparison view to help understand the impact of your code changes. https://t.co/Z5dXbjfMBW
View originalWhen building LLM-powered features, developers often need to chain multiple LLM calls, third-party API calls, and other custom logic together. In the Shopping Assistant example, here are the steps hap
When building LLM-powered features, developers often need to chain multiple LLM calls, third-party API calls, and other custom logic together. In the Shopping Assistant example, here are the steps happening in the background: https://t.co/MyOukRipMU
View originalBaserun will log all the test results for you, including the metadata such as duration, cost, and a summary. https://t.co/hub942EwcD
Baserun will log all the test results for you, including the metadata such as duration, cost, and a summary. https://t.co/hub942EwcD
View originalBefore Baserun, similar to most AI teams today, we used a sheet to manage the test results. https://t.co/odVpyjBtga
Before Baserun, similar to most AI teams today, we used a sheet to manage the test results. https://t.co/odVpyjBtga
View originalBased on 25 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.