Required Arguments
To use the ToolCorrectnessMetric, you need to provide the following arguments when creating a ModelTestCase:- input: The task or goal the user wants the model to perform.
- actual_output: The output generated by the model.
- tools_called: The tools or actions the model used to accomplish the task.
- expected_tools: The tools that are expected to be used by the model.
Optional Parameters
- threshold: A float representing the minimum passing threshold, defaulted to 0.5.
- evaluation_params: A list of ToolCallParams indicating the strictness of the correctness criteria. Options include ToolCallParams.INPUT_PARAMETERS and ToolCallParams.OUTPUT.
- include_reason: A boolean indicating whether to include a reason for the evaluation score. Defaulted to True.
- strict_mode: Enforces a binary metric score: 1 for perfection, 0 otherwise. Overrides the current threshold and sets it to 1. Defaulted to False.
- verbose_mode: Prints intermediate steps used to calculate the metric to the console. Defaulted to False.
- should_consider_ordering: Considers the order in which tools were called. Defaulted to False.
- should_exact_match: Requires the tools_called and expected_tools to be exactly the same. Defaulted to False.
