Required Arguments
To use the GEvalEvaluator, you need to provide the following arguments:- name: A descriptive name for the metric.
- criteria: A description outlining the specific evaluation aspects for each test case.
- threshold: The minimum score required for a passing evaluation.
Optional Arguments
- evaluation_steps: A list of strings outlining the exact steps the LLM should take for evaluation. If not provided, G-Eval will generate steps based on the criteria.
- rubric: A list of rubrics to confine the range of the final metric score.
- model: The specific model to use for evaluation, defaulting to ‘gpt-4o’.
- strict_mode: Enforces a binary metric score: 1 for perfection, 0 otherwise.
- async_mode: Enables concurrent execution within the measure() method.
- verbose_mode: Prints intermediate steps used to calculate the metric.
Usage Example
Here’s how you can use the GEvalEvaluator in your evaluation system:Evaluation Steps
If you want more control over the evaluation process, you can provide evaluation_steps:Best Practices
- Define Clear Criteria: Ensure your criteria are specific and measurable.
- Use Evaluation Steps: For more reliable results, provide detailed evaluation steps.
- Leverage Rubrics: Use rubrics to standardize scoring and provide clear feedback.
