Documentation Index
Fetch the complete documentation index at: https://pype-db52d533.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Required Arguments
Each test case must be aModelTestCase instance with the following fields:
input: The user’s query.actual_output: The LLM-generated response (not used in the score computation, but required).expected_output: The ideal response used to extract ground-truth statements.retrieval_context: A list of strings representing retrieved context chunks.
Optional Arguments
| Argument | Type | Description | Default |
|---|---|---|---|
threshold | float | Minimum score to be considered a “pass”. | 0.5 |
model | str | LLM to use for evaluation (e.g., 'gpt-4o', or a custom DeepEval-compatible model). | 'gpt-4o' |
include_reason | bool | If True, includes explanation for the evaluation score. | True |
strict_mode | bool | Binary scoring mode — 1 for full match, 0 otherwise. | False |
async_mode | bool | Enables concurrent scoring for faster evaluations. | True |
verbose_mode | bool | If True, logs detailed steps to the console. | False |
evaluation_template | ContextualRecallTemplate | Optional custom prompt template class. | Default internal template |
Usage Example
How It Works
- The metric uses the expected_output to extract statements.
- For each statement, an LLM determines whether it can be attributed to any node in the
retrieval_context. - The contextual recall score is calculated as:
Use Cases
Ideal when you want to:- Ensure critical facts from your knowledge base are retrieved.
- Improve the completeness of your RAG retriever, especially in complex or multi-fact queries.
