Skip to content

Frequently Asked Questions

What is ARES?

ARES (Automated RAG Evaluation System) is an open-source evaluation framework for RAG systems. It is designed to simplify the evaluation of retrieval-augmented generation (RAG) systems. It automates the assessment of key metrics like context relevance, answer faithfulness, and answer relevance, significantly reducing the need for extensive human annotations.

How does ARES work?

ARES works by generating synthetic training data and fine-tuning lightweight language models to act as judges for evaluating different components of RAG systems. The process involves three main stages:

  1. Synthetic Data Generation: ARES uses generative language models to create synthetic question-answer pairs from a given corpus of documents. This data includes both positive and negative examples.
  2. Training LLM Judges: Using the synthetic data, ARES fine-tunes lightweight models to evaluate context relevance, answer faithfulness, and answer relevance. These models are trained to classify whether the retrieved passages and generated answers meet the evaluation criteria.
  3. Evaluation with PPI: ARES leverages prediction-powered inference (PPI) to improve the accuracy of evaluations. PPI combines predictions from the LLM judges with a small set of human-annotated data to generate confidence intervals for the scores, ensuring reliable and precise evaluations.

Why should I use ARES?

ARES offers several benefits:

  • Efficiency: By automating much of the evaluation process, ARES saves significant time and resources compared to traditional methods that require extensive human annotations.
  • Accuracy: ARES provides precise and reliable evaluations through its combination of synthetic data generation, fine-tuned LLM judges, and prediction-powered inference.
  • Flexibility: It can be adapted to various domains and use cases, making it a versatile tool for different types of RAG systems.
  • Open Source: ARES is open-source, allowing you to access the code, customize it for your needs, and contribute to its development.

Is ARES open-source?

Yes, ARES is completely open-source and available on GitHub. This means you can freely access the source code, make modifications, and contribute to the project. The community-driven nature of the project encourages collaboration and continuous improvement.

How can I contribute to ARES?

You can contribute to ARES by visiting our GitHub repository. Here are a few ways to get involved:

  • Report Issues: If you encounter any bugs or have suggestions for improvements, you can open an issue on the GitHub repository.
  • Submit Pull Requests: If you've made enhancements or fixed bugs, submit a pull request for review and potential inclusion in the main codebase.
  • Join Discussions: Participate in discussions with other contributors and maintainers to share ideas and collaborate on new features.

Can ARES handle my custom RAG model?

Yes, ARES is a model-agnostic tool that enables you to generate synthetic queries and answers from your documents. With ARES, you can evaluate these generated queries and answers from your RAG model.​

What are the main evaluation metrics used by ARES?

ARES evaluates RAG systems based on three main metrics:

  1. Context Relevance: Determines if the retrieved information is pertinent to the query.
  2. Answer Faithfulness: Checks if the response generated by the language model is properly grounded in the retrieved context and does not include hallucinated or extraneous information.
  3. Answer Relevance: Evaluates whether the generated response is relevant to the query, addressing all aspects of the question appropriately.

What are the requirements to run ARES?

To run ARES, you need:

  • Hardware: A machine equipped with a GPU for fine-tuning language models and generating synthetic data. For example, GPUs such as A100s should work fine.
  • Software: Access to the required libraries and dependencies, as specified in the setup instructions available in the GitHub repository.
  • Data: An in-domain passage set, a human preference validation set with approximately 150 annotated datapoints, and few-shot examples of in-domain queries and answers.

Who can I contact for more information?

For more information, you can reach out to the project maintainers through the GitHub repository by opening an issue or joining the discussions. You can also refer to the contact details provided in the research paper. Additionally, the repository includes comprehensive documentation and user guides to help you understand and utilize ARES effectively.