X - Useful Ressources

Building a RAG app from scratch with Langchain (by svpino)

Advanced Langchain Chains
Extract & transcribe Audio from Youtube Video (using youtube and openai-whisper libs)
https://www.youtube.com/watch?v=BrsocJb-fAo
https://github.com/svpino/youtube-rag/blob/main/rag.ipynb

Evaluating an LLM-powered RAG application automatically (by svpino)

Generate a test-set from the documents
Automatically evaluate the RAG system using Giskard
Integrate the evaluation with PyTest in order include it in CI/CD steps
https://www.youtube.com/watch?v=ZPX3W77h_1E
https://github.com/svpino/llm/blob/main/evaluation/notebook.ipynb

Builing a RAG app from scratch with LlamaIndex (by Arize)

Very simple RAG with LlamaIndex
Questions generation using Phoenix (but LlamaIndex can do it as well) for evaluation
Compute Retrieval scores: ndcg_score@K (with sklearn.metrics), precision@K and Hit-rate
Compute Response score: Q&A correctness, Hallucinations
https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/evals/evaluate_rag.ipynb
https://www.youtube.com/watch?v=LrMguHcbpO8

LLM brenchmarks review and origin

Describe 6 of the most famous LLM benchmarks (Glue & SuperGLUE, Adversarial NLI, Big Bench, MMLU, HELM, LM Evaluation Harness)
Explain why we need to use these benchmarks instead of the old NLP metrics and benchmarks
Part1: https://medium.com/@myschang/evaluation-of-large-language-model-llm-introduction-9343424ad253
Part2: https://medium.com/@myschang/benchmark-of-llms-part-1-glue-superglue-adversarial-nli-big-bench-8d1aed6bae12
Part3: https://medium.com/aimonks/benchmark-of-llms-part-2-mmlu-helm-eleuthera-ai-lm-eval-e6fc54053e3d

Phoenix demo

This video actually gives a great demo on how to identify problematic queries on a RAG app.
https://www.youtube.com/watch?v=hbQYDpJayFw
http://bit.ly/llama-index-phoenix-tutorial (associated notebook)
https://phoenix-demo.arize.com/projects (live demo of Phoenix)

LLM / RAG / Evaluation overview

Great overview of the LLM + RAG + Evaluation steps
https://arize.com/blog-course/large-language-model-monitoring-observability/

Data Science clearly explained

Lots of videos explaining many ML concepts
https://www.youtube.com/@statquest/videos