Building a RAG app from scratch with Langchain (by svpino)
- Advanced Langchain Chains
- Extract & transcribe Audio from Youtube Video (using youtube and openai-whisper libs)
- https://www.youtube.com/watch?v=BrsocJb-fAo
- https://github.com/svpino/youtube-rag/blob/main/rag.ipynb
Evaluating an LLM-powered RAG application automatically (by svpino)
- Generate a test-set from the documents
- Automatically evaluate the RAG system using Giskard
- Integrate the evaluation with PyTest in order include it in CI/CD steps
- https://www.youtube.com/watch?v=ZPX3W77h_1E
- https://github.com/svpino/llm/blob/main/evaluation/notebook.ipynb
Builing a RAG app from scratch with LlamaIndex (by Arize)
- Very simple RAG with LlamaIndex
- Questions generation using Phoenix (but LlamaIndex can do it as well) for evaluation
- Compute Retrieval scores: ndcg_score@K (with sklearn.metrics), precision@K and Hit-rate
- Compute Response score: Q&A correctness, Hallucinations
- https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/evals/evaluate_rag.ipynb
- https://www.youtube.com/watch?v=LrMguHcbpO8
LLM brenchmarks review and origin
- Describe 6 of the most famous LLM benchmarks (Glue & SuperGLUE, Adversarial NLI, Big Bench, MMLU, HELM, LM Evaluation Harness)
- Explain why we need to use these benchmarks instead of the old NLP metrics and benchmarks
- Part1: https://medium.com/@myschang/evaluation-of-large-language-model-llm-introduction-9343424ad253
- Part2: https://medium.com/@myschang/benchmark-of-llms-part-1-glue-superglue-adversarial-nli-big-bench-8d1aed6bae12
- Part3: https://medium.com/aimonks/benchmark-of-llms-part-2-mmlu-helm-eleuthera-ai-lm-eval-e6fc54053e3d
Phoenix demo
- This video actually gives a great demo on how to identify problematic queries on a RAG app.
- https://www.youtube.com/watch?v=hbQYDpJayFw
- http://bit.ly/llama-index-phoenix-tutorial (associated notebook)
- https://phoenix-demo.arize.com/projects (live demo of Phoenix)
LLM / RAG / Evaluation overview
- Great overview of the LLM + RAG + Evaluation steps
- https://arize.com/blog-course/large-language-model-monitoring-observability/
Data Science clearly explained
- Lots of videos explaining many ML concepts
- https://www.youtube.com/@statquest/videos