Amamodeli Olimi Amakhulu anamandla amangalisayo futhi kuyamangaza ukuthi aqeqeshwe eqoqweni elikhulu ledatha. Kodwa-ke, bangaki bethu, njengoba siqala uhambo lwethu loMsunguli, abanesabelomali sokuqeqesha lawa mamodeli ngaphakathi endlini? Cishe, bambalwa kakhulu.
Kodwa linda, ingabe lawa ma-LLM aqeqeshwe kusengaphambili awusizo kithi? Kunjalo, uma kungumthombo ovulekile. Ngenhlanhla, ambalwa ayatholakala manje.
Ngakho, ziwusizo kangakanani ngempela? Abaningi bethu emkhakheni we-AI bayawazi ama-RAG noma okungenani bezwa ngawo. Ake nginikeze incazelo elula yomugqa owodwa. Isizukulwane sokubuyiswa-esithuthukisiwe siyile ndlela esizwakala ngayo - sibuyisela idatha emithonjeni yangaphandle esiyinikezayo futhi siyengezelele kokuphumayo kwe-LLM.
Iwusizo kakhulu ngoba isebenzisa amandla okukhiqiza ama-LLM kuyilapho ihlanganisa ulwazi esilunikeza ngaphandle ukuze kukhiqizwe imiphumela evela kukhorasi efunekayo. Uma unekhorasi yangaphandle ekhawulelwe, singavumela imodeli ibuyele olwazini olujwayelekile lwe-LLM.
Nginesithakazelo ikakhulukazi endleleni esidla ngayo futhi sikholelwa ngokujulile emcabangweni wokuthi "udoti ungene, udoti uphume" uma kuziwa ekudleni nasemizimbeni. Uma sizondla ngokudla okunempilo ngokwemvelo, sibonisa imvelo - enamandla, enempilo, futhi engavimbeki. Kodwa uma sidla ukudla okwenziwayo, okungenampilo, siqala ukubukeka futhi sizizwe ngendlela efanayo - sikhathele futhi singezona ezemvelo. Omunye wemiphumela emibi kakhulu yokusetshenziswa ngokweqile kokudla okufakelwayo nokucolisisiwe namuhla isifo sikashukela.
Futhi ubani oqonda ngempela amaphuzu ezinhlungu zangempela zokuphila nesifo sikashukela? Kulula - abantu abazibonela mathupha. Ngokubheka intshisekelo yami ekutholeni idatha yolwazi mayelana nesifo sikashukela ngisebenzisa ama-LLM, ngenze lolu cwaningo nge-Ollama - enye yama-LLM amaningi avulekile angasetshenziselwa imisebenzi enjalo.
Ngabelana ngencwajana yami isinyathelo ngesinyathelo ngencazelo esigabeni ngasinye. Ukwengeza, ukusiza ukuqonda, ngifaka umdwebo wezinga eliphezulu lezakhiwo.
Isinyathelo 1: Landa futhi usebenze okuqukethwe kuma-URL amaningi. Sikhuhla umbhalo ohlwini olunikeziwe lwama-URL e-Reddit futhi siwugcine kuwo wonke_imibhalo.
# Import necessary libraries import requests from bs4 import BeautifulSoup from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.embeddings import OllamaEmbeddings from langchain.vectorstores import FAISS from langchain.chains import RetrievalQA from langchain.prompts import PromptTemplate from langchain.llms import Ollama # List of URLs to scrape urls = [ 'https://www.reddit.com/r/diabetes/comments/1broigp/what_are_your_biggest_frustrations_with_diabetes/', 'https://www.reddit.com/r/diabetes_t2/comments/156znkx/whats_the_most_challenging_part_about_dealing/', 'https://www.reddit.com/r/diabetes/comments/qcsgji/what_is_the_hardest_part_about_managing_diabetes/', 'https://www.reddit.com/r/diabetes_t1/comments/1hdlipr/diabetes_and_pain/', 'https://www.reddit.com/r/diabetes/comments/ww6mrj/what_does_diabetic_nerve_pain_feel_like/', 'https://www.reddit.com/r/AskReddit/comments/avl1x0/diabetics_of_reddit_what_is_your_experience_of/', 'https://www.reddit.com/r/diabetes_t2/comments/1jggxi9/my_fathers_sugar_levels_are_not_dropping/', 'https://www.reddit.com/r/diabetes_t2/comments/1jglmie/shaky_feeling/', 'https://www.reddit.com/r/diabetes_t2/comments/1jgccvo/rant_from_a_depressedeating_disordered_diabetic/' ] # Initialize text storage all_texts = [] # Step 1: Fetch and process content from multiple URLs for url in urls: response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') # Extract text from <p> tags text = ' '.join([para.get_text() for para in soup.find_all('p')]) if text: # Store only if text is found all_texts.append(text)
Isinyathelo sesi-2: Sihlukanisa umbhalo ohlutshiwe ube yizingxenyana ezilawulekayo ukuze kucutshungulwe kahle umbhalo. Futhi singanciphisa ukusetshenziswa kwememori futhi sithuthukise ukusebenza kosesho.
# Step 2: Split all content into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50) all_chunks = [] for text in all_texts: all_chunks.extend(text_splitter.split_text(text))
Isinyathelo sesi-3: Siqalisa ukushumeka kwe-Ollama. Okushumekiwe kuwumfanekiso wezinombolo (ivekhtha) wombhalo ongahluziwe othwebula incazelo ye-semantic yombhalo. Ukushumeka kunika amandla inqubo yamamodeli e-ML ahlukahlukene futhi kuqondwe umbhalo ngempumelelo. Ikilasi le-OllamaEmbeddings lisebenzisa imodeli ye-llama2 ukwenza lokhu kushumeka.
# Step 3: Initialize Ollama embeddings embeddings = OllamaEmbeddings(model="llama2") # Adjust model name if needed
Isinyathelo sesi-4: Ukusesha Okufanayo kwe-Facebook AI (FAISS) iwumtapo wolwazi wokusesha okufana okusebenzayo kanye nokuhlanganisa ama-vectors anobukhulu obuphezulu. Umsebenzi we-from_texts uguqula izingcezu zemibhalo ibe ama-vector (kusuka esinyathelweni sesi-3) futhi uwagcine esitolo se-FAISS ivekhtha. I-FAISS ikusiza ukuthi uthole izingcezu ezifana nombuzo wakho ngokuqhathanisa amabanga e-vector (cosine, euclidean) ngendlela ethuthukiswe kakhulu.
# Step 4: Create a FAISS vector store using all chunks vector_store = FAISS.from_texts(all_chunks, embeddings)
Isinyathelo sesi-5: Siqalisa i-Ollama LLM, ezokhiqiza izimpendulo noma icubungule imibuzo ngokusekelwe olwazini olugcinwe ekushumekeni esitolo se-FAISS se-vector kusukela esinyathelweni sesi-4.
# Step 5: Initialize the Ollama LLM llm = Ollama(model="llama2", temperature=0.3)
Ndawonye, izinyathelo 3-5 zinika amandla i-RAG, lapho i-LLM ingadonsa ulwazi ezingxenyeni ezigciniwe esitolo se-vector futhi isebenzise umongo obuyisiwe ukuze iphendule umbuzo womsebenzisi ngendlela efaneleke kakhulu.
Isinyathelo sesi-6: Lapha, sichaza umsebenzi othi ask_question_with_fallback ukuze sibuze ulwazi olugciniwe ukuze siphendule umbuzo wabasebenzisi. Kodwa uma ingakwazi ukuthola amadokhumenti afanayo noma isikolo sokufana siphansi, izobuyela emuva olwazini olujwayelekile lwe-LLM eyisisekelo (Ollama lapha).
# Step 6: Create the question-answering function def ask_question_with_fallback(query): # Retrieve relevant documents docs = vector_store.similarity_search(query, k=3) docs = vector_store.similarity_search(query, k=3) for doc in docs: print(f"Retrieved doc: {doc.page_content[:200]}") # If no relevant documents or low similarity, use general knowledge #if not docs or all(doc.metadata.get('score', 1.0) < 0.3 for doc in docs): # return use_general_knowledge(query) if not docs: return use_general_knowledge(query) # Format retrieved documents as context context = "\n\n".join([doc.page_content for doc in docs]) # Construct RAG prompt rag_prompt = f""" Use the following pieces of context to answer the question at the end. If you don't know the answer based on this context, respond with "NO_ANSWER_FOUND". Context: {context} Question: {query} Provide a direct and concise answer to the question based only on the context above: """ rag_answer = llm(rag_prompt) # Check for fallback trigger if "NO_ANSWER_FOUND" in rag_answer or "don't know" in rag_answer.lower() or "cannot find" in rag_answer.lower(): return use_general_knowledge(query) return { "answer": rag_answer, "source": "URL content", "source_documents": docs }
Isinyathelo sesi-7: Lona umsebenzi wokubuyela emuva. Uma kungekho mibhalo efanele engabuyiswa esinyathelweni sesi-6, i-LLM isebenzisa ulwazi lwayo olujwayelekile ukuze iphendule umbuzo womsebenzisi.
# Step 7: Define fallback general knowledge function def use_general_knowledge(query): general_prompt = f""" Answer this question using your general knowledge: {query} Provide a direct and helpful response. If you don't know, simply say so. """ general_answer = llm(general_prompt) return { "answer": general_answer, "source": "General knowledge", "source_documents": [] }
Isinyathelo sesi-8: Lesi sinyathelo sibonisa isibonelo sendlela yokusebenzisa lo mshini we-RAG. Uhlinzeka ngombuzo kumodeli, futhi i-LLM isebenzisa ulwazi lwangaphandle noma lwangaphakathi ukuphendula umbuzo wakho.
#Step 8 # Example usage query = "What is the hardest part about managing diabetes?" # Replace with your actual question result = ask_question_with_fallback(query) # Display results print("Answer:") print(result["answer"]) print(f"\nSource: {result['source']}") if result["source_documents"]: print("\nSource Documents:") for i, doc in enumerate(result["source_documents"]): print(f"Source {i+1}:") print(doc.page_content[:200] + "...") # Print first 200 chars of each source print()
Ukukhipha kwami Isinyathelo sesi-8 kungezansi. Imodeli isebenzisa i-RAG, ehlonza idokhumenti efanayo esitolo se-FAISS futhi iphendule umbuzo wami.
Sengiphetha, ukusebenzisa i-RAG ngama-URL kuyindlela enamandla yokuthola ulwazi oluthuthukisiwe emibuzweni ehlobene nesifo sikashukela. Ngokuhlanganisa imininingwane yomhlaba wangempela evela ezinkundleni zomphakathi njenge-Reddit phezu kwama-LLM omthombo ovulekile, singanikeza ulwazi lomuntu siqu nolunembile - phela, ubani oluqonda kangcono kunalabo abahlala nalo nsuku zonke?
Le ndlela ayibizi kuphela kodwa futhi ikhuthaza ukubambisana, ekugcineni ithuthukise ukwesekwa kwabantu abaphethe isifo sikashukela. Njengoba i-AI iqhubeka nokuvela, amandla ayo okuthuthukisa ukunakekelwa kwezempilo nokuphila kahle asalokhu emakhulu.
Isithombe esifakiwe ngu-Suzy Hazelwood: https://www.pexels.com/photo/close-up-photo-of-sugar-cubes-in-glass-jar-2523650/