Mhoroi vanhu vakanaka! Ndinovimba kuti 2025 iri kukubatai zvakanaka, kunyangwe yanga iri bhagi kwandiri, kusvika parizvino.
Tikugashirei kuDoodles uye Programming blog uye nhasi tichavaka: Iyo yekuongorora manzwiro modhi uchishandisa TensorFlow + Python.
Muchidzidzo ichi, tichadzidzawo nezve izvo zvekutanga zveMuchina Kudzidza nePython, uye sezvambotaurwa, tichakwanisa kuvaka yedu Yedu Yekudzidza Modhi neTensorflow, raibhurari yePython. Uyu modhi uchakwanisa kuona matauriro/manzwiro echinyorwa chekupinda , nekudzidza nekudzidza kubva kumuenzaniso wedhata rakapihwa.
Chatinoda rumwe ruzivo rwePython (zvinhu zvakanyanya kukosha, hongu), uye zvakare, ita shuwa kuti une Python yakaiswa muhurongwa hwako (Python 3.9+ inokurudzirwa).
Nekudaro, kana iwe ukaona zvakaoma kuti upfuure neino tutori, usazvinetse! Ndipfure email kana meseji; Ndichadzoka kwauri ASAP.
Kana iwe usiri kuziva, chii chinonzi Machine Kudzidza?
Mumashoko akareruka, Kudzidza Kwemuchina (ML) kuri kuita kuti komputa idzidze uye kuita fungidziro, nekudzidza data nenhamba. Nekudzidza iyo data yakapihwa, komputa inokwanisa kuona uye kubvisa mapatani uye yobva yaita fungidziro yakavakirwa pazviri. Kuzivikanwa kweSpam Emails, kucherechedzwa kwekutaura, uye kufanotaura kwetraffic ndezvimwe zviitiko zvehupenyu zvekushandisa zveMuchina Kudzidza.
Kuti uwane muenzaniso uri nani, fungidzira kuti tinoda kudzidzisa komputa kuziva katsi mumifananidzo. Iwe waizoratidza mifananidzo yakawanda yekatsi uye woti, "Hei, komputa, idzi ikati!" Kombiyuta inotarisa mifananidzo uye inotanga kuona mapatani - senzeve dzakanongedza, ndebvu, nemvere. Mushure mekuona mienzaniso yakakwana, inogona kuziva katsi mumufananidzo mutsva waasati amboona.
Imwe yakadaro sisitimu yatinotora mukana wemazuva ese ndeye email spam mafirita. Mufananidzo unotevera unoratidza kuti zvinoitwa sei.
Sei kushandisa Python?
Kunyangwe Python Programming Mutauro isina kuvakwa zvakanangana neML kana Data Sayenzi, inoonekwa semutauro mukuru wekuronga weML nekuda kwekuchinjika kwayo. Nemazana emaraibhurari aripo ekudhawunirodha mahara, chero munhu anogona nyore kuvaka ML modhi nekushandisa pre-yakavakwa raibhurari pasina chikonzero chekuronga yakazara maitiro kubva kutanga.
TensorFlow imwe raibhurari yakadaro yakavakwa neGoogle yeMuchina Kudzidza uye Artificial Intelligence. TensorFlow inowanzo shandiswa nemasainzi edata, mainjiniya edata, uye vamwe vanogadzira kuvaka michina yekudzidza zviri nyore, sezvo ine akasiyana ekudzidza muchina uye AI algorithms.
Shanyira iyo Yepamutemo TensorFlow Webhusaiti
Kuisa Tensorflow, mhanyisa unotevera kuraira mune yako terminal:
pip install tensorflow
Uye kuisa Pandas uye Numpy,
pip install pandas numpy
Ndapota dhawunirodha faira re CSV kubva pane ino repository: Github Repository - TensorFlow ML Model
#1 mutemo wekuongorora data uye chero chinhu chakavakirwa pane data: Nzwisisa iyo data yaunayo kutanga.
Muchiitiko ichi, iyo dataset ine makoramu maviri: Chinyorwa uye Sentiment. Nepo chikamu che "chinyorwa" chine zvirevo zvakasiyana-siyana zvakaitwa pamabhaisikopo, mabhuku, nezvimwe, koramu ye" manzwiro" inoratidza kana zvinyorwa zvakanaka, zvisina kwazvakarerekera, kana zvakaipa, uchishandisa nhamba 1, 2, uye 0 zvichiteerana.
Mutemo unotevera wechigunwe ndewekuchenesa zvakapetwa uye kubvisa null kukosha mune yako data data. Asi mune iyi kesi, sezvo iyo dataset yakapihwa iri diki uye isina zvakapetwa kana zvisina maturo, isu tinogona kusvetuka iyo data-yekuchenesa maitiro.
Kutanga kuvaka modhi, isu tinofanirwa kuunganidza uye kugadzirira iyo dataset kudzidzisa manzwiro ekuongorora modhi. Pandas, raibhurari yakakurumbira yekuongorora data uye kunyengedza, inogona kushandiswa pabasa iri.
import pandas as pd # Load data from CSV data = pd.read_csv('sentiment.csv') # Change the path of downloaded CSV File accordingly # Text data and labels texts = data['text'].tolist() labels = data['sentiment'].values
Kodhi iri pamusoro inoshandura faira reCSV kuita furemu yedata, uchishandisa pandas.read_csv()
basa. Tevere, inopa kukosha kweiyo "manzwiro" mutsara kune Python runyorwa uchishandisa iyo tolist()
basa uye inogadzira iyo Numpy array nehunhu.
Sei uchishandisa Numpy array?
Sezvaungave watoziva kare, Numpy yakanyatso kugadzirirwa kushandiswa kwedata. Numpy arrays inonyatso kubata manhamba mavara emuchina ekudzidza mabasa, ayo anopa kuchinjika muhurongwa hwedata. Ndosaka tiri kushandisa Numpy mune iyi kesi.
Mushure mekugadzirira data remuenzaniso, isu tinofanirwa kudzokorora Chinyorwa, icho chinosanganisira Tokenization.
Tokenization inzira yekukamura sampu yega yega kuita mazwi ega kana tokens, kuitira kuti, tikwanise kushandura iyo yakasvibira data data kuita fomati inogona kugadziriswa neiyo modhi , ichibvumira kuti inzwisise uye idzidze kubva kumazwi ega ega mune zvinyorwa zvinyorwa. .
Tarisa pamufananidzo uri pazasi kuti udzidze kuti tokenization inoshanda sei.
Muchirongwa ichi, zvakanakisa kushandisa manyorerwo emaoko pachinzvimbo chezvimwe zvakafanovakwa tokenizer sezvo ichitipa humwe hutongi pamusoro peiyo tokenization process, inoita kuti ienderane neakananga mafomati data, uye inobvumira kune akagadzirirwa preprocessing matanho.
Ongorora: MuManual Tokenization, tinonyora kodhi kupatsanura zvinyorwa mumashoko, izvo zvinogoneka zvakanyanya zvinoenderana nezvinodiwa zveprojekiti. Nekudaro, dzimwe nzira, dzakadai sekushandisa TensorFlow Keras Tokenizer, huya neyakagadzirirwa-yakagadzirwa maturusi uye mabasa ekutsemura zvinyorwa otomatiki, zviri nyore kuita asi zvisingaite.
Inotevera ndiyo kodhi snippet yatinogona kushandisa kuratidza yemuenzaniso data.
word_index = {} sequences = [] for text in texts: words = text.lower().split() sequence = [] for word in words: if word not in word_index: word_index[word] = len(word_index) + 1 sequence.append(word_index[word]) sequences.append(sequence)
Mune kodhi iri pamusoro,
word_index
: Duramazwi risina chinhu rakagadzirwa kuti richengete rimwe nerimwe rakasiyana mudataset, pamwe chete nekukosha kwaro.sequences
: Rondedzero isina chinhu inochengeta kutevedzana kwenhamba inomiririra yemazwi kune yega yega chinyorwa.for text in texts
: loop kuburikidza nemuenzaniso wega wega wemavara mune "zvinyorwa" runyorwa (rwakagadzirwa kare).words = text.lower().split()
: Inoshandura sampuli yega yega kuva mavara maduku uye inokamura kuva mazwi ega ega, zvichienderana nechena.for word in words
: Nested loop inodzokorodza pamusoro peizwi rega rega mune "mazwi" rondedzero, iyo ine tokenized mazwi kubva kune yazvino zvinyorwa zvinyorwa.if word not in word_index
: Kana izwi racho risipo parizvino muduramazwi rezwi_indekisi, rinowedzerwa mariri pamwe chete nendekisi yakasiyana, inowanikwa nekuwedzera 1 kuhurefu hwazvino hweduramazwi.sequence. append (word_index[word])
: Mushure mekuona indekisi yezwi razvino, inowedzerwa kune "kutevedzana" runyorwa. Izvi zvinoshandura izwi rega rega riri mumutsara wemavara kuenda kundekisi yaro inoenderana ne "word_index" duramazwi.sequence.append(sequence)
: Mushure mekunge mazwi ese ari mumuenzaniso wezvinyorwa ashandurwa kuita manhamba indices uye achengetwa mu "sequence" runyorwa, runyorwa urwu rwunowedzerwa kune "sequences".
Muchidimbu, kodhi iri pamusoro inoisa data data nekushandura izwi rega rega kune nhamba yaro inomiririra zvichienderana neduramazwi word_index
, iro rinoisa mazwi kune akasiyana indices. Iyo inogadzira kutevedzana kwenhamba inomiririra kune yega yega chinyorwa sampuli, iyo inogona kushandiswa seyekuisa data yemuenzaniso.
Mavakirwo eimwe modhi ndiko kurongeka kwezvikamu, zvikamu, uye zvinongedzo zvinotaridza kuti data rinofamba sei mazviri . Mavakirwo eiyo modhi ane simba rakakosha pakumhanya kweiyo modhi yekudzidzira, kuita, uye kugona kuita.
Mushure mekugadzirisa data rekuisa, tinogona kutsanangura mavakirwo eiyo modhi semumuenzaniso uri pazasi:
model = tf.keras.Sequential([ tf.keras.layers.Embedding(len(word_index) + 1, 16, input_length=max_length), tf.keras.layers.LSTM(64), tf.keras.layers.Dense(3, activation='softmax') ])
Mune kodhi iri pamusoro, tinoshandisa TensorFlow Keras inova yepamusoro-level neural network API yakavakirwa kukurumidza kuyedza uye prototyping yeDeep Kudzidza modhi, nekurerutsa maitiro ekugadzira uye kuunganidza modhi yekudzidza yemuchina.
tf. keras.Sequential()
: Kutsanangudza modhi inotevedzana, inova mutsara wemitsara yezvikamu. Iyo data inoyerera kubva kune yekutanga layer kusvika kune yekupedzisira, yakarongeka.tf.keras.layers.Embedding(len(word_index) + 1, 16, input_length=max_length)
: Mutsara uyu unoshandiswa kureva kudzvanya kwemazwi, izvo zvinoshandura mazwi kuita mavector akakora esaizi yakatarwa. Iyo len(word_index) + 1 inotsanangura saizi yemazwi, 16 ndiyo dimensionality yekumisikidza, uye input_length=max_length inoseta hurefu hwekupinda panhevedzano yega yega.tf.keras.layers.LSTM(64)
: Iyi layer ndeye Yenguva Yakareba Yenguva Yekurangarira (LSTM) layer, inova imhando yerecurrent neural network (RNN). Iyo inogadzirisa kutevedzana kwemazwi ekumisikidza uye inogona "kurangarira" akakosha mapatani kana kutsamira mu data. Iine zvikamu makumi matanhatu nemana, izvo zvinotarisa hukuru hwenzvimbo yekubuda.tf.keras.layers.Dense(3, activation='softmax')
: Iri ndiro rakanyatsobatanidzwa rine mayunitsi matatu uye softmax activation function. Ndiyo inobuda chikamu cheiyo modhi, ichigadzira mukana wekugovera pamusoro pemakirasi matatu anogoneka (zvichitora dambudziko remhando dzakasiyana-siyana).
MuMuchina Kudzidza neTensorFlow, kuunganidza kunoreva maitiro ekugadzirisa modhi yekudzidziswa nekudoma zvinhu zvitatu zvakakosha - Kurasika Basa, Optimizer, uye Metrics.
Kurasika Basa : Inoyera kukanganisa pakati pekufanotaura kwemuenzaniso uye nezvinangwa chaizvo, kubatsira kutungamira modhi kugadzirisa.
Optimizer : Inogadzirisa maparamendi emodhi kuti aderedze basa rekurasikirwa, zvichiita kuti kudzidza kwakanaka.
Metrics : Inopa kuongororwa kwekuita kunze kwekurasikirwa, sekunge kurongeka kana kurongeka, kubatsira mukuongorora kwemuenzaniso.
Iyo pazasi kodhi inogona kushandiswa kuunganidza iyo Sentiment Analysis Model:
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Pano,
loss='sparse_categorical_crossentropy'
: Basa rekurasikirwa rinowanzo shandiswa paZvikamu zvemabasa zvisinei kuti mavara akatariswa ari ma integers uye kubuda kwemuenzaniso ndiko kugoverwa kungangoitika pamakirasi akawanda. Inoyera mutsauko uripo pakati pechokwadi mavara uye kufanotaura , nechinangwa chekudzikisira panguva yekudzidziswa.
optimizer='adam'
: Adam is an optimization algorithm inogadzirisa mwero wekudzidza zvine simba panguva yekudzidziswa. Inoshandiswa zvakanyanya mukuita nekuda kwekuita kwayo, kusimba, uye hunyanzvi mumhando dzakasiyana dzemabasa kana ichienzaniswa nemamwe ma optimizer.
metrics = ['accuracy']
: Kururama ndiyo metric yakajairika inowanzo shandiswa kuongorora mhando dzemhando. Inopa chiyero chakatwasuka chekuita kwemodhi pane basa racho, sehuwandu hwemasamples ayo akafanofembera emodhi anowirirana nemazita echokwadi.
Iye zvino kuti data rekuisa ragadziriswa uye rakagadzirira uye mavakirwo emuenzaniso wacho zvakare akatsanangurwa, tinogona kudzidzisa modhi tichishandisa nzira model.fit()
.
model.fit(padded_sequences, labels, epochs=15, verbose=1)
padded_sequences
: Iyo data yekuisa yekudzidzisa modhi, iyo ine kutevedzana kwezviyero zvakafanana (padding ichakurukurwa gare gare muchidzidzo).
labels
: Target labels inoenderana neiyo data yekuisa (kureva manzwiro zvikamu zvakapihwa kune yega yega chinyorwa sampuli)
epochs=15
: An epoch imwe yakakwana kupfuura nepakati yakazara yekudzidzisa dataset panguva yekudzidziswa. Nekudaro, muchirongwa ichi, modhi inodzokorora pamusoro peiyo yakazara dataset ka15 panguva yekudzidziswa.
Kana huwandu hwema epoch hwawedzerwa, hunogona kuvandudza mashandiro sezvo ichidzidza mamwe akaomarara mapatani kuburikidza nemasampuli edatha. Nekudaro, kana nguva yakawandisa ikashandiswa, modhi inogona kubata nemusoro data yekudzidziswa inotungamira (inodanwa kuti "kuwedzeredza") kune kusakwana kwehuwandu hwe data idzva. Iyo nguva inopedzwa yekudzidziswa ichawedzerawo nekuwedzera kwenhamba epochs uye zvinopesana.
verbose=1
: Iyi ndiyo parameter yekudzora kuti yakawanda sei inobuda nenzira yemodhiri yakakodzera paunenge uchidzidzira. Kukosha kwe1 kunoreva kuti mabara ekufambira mberi acharatidzwa mukoni sezvo modhi ichidzidzisa, 0 inoreva kuti hapana chinobuda, uye 2 inoreva mutsara mumwechete panguva imwe neimwe. Sezvo zvingave zvakanaka kuona chokwadi uye kurasikirwa kukosha uye nguva inotorwa kune yega yega, isu tichaiisa ku1.
Mushure mekubatanidza uye kudzidziswa kweiyo modhi, inogona kupedzisira yaita fungidziro zvichibva pane yedu sampuli data, nekushandisa kufanotaura () basa. Nekudaro, isu tinofanirwa kuisa data rekuisa kuti tiyedze iyo modhi uye kugamuchira zvinobuda. Kuti tidaro, isu tinofanirwa kuisa mamwe mameseji zvirevo tobva tabvunza modhi kuti ifanotaura manzwiro eiyo data yekuisa.
test_texts = ["The price was too high for the quality", "The interface is user-friendly", "I'm satisfied"] test_sequences = [] for text in test_texts: words = text.lower().split() sequence = [] for word in words: if word in word_index: sequence.append(word_index[word]) test_sequences.append(sequence)
Pano, test_texts
inochengeta imwe data yekuisa ukuwo test_sequences
runyorwa ruchishandiswa kuchengetedza tokenized test data, ari mazwi akapatsanurwa newhitespaces mushure mekushandura kuita madiki. Asi zvakadaro, test_sequences
haizokwanisa kuita se data rekuisa remuenzaniso.
Chikonzero ndechekuti akawanda akadzama masisitimu ekudzidza, anosanganisira Tensorflow, anowanzoda data rekuisa kuti rive nechiyero chakafanana (zvinoreva kuti kureba kwese kutevedzana kunofanira kuenzana), kugadzirisa mabheji edata nemazvo. Kuti uite izvi, unogona kushandisa matekiniki akaita sepadding, uko nhevedzano inotambanudzwa kuti ienderane nehurefu hwekutevedzana kwakareba mudhatabheti, uchishandisa chiratidzo chakakosha senge # kana 0 (0, mumuenzaniso uyu).
import numpy as np padded_test_sequences = [] for sequence in test_sequences: padded_sequence = sequence[:max_length] + [0] * (max_length - len(sequence)) padded_test_sequences.append(padded_sequence) # Convert to numpy array padded_test_sequences = np.array(padded_test_sequences)
Mune kodhi yakapihwa,
padded_test_sequences
: Rondedzero isina chinhu yekuchengetedza iwo akateedzana akateedzana anozoshandiswa kuyedza modhi.for sequence in sequences
: Loops nepakati pese kutevedzana mune "sequences" runyorwa.padded_sequence
: Inogadzira mutsva wepadded sequence kune yega yega inoteedzana, ichidzikisa iyo yekutanga kutevedzana kune yekutanga max_length zvinhu kuti ive nechokwadi chekuenderana. Zvadaro, tiri kupeta kutevedzana nemazero kuti aenderane nehukuru_hurefu kana iri ipfupi, tichinyatso kugadzira kutevedzana kwese kwakaenzana.padded_test_sequences.append()
: Wedzera kutevedzana kwepadded kune iyo rondedzero ichashandiswa pakuyedza.padded_sequences = np.array()
: Kushandura runyoro rwemitsetse yakatevedzana kuita Numpy array.
Zvino, sezvo data rekuisa rakagadzirira kushandiswa, modhi inogona kupedzisira yafanotaura manzwiro ezvinyorwa zvekupinza.
predictions = model.predict(padded_test_sequences) # Print predicted sentiments for i, text in enumerate(test_texts): print(f"Text: {text}, Predicted Sentiment: {np.argmax(predictions[i])}")
Mune kodhi iri pamusoro, iyo model.predict()
nzira inoburitsa fungidziro kune yega yega bvunzo kutevedzana, ichigadzira ndandanda yezvinobvira zvinofungidzirwa pachikamu chega chega chemanzwiro. Zvino inodzokorora kuburikidza nechimwe chinhu test_texts
uye np.argmax(predictions[i])
inodzosa indekisi yemukana wepamusoro-soro mune zvakafanotaurwa zvingangoitika zvei-th test sample. Indekisi iyi inoenderana nechikamu chemanzwiro akafanotaurwa ane mukana wepamusoro wakafanotaurwa wemuenzaniso wega wega webvunzo, zvinoreva kuti kufanotaura kwakanakisa kunoitwa kunotorwa uye kuratidzwa sechinhu chikuru chakabuda.
Special Notes *:* np.argmax()
ibasa reNumPy rinowana indekisi yeukoshi hwepamusoro muhurongwa. Muchirevo chechinyorwa chino, np.argmax(predictions[i])
inobatsira kuona chikamu chemanzwiro chine mukana wepamusoro wakafanotaurwa wemuenzaniso wega wega webvunzo.
Chirongwa chacho chagadzirira kushanda. Mushure mekugadzira uye kudzidzisa modhi, iyo Machine Yekudzidza Modhi inodhinda kunze kwekufungidzira kwayo kune data rekuisa.
Mukubuda kweiyo modhi, tinogona kuona kukosha se "Kururama" uye "Kurasika" kweEpoch yega yega. Sezvambotaurwa, Accuracy iperesenti yekufanotaura chaiko kunze kwekufanotaura. Kururama kwepamusoro kuri nani. Kana iko kurongeka kuri 1.0, zvinoreva 100%, zvinoreva kuti modhi yakafanotaura nenzira kwayo muzviitiko zvese. Saizvozvo, 0.5 zvinoreva kuti modhi yakafanotaura hafu yenguva, 0.25 inoreva kufanotaura kwekota yenguva, zvichingodaro.
Kurasikirwa , kune rumwe rutivi, kunoratidza kuti zvakashata sei kufanotaura kwemuenzaniso kunoenderana nehunhu hwechokwadi. Iko kuderera kushoma kukosha kunoreva modhi iri nani ine nhamba shoma yemhosho, ine kukosha 0 kuve iyo yakakwana kurasikirwa kukosha sezvo zvichireva kuti hapana kukanganisa kunoitwa.
Nekudaro, isu hatigone kuona kurongeka kwese uye kurasikirwa kweiyo modhi nedata riri pamusoro rinoratidzwa kune yega Epoch. Kuti tidaro, isu tinogona kuongorora modhi tichishandisa evaluate() nzira uye kudhinda kwayo Kururama uye Kurasika.
evaluation = model.evaluate(padded_sequences, labels, verbose=0) # Extract loss and accuracy loss = evaluation[0] accuracy = evaluation[1] # Print loss and accuracy print("Loss:", loss) print("Accuracy:", accuracy)
Zvakabuda:
Loss: 0.6483516097068787 Accuracy: 0.7058823704719543
Saizvozvo, mune iyi modhi, kukosha kwekurasikirwa ndeye 0.6483 zvinoreva kuti Model yakaita zvimwe zvikanganiso. Kurongeka kwemuenzaniso kunenge 70%, zvinoreva kuti kufanotaura kwakaitwa nemuenzaniso kwakarurama kupfuura hafu yenguva. Pakazara, iyi modhi inogona kutorwa se "yakanaka zvishoma" modhi; zvisinei, ndapota cherechedza kuti "yakanaka" kurasikirwa uye kurongeka kwechokwadi kunoenderana nemhando yemuenzaniso, saizi yedataset, uye chinangwa cheimwe Machine Learning Model.
Uye hongu, isu tinokwanisa uye tinofanirwa kuvandudza mametric ari pamusoro eiyo modhi nekunyatso-tunning maitiro uye zvirinani sampuli dataset. Nekudaro, nekuda kwechidzidzo ichi, ngatimirei kubva pano. Kana iwe uchida chikamu chechipiri chedzidziso iyi, ndapota ndizivise!
Muchidzidzo ichi, takavaka TensorFlow Machine Kudzidza Model ine kugona kufanotaura manzwiro echimwe chinyorwa, mushure mekuongorora dhatabheti remuenzaniso.
Iyo Yakazara Code uye Sample CSV Faira inogona kurodha pasi uye kuoneka muGitHub Repository - GitHub - Buzzpy/Tensorflow-ML-Model.