paint-brush
Abaphandi be-IBM benza iModeli ye-AI encinci eqikelela ikamvange@fewshot
Imbali entsha

Abaphandi be-IBM benza iModeli ye-AI encinci eqikelela ikamva

Inde kakhulu; Ukufunda

Abaphandi baye baphuhlisa enye indlela esebenzayo, esebenzayo kwiimodeli ezinkulu ze-AI zokuqikelela uthotho lwexesha.
featured image - Abaphandi be-IBM benza iModeli ye-AI encinci eqikelela ikamva
The FewShot Prompting Publication  HackerNoon profile picture
0-item

Ababhali:

(1) Vijay Ekambaram, Uphando lwe-IBM;

(2) Arindam Jati, Uphando lwe-IBM;

(3) Nam H. Nguyen, Uphando lwe-IBM;

(4) iPankaj Dayama, uphando lwe-IBM;

(5) uChandra Reddy, uPhando lwe-IBM;

(6) uWesley M. Gifford, uPhando lwe-IBM;

(7) Jayant Kalagnanam, Uphando lwe-IBM.

Inqaku loMhleli: le yinxalenye yesi-1 yesi-5 sophononongo oluchaza uphuhliso lwemodeli ye-AI encinci, ekhawulezayo enikezela ngokuchaneka okugqwesileyo. Funda okuseleyo apha ngezantsi.

Itheyibhile yoQhagamshelwano

Abstract

Imifuziselo emikhulu esele iqeqeshwe kwangaphambili yokufunda iqanda/imbalwa igqwesile kulwimi nakwimimandla yembono kodwa idibana nemingeni kuthotho lwexesha le-multivariate (TS) ngenxa yendalo eyahlukeneyo kunye nokunqongophala kwedatha yoqeqesho lwaphambi kokufumaneka esidlangalaleni. Ngenxa yoko, kubekho ukwanda kwamva nje ekusebenziseni imifuziselo yolwimi olukhulu oluqeqeshwe kwangaphambili (LLMs) kunye nohlengahlengiso lweethokheni zoqikelelo lwe-TS. Ezi ndlela zisebenzisa i-cross-domain transfer learning kwaye zivelise iziphumo ezimangalisayo. Nangona kunjalo, ezi modeli zihlala zicotha kakhulu kwaye zinkulu (∼iiparamitha zebhiliyoni) kwaye azithatheli ngqalelo ulungelelwaniso lwesitishi esinqamlezayo. Ukujongana noku, sibonisa iTiny Time Mixers (TTM), imodeli encinci kakhulu esekwe kuyilo lwe-TSMixer olulula. I-TTM iphawula impumelelo yokuqala ekuphuhliseni iimodeli ezikhawulezayo nezincinci eziqeqeshelwe kwangaphambili (≤1M iiparamitha), eziqeqeshwe ngokukodwa kwiiseti zedatha zikawonke-wonke ze-TS, kunye nesakhono sokudlulisa esisebenzayo soqikelelo. Ukujongana nobunzima boqeqesho lwangaphambili kwiiseti zedatha ezininzi ezinezisombululo zexeshana ezahlukahlukeneyo, sazisa uphuculo oluninzi olufana nokuchwetheza okuguquguqukayo, ulwandiso lweseti yedatha ngokuthoba iisampulu, kunye nokulungiswa kwesimaphambili sesisombululo. Ngaphaya koko, sisebenzisa isicwangciso-qhinga esinamanqanaba amaninzi ukwenza imodeli yolungelelwaniso lwetshaneli ngokufanelekileyo kunye nokungenisa imiqondiso yangaphandle ngexesha lokulungiswa kakuhle, amandla abalulekileyo anqongopheleyo kwiibenchmarks ezikhoyo. I-TTM ibonisa ukuchaneka okubonakalayo okuzuziweyo (12-38%) ngaphezu kwebenchmarks ezidumileyo kuqikelelo olumbalwa/i-zero-shot. Kwakhona kunciphisa kakhulu iimfuno zekhompuyutha xa kuthelekiswa neendlela ze-LLM-TS, kunye ne-14X yokusika kwiiparamitha ezifundekayo, i-106X engaphantsi kweeparitha ezipheleleyo, kunye nokunciphisa kakhulu ukulungiswa kakuhle (i-65X) kunye nexesha le-inference (54X). Ngapha koko, i-TTM's zero-shot ihlala idlula iziphumo ezimbalwa kwiibenchmarks ezininzi ezidumileyo, iqaqambisa ukusebenza ngendlela yethu. Iimodeli kunye nekhowudi yomthombo ziyafumaneka ku-https://huggingface.co/ibm/TTM

1. Intshayelelo

Uqikelelo lwamaxesha amaninzi (TS) lubandakanya ukuqikelela amaxabiso exesha elizayo othotho lwexesha elinxulumeneyo ngokusekwe kwidatha yembali. Lo mmandla uhambele phambili kakhulu, usebenzisa iindlela zamanani kunye nokufunda koomatshini (ML) [Hyndman kunye ne-Athanasopoulos, 2021] kuzo zonke iindawo ezifana nemozulu, itrafikhi, ukuthengisa, kunye namandla. Ngokubanzi, uthotho lwexesha ngalinye limele ukuguquguquka okanye itshaneli[1]. Kwizicelo ezithile, iziguquguquko ezingaqikeleliyo, ezihlelwe njengezinto ezilawulwayo nezingalawulekiyo zangaphandle, zinefuthe eliguquguqukayo ukuqikelela. Sibiza ezi ziguquguqukayo ezingaqikeleliyo njengezinto zangaphandle, kunye nezinto ezifuna uqikelelo njengeenguqu ekujoliswe kuzo.


Umsebenzi oNxulumeneyo: Inkqubela phambili yakutsha nje kuqikelelo oluninzi luphawulwe ngokufika kwe-transformerbased [Vaswani et al., 2017] iindlela, ezibonakaliswe yimodeli efana ne-PatchTST [Nie et al., 2023], i-Autoformer [Wu et al., 2021], Informer [I-Informer] ye-F2, i-F2, i-F2, i-F2, i-Informer, i-2023 [Zhou et al., 2022]. Le mifuziselo ibonise ukuphucuka okuqaphelekayo kumanani emveli kunye neendlela zeML. Ngaphaya koko, ulwakhiwo olusekwe kwi-MLPMixer [Tolstikhin et al., 2021], efana ne-TSMixer [Ekambaram et al., 2023], iye yavela njengezinye iindlela ezisebenzayo zoguqulo, ziqhayisa nge-2-3X encitshisiweyo yekhompyuter kunye neemfuno zememori ngaphandle kokuchaneka kokuchaneka xa kuthelekiswa nomguquli wabo. Nangona kunjalo, akukho nanye kwezi ndlela eziphambili eziye zabonisa ngempumelelo amandla okudala iimodeli eziqeqeshwe kwangaphambili ezinokuthi zidlulisele ngempumelelo ukufunda kwi-dataset ye-TS ekujoliswe kuyo engabonakaliyo, ngendlela efanayo njengoko ubungqina obuqhelekileyo kwi-NLP kunye nemisebenzi yombono. Oku kulucelomngeni kakhulu kwi-domain ye-TS ngenxa yendalo eyahlukileyo yedatha kuzo zonke izicelo kunye nokufumaneka okulinganiselweyo koluntu lwedatha ye-TS yoqeqesho lwangaphambili. Kukho iindlela ezikhoyo ezizilawulayo zangaphambili ze-TS zisebenzisa i-mask modeling kunye neendlela zokufunda ezingafaniyo ezifana ne-SimMTM [Dong et al., 2023] kunye ne-TF-C [Zhang et al., 2022] enikezela ngokufunda ngokudluliselwa phakathi kweeseti ezimbini zedatha xa zikhethwe ngononophelo ngokusekelwe kwiipropati zeseti yedatha. Nangona kunjalo, bayasilela ukubonelela ngezakhono zokufunda ezidluliswayo jikelele kuzo zonke iiseti zedatha. Ngenxa yoko, kubekho umkhwa okhulayo wakutsha nje wokusebenzisa imifuziselo yolwimi olukhulu oluqeqeshwe kwangaphambili (LLMs) kuqikelelo lwe-TS, luwuthatha njengomsebenzi wokufunda wodluliselo lwe-domain. Ezi ndlela zokudluliselwa kwehlabathi jikelele, ngokukodwa imisebenzi yamva nje efana neLLMTime [Gruver et al., 2023] kunye ne-GPT4TS [Zhou et al., 2023] ivelisa iziphumo ezithembisayo kwiindlela ezimbalwa zokuxela kwangaphambili / zero-shot. Le mizekelo i-bootstrapped ukusuka kwi-GPT-2/3 okanye i-LLAMA-2 kunye nezicwangciso ezifanelekileyo ze-tokenization ukuze zilungelelanise kwiindawo ze-time-series.


Nangona kunjalo, ezi ndlela ze-TS ezisekelwe kwi-LLM azibambi ngokucacileyo ukulungelelaniswa kwesiteshi kunye nenkxaso yangaphandle kumxholo woqikelelo oluninzi. Ngaphezu koko, ezi modeli zinkulu, ezineebhiliyoni zeeparamitha, zifuna izixhobo ezibalulekileyo zokubala kunye nexesha lokusebenza. Ke, kweli phepha, sigxile ekwakheni iimodeli eziqeqeshwe kwangaphambili ukusuka ekuqaleni kusetyenziswa idatha ye-TS. Ngokungafaniyo nolwimi, olunedatha eninzi yoluntu yoqeqesho lwangaphambili kwiiterabytes, idatha yoluhlu lwexesha inqabile, yahluke kakhulu kwaye ilinganiselwe esidlangalaleni. Ukunqongophala kwayo kukhokelela ekugqityweni kakhulu xa uqeqeshelwa kwangaphambili imifuziselo “enkulu” kuphela kwidatha yetimeseries. Oku kuphakamisa umbuzo: Ngaba iimodeli ezincinci eziqeqeshwe kwangaphambili kuphela kwiiseti zedatha zikawonke-wonke ezilinganiselweyo ze-TS zinika uqikelelo olungcono lwe-zero/ukudubula okumbalwa? Okumangalisayo kukuba, impendulo nguewe! Ukuza koku, sicebisa amanqanaba amaninzi amaNxuwa amaXesha amaNcinci (i-TTM), imodeli encinci kakhulu (≤1M iiparamitha) esekwe kulwakhiwo olukhaphukhaphu lwe-TSMixer, oluqeqeshwe ngokukodwa kwii-TS corpora ezahlukeneyo zero / ezimbalwa ze-multivariate ze-TS zoqikelelo ngokufunda ngokudluliselwa.


Ngokukodwa, i-TTM iqeqeshwe kwangaphambili kusetyenziswa iiseti zedatha yoluntu ezininzi (∼ iisampuli ze-244M) ukusuka kwindawo yokugcina idatha ye-Monash [2] [Godahewa et al., 2021]). Qaphela ukuba iiseti zedatha zibonisa iyantlukwano enkulu ngokweempawu, njengemimandla eyahlukeneyo, isisombululo sexeshana[3] (isuka kwisibini ukuya kwimihla ngemihla), ubude, kunye nenani letshaneli. Ukuqeqeshwa kwangaphambili kwiiseti zedatha ezingafaniyo azikwazi ukuphathwa ngokuthe ngqo yi-TSMixer okanye imodeli ekhoyo ye-state-of-the-art (SOTA). Yiyo loo nto, i-TTM indulula oku kuphuculwa kulandelayo kuyilo lwe-TSMixer: (i) I-Adaptive Patching kuwo wonke amaleya, ithathela ingqalelo ukwahluka kobude bepetshi kwiiseti zedatha ezahlukeneyo, (ii) Ukwandiswa kweDatha ngeDownsampling ukunyusa ukhuselo kunye neesampulu kuzo zonke izisombululo ezahlukeneyo, (iii) Isiqalo seSigqibo kwiSiqalelo sokuLungiselela ukucacisa ngokucacileyo, imodeli enesisombululo, ngakumbi iluncedo kwiimeko ezinobude bembali emfutshane. Ngaphezu koko, indlela yethu yokuphucula imodeli yamanqanaba amaninzi, apho i-TTM iqala ukuqeqeshwa kwangaphambili ngendlela ezimeleyo kwaye emva koko idibanise ngokungenamthungo ukuxutywa kwetshaneli ngexesha lokulungisa kakuhle ukulinganisa idatha ekujoliswe kuyo kwitshaneli-ulungelelwaniso kunye nokufakwa ngaphandle.


Ngezantsi, sichaza amagalelo aphambili ephepha:


• Phakathi kokuxhaphaka kweemodeli ezinkulu eziqeqeshwe kwangaphambili ezifuna ukubala okubalulekileyo kunye nexesha loqeqesho (kwiiveki), umsebenzi wethu ngowokuqala ukubonisa ukusebenza kakuhle kolwakhiwo lweemodeli eziKhawulezileyo neziNcinci eziqeqeshwe kwangaphambili (≤1M iiparamitha) eziqeqeshwe ngokukodwa kwiiseti zedatha zikawonke-wonke ngephanyazo leeyure nje ezimbalwa (iiyure ze-4-8, ii-6 A100 GPUs). I-TTM ibonise ngempumelelo utshintshelo lokufunda kwiiseti zedatha ekujoliswe kuzo ezahlukeneyo, ezingabonakaliyo zeroti/imbalwa-uqikelelo lwengqikelelo, ukujongana nemiba yokunqongophala kwedatha exhaphakileyo kuthotho lwamaxesha.


• Uqeqesho lwangaphambili kwi-heterogeneous multi-resolution datasets alukwazi ukuphathwa ngokufanelekileyo yi-TSMixer okanye ezinye iimodeli zeSOTA. Yiyo loo nto, sicebisa ulwakhiwo kunye nophuculo loqeqesho , olufana nokulungelelanisa, ulwandiso lwedatha ngokuthoba isampulu, kunye (nokuzikhethela) ukulungiswa kwesimaphambili sesisombululo soqeqesho oluqinileyo lwangaphambili.


• I-TTM isebenzisa isicwangciso-qhinga sokubumba amanqanaba amaninzi ukuze imodeli ecacileyo yonxulumano lwetshaneli, kwaye ibandakanye imiqondiso yangaphandle - isakhono esibalulekileyo esinqongopheleyo kwiindlela ze-LLMsbased TS.


Ngovavanyo olubanzi kwiiseti zedatha ezili-11, i-TTM ibonisa iinzuzo zokuchaneka okuphawulekayo ngaphezu kwemilinganiselo edumileyo (i-12- 38% kuqikelelo olumbalwa/i-zero-shot). Kwakhona kunciphisa kakhulu iimfuno zekhompuyutha xa kuthelekiswa neendlela ze-LLM-TS, kunye ne-14X yokusika kwiiparamitha ezifundekayo, i-106X engaphantsi kweeparitha ezipheleleyo, kunye nokunciphisa okukhulu kwi-finetuning (65X), ixesha le-inference (54X), kunye nokusetyenziswa kwememori (27X). • Iziphumo ze-TTM ezingafumani ziphumo zikholisa ukodlula iziphumo ezimbalwa zeendlela ezininzi ze-SOTA, ziqaqambisa impumelelo yendlela yethu.


Eli phepha liyafumaneka arxiv phantsi CC BY-NC-ND 4.0 DEED ilayisenisi.


[1] "Ishaneli" ibhekiselele kuluhlu lwexesha lomntu ngamnye kwidatha ye-multivariate (oko kukuthi, i-TS ye-multivariate isignali yamajelo amaninzi).


[2] Ifumaneka apha https://forecastingdata.org/


[3] Isigqibo sibhekisa kumyinge wesampulu yoluhlu lwexesha legalelo (umzekelo, ngeyure, imizuzu eli-10, imizuzu eli-15, njl.njl.)

L O A D I N G
. . . comments & more!

About Author

The FewShot Prompting Publication  HackerNoon profile picture
The FewShot Prompting Publication @fewshot
Spearheading research, publications, and advancements in few-shot learning, and redefining artificial intelligence.

ZIJONGE IIMPAWU

ELI NQAKU LINIKEZELWE KU...