paint-brush
Ukusombulula ezona zibini zimfutshane zeNgxaki ngeFloyd-Warshall Algorithm kwiC # nge@olegkarasik
678 ukufunda
678 ukufunda

Ukusombulula ezona zibini zimfutshane zeNgxaki ngeFloyd-Warshall Algorithm kwiC #

nge Oleg Karasik28m2024/09/26
Read on Terminal Reader

Inde kakhulu; Ukufunda

Kule post ndibonisa indlela onokuthi uphumeze ngayo i-algorithm ye-Floyd-Warshall kwi-C # ukusombulula zonke-izibini eyona ngxaki imfutshane yendlela. Ngaphandle kokuphunyezwa kwesi sithuba sibandakanya ukuphuculwa kwe-algorithm eyahlukahlukeneyo kubandakanya i-vectorisation kunye ne-parallelism.
featured image - Ukusombulula ezona zibini zimfutshane zeNgxaki ngeFloyd-Warshall Algorithm kwiC #
Oleg Karasik HackerNoon profile picture

Sonke sisombulula ingxaki " yeyona ndlela imfutshane " amaxesha amaninzi ngemini. Ngokungeyomfuneko kunjalo. Siyicombulula xa sisemsebenzini okanye xa sijonga i-Intanethi okanye xa sicwangcisa izinto zethu edesikeni.


Ivakala kancinci… kakhulu? Makhe sifumanise.


Khawufane ucinge, uthathe isigqibo sokudibana nabahlobo, kulungile ... masithi kwi-cafe. Okokuqala, kufuneka ufumane indlela (okanye indlela) eya kwi-cafe, ngoko uqala ukukhangela izithuthi zikawonke-wonke ezikhoyo (ukuba uhamba ngeenyawo) okanye iindlela kunye nezitrato (ukuba uqhuba). Ukhangela indlela esuka kwindawo okuyo ukuya kwikhefi kodwa hayi "nayiphi na" indlela - eyona imfutshane okanye ikhawulezayo.


Nanku omnye umzekelo, ongacacanga njengowangaphambili. Ngexesha lomsebenzi uthatha isigqibo sokuthatha "indlela emfutshane" kwisitalato esisecaleni kuba ... "yindlela emfutshane" kwaye "ikhawuleza" ukuhamba ngale ndlela. Kodwa wazi njani ukuba le "short cut" iyakhawuleza? Ngokusekelwe kumava akho, usombulule ingxaki "yeyona ndlela imfutshane" kwaye ukhethe indlela, ehamba ngesitrato esisecaleni.


Kuyo yomibini imizekelo, indlela “emfutshane” imiselwa nokuba ngumgama okanye ixesha elifunekayo ukusuka kwenye indawo ukuya kwenye. Imizekelo ehambayo lusetyenziso lwendalo lwethiyori yegrafu kunye nengxaki "yeyona ndlela imfutshane" ngakumbi. Noko ke, kuthekani ngomzekelo wokulungelelanisa izinto edesikeni? Kule meko "emfutshane" inokumela inani okanye ukuntsonkotha kwezenzo ekufuneka uzenze ukufumana, umzekelo, iphepha: idesika evulekileyo, ifolda evulekileyo, thatha iphepha vs ukuthatha iphepha ekunene kwidesika. .


Yonke le mizekelo ingasentla imele ingxaki yokufumana eyona ndlela imfutshane phakathi kwee-vertex ezimbini kwigrafu (indlela okanye indlela phakathi kweendawo ezimbini, inani lezenzo okanye ubunzima bokufumana iphepha ukusuka kwindawo enye okanye kwenye). Olu didi lweengxaki zendlela ezimfutshane lubizwa ngokuba yi-SSSP (Umthombo omnye oMfutshane weNdlela) kunye nesiseko se-algorithm yokusombulula ezi ngxaki yi -algorithm ye-Dijkstra , ene- O(n^2) yobunzima bokubala.


Kodwa, ngamanye amaxesha kufuneka sifumane zonke iindlela ezimfutshane phakathi kwazo zonke ii-vertex. Qwalasela lo mzekelo ulandelayo: wenzela imephu yakho iintshukumo rhoqo phakathi kwekhaya , umsebenzi kunye nethiyetha . Kulo mzekelo uya kuphelela ngeendlela ezi-6: work ⭢ home , home ⭢ work , work ⭢ theatre , theatre ⭢ work , home ⭢ theatre kunye theatre ⭢ home (iindlela ezibuyela umva zinokwahluka ngenxa yeendlela zendlela enye umzekelo) .


Ukongeza indawo eyongezelelekileyo kwimephu kuya kukhokelela ekukhuleni okubalulekileyo kokudityaniswa - ngokungqinelana nefomula ye-n ye -combinatorics ingabalwa ngolu hlobo:

 A(k, n) = n! / (n - m)! // where n - is a number of elements, // k - is a number of elements in permutation (in our case k = 2)

Okusinika iindlela ezili-12 zeendawo ezi-4 kunye neendlela ezingama-90 zeendawo ezili-10 (ezichukumisayo). Qaphela… oku ngaphandle kokuthathela ingqalelo iindawo eziphakathi phakathi kweendawo okt, ukusuka ekhaya ukuya emsebenzini kufuneka uwele izitalato ezi-4, uhambe ngomlambo kwaye uwele ibhulorho. Ukuba siyaqikelela, ezinye iindlela zinokuba neendawo eziqhelekileyo eziphakathi… kuhle ... ngenxa yoko siya kuba negrafu enkulu kakhulu, eneentloko ezininzi, apho i-vertex nganye iyakumela nokuba yindawo okanye indawo ephakathi ebalulekileyo. Udidi lweengxaki, apho kufuneka sifumane zonke iindlela ezimfutshane phakathi kwazo zonke izibini ze-vertexes kwigrafu, ibizwa ngokuba yi -APSP (Zonke Izibini Ezifutshane) kunye nesiseko se-algorithm yokusombulula ezi ngxaki yi -algorithm ye-Floyd-Warshall , ene- O(n^3) ukuntsokotha kokubala.


Kwaye le yi-algorithm esiza kuyiphumeza namhlanje 🙂

Floyd-Warshall algorithm

I-algorithm ye-Floyd-Warshall ifumana zonke iindlela ezimfutshane phakathi kweperi nganye yeevertex kwigrafu. I-algorithms yapapashwa nguRobert Floyd kwi- [1] (jonga icandelo elithi "References" ukuze ufumane iinkcukacha ezingaphezulu). Kwangalo nyaka, uPeter Ingerman kwi- [2] uchaze ukuphunyezwa kwe-algorithm yangoku ngohlobo lwezintlu ezintathu ezibekwe for :

 algorithm FloydWarshall(W) do for k = 0 to N - 1 do for i = 0 to N - 1 do for j = 0 to N - 1 do W[i, j] = min(W[i, j], W[i, k] + W[k, j]) end for end for end for end algorithm // where W - is a weight matrix of N x N size, // min() - is a function which returns lesser of it's arguments

Ukuba awuzange ube notshintsho ekusebenzeni ngegrafu emelwe ngohlobo lwe-matrix ngoko kunokuba nzima ukuqonda ukuba i-algorithm ingentla yenza ntoni. Ke, ukuqinisekisa ukuba sikwiphepha elinye, makhe sijonge ukuba igrafu inokumelwa njani ngohlobo lwe-matrix kwaye kutheni umelo olunjalo luluncedo ukusombulula eyona ngxaki imfutshane yendlela.


Umfanekiso ongezantsi ubonisa igrafu eqondisiweyo, enobunzima bee-vertexes ezi-5. Ngasekhohlo, igrafu iboniswa ngendlela ebonwayo, eyenziwe ngezangqa kunye neentolo, apho isangqa ngasinye simele i-vertex kwaye utolo lumele isiphelo esinesalathiso. Inani elingaphakathi kwesangqa lihambelana nenombolo yevertex kunye nenani elingentla komphetho liyahambelana nobunzima bomphetho. Ngasekunene, igrafu efanayo inikezelwa ngendlela ye-matrix yobunzima. I-matrix yobunzima luhlobo lwe -matrix ekufutshane apho iseli nganye ye-matrix iqulethe "ubunzima" - umgama phakathi kwe-vertex i (umqolo) kunye ne-vertex j (ikholamu). I-matrix yobunzima ayibandakanyi ulwazi malunga “nendlela” phakathi kwee-vertex (uluhlu lwee-vertex ofumana ngalo ukusuka ku i ukuya ku j ) – nje ubunzima (umgama) phakathi kwezi vesi.


Umfanekiso 1. Ukumelwa kwegrafu eqondisiweyo, enobunzima bee-vertex ezi-5 kwifom ebonakalayo (ngasekhohlo) kunye nefom ye-matrix enobunzima (ngasekunene).


Kwi-matrix yobunzima sinokubona ukuba ixabiso leseli lilingana nobunzima phakathi kwee-vertex ezikufutshane . Yiyo loo nto, ukuba sihlola iindlela ukusuka kwi-vertex 0 (umqolo we 0 ), siya kubona ukuba ... kukho indlela enye kuphela - ukusuka ku 0 ukuya ku 1 . Kodwa, kumboniso obonakalayo sinokubona ngokucacileyo iindlela ukusuka kwi-vertex 0 ukuya kwi-vertexes yesi 2 kunye ne 3 (nge-vertex 1 ). Isizathu salokhu silula - kwimeko yokuqala, i-matrix yobunzima iqulethe umgama phakathi kwee-vertex ezikufutshane kuphela. Nangona kunjalo, olu lwazi lulodwa lwanele ukufumana ukuphumla.


Makhe sibone ukuba isebenza njani. Nika ingqalelo kwiseli W[0, 1] . Ixabiso libonisa, kukho indlela esuka kwivertex 0 ukuya kwivertex 1 enobunzima obulingana no 1 (ngokufutshane singabhala njenge: 0 ⭢ 1: 1 ). Ngolu lwazi, ngoku sinokuskena zonke iiseli zomqolo woku 1 (oqulethe zonke izisindo zazo zonke iindlela ukusuka kwi-vertex 1 ) kunye ne-port yangasemva le ngcaciso kumqolo 0 , ukwandisa ubunzima ngo 1 (ixabiso le- W[0, 1] ).


Umfanekiso 2. Umboniso wokufumana zonke iindlela ukusuka kwivetheksi ukuya kwivetheksi emelene nevetheksi yoku-1.


Ngokusebenzisa amanyathelo afanayo sinokufumana iindlela ukusuka kwi-vertex 0 ukuya kwezinye ii-vertex. Ngexesha lokukhangela, kunokwenzeka ukuba kukho iindlela ezingaphezu kwenye ezikhokelela kwi-vertex enye kwaye yintoni ebaluleke kakhulu ubunzima bezi ndlela zinokwahluka. Umzekelo wemeko enjalo ubonisiwe kumfanekiso ongezantsi, apho ukukhangela ukusuka kwi-vertex 0 ukuya kwi-vertex 2 kutyhile enye indlela eya kwi-vertex yesi 3 yobunzima obuncinane.


Umfanekiso wesi-3. Umzekeliso wemeko apho uphendlo olusuka kwi-vertex ukuya kwi-vertex yesi-2 luveze indlela eyongezelelweyo nemfutshane eya kwi-vertex yesi-3.


Sineendlela ezimbini: indlela yoqobo – 0 ⭢ 3: 4 kunye nendlela entsha esisandula ukuyifumanisa – 0 ⭢ 2 ⭢ 3: 3 (khumbula, ubunzima bematrix ayiqulathanga iindlela, ngoko asazi ukuba yeyiphi. iivertex zibandakanyiwe kwindlela yokuqala). Eli lixesha xa sithatha isigqibo sokugcina eyona imfutshane kwaye sibhale 3 kwiseli W[0, 3] .


Kubonakala ngathi sifumene eyona ndlela yethu yokuqala imfutshane!


Amanyathelo esisanda kuwabona angoyena ndoqo we-algorithm ye-Floyd-Warshall. Jonga i-algorithm's pseudocode ixesha elingakumbi:

 algorithm FloydWarshall(W) do for k = 0 to N - 1 do for i = 0 to N - 1 do for j = 0 to N - 1 do W[i, j] = min(W[i, j], W[i, k] + W[k, j]) end for end for end for end algorithm

Umjikelo ongaphandle for - k iiterates phezu kwazo zonke iivertex kwigrafu kwaye kuphindaphindo ngalunye, ulwahlulo u k umele ivertex esizingela iindlela kuyo . Umjikelo wangaphakathi for i iphinda iphindaphinde ngaphezu kwazo zonke ii-vertex kwigrafu kwaye kwi-iteration nganye, i i-vertex esikhangela iindlela ukusuka kuyo . Kwaye okokugqibela umjikelo ongaphakathi for j iterates phezu kwazo zonke iivertex kwigrafu nakuphinda-phindo ngalunye, j imele ivertex esikhangela indlela eya kuyo. Xa kudityanisiwe isinika oku kulandelayo: kwi-iteration nganye k sikhangela umendo kuzo zonke iivertex i ukuya kuzo zonke iivertex j ukugqitha kwivertex k . Ngaphakathi kumjikelo sithelekisa indlela i ⭢ j (emelwe ngu W[i, j] ) kunye nendlela i ⭢ k ⭢ j (emelwe sisimbuku se- W[I, k] kunye ne W[k, j] ) kunye nokubhala eyona imfutshane enye ibuyele kwi W[i, j] .


Ngoku, xa siqonda i-mechanics lixesha lokuphumeza i-algorithm.

Ukuphunyezwa okusisiseko

Ikhowudi yomthombo kunye neegrafu zokulinga ziyafumaneka kwindawo yokugcina kwi-GitHub. Iigrafu zovavanyo zinokufunyanwa kuvimba Data/Sparse-Graphs.zip . Zonke iibenchmarks kwesi sithuba ziphunyezwe kwifayile ye -APSP01.cs .

Ngaphambi kokuba singene ekuphunyezweni kufuneka sicacise amaxesha ambalwa obugcisa:

  1. Zonke izinto eziphunyeziweyo zisebenza nge-matrix yobunzima obumelwe ngokohlobo lomgca womgca.
  2. Lonke uzalisekiso lusebenzisa izibalo ezipheleleyo. Ukungabikho kwendlela phakathi kwee-vertex kubonakaliswa ubunzima obukhethekileyo obuqhubekayo: NO_EDGE = (int.MaxValue / 2) - 1 .


Ngoku, makhe sibone ukuba kutheni.


Malunga ne-#1. Xa sithetha ngee-matrixes sichaza iiseli ngokwe “miqolo” kunye “nezikholamu”. Ngenxa yoko kubonakala kungokwemvelo ukucinga i-matrix ngendlela "yesikwere" okanye "uxande" (Umfanekiso 4a).


Umfanekiso 4: Iimboniso ezininzi zematrix. a) umfanekiso wentelekelelo “osisikwere”; b) uluhlu lwamagama amelweyo; c) Umelo olulandelelanayo.


Nangona kunjalo, oku akuyomfuneko kuthetha ukuba kufuneka simele imatrix ngohlobo loluhlu lwee-arrays (Umfanekiso 4b) ukunamathela kwintelekelelo yethu. Endaweni yoku, sinokumela imatrix ngendlela yoluhlu lomgca (Umfanekiso 4c) apho isalathiso seseli nganye sibalwa kusetyenziswa le fomula ilandelayo:

 i = row * row_size + col; // where row - cell row index, // col - cell column index, // row_size - number of cells in a row.

Uluhlu lomgca wobunzima be-matrix ngumqathango wangaphambili wokuphunyezwa okusebenzayo kwe-algorithm ye-Floyd-Warshall. Isizathu soko asilula kwaye ingcaciso eneenkcukacha ifanelwe yiposti eyahlukileyo… okanye izithuba ezimbalwa. Nangona kunjalo, okwangoku, kubalulekile ukukhankanya ukuba ukumelwa okunjalo kuphucula kakhulu indawo yedatha , eneneni enempembelelo enkulu ekusebenzeni kwe-algorithm.

Apha, ndicela ukuba undikholelwe kwaye nje olu lwazi engqondweni njengemeko yangaphambili, nangona kunjalo, kwangaxeshanye ndincoma ukuba uchithe ixesha kwaye ufunde lo mbuzo, kwaye ngendlela - ungakholelwa abantu kwi-Intanethi. .


- Inqaku lombhali

Ngokuphathelele #2. Ukuba ujonga ngokusondeleyo kwi-algorithm yepseudocode, awuzukufumana naziphi iitshekhi eziyelelene nobukho bendlela phakathi kweeverteksi ezimbini. Endaweni yoko, pseudocode sebenzisa nje min() umsebenzi. Isizathu silula - ekuqaleni, ukuba akukho mendo phakathi kweeverteksi, ixabiso leseli limiselwe kwi infinity kwaye kuzo zonke iilwimi, ngaphandle kokuba iJavaScript, onke amaxabiso angaphantsi kune infinity . Kwimeko yeenombolo ezipheleleyo kunokuhenda ukusebenzisa int.MaxValue njengexabiso elithi "akukho ndlela". Nangona kunjalo oku kuyakukhokelela ekuphuphumeni okupheleleyo kwiimeko xa amaxabiso azo zombini i ⭢ k no k ⭢ j iindlela ziyakulingana ne int.MaxValue . Yiyo loo nto sisebenzisa ixabiso elingaphantsi kwesiqingatha se- int.MaxValue .

Hayi! Kodwa kutheni singenakujonga nje ukuba indlela ikhona na phambi kokuba senze naziphi na izibalo. Umzekelo ngokuthelekisa iindlela zombini ukuya ku-0 (ukuba sithatha u-zero njengexabiso elithi "akukho ndlela").


- Umfundi ocingayo

Inokwenzeka ngokwenene kodwa ngelishwa iya kukhokelela kwisohlwayo esibalulekileyo sokusebenza. Ngokufutshane, i-CPU igcina iinkcukacha-manani zeziphumo zovavanyo lwesebe. xa if zengxelo zivavanya ukuba true okanye false . Isebenzisa olu balo-manani ukwenza ikhowudi "yesebe eliqikelelweyo ngokweenkcukacha-manani" ngaphambili phambi kokuba if ivavanywe (oku kubizwa ngokuba yintelekelelo ekwenziweni ) kwaye ke ngoko iphumeze ikhowudi ngokufanelekileyo ngakumbi. Nangona kunjalo, xa ukubikezelwa kwe-CPU kungachanekanga, kubangela ilahleko enkulu yokusebenza xa kuthelekiswa nokuqikelelwa okuchanekileyo kunye nokuphunyezwa okungenamiqathango kuba i-CPU kufuneka ime kwaye ibale isebe elichanekileyo.


Kuba kwi-iteration k nganye sihlaziya inxalenye ebalulekileyo ye-matrix yobunzima, izibalo ze-CPU zesebe ziba lilize kuba akukho pateni yekhowudi, onke amasebe asekelwe ngokukodwa kwidatha. Ngoko ke oko kuhlolwa kuya kukhokelela kwisixa esibalulekileyo sokungacingelwa kakuhle kwesebe .

Apha kwakhona ndicela ukuba undikholelwe (okwangoku) kwaye emva koko uchithe ixesha lokufunda isihloko.


Uhh, kujongeka ngathi sigqibile ngethiyori inxalenye - masiphumeze i-algorithm (sichaza oku kuphunyezwa Baseline ):

 public void Baseline(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }

Le khowudi ingentla iphantse ifane ikopi yepseudocode ekhankanywe ngaphambili ngaphandle kokunye - endaweni Math.Min() sisebenzisa if sihlaziya iseli kuphela xa kuyimfuneko.

Hayi! Yima umzuzu, yayingenguwe owabhala amagama amaninzi malunga nokuba kutheni ukuba akulunganga apha kunye nemigca embalwa kamva thina ngokwethu sazisa ukuba?


- Umfundi ocingayo

Isizathu silula. Ngomzuzu wokubhala i-JIT ikhupha ikhowudi ephantse ilingane kuzo zombini if kunye nokuphunyezwa Math.Min . Ungayijonga kwiinkcukacha kwi-shartlab.io kodwa nantsi i-snippets yemizimba ephambili ye-loop:

 // // == Assembly code of implementation of innermost loop for of j using if. // 53: movsxd r14, r14d // // compare matrix[i * sz + j] and distance (Condition) // 56: cmp [rdx+r14*4+0x10], ebx 5b: jle short 64 // // if matrix[i * sz + j] greater than distance write distance to matrix // 5d: movsxd rbp, ebp 60: mov [rdx+rbp*4+0x10], ebx 64: // end of loop // // == Assembly code of implementation of innermost loop for of j using Math.Min. // 4f: movsxd rbp, ebp 52: mov r14d, [rdx+rbp*4+0x10] // // compare matrix[i * sz + j] and distance (Condition) // 57: cmp r14d, ebx. // // if matrix[i * sz + j] less than distance write value to matrix // 5a: jle short 5e // // otherwise write distance to matrix // 5c: jmp short 61 5e: mov ebx, r14d 61: mov [rdx+rbp*4+0x10], ebx 65: // end of loop

Ungabona, nokuba sisebenzisa if okanye Math.Min kusekho ujongo olunemiqathango. Nangona kunjalo, kwimeko if akukho bhala ngokungeyomfuneko.


Ngoku xa sigqibile ngokuphunyezwa, lixesha lokuqhuba ikhowudi kwaye sibone ukuba ikhawuleza kangakanani?!

Ungaqinisekisa ukuchaneka kwekhowudi ngokwakho ngokuqhuba iimvavanyo kwindawo yokugcina .

Ndisebenzisa iBenchmark.NET (uguqulelo 0.12.1) kwikhowudi yebenchmark. Zonke iigrafu ezisetyenziswe kwiibenchmarks ziqondiswe , iigrafu ze-acyclic ze-300, 600, 1200, 2400 kunye ne-4800 vertexes. Inani lemiphetho kwiigrafu lijikeleze i-80% yobuninzi obunokwenzeka apho ulwalathiso, iigrafu ze-acyclic zingabalwa ngolu hlobo:

 var max = v * (v - 1)) / 2; // where v - is a number of vertexes in a graph.

Masishukume!


Nazi iziphumo zebenchmarks eziqhutywa kwiPC yam (Windows 10.0.19042, Intel Core i7-7700 CPU 3.60Ghz (Kaby Lake) / 8 iiprosesa ezinengqiqo / 4 cores):


Indlela

Ubungakanani

Ithetha

Impazamo

StdDev

Isiseko

300

27.525 ms

0.1937 ms

0.1617 ms

Isiseko

600

217.897 ms

1.6415 ms

1.5355 ms

Isiseko

1200

1,763.335 ms

7.4561 ms

6.2262 ms

Isiseko

2400

14,533.335 ms

63.3518 ms

52.9016 ms

Isiseko

4800

119,768.219 ms

181.5514 ms

160.9406 ms


Kwiziphumo esizibonayo, ixesha lokubala likhula ngokumangalisayo xa lithelekiswa nobukhulu begrafu - kwigrafu ye-300 vertexes ithathe i-27 milliseconds, igrafu ye-2400 vertex - imizuzwana ye-14.5 kunye negrafu ye-4800 - 119 imizuzwana phantse 2 imizuzu !


Ukujonga ikhowudi ye-algorithm kunokuba nzima ukuyicinga, kukho into esinokuyenza ukukhawulezisa izibalo… kuba kuhle… kukho iilophu AMATHATHU, iilophu AMATHATHU kuphela.


Nangona kunjalo, njengoko kuqhele ukwenzeka - amathuba afihliweyo kwiinkcukacha 🙂

Yazi wena idatha - iigrafu ezimbalwa

I-algorithm ye-Floyd-Warshall yi-algorithm yesiseko sokusombulula zonke-izibini eyona ngxaki imfutshane yendlela, ngakumbi xa kufikwa kwiigrafu ezixineneyo okanye ezipheleleyo (kuba i-algorithm yokukhangela iindlela phakathi kwazo zonke izibini ze-vertexes).


Nangona kunjalo, kwiimvavanyo zethu sisebenzisa i-directed, iigrafu ze-acyclic , ezinepropati emangalisayo - ukuba kukho indlela esuka kwi-vertex 1 ukuya kwi- 2 , ngoko akukho ndlela esuka kwi-vertex 2 ukuya kwi- 1 . Kithina, kuthetha ukuba, zininzi ii-vertex ezingadibananga esinokutsiba ukuba akukho ndlela esuka ku i ukuya ku k (sichaza oku kuphunyezwa njenge SpartialOptimisation ).

 public void SpartialOptimisation(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { if (matrix[i * sz + k] == NO_EDGE) { continue; } for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }

Nazi iziphumo zangaphambili ( Baseline ) kunye nangoku ( SpartialOptimisation ) yangoku kwiseti yeegrafu (iziphumo ezikhawulezayo ziphawulwe ngokungqindilili ):


Indlela

Ubungakanani

Ithetha

Impazamo

StdDev

Umlinganiselo

Isiseko

300

27.525 ms

0.1937 ms

0.1617 ms

1.00

SpartialOptimisation

300

12.399 ms

0.0943 ms

0.0882 ms

0.45

Isiseko

600

217.897 ms

1.6415 ms

1.5355 ms

1.00

SpartialOptimisation

600

99.122 ms

0.8230 ms

0.7698 ms

0.45

Isiseko

1200

1,763.335 ms

7.4561 ms

6.2262 ms

1.00

SpartialOptimisation

1200

766.675 ms

6.1147 ms

5.7197 ms

0.43

Isiseko

2400

14,533.335 ms

63.3518 ms

52.9016 ms

1.00

SpartialOptimisation

2400

6,507.878 ms

28.2317 ms

26.4079 ms

0.45

Isiseko

4800

119,768.219 ms

181.5514 ms

160.9406 ms

1.00

SpartialOptimisation

4800

55,590.374 ms

414.6051 ms

387.8218 ms

0.46


Iyancomeka kakhulu!


Sinciphise ixesha lokubala ngesiqingatha! Ngokuqinisekileyo, i-denser igrafu, isantya esincinci siya kuba. Nangona kunjalo, olu lolunye lolungiselelo oluza luncedo ukuba uyazi kwangaphambili ukuba loluphi udidi lwedatha ojonge ukusebenza nalo.


Ngaba sinokwenza okungakumbi? 🙂

Yazi i-hardware yakho - ukuhambelana


I-PC yam ixhotyiswe nge Inter Core i7-7700 CPU 3.60GHz (Kaby Lake) iprosesa ene-8 logical processors ( HW ) okanye ii-cores ze-4 ezinobuchwepheshe be-Hyper-Threading . Ukuba nondoqo ongaphezulu kwesinye kufana nokuba “nezandla ezizezinye” esinokuzisebenzisa. Ke, masibone ukuba leliphi icandelo lomsebenzi elinokwahlulwa ngokufanelekileyo phakathi kwabasebenzi abaninzi kwaye emva koko, lingqamanise.


Iiluphu zisoloko zingoyena mviwa ucacileyo wokunxulunyaniswa kuba kwiimeko ezininzi uphindaphindo luzimele kwaye lunokwenziwa ngaxeshanye. Kwi-algorithm, sinee-loops ezintathu ekufuneka sizihlalutye kwaye sibone ukuba kukho naziphi na izinto ezixhomekeke ekusithinteleni ukuba zihambelane nazo.


Masiqale for of k loop. Kuluphu ngalunye lokuphindaphinda ubala iindlela ukusuka kwivertex nganye ukuya kwivertex nganye kwivertex k . Iindlela ezintsha kunye nezihlaziyiweyo zigcinwa kwi-matrix yobunzima. Ukujonga uphindaphindo onokuthi uqaphele - banokubulawa ngalo naluphi na ulandelelwano: 0, 1, 2, 3 okanye 3, 1, 2, 0 ngaphandle kokuchaphazela isiphumo. Nangona kunjalo, kusafuneka ziphunyezwe ngokulandelelana, kungenjalo ezinye iiterations aziyi kukwazi ukusebenzisa iindlela ezintsha okanye ezihlaziyiweyo kuba aziyi kubhalwa kwi-matrix yobunzima, okwangoku. Ukungahambelani okunjalo ngokuqinisekileyo kuya kutyumza iziphumo. Ngoko kufuneka siqhubeke sikhangela.


Umgqatswa olandelayo for of i loop. Kuluphu ngalunye lokuphinda lufundeka indlela esuka kwivertex i ukuya kwivertex k (iseli: W[i, k] ), indlela esuka kwivertex k ukuya kwivertex j (iseli: W[k, j ]) kwaye emva koko ijonga ukuba ngaba indlela eyaziwayo evela i ukuya ku j (iseli: W[i, j] ) mfutshane kuno i ⭢ k ⭢ j indlela (isimbuku se: W[i, k] + W[k, j] ). Ukuba ujonga ngokusondeleyo kwipateni yokufikelela, unokuqaphela - kwi-iteration nganye i loop ifunda idatha ukusuka k kumqolo kwaye ihlaziywe i rowu ethetha ngokusisiseko - uphindaphindo luzimele kwaye lunokuphunyezwa kungekuphela kulo naluphi na ulandelelwano kodwa nangokuhambelana!


Oku kubonakala kuthembisa, ngoko ke masikuphumeze (sichaza olu phumezo njenge SpartialParallelOptimisations ).

for of j loop nayo inokudityaniswa. Nangona kunjalo, ukuhambelana komjikelo ongaphakathi kulo mzekelo akuphumelelanga kakhulu. Unokuziqinisekisa ngokwenza utshintsho olulula kwikhowudi yemvelaphi .


- Inqaku lombhali

 public void SpartialParallelOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { Parallel.For(0, sz, i => { if (matrix[i * sz + k] == NO_EDGE) { return; } for (var j = 0; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } }); } }

Nazi iziphumo ze Baseline , SpartialOptimisation kunye SpartialParallelOptimisations uzalisekiso kwiseti enye yeegrafu (ukulinganisa kwenziwa kusetyenziswa i-Parallel class):


Indlela

Ubungakanani

Ithetha

Impazamo

StdDev

Umlinganiselo

Isiseko

300

27.525 ms

0.1937 ms

0.1617 ms

1.00

SpartialOptimisation

300

12.399 ms

0.0943 ms

0.0882 ms

0.45

SpartialParallelOptimisations

300

6.252 ms

0.0211 ms

0.0187 ms

0.23

Isiseko

600

217.897 ms

1.6415 ms

1.5355 ms

1.00

SpartialOptimisation

600

99.122 ms

0.8230 ms

0.7698 ms

0.45

SpartialParallelOptimisations

600

35.825 ms

0.1003 ms

0.0837 ms

0.16

Isiseko

1200

1,763.335 ms

7.4561 ms

6.2262 ms

1.00

SpartialOptimisation

1200

766.675 ms

6.1147 ms

5.7197 ms

0.43

SpartialParallelOptimisations

1200

248.801 ms

0.6040 ms

0.5043 ms

0.14

Isiseko

2400

14,533.335 ms

63.3518 ms

52.9016 ms

1.00

SpartialOptimisation

2400

6,507.878 ms

28.2317 ms

26.4079 ms

0.45

SpartialParallelOptimisations

2400

2,076.403 ms

20.8320 ms

19.4863 ms

0.14

Isiseko

4800

119,768.219 ms

181.5514 ms

160.9406 ms

1.00

SpartialOptimisation

4800

55,590.374 ms

414.6051 ms

387.8218 ms

0.46

SpartialParallelOptimisations

4800

15,614.506 ms

115.6996 ms

102.5647 ms

0.13


Kwiziphumo sinokubona ukuba ukunxulunyaniswa kwe for of i loop kunciphise ixesha lokubala ngama-2-4 amaxesha xa kuthelekiswa nokuphunyezwa kwangaphambili ( SpartialOptimisation )! Oku kuchukumisa kakhulu, nangona kunjalo, kubalulekile ukukhumbula, ukuhambelana kobalo olusulungekileyo kudla zonke izixhobo zekhompyutha ezikhoyo ezinokukhokelela ekulambeni kobutyebi kwezinye izicelo kwinkqubo. Ukulinganisa kufuneka kusetyenziswe ngononophelo.


Njengoko unokuthekelela - esi ayisosiphelo 🙂

Yazi iqonga lakho - vectorization

IVectorisation lutshintsho lwekhowudi esebenza kwinto enye ibe yikhowudi esebenza kwizinto ezininzi ngaxeshanye.

Oku kunokuvakala njengomcimbi onzima, ngoko ke makhe sibone ukuba isebenza njani kumzekelo olula:

 var a = new [] { 5, 7, 8, 1 }; var b = new [] { 4, 2, 2, 6 }; var c = new [] { 0, 0, 0, 0 }; for (var i = 0; i < 4; ++i) c[i] = a[i] + b[i];

Ngendlela eyenziwe lula kakhulu , ukuphunyezwa kokuphinda-phinda oku- 0 koku kungasentla for kwinqanaba le-CPU kunokucaciswa ngolu hlobo lulandelayo:


Umfanekiso wesi-5. Umzekeliso owenziwe lula ngokugqithisileyo wesikali sokwenziwa kwe-loop iteration kwinqanaba le-CPU.


Nantsi into eyenzekayo. I-CPU ilayisha amaxabiso oluhlu a kunye no b ukusuka kwimemori ukuya kwiirejista ze-CPU (irejista enye inokubamba ngqo ixabiso elinye ). Emva koko i-CPU yenza umsebenzi wokongeza kwi-scalar kwezi rejista kwaye ibhale isiphumo sibuyele kwimemori ephambili - ngqo kuluhlu c . Le nkqubo iphinda iphindwe kane ngaphambi kokuba i-loop iphele.


UVectorisation kuthetha ukusetyenziswa kweerejista ezikhethekileyo ze-CPU - iVector okanye iirejista ze-SIMD (umyalelo omnye wedatha ezininzi), kunye nemiyalelo ye-CPU ehambelanayo yokwenza imisebenzi kumaxabiso amaninzi ngaxeshanye:


Umfanekiso 6. Umboniso owenziwe lula ngokugqithisileyo wevektha yokwenziwa kweluphu kwinqanaba le-CPU.


Nantsi into eyenzekayo. I-CPU ilayisha amaxabiso oluhlu a kunye b ukusuka kwimemori ukuya kwiirejista ze-CPU (nangona ngeli xesha, irejista yevektha enye inokubamba ixabiso loluhlu olubini ). Emva koko i-CPU yenza umsebenzi wokongeza i-vector kwezi rejista kwaye ibhale isiphumo sibuyele kwimemori ephambili - ngqo kuluhlu c . Ngenxa yokuba sisebenza kumaxabiso amabini kanye, le nkqubo iphindwa kabini endaweni yesine.


Ukuphumeza i-vectorisation kwi-.NET sinokusebenzisa - iVector kunye neVector<T> iintlobo (nathi sinokusebenzisa iindidi ezivela kwi -System.Runtime.Intrinsics namespace, nangona kunjalo zihamba phambili kuphunyezo lwangoku, ngoko ke andiyi kuzisebenzisa, kodwa ndizive simahla ukuba uzizame ngokwakho):

 public void SpartialVectorOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { for (var i = 0; i < sz; ++i) { if (matrix[i * sz + k] == NO_EDGE) { continue; } var ik_vec = new Vector<int>(matrix[i * sz + k]); var j = 0; for (; j < sz - Vector<int>.Count; j += Vector<int>.Count) { var ij_vec = new Vector<int>(matrix, i * sz + j); var ikj_vec = new Vector<int>(matrix, k * sz + j) + ik_vec; var lt_vec = Vector.LessThan(ij_vec, ikj_vec); if (lt_vec == new Vector<int>(-1)) { continue; } var r_vec = Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec); r_vec.CopyTo(matrix, i * sz + j); } for (; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } } } }

Ikhowudi yeVectorised inokujongeka ingaqhelekanga, ke masihambe ngayo inyathelo ngenyathelo.


Senza i-vector ye- i ⭢ k indlela: var ik_vec = new Vector<int>(matrix[i * sz + k]) . Njengomphumo, ukuba i-vector inokubamba amaxabiso amane ohlobo int kunye nobunzima be i ⭢ k indlela ilingana no-5, siya kudala i-vector yesine 5's - [5, 5, 5, 5]. Okulandelayo, kuphindaphindo ngalunye, ngaxeshanye sibala iindlela ukusuka kwi-vertex i ukuya kwi-vertexes j, j + 1, ..., j + Vector<int>.Count .

Vector<int>.Count lubuyisela inani leezakhi zodidi int olungena kwiirejista ze-vector. Ubungakanani beerejista zeVektha buxhomekeke kwimodeli ye-CPU, ke le propati inokubuyisela amaxabiso awohlukeneyo kwii-CPU's.


- Inqaku lombhali

Yonke inkqubo yokubala inokohlulwa ibe ngamanyathelo amathathu:

  1. Layisha ulwazi kwi-matrix yobunzima kwii-vectors: ij_vec kunye ikj_vec .
  2. Thelekisa ij_vec kunye ikj_vec iivektha kwaye ukhethe amaxabiso amancinci kwivektha entsha r_vec .
  3. Bhala r_vec umva ubunzima matrix.


Ngelixa i-#1 kunye ne #3 zithe ngqo, #2 ifuna ingcaciso. Njengoko bekutshiwo ngaphambili, ngee-vectors sisebenzisa amaxabiso amaninzi ngexesha elinye. Ngoko ke, isiphumo semisebenzi ethile ayinakuba sisinye, oko kukuthi, umsebenzi wothelekiso Vector.LessThan(ij_vec, ikj_vec) ayinakubuya true okanye false kuba ithelekisa amaxabiso amaninzi. Ngoko endaweni yoko ibuyisela i-vector entsha equlathe iziphumo zothelekiso phakathi kwamaxabiso ahambelanayo asuka kwivektha ij_vec kunye ne ikj_vec ( -1 , ukuba ixabiso elisuka kwi ij_vec lingaphantsi kwexabiso kwi ikj_vec kunye no 0 ukuba kungenjalo). IVektha ebuyiselweyo (ngokwayo) ayiloncedo ngenene, nangona kunjalo, sinokusebenzisa Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec) ukusebenza kwevektha ukukhupha amaxabiso afunekayo kwi ij_vec kunye ne ikj_vec vector kwi-vector entsha – r_vec . Lo msebenzi ubuyisela ivektha entsha apho amaxabiso alingana namancinane amabini ahambelanayo amaxabiso eevektha zongeniso okt, ukuba ixabiso le vector lt_vec lilingana no -1 , ngoko umsebenzi ukhetha ixabiso kwi ij_vec , kungenjalo ikhetha ixabiso kwi ikj_vec .


Ngaphandle kwe- #2 , kukho enye inxalenye, ifuna inkcazo-yesibini, i-loop ye-non-vectorised. Kufuneka xa ubungakanani bobunzima be-matrix bungeyomveliso Vector<int>.Count ixabiso. Kwimeko apho i-loop engundoqo ayikwazi ukucubungula onke amaxabiso (kuba awukwazi ukulayisha inxalenye ye-vector) kwaye okwesibini, i-non-vectored, i-loop iya kubala ukuphindaphinda okuseleyo.


Nazi iziphumo zoku kunye nakho konke ukuphunyezwa kwangaphambili:


Indlela

sz

Ithetha

Impazamo

StdDev

Umlinganiselo

Isiseko

300

27.525 ms

0.1937 ms

0.1617 ms

1.00

SpartialOptimisation

300

12.399 ms

0.0943 ms

0.0882 ms

0.45

SpartialParallelOptimisations

300

6.252 ms

0.0211 ms

0.0187 ms

0.23

SpartialVectorOptimisations

300

3.056 ms

0.0301 ms

0.0281 ms

0.11

Isiseko

600

217.897 ms

1.6415 ms

1.5355 ms

1.00

SpartialOptimisation

600

99.122 ms

0.8230 ms

0.7698 ms

0.45

SpartialParallelOptimisations

600

35.825 ms

0.1003 ms

0.0837 ms

0.16

SpartialVectorOptimisations

600

24.378 ms

0.4320 ms

0.4041 ms

0.11

Isiseko

1200

1,763.335 ms

7.4561 ms

6.2262 ms

1.00

SpartialOptimisation

1200

766.675 ms

6.1147 ms

5.7197 ms

0.43

SpartialParallelOptimisations

1200

248.801 ms

0.6040 ms

0.5043 ms

0.14

SpartialVectorOptimisations

1200

185.628 ms

2.1240 ms

1.9868 ms

0.11

Isiseko

2400

14,533.335 ms

63.3518 ms

52.9016 ms

1.00

SpartialOptimisation

2400

6,507.878 ms

28.2317 ms

26.4079 ms

0.45

SpartialParallelOptimisations

2400

2,076.403 ms

20.8320 ms

19.4863 ms

0.14

SpartialVectorOptimisations

2400

2,568.676 ms

31.7359 ms

29.6858 ms

0.18

Isiseko

4800

119,768.219 ms

181.5514 ms

160.9406 ms

1.00

SpartialOptimisation

4800

55,590.374 ms

414.6051 ms

387.8218 ms

0.46

SpartialParallelOptimisations

4800

15,614.506 ms

115.6996 ms

102.5647 ms

0.13

SpartialVectorOptimisations

4800

18,257.991 ms

84.5978 ms

79.1329 ms

0.15


Ukususela kwisiphumo kuyacaca, i-vectorisation iye yanciphisa kakhulu ixesha lokubala - ukusuka kwi-3 ukuya kwi-4 amaxesha xa kuthelekiswa ne-non-parallelised version ( SpartialOptimisation ). Umzuzu onomdla apha, kukuba i-vectorised version iphinda idlulele kwi-parallel version ( SpartialParallelOptimisations ) kwiigrafu ezincinci (ngaphantsi kweeverteksi ze-2400).


Kwaye okokugqibela kodwa okungakuncinananga - masidibanise i-vectorisation kunye ne-parallelism!

Ukuba unomdla kwisicelo esisebenzayo se-vectorisation, ndincoma ukuba ufunde uchungechunge lwezithuba zikaDan Shechter apho wafaka khona Array.Sort (ezi ziphumo kamva zifumene uhlaziyo lwe-Garbage Collector kwi-.NET 5 ).

Yazi iqonga lakho kunye ne-hardware-i-vectorisation kunye ne-parallelism!

Uzalisekiso lokugqibela ludibanisa iinzame zongqamaniso kunye novetho kunye ... ukunyaniseka alunamntu 🙂 Kuba… ke, sisanda kutshintsha umzimba Parallel.For kwi SpartialParallelOptimisations kunye ne-vectorised loop evela kwi SpartialVectorOptimisations :

 public void SpartialParallelVectorOptimisations(int[] matrix, int sz) { for (var k = 0; k < sz; ++k) { Parallel.For(0, sz, i => { if (matrix[i * sz + k] == NO_EDGE) { return; } var ik_vec = new Vector<int>(matrix[i * sz + k]); var j = 0; for (; j < sz - Vector<int>.Count; j += Vector<int>.Count) { var ij_vec = new Vector<int>(matrix, i * sz + j); var ikj_vec = new Vector<int>(matrix, k * sz + j) + ik_vec; var lt_vec = Vector.LessThan(ij_vec, ikj_vec); if (lt_vec == new Vector<int>(-1)) { continue; } var r_vec = Vector.ConditionalSelect(lt_vec, ij_vec, ikj_vec); r_vec.CopyTo(matrix, i * sz + j); } for (; j < sz; ++j) { var distance = matrix[i * sz + k] + matrix[k * sz + j]; if (matrix[i * sz + j] > distance) { matrix[i * sz + j] = distance; } } }); } }

Zonke iziphumo zesi sithuba zithiwe thaca apha ngezantsi. Njengoko kulindelekile, ukusetyenziswa ngaxeshanye ukuhambelana kunye ne-vectorisation kubonise iziphumo ezilungileyo, ukunciphisa ixesha lokubala ukuya kumaxesha angama-25 (kwiigrafu ze-1200 vertexes) xa kuthelekiswa nokuphunyezwa kokuqala.


Indlela

sz

Ithetha

Impazamo

StdDev

Umlinganiselo

Isiseko

300

27.525 ms

0.1937 ms

0.1617 ms

1.00

SpartialOptimisation

300

12.399 ms

0.0943 ms

0.0882 ms

0.45

SpartialParallelOptimisations

300

6.252 ms

0.0211 ms

0.0187 ms

0.23

SpartialVectorOptimisations

300

3.056 ms

0.0301 ms

0.0281 ms

0.11

SpartialParallelVectorOptimisations

300

3.008 ms

0.0075 ms

0.0066 ms

0.11

Isiseko

600

217.897 ms

1.6415 ms

1.5355 ms

1.00

SpartialOptimisation

600

99.122 ms

0.8230 ms

0.7698 ms

0.45

SpartialParallelOptimisations

600

35.825 ms

0.1003 ms

0.0837 ms

0.16

SpartialVectorOptimisations

600

24.378 ms

0.4320 ms

0.4041 ms

0.11

SpartialParallelVectorOptimisations

600

13.425 ms

0.0319 ms

0.0283 ms

0.06

Isiseko

1200

1,763.335 ms

7.4561 ms

6.2262 ms

1.00

SpartialOptimisation

1200

766.675 ms

6.1147 ms

5.7197 ms

0.43

SpartialParallelOptimisations

1200

248.801 ms

0.6040 ms

0.5043 ms

0.14

SpartialVectorOptimisations

1200

185.628 ms

2.1240 ms

1.9868 ms

0.11

SpartialParallelVectorOptimisations

1200

76.770 ms

0.3021 ms

0.2522 ms

0.04

Isiseko

2400

14,533.335 ms

63.3518 ms

52.9016 ms

1.00

SpartialOptimisation

2400

6,507.878 ms

28.2317 ms

26.4079 ms

0.45

SpartialParallelOptimisations

2400

2,076.403 ms

20.8320 ms

19.4863 ms

0.14

SpartialVectorOptimisations

2400

2,568.676 ms

31.7359 ms

29.6858 ms

0.18

SpartialParallelVectorOptimisations

2400

1,281.877 ms

25.1127 ms

64.8239 ms

0.09

Isiseko

4800

119,768.219 ms

181.5514 ms

160.9406 ms

1.00

SpartialOptimisation

4800

55,590.374 ms

414.6051 ms

387.8218 ms

0.46

SpartialParallelOptimisations

4800

15,614.506 ms

115.6996 ms

102.5647 ms

0.13

SpartialVectorOptimisations

4800

18,257.991 ms

84.5978 ms

79.1329 ms

0.15

SpartialParallelVectorOptimisations

4800

12,785.824 ms

98.6947 ms

87.4903 ms

0.11

Ukuqukumbela

Kule post singene kancinci kweyona ngxaki imfutshane yezibini kwaye siphumeze i-algorithm ye-Floyd-Warshall kwi-C # ukuyisombulula. Siphinde sahlaziya ukuphunyezwa kwethu ukuhlonipha idatha kunye nokusebenzisa umgangatho ophantsi we-.NET kunye ne-hardware.


Kule post sidlale "zonke ngaphakathi". Nangona kunjalo, kwizicelo zobomi benene kubalulekile ukukhumbula - asodwa. I-Hardcore parallelization inokuthoba kakhulu ukusebenza kwenkqubo kwaye ibangele ingozi enkulu kunokulunga. I-Vectorisation kwelinye icala ikhuselekile ngakumbi ukuba yenziwa ngendlela ezimeleyo yeqonga. Khumbula ukuba eminye imiyalelo yeVector inokufumaneka kuphela kwii-CPU ezithile.


Ndiyathemba ukuba ukonwabele ukufunda kwaye wonwabe “ekucudiseni” amasuntswana okusebenza kwi-algorithm enemijikelo EMITHATHU nje 🙂

Iimbekiselo


  1. Floyd, RW Algorithm 97: Indlela emfutshane / RW Floyd // Unxibelelwano lwe-ACM. – 1962. – Vol. 5, №. 6. – P. 345-.
  2. Ingerman, PZ Algorithm 141: Indlela ye-matrix / PZ Ingerman // Unxibelelwano lwe-ACM. – 1962. – Vol. 5, №. 11. – P. 556.