paint-brush
Kahor inta AI aysan saadaalin noloshaada bulshada, waxay u baahan tahay inay nadiifiso xogteedaby@andrei9735
188 akhrin Taariikh cusub

Kahor inta AI aysan saadaalin noloshaada bulshada, waxay u baahan tahay inay nadiifiso xogteeda

by Andrei12m2025/02/12
Read on Terminal Reader

Aad u dheer; In la akhriyo

Maqaalkan waxaan ku sii wadi doonaa ka shaqeynta saadaasha isku xirka xogta Twitch.
featured image - Kahor inta AI aysan saadaalin noloshaada bulshada, waxay u baahan tahay inay nadiifiso xogteeda
Andrei HackerNoon profile picture
0-item

Maqaalkan waxaan ku sii wadi doonaa ka shaqeynta saadaasha isku xirka xogta Twitch.


Waxaan horey u haysanay xogta garaafka laga dhoofiyo Neptune iyadoo la adeegsanayo utility dhoofinta neptune iyo astaanta 'neptune_ml'. Talaabooyinka hore waxa lagu sifeeyay qaybaha 2 iyo 1 ee hagahan.


Halkan ka akhri qaybta 1- aad iyo qaybta 2- aad halkan.


Xogta hadda waxay ku kaydsan tahay S3 waxayna u egtahay sidan:


Vertices CSV (nodes/user.consolidated.csv):

 ~id,~label,days,mature,views,partner "6980","user",771,true,2935,false "547","user",2602,true,18099,false "2173","user",1973,false,3939,false ...

Edges CSV (edges/user-follows-user.consolidated.csv):

 ~id,~label,~from,~to,~fromLabels,~toLabels "3","follows","6194","2507","user","user" "19","follows","3","3739","user","user" "35","follows","6","2126","user","user" ...

Utility dhoofinta ayaa sidoo kale noo soo saartay faylka qaabeynta:

training-data-configuration.json:

 { "version" : "v2.0", "query_engine" : "gremlin", "graph" : { "nodes" : [ { "file_name" : "nodes/user.consolidated.csv", "separator" : ",", "node" : [ "~id", "user" ], "features" : [ { "feature" : [ "days", "days", "numerical" ], "norm" : "min-max", "imputer" : "median" }, { "feature" : [ "mature", "mature", "auto" ] }, { "feature" : [ "views", "views", "numerical" ], "norm" : "min-max", "imputer" : "median" }, { "feature" : [ "partner", "partner", "auto" ] } ] } ], "edges" : [ { "file_name" : "edges/%28user%29-follows-%28user%29.consolidated.csv", "separator" : ",", "source" : [ "~from", "user" ], "relation" : [ "", "follows" ], "dest" : [ "~to", "user" ], "features" : [ ] } ] }, "warnings" : [ ] }

Hadafkayaga hadda waa in aan samayno habaynta xogta, taas oo macnaheedu yahay u beddelashada xogta aan hayno qaab qaabdhismeedka maktabadda Deep Graph u isticmaali karo tababarka moodeelka. (Wixii dulmar ah oo ku saabsan saadaasha isku xirka DGL kaliya, eeg boostadan ). Taas waxaa ka mid ah caadiyeynta astaamaha nambarada, codeynta sifooyin gaar ah, abuurista liisaska lammaanaha nood ee leh xiriiriyeyaasha jira iyo kuwa aan jirin si loo suurtageliyo barashada kormeerka ah ee hawsha saadaasha isku xirka, iyo u kala qaybinta xogta tababarka, xaqiijinta iyo diyaarinta tijaabada.


Sida aad ku arki karto faylka training-data-configuration.json , noodhka ayaa leh 'maalmo' (da'da xisaabta) iyo 'aragtiyada' waxaa loo aqoonsaday inay yihiin tiro ahaan, iyo caadi ahaan min-max ayaa la soo jeediyay. Caadi ahaan ugu Min-max waxay qiyaastaa qiyamka sabab la'aanta ah ilaa tiro ah [0; 1] sida tan: x_normalized = (x - x_min) / (x_max - x_min). imputer = dhexdhexaadin macnaheedu waa in qiyamka maqan lagu buuxin doono qiimaha dhexdhexaadka ah.


Astaamaha qanjidhada 'baaluq' iyo 'lamaanaha' waxaa lagu calaamadeeyay 'auto', iyo maadaama tiirarkaas ay ka kooban yihiin qiyamka boolean, waxaan fileynaa in loo aqoonsan doono sifooyin kala duwan oo lagu dhejin doono inta lagu jiro marxaladda habaynta xogta. Kala qaybsanaanta xaqiijinta tareenka kuma jirto faylkan si toos ah loo soo saaray, kala qaybsanaanta caadiga ah ee hawsha saadaalinta isku xidhka waa 0.9, 0.05, 0.05.


Waxaad hagaajin kartaa habaynta caadiga ah iyo codaynta, oo waxaad dooran kartaa qaybsanaanta hubinta tareenada caadiga ah. Haddii aad doorato inaad sidaas samayso, kaliya ku beddel faylka tababarka-data-configuration.json ee asalka ah ee S3 nooca la cusboonaysiiyay. Liiska buuxa ee goobaha la taageeray ee JSON waa laga heli karaa halkan . Qoraalkan, waxaanu kaga tagi doonaa faylkan oo aan isbeddelin.

DOORARKA IAM ayaa looga baahan yahay habraaca xogta

Sida marxaladda rarka xogta (oo lagu sifeeyay Qaybta 1 ee casharradan), waxaan u baahanahay inaan abuurno door IAM oo noo oggolaanaya gelitaanka adeegyada aan adeegsan doonno, sidoo kale waxaan u baahanahay inaan ku darno doorarkaas kooxdayada Neptune. Waxaan u baahanahay laba door heerka habaynta xogta. Midka koowaad waa doorka Neptune ee siiya Neptune marin u helka SageMaker iyo S3. Midka labaad waa doorka fulinta SageMaker kaas oo ay adeegsato SageMaker inta lagu guda jiro hawsha habaynta xogta oo u oggolaanaysa helitaanka S3.


Doorarkani waa inay lahaadaan siyaasado aamin ah oo u oggolaanaya adeegyada Neptune iyo SageMaker inay qaataan:

 { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "sagemaker.amazonaws.com" }, "Action": "sts:AssumeRole" }, { "Sid": "", "Effect": "Allow", "Principal": { "Service": "rds.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }

Kadib abuurista doorarka iyo cusboonaysiinta siyaasadahooda kalsoonida, waxaanu ku dari doonaa kooxda Neptune (Neptune -> Databases -> YOUR_NEPTUNE_CLUSTER_ID -> Isku xidhka & Amniga -> Doorarka IAM -> Doorka ku dar).

LA SOCODKA XOGTA NEPTUNE ML HTTP API

Hadda oo aanu cusbooneysiinay faylka training-data-configuration.json oo aanu ku darnay doorarka IAM ee kooxda Neptune, waxaanu diyaar u nahay inaanu bilowno shaqada habaynta xogta. Si taas loo sameeyo waxaan u baahanahay inaan codsi u dirno kooxda Neptune HTTP API gudaha VPC halka ay kooxdu ku taal. Waxaan isticmaali doonaa iyo tusaale ahaan EC2 si aan taas u samayno.

Waxaan isticmaali doonaa curl si aan u bilowno shaqada habaynta xogta:

 curl -XPOST https://(YOUR_NEPTUNE_ENDPOINT):8182/ml/dataprocessing \ -H 'Content-Type: application/json' \ -d '{ "inputDataS3Location" : "s3://SOURCE_BUCKET/neptune-export/...", "processedDataS3Location" : "s3://OUTPUT_BUCKET/neptune-export-processed/...", "neptuneIamRoleArn": "arn:aws:iam::123456789012:role/NeptuneMLDataProcessingNeptuneRole", "sagemakerIamRoleArn": "arn:aws:iam::123456789012:role/NeptuneMLDataProcessingSagemakerRole" }'

Kaliya kuwan 4-beeg ayaa loo baahan yahay: gelinta xogta goobta S3, xogta la farsameeyay ee S3 goobta, doorka Neptune, doorka Sagemaker. Waxa jira xujooyin badan oo ikhtiyaari ah: tusaale ahaan, waxaanu si gacanta ah u dooran karnaa nooca tusaale ahaan EC2 ee loo abuuri doono hawsha habaynta xogta ee processingInstanceType oo aanu dejinno cabbirka mugga kaydinta iyada oo la processingInstanceVolumeSizeInGB . Liiska buuxa ee xuduudaha ayaa laga heli karaa halkan .

Kooxdu waxay ku jawaabtaa JSON ka kooban aqoonsiga shaqada habaynta xogta ee aanu hadda abuurnay:

 {"id":"d584f5bc-d90e-4957-be01-523e07a7562e"}

Waxaan u isticmaali karnaa si aan u helno heerka shaqada amarkan (isticmaal isla neptuneIamRoleArn sidii codsigii hore):

 curl https://YOUR_NEPTUNE_CLUSTER_ENDPOINT:8182/ml/dataprocessing/YOUR_JOB_ID?neptuneIamRoleArn='arn:aws:iam::123456789012:role/NeptuneMLDataProcessingNeptuneRole'

Mar ay ka jawaabto wax sidan oo kale ah.

 { "processingJob": {...}, "id":"d584f5bc-d90e-4957-be01-523e07a7562e", "status":"Completed" }

waxaan hubin karnaa wax soo saarka. Faylashan waxaa lagu abuuray baaldigii loo socday ee S3:


Garaafka

Faylka features.json wuxuu ka kooban yahay liiska noodhka iyo astaamaha cidhifka:

 { "nodeProperties": { "user": [ "days", "mature", "views", "partner" ] }, "edgeProperties": {} }

Faahfaahinta ku saabsan sida xogta loo habeeyey iyo sida sifada loo habeeyey waxa laga heli karaa faylka la cusboonaysiiyay_training_config.json :

 { "graph": { "nodes": [ { "file_name": "nodes/user.consolidated.csv", "separator": ",", "node": [ "~id", "user" ], "features": [ { "feature": [ "days", "days", "numerical" ], "norm": "min-max", "imputer": "median" }, { "feature": [ "mature", "mature", "category" ] }, { "feature": [ "views", "views", "numerical" ], "norm": "min-max", "imputer": "median" }, { "feature": [ "partner", "partner", "category" ] } ] } ], "edges": [ { "file_name": "edges/%28user%29-follows-%28user%29.consolidated.csv", "separator": ",", "source": [ "~from", "user" ], "relation": [ "", "follows" ], "dest": [ "~to", "user" ] } ] } }

Waxaan arki karnaa in tiirarka 'baaluq' iyo 'lamaane' leh qiyamka boolean, markii hore lagu calaamadiyay 'auto' faylka tababarka-data-configuration.json, lagu dhejiyay sifada qaybta.


Faylka 'train_instance_recommendation.json' wuxuu ka kooban yahay nooca tusaale ahaan SageMaker iyo cabbirka kaydinta ee lagu taliyay tababbarka moodeelka:

 { "instance": "ml.g4dn.2xlarge", "cpu_instance": "ml.m5.2xlarge", "disk_size": 14126462, "mem_size": 4349122131.111111 }

Faylka model-hpo-configuration.json wuxuu ka kooban yahay nooca moodeelka, cabbirrada loo isticmaalo qiimayntiisa, inta jeer ee qiimaynta, iyo cabbirrada sare.


Tani waxay soo gabagabeyneysaa marxaladda habaynta xogta ee habka, maadaama aan hadda diyaar u nahay inaan bilowno tababarka qaabka ML. Waxaa looga hadli doonaa qaybta xigta ee hagahan.