paint-brush
Tika kosala défilement, Bandá kotonga: Salá Recommandé na yo moko ya film AIpene@superlinked
Lisolo ya sika

Tika kosala défilement, Bandá kotonga: Salá Recommandé na yo moko ya film AI

pene Superlinked11m2025/03/06
Read on Terminal Reader

Molai mingi; Mpo na kotánga

Yekola ndenge ya kotonga système ya recommandé personnalisé, oyo etambwisami na AI oyo esakolaka film oyo olingaka mingi na bosikisiki. Na mateya oyo, tokokamba yo na nzela ya kosala système ya recommandé ya film na kosalelaka ba bases de données vectorielles. Okoyekola ndenge nini ba moteurs ya recommandé ya AI ya mikolo oyo esalaka mpe okozwa expérience ya maboko ya kotonga système na yo moko na Superlinked.
featured image - Tika kosala défilement, Bandá kotonga: Salá Recommandé na yo moko ya film AI
Superlinked HackerNoon profile picture
0-item
1-item

“Nalobaki nalingi B-Movie dammit!”

Suka ya défilement sans fin (mpe ba arguments sur nini ya kotala...)

Olembi kosala défilement sans fin na Netflix, oyebi te nini okotala sima? Ezali boni soki okokaki kotonga système na yo moko ya toli oyo etambwisami na AI oyo esakolaka filme oyo olingaka mingi na bosikisiki?


Na mateya oyo, tokotambwisa yo na nzela ya kosala système ya recommandé ya film na kosalelaka ba bases de données vectorielles (VectorDBs) . Okoyekola ndenge nini ba moteurs ya recommandé ya AI ya mikolo oyo esalaka mpe okozwa expérience ya maboko ya kotonga système na yo moko na Superlinked .


(Olingi ko sauter mbala moko na code? Tala repo na biso na GitHub awa . Prêt ya komeka ba systèmes recommandé pona cas d'utilisation na yo moko? Zua démonstration awa .)

Tozua Recommandé!

Tokolanda kaye oyo na lisolo mobimba. Okoki pe kosala code mbala moko na navigateur na yo na nzela ya Colab.


Algorithme ya recommandation ya Netflix esalaka mosala malamu mpenza ya kopesa likanisi ya makambo oyo etali yango - soki totali volume ya ba options (~16k ba films mpe ba émissions ya TV na 2023) mpe ndenge nini noki esengeli ko proposer ba émissions na ba usagers. Netflix esalaka yango ndenge nini? Na liloba moko, boluki na ndimbola .


Boluki ya sémantique esosolaka ndimbola mpe contexte (ezala ba attributs mpe ba modèles ya consommation) sima ya ba queries ya usager mpe ba déscriptions ya film/émission ya TV, mpe na yango ekoki kopesa personnalisation ya malamu na ba queries mpe ba recommandations na yango koleka ba approches traditionnelles oyo esalemi na ba mots clés.


Kasi boluki sémantique ezali kobimisa mikakatano mosusu - ya liboso na kati na yango : 1) kosala ete ba résultats ya boluki ezala ya sikisiki, 2) interprabilité, mpe 3) évolutivité - mikakatano oyo stratégie nionso ya recommandé ya contenus oyo elongi ekosengela kosilisa. Na kosaleláká bibliotɛkɛ ya Superlinked, okoki kolonga mikakatano yango.


Na article oyo, toko lakisa yo ndenge nini okoki kosalela bibliothèque Superlinked pona ko configurer recherche sémantique na yo moko pe ko produire liste ya ba films pertinents selon ba préférences na yo.

Boluki ya Semantique - Mikakatano

Bolukiluki ya sémantique epesaka motuya mingi na boluki ya vecteur kasi ezali kobimisa mikakatano misato ya ntina ya bokɔtisi vecteur mpo na ba développeurs:

  • Qualité mpe pertinence : Kosala que ba embeddings na yo e capter na bosikisiki signification sémantique ya ba données na yo esengaka pona malamu ba techniques ya intégration, ba données ya formation, na ba hyperparamètres. Ba embeddings ya qualité ya mabe ekoki komema na ba résultats ya recherche ya sikisiki te mpe ba recommandations oyo ezali na tina te.


  • Interprétabilité : Ba espaces vecteurs haute dimensions ezali trop compliqués mpo na ko comprendre yango facilement. Mpo na kozwa bososoli ya boyokani mpe bokokani oyo ezali na code na kati na yango, bato ya siansi ya ba données basengeli kosala mayele mpo na komona yango na makanisi mpe kotalela yango.


  • Escalabilité : Kokamba pe kosala ba embeddings ya dimensions ya likolo, mingi mingi na ba ensembles ya ba données ya minene, ekoki ko déranger ba ressources informatiques pe komatisaka latence. Ba méthodes efficaces pona indexation, récupération, pe calcul ya similarité ezali essentiel pona ko assurer évolutivité pe performance en temps réel na ba environnements ya production.


Bibliotɛkɛ oyo babengi Superlinked epesaka yo nzela ya kosilisa mikakatano yango. Na nse, tokotonga recommandé ya contenus (specifiquement mpo na ba films), kobanda na ba informations oyo tozali na yango na oyo etali film moko epesami, toko intégrer information oyo lokola vecteur multimodal, tokotonga index vecteur oyo ekoki kolukama mpo na ba films na biso nionso, mpe sima tokosalela ba poids ya requête mpo na ko tweak ba résultats na biso mpe kokoma na ba recommandations ya ba films ya malamu. Tokota na kati na yango.

Kosala Expérience ya boluki ya mbangu mpe ya kozala na confiance na Superlinked

Na nse, okosala boluki ya sémantique na ensemble ya ba données ya film ya Netflix na kosalelaka ba éléments oyo elandi ya bibliothèque Superlinked:

  • Espace ya recency - pona ko comprendre fraîcheur (monnaie na pertinence) ya ba données na yo, ko identifier ba films ya sika.
  • TextSimilarity space - pona kolimbola biteni ndenge na ndenge ya ba métadonnées oyo ozali na yango na oyo etali film, lokola description, titre, pe genre.
  • Ba poids ya temps ya requête - ko permettre yo opona oyo ezali na tina mingi na ba données na yo tango ozali kosala query, na yango ko optimiser sans que ozala na besoin ya ko re-embed ensemble ya ba données mobimba, kosala posttraitement, to kosalela modèle ya reranking personnalisé (c.a.d., kokitisa latence).

Ensemble ya ba données ya Netflix, na Oyo Tokosala Na Yango

Ko recommander na succès ba films eza pasi mingi mingi po ba options ezali ebele (>9000 titres na 2023), pe ba usagers balingi ba recommandations sur demande, mbala moko. Tozua approche oyo etambwisami na ba données pona koluka eloko oyo tolingi kotala. Na ensemble na biso ya ba données ya ba films, toyebi ba:

  • ndimbola
  • lolenge
  • titre
  • mbula_ya_bobimisi


Tokoki ko intégrer ba entrées wana, pe kotia esika moko index vecteur likolo ya ba intégrations na biso, ko créer espace oyo tokoki koluka na ndenge ya sémantique.


Soki tozwi esika na biso ya vecteur indexé, tokosala:

  • ya liboso, talá na bafilme, filtré na likanisi moko (comédie romantique oyo euti na motema)
  • na nsima, tweak ba résultats, kopesa importance mingi na ba matchs na certains champs d’entrée (c.a.d., poids) .
  • na nsima, luka na ndimbola, genre, mpe titre na maloba ya boluki ekeseni mpo na moko na moko
  • mpe, nsima ya koluka filme oyo ekokani penepene kasi ya sikisiki te, luka mpe zingazinga kosalelaka filme wana lokola référence

Installation mpe Bobongisi ya ensemble ya ba données

Etape na yo ya liboso ezali ya ko installer bibliothèque pe ko importer ba classes oyo esengeli.


(Liyebisi: Na nse, bongola alt.renderers.enable(“mimetype”) na alt.renderers.enable('colab') soki ozali kosala oyo na google colab . Bomba “mimetype” soki ozali kosala na github .)


 %pip install superlinked==5.3.0 from datetime import timedelta, datetime import altair as alt import os import pandas as pd from superlinked.evaluation.charts.recency_plotter import RecencyPlotter from superlinked.framework.common.dag.context import CONTEXT_COMMON, CONTEXT_COMMON_NOW from superlinked.framework.common.dag.period_time import PeriodTime from superlinked.framework.common.schema.schema import schema from superlinked.framework.common.schema.schema_object import String, Timestamp from superlinked.framework.common.schema.id_schema_object import IdField from superlinked.framework.common.parser.dataframe_parser import DataFrameParser from superlinked.framework.dsl.executor.in_memory.in_memory_executor import ( InMemoryExecutor, InMemoryApp, ) from superlinked.framework.dsl.index.index import Index from superlinked.framework.dsl.query.param import Param from superlinked.framework.dsl.query.query import Query from superlinked.framework.dsl.query.result import Result from superlinked.framework.dsl.source.in_memory_source import InMemorySource from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace from superlinked.framework.dsl.space.recency_space import RecencySpace alt.renderers.enable("mimetype") # NOTE: to render altair plots in colab, change 'mimetype' to 'colab' alt.data_transformers.disable_max_rows() pd.set_option("display.max_colwidth", 190)


Tosengeli mpe kobongisa ensemble ya ba données - kolimbola ba constantes ya temps, kotiya esika ya URL ya ba données, kosala dictionnaire ya magasin ya ba données, kotanga CSV na kati ya pandas DataFrame, kosukola cadre ya ba données mpe ba données mpo ete ekoki kolukama malamu, mpe kosala vérification mpe botali ya mbangu. (Talá baselile 3 mpe 4 mpo na koyeba makambo mosusu.)


Sikoyo lokola ensemble ya ba données ebongisami, okoki ko optimiser récupération na yo na nzela ya bibliothèque Superlinked.

Kotonga Index mpo na boluki ya vecteur

Bibliothèque ya Superlinked ezali na ensemble ya ba blocs de construction ya moboko oyo tosalelaka pona kotonga index pe ko gérer récupération. Okoki kotanga na ntina ya biloko wana ya kotonga na bozindo awa .


Ya liboso, esengeli o définir Schéma na yo pona koyebisa système ba données na yo.

 # accommodate our inputs in a typed schema @schema class MovieSchema: description: String title: String release_timestamp: Timestamp genres: String id: IdField movie = MovieSchema()


Na sima, osalelaka ba Espaces pona koloba ndenge nini olingi ko traité eteni moko na moko ya ba données tango ozali ko intégrer. Ba Espaces nini esalelamaka etali type ya ba données na yo. Espace moko na moko ezali optimisé mpo na ko intégrer ba données mpo na kozongisa qualité ya likolo ya ba résultats ya récupération.


Na ba définitions ya Espace, tozali kolimbola ndenge nini esengeli ko intégrer ba entrées pona ko refleter ba relation sémantique na ba données na biso.


 # textual fields are embedded using a sentence-transformers model description_space = TextSimilaritySpace( text=movie.description, model="sentence-transformers/paraphrase-MiniLM-L3-v2" ) title_space = TextSimilaritySpace( text=movie.title, model="sentence-transformers/paraphrase-MiniLM-L3-v2" ) genre_space = TextSimilaritySpace( text=movie.genres, model="sentence-transformers/paraphrase-MiniLM-L3-v2" ) # release date are encoded using our recency space # periodtimes aim to reflect notable breaks in our scores recency_space = RecencySpace( timestamp=movie.release_timestamp, period_time_list=[ PeriodTime(timedelta(days=4 * YEAR_IN_DAYS)), PeriodTime(timedelta(days=10 * YEAR_IN_DAYS)), PeriodTime(timedelta(days=40 * YEAR_IN_DAYS)), ], negative_filter=-0.25, ) movie_index = Index(spaces=[description_space, title_space, genre_space, recency_space])


Soki obongisi bisika na yo mpe osali index na yo, osalelaka biteni ya source mpe ya exécuteur ya bibliothèque mpo na kobongisa ba requêtes na yo. Talá baselile 10-13 na kaye .


Sikoyo lokola ba requêtes ebongisami, tokende na kosala ba requêtes mpe ko optimiser récupération na ko ajuster ba poids.

Kososola Recency, mpe Ndenge ya kosalela yango na Superlinked

Espace ya recency e permettre yo o changer ba résultats ya query na yo na préférentiellement ko benda ba sorties ya kala to ya sika na ensemble ya ba données na yo. Tosalelaka mibu 4, 10, mpe 40 lokola bantango na biso ya période mpo ete tokoka kopesa mibu na ba titres mingi focus mingi - tala cellule 5 ).


Talá ba pauses na score na 4, 10, mpe 40 ans. Ba titres oyo eleki mbula 40 ezuaka score negative_filter .

Ba scores ya récency na période

Kotala mpe kobongisa malamu ba résultats ya boluki na kosalelaka ba poids différents ya temps ya requête

To définir fonction util ya mbangu pona ko présenter ba résultats na biso na cahier.


 def present_result( result: Result, cols_to_keep: list[str] = ["description", "title", "genres", "release_year", "id"], ) -> pd.DataFrame: # parse result to dataframe df: pd.DataFrame = result.to_pandas() # transform timestamp back to release year df["release_year"] = [ datetime.fromtimestamp(timestamp).year for timestamp in df["release_timestamp"] ] return df[cols_to_keep]


Mituna ya pete mpe ya likolo

Bibliothèque Superlinked epesaka yo nzela ya kosala ba requêtes ya ndenge na ndenge; awa tozali kolimbola mibale. Mitindo na biso mibale ya mituna ya mituna (pete mpe ya liboso) tika ngai napesa kilo na bisika ya moto na moto (ndimbola, motó ya likambo, genre, mpe ya solo recency) engebene na ba préférences na ngai. Bokeseni kati na bango ezali ete na motuna moko ya pɛtɛɛ , natye makomi moko ya motuna mpe na nsima nabimisaka matomba ya ndenge moko na bisika ya kolimbola, motó ya likambo, mpe ya genre.


With an advanced query , nazali na contrôle ya grain fine mingi. Soki nalingi, nakoki kokotisa makomi ya mituna ekeseni na moko na moko ya bisika ya ndimbola, motó ya likambo, mpe ya genre. Tala code ya requête:


 query_text_param = Param("query_text") simple_query = ( Query( movie_index, weights={ description_space: Param("description_weight"), title_space: Param("title_weight"), genre_space: Param("genre_weight"), recency_space: Param("recency_weight"), }, ) .find(movie) .similar(description_space.text, query_text_param) .similar(title_space.text, query_text_param) .similar(genre_space.text, query_text_param) .limit(Param("limit")) ) advanced_query = ( Query( movie_index, weights={ description_space: Param("description_weight"), title_space: Param("title_weight"), genre_space: Param("genre_weight"), recency_space: Param("recency_weight"), }, ) .find(movie) .similar(description_space.text, Param("description_query_text")) .similar(title_space.text, Param("title_query_text")) .similar(genre_space.text, Param("genre_query_text")) .limit(Param("limit")) )


Motuna ya pete

Na ba requêtes simples, na tia texte ya requête na ngai pe na appliquer ba poids différents selon importance na yango pona ngai.


 result: Result = app.query( simple_query, query_text="Heartfelt romantic comedy", description_weight=1, title_weight=1, genre_weight=1, recency_weight=0, limit=TOP_N, ) present_result(result) 


Ba résultats ya Query ya pete 1

Ba résultats na biso ezali na ba titres oyo namoni déjà. Nakoki kosala na likambo oyo na kopesa kilo na recency mpo na ko bias ba résultats na ngai vers ba titres récents. Ba poids e normaliser po ezala na somme unitaire (c.a.d., ba poids nionso e ajusté donc e sommer toujours na total ya 1), donc il faut omitungisa te na ndenge okotia yango.


 result: Result = app.query( simple_query, query_text="Heartfelt romantic comedy", description_weight=1, title_weight=1, genre_weight=1, recency_weight=3, limit=TOP_N, ) present_result(result) 


Ba résultats ya Query ya pete 1

Ba résultats na ngai (likolo) ezali sikoyo nionso post-2021.


Kosalela motuna ya pete, nakoki kopesa kilo na esika nyonso ya sikisiki (ndimbola, motó ya likambo, genre, to ya sika) mpo na kosala ete etangama mingi ntango nazali kozongisa ba résultats. Tomeka likambo oyo. En bas, tokopesa poids mingi na genre na titre ya poids ya se - texte ya requête na ngai ezali essentiellement kaka genre na mua contexte supplémentaire. Na bomba recency na ngai ndenge ezali mpo nakolinga kaka ba résultats na ngai ezala bias na ba films ya sika.


 result = app.query( simple_query, query_text="Heartfelt romantic comedy", description_weight=1, title_weight=0.1, genre_weight=2, recency_weight=1, limit=TOP_N, ) present_result(result)


Query oyo ezo puser année ya sortie mua sima pona kopesa ngai ba résultats pondérés ya genre mingi (na se).


Ba résultats ya Query ya pete 3

Motuna ya liboso

Requête avancée epesaka ngai contrôle encore plus fines. Nabatelaka bokonzi likoló na makambo oyo euti kosalema kala mingi te, kasi nakoki mpe koyebisa makomi ya boluki mpo na ndimbola, motó ya likambo, mpe lolenge ya lolenge, mpe kopesa mokomoko na mokomoko kilo ya sikisiki engebene oyo nalingaka, na kotalela na nse (mpe baselile 19-21 ), .

 result = app.query( advanced_query, description_query_text="Heartfelt lovely romantic comedy for a cold autumn evening.", title_query_text="love", genre_query_text="drama comedy romantic", description_weight=0.2, title_weight=3, genre_weight=1, recency_weight=5, limit=TOP_N, ) present_result(result)


Luká na lisalisi ya filme moko boye

Loba na ba résultats ya film na ngai ya suka, nakuti film moko nasi na mona pe nakolinga komona eloko ya ndenge wana. Tokanisa ete nalingaka Noele ya Pembe, comédie romantique ya 1954 (id = tm16479) oyo elobeli bayembi-babini koya esika moko mpo na elakiseli ya estrade mpo na kobenda bapaya na ndako ya bapaya ya Vermont oyo ezali kobunda. Na kobakisa clause with_vector ya likolo (na paramètre movie_id ) na advanced_query, with_movie_query e permettre ngai naluka na nzela ya film oyo (to film nionso nalingaka), pe epesi ngai contrôle nionso ya grain fine ya texte ya requête ya sous-recherche separate na poids.


Ya liboso, tobakisi paramètre na biso ya movie_id:

 with_movie_query = advanced_query.with_vector(movie, Param("movie_id"))


Mpe na nsima nakoki kotya mituna na ngai mosusu ya boluki ya moke soit na mpamba to oyo ezali na ntina mingi, elongo na ba poids nionso oyo ezali na ntina. Toloba query na ngai ya liboso ezongisaka ba résultats oyo ezo lakisa aspect ya performance/band ya scène ya Noël Blanc (tala cellule 24 ), kasi nalingi kotala film oyo ezali plus orienté na famille. Nakoki kokotisa description_query_text mpo na ko skew ba résultats na ngai na direction oyo nalingi.

 result = app.query( with_movie_query, description_query_text="family", title_query_text="", genre_query_text="", description_weight=1, title_weight=0, genre_weight=0, recency_weight=0, description_query_weight=1, movie_id="tm16479", limit=TOP_N, ) present_result(result) 


Ba résultats ya Query ya likolo 1

Kasi sikawa lokola namoni mbano na ngai, nasosoli ete nazali mpenza mingi na ezalela ya likambo moko ya motema pete mpe ya kosekisa. Tobongisa motuna na ngai na kolanda yango:


 Result = app.query( with_movie_query, description_query_text="", title_query_text="", genre_query_text="comedy", description_weight=1, title_weight=0, genre_weight=2, recency_weight=0, description_query_weight=1, movie_id="tm16479", limit=TOP_N, ) present_result(result) 


Ba résultats ya Requête avancée 2

Okey, ba résultats wana eza malamu koleka. Nakopona moko ya makambo oyo. Tia ba popcorn na likolo!

Maloba ya nsuka

Superlinked esalaka ete ezala pete mpo na komeka, kozongela, mpe kobongisa lolenge na yo ya kozwa. Likolo, totambolisi bino na ndenge ya kosalela bibliothèque Superlinked mpo na kosala boluki ya sémantique na esika ya vecteur, lolenge Netflix esalaka, mpe kozongisa ba résultats ya film ya sikisiki, oyo etali yango. Tomoni mpe ndenge ya kobongisa malamu ba résultats na biso, ko tweaking ba poids mpe ba termes ya boluki tii tokokoma kaka na résultat oyo ebongi.


Sikoyo, meká yo moko kaye yango , mpe talá nini okoki kokokisa!

Meka Yango Yo Moko – Zua Code & Demo!

  • 💾 Kanga Code : Tala mise en œuvre mobimba na repo na biso ya GitHub awa . . Fork yango, tweak yango, mpe sala yango ya yo moko!


  • 🚀 Tala Yango na Action : Olingi omona oyo ezo sala na setup ya mokili ya solo? Réserver démonstration ya mbangu, pe explorer ndenge nini Superlinked ekoki ko supercharger ba recommandations na yo. Bozua démo sikoyo !


Ba moteurs ya recommandation ezali ko shape ndenge to découvrir contenus. Ezala ba films, miziki, to biloko, boluki ya vecteur ezali avenir —mpe sikoyo ozali na bisaleli ya kotonga ya yo moko.


Mokomi: Mór Kapronczay

L O A D I N G
. . . comments & more!

About Author

Superlinked HackerNoon profile picture
Superlinked@superlinked
Turning complex data into vector embeddings for better AI/ML results.

KOKANGA BA ÉTIQUES

ARTICLE OYO EZALAKI PRESENTE NA...