A Guide on How to Build an AI-Powered Automated Product Enrichment Pipeline for Shopify

Written by konsole | Published 2025/04/27
Tech Story Tags: shopify | artificial-intelligence | automation | shopify-apps | data-pipeline | github-actions-workflow | github-actions | hackernoon-top-story

TLDRWe’ll build a pipeline using GitHub Actions to export the latest products from the Shopify store, perform some actions using LLM, and update the products.via the TL;DR App

Engineering a fully automated workflow for a Shopify store

Maintaining a successful e-commerce store comes with its fair share of challenges. It demands constant attention to ever-changing details across inventory, customer experience, and platform updates. With so many moving parts, manual oversight can quickly become overwhelming, error-prone, and time-consuming.

That’s where automation steps in — not just as a convenience but as a necessity to keep your store running efficiently and at scale. While Shopify offers a rich ecosystem of apps and drag-and-drop interfaces, it often requires you to trade transparency and control for convenience.

Taking Back Control

Let the robots worry about the boring stuff!

Sooner or later, you will hit the limits with off-the-shelf apps and manual workflows and start looking for alternatives. One such alternative is to shift away from GUI-centric tools toward programmable pipelines that offer complete flexibility and control. What you want is:

  • Full ownership of your data
  • Enhancements tailored to your brand and products
  • Shareable Workflows: multiple stores could use the same workflow with little to no tweaks
  • Confidence in every step of the process

Now, let’s explore how we can build an automated CI pipeline to help mitigate the issues mentioned above. As a proof-of-concept, we’ll create a pipeline to streamline our product-content workflow. The pipeline will use LLM to review the latest products on our store, optimize the title, add SEO title and description, and generate a summary for the team to review.

The Stack

Here’s what powers the workflow:

  • Shopify — where our products live
  • GitHub Actions — for orchestration and automation
  • ShopCTL — A command line utility for Shopify store management
  • OpenAI API — to revise product titles, generate SEO content, and provide suggestions
  • Python and some Bash scripts — for the enrichment logic and updates

First Things First — Setting Up the Stack

Let’s start by setting up a GitHub Actions workflow. We’ll store pipeline configs in the .github/workflows/ directory. Create a file named enrich-products.yml inside the workflows directory. This file will define jobs for our product-content workflow.

# .github/workflows/enrich-products.yml

name: Shopify Product Enrichment

on:
  workflow_dispatch:

The workflow_dispatch event in GitHub Actions allows you to manually trigger a workflow from the GitHub interface or via the API, or you can schedule it to run automatically at a specific time.

API Keys

We’d need a few API keys to complete our configuration: OPENAI_API_KEY for AI operations and SHOPIFY_ACCESS_TOKEN to communicate with our store.

Get the OpenAI API key from your OpenAI account, and set it as a secret in GitHub. Getting a Shopify access token is tricky since you need to create a dummy app to do so. Follow this official guide to get one.

ShopCTL

We’ll use a command-line tool to export and update our products. Let’s create a custom action that we can reuse to reference in our pipeline. Create a file called setup-shopctl.yml inside the actions directory and add the following config.

# .github/workflows/actions/setup-shopctl.yml

name: Setup ShopCTL
description: Installs Go and ShopCTL CLI
runs:
  using: "composite"
  steps:
    - name: Set up Go
      uses: actions/setup-go@v5
      with:
        go-version: "1.24"

    - name: Install ShopCTL
      shell: bash
      run: |
        sudo apt-get update
        sudo apt-get install -y libx11-dev
        go install github.com/ankitpokhrel/shopctl/cmd/shopctl@main
        echo "$HOME/go/bin" >> "$GITHUB_PATH"

Apart from custom actions, we need to add a configuration for the store we’re operating. Create a folder called shopctl on the repo’s root and add the following config in a file called .shopconfig.yml. Replace all occurrences of store1 with your store alias.

# shopctl/.shopcofig.yml

ver: v0
contexts:
    - alias: store1
      store: store1.myshopify.com
currentContext: store1

Finalizing the Pipeline

Full source for the pipeline can be found here.

Our pipeline has four stages, viz: Export -> Enrich -> Update -> Notify

Stage 1: Export Products

The first step in our pipeline is to export the latest products from our store. Add a job called export-products in the enrich-products.yml file we created earlier.

jobs:
  export-products:
    runs-on: ubuntu-latest
    env:
      SHOPIFY_ACCESS_TOKEN: ${{ secrets.SHOPIFY_ACCESS_TOKEN }} # The secret we set earlier
      SHOPIFY_CONFIG_HOME: ${{ github.workspace }} # This will tell shopctl to use current dir to look for .shopconfig
    outputs:
      has-data: ${{ steps.check.outputs.has_data }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Setup ShopCTL
        uses: ./.github/workflows/actions/setup-shopctl

      - name: Export products
        run: |
          mkdir -p data

          # Export latest data (last 7 days) using the shopctl tool as latest_products.tar.gz
          shopctl export -r product="created_at:>=$(date -v -7d +%Y-%m-%d)" -o data/ -n latest_products -vvv

      - name: Check if export has data
        id: check
        run: |
          if [ -s data/latest_products.tar.gz ]; then
            echo "has_data=true" >> "$GITHUB_OUTPUT"
          else
            echo "has_data=false" >> "$GITHUB_OUTPUT"
            echo "No products found to process"
          fi

      - name: Upload exported products
        if: steps.check.outputs.has_data == 'true'
        uses: actions/upload-artifact@v4
        with:
          name: exported-products
          path: data/latest_products.tar.gz

The job above will set up ShopCTL using the custom action we created earlier. It will export all products created in the last 7 days and upload them as artifacts if any new products exist.

Stage 2a: Review Catalog

The next thing we want to do is to review our catalog. We’ll use the OpenAI API to review product data samples and identify the following:

  • Issues or inconsistencies in tags, product types, or variants
  • Missing or inconsistent inventory information
  • Gaps in product configuration or variant structure
  • Duplicate or overly similar products
  • General recommendations to improve catalog quality and its completeness

review-catalog:
    needs: export-products
    runs-on: ubuntu-latest
    env:
      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Download product export
        uses: actions/download-artifact@v4
        with:
          name: exported-products
          path: data/

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.13"

      - name: Install dependencies
        run: pip install openai

      - name: Run catalog review script
        run: |
          # Assuming your script is saved in scripts/review_catalog.py
          python scripts/review_catalog.py \
            data/latest_products.tar.gz \
            data/review_summary.md

      - name: Upload catalog summary
        uses: actions/upload-artifact@v4
        with:
          name: catalog-review-summary
          path: data/review_summary.md

      - name: Final summary
        run: echo "✅ Shopify product catalog review completed!"

Notice the needs section. We want to run it after products are exported and made available as artifacts. We also need to set up Python, as our review script is written in Python. You can use any language of your choice here. The script generates review_summary.md, which is uploaded as an artifact in the next step (example output below).

## Identified Issues

### 1. Missing or Inconsistent Information:
- Some products have missing or inconsistent `productType` (e.g. `"gid://shopify/Product/8790718087392"`, `"gid://shopify/Product/879071795632

The sample script and the prompt can be found here.

Stage 2b: Enrich Products

Similar to the review-catalog job, add an enrich-products job that will run the script to review the product title and generate an SEO title and description for the product using OpenAI. This job runs in parallel with the review catalog job and generates a CSV with details on metadata to update.


The sample script and the prompt can befound here.

Stage 3: Update Products

Once the metadata is generated in stage 2b, we can update products using ShopCTL. We’ll use a bash script instead of Python at this stage.

Add a job called update-products, as shown below.

update-products:
    needs: enrich-products
    runs-on: ubuntu-latest
    env:
      SHOPIFY_ACCESS_TOKEN: ${{ secrets.SHOPIFY_ACCESS_TOKEN }}
      SHOPIFY_CONFIG_HOME: ${{ github.workspace }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Setup ShopCTL
        uses: ./.github/workflows/actions/setup-shopctl

      - name: Download enriched products
        uses: actions/download-artifact@v4
        with:
          name: enriched-products
          path: data/

      - name: Apply updates using shopctl
        run: |
          mkdir -p logs
          touch logs/audit.txt

          while IFS=, read -r pid new_title seo_title seo_desc; do
            # Strip leading/trailing quotes
            seo_desc="${seo_desc%\"}"
            seo_desc="${seo_desc#\"}"

            # Use shopctl to update product details
            if output=$(shopctl product update "$pid" \
                --title "$new_title" \
                --seo-title "$seo_title" \
                --seo-desc "$seo_desc" 2>&1); then
                echo "$pid,success" >> logs/audit.txt
            else
              sanitized_error=$(echo "$output" | tr '\n' ' ' | sed 's/,/ /g')
              echo "$pid,failure,$sanitized_error" >> logs/audit.txt
            fi
          done < <(tail -n +2 data/enriched_products.csv)

        - name: Upload audit log
          uses: actions/upload-artifact@v4
          with:
            name: product-audit-log
            path: logs/audit.txt
  
        - name: Final summary
          run: echo "✅ Shopify product enrichment and updates completed!"

The job is relatively simple; it uses a bash script to read from the CSV file generated in the previous step, update the product using ShopCTL, and create a log file.

Stage 4: Notify

Now, the only thing remaining is to notify interested parties that the job has been completed (or failed) and what has changed. You can either send a Slack notification or email the details. We will simply fetch and print the logs for the tutorial’s sake.

notify:
    needs: [review-catalog, update-products]
    runs-on: ubuntu-latest

    steps:
      - name: Download audit log
        uses: actions/download-artifact@v4
        with:
          name: product-audit-log
          path: logs/

      - name: Download catalog review
        uses: actions/download-artifact@v4
        with:
          name: catalog-review-summary
          path: data/

      - name: Print audit summary
        run: |
          ls -lah logs/
          ls -lah data/
          echo "🧾 Shopify Product Update Audit"
          echo "-------------------------------"

          total=$(wc -l < logs/audit.txt)
          updated=$(grep -c ',success' logs/audit.txt || true)
          failed=$(grep -c ',failure' logs/audit.txt || true)

          echo "✅ Success: $updated"
          echo "❌ Failed: $failed"
          echo "📦 Total Processed: $total"
          echo ""
          echo "📋 Detailed Audit:"
          cat logs/audit.txt

      - name: Print catalog review summary
        run: |
          echo ""
          echo "🧠 Catalog Review Summary"
          echo "-------------------------"
          cat data/review_summary.md

Putting It All Together

The example above showcases how you can leverage available tools to create something unique and powerful, tailored to your use case, without handing over sensitive store data to external apps.

While our proof-of-concept skips over a few production-grade essentials — like using a staging store for manual approvals and proper error handling — it gives you a general idea of how to get started.

Takeaway

This level of flexibility and control opens up limitless possibilities — from automated A/B testing on product copies, multi-language enrichment workflows, dynamic pricing experiments, and automated inventory cleanup to personalized recommendations and beyond.

With every step in your control, you can experiment with new ideas, adapt quickly to market shifts, and scale operations effortlessly as your business grows.


Written by konsole | I've no idea what I do
Published by HackerNoon on 2025/04/27