Skip to content
ENTERPRISE AI PLATFORM BUILT ON REAL-WORLD RETAIL DATA

Retail AI Platform Powered by Foundation Models, Image Recognition & Demand Intelligence

Trained on 1B+ shelf images, 1M+ SKUs, and 10M+ planograms, Vision Group's retail foundation models power a unified AI engine that understands the shelf, predicts true demand, and drives agentic execution across every store.

The OpenAI analogy for retail.

OpenAI trained a language foundation model on text from the internet. Vision Group trained retail foundation models on data from the shop floor. The analogy is exact — and the moat is the same. The dataset took 11 years to build. It cannot be bought.

Gartner predicts 50%+ of enterprise AI models will be domain-specific by 2027. Vision Group's retail foundation models are the definition of that prediction — already there, already compounding.

bullet-cloud

AI Platform Layer

  • 3D digital twin generation
  • Automated AI training pipeline
  • Real-time agentic rules engine
  • Multi-model pipeline orchestration

bullet-multimodal

Multimodal AI Models

  • Retail image recognition engine (1B+ training images)
  • 3D scene reconstruction
  • Vision-language model (price tags, menus)
  • Shelf sales volume & behavior model

idea

Strategy AI Models

  • Natural language data intelligence
  • AI assortment optimization (Assortment.AI)
  • AI demand transfer model
  • AI consumer decision trees
  • Retail demand foundation model (Demand.AI)

The data flywheel — why models improve continuously

Every shelf image
improves the image recognition foundation model — for every customer, not just the one whose shelf was photographed
Every transaction
improves the consumer decision tree models — building a more precise map of substitution behavior across every market
Every assortment decision
improves the demand forecasting models — the gap between forecast and actual narrows with every cycle
THE AI ENGINE

Three integrated sub-systems. One retail AI engine.

The AI Engine is not a single model. It is three purpose-built sub-systems — AI Platform, Multimodal AI, and Strategy AI — working together across all five intelligence layers.

SUB-SYSTEM 1

AI Platform Layer

The foundation of the system — combining 3D digital twin generation, automated AI training pipelines, a real-time agentic rules engine, and multi-model orchestration to continuously train, manage, and deploy intelligence at scale.

SUB-SYSTEM 2

Multimodal AI Models

At the core is a retail image recognition engine trained on 1B+ real-world images, enhanced by 3D scene reconstruction, vision-language models that interpret price tags and menus, and behavioural models that connect what happens on shelf to what sells.

SUB-SYSTEM 3

Strategy AI Models

On top sits a layer of decision intelligence — natural language data analysis, AI-driven assortment optimisation, demand transfer modelling, and consumer decision trees — culminating in a retail demand foundation model that predicts true demand, not just observed sales.

SUB-SYSTEM 1

AI Platform Layer

3D Digital Twin Generation

Generates 3D models and complete digital twins of retail products from six-sided images or volumetric captures — creating the product content foundation that powers planogram generation, shelf recognition, and space planning across all five intelligence layers.

POWERS:
Product.AI Space.AI
Live

Automated AI Training Pipeline

Automates data cleaning, annotation, and synthetic dataset generation to continuously train and improve computer vision models — removing manual data labelling and enabling AI accuracy to improve at scale without proportional human effort.

POWERS:
All Five Layers
Live (Internal)

Real-Time Agentic Rules Engine

A real-time AI rules engine that continuously checks store data against Perfect Store targets and planogram standards — instantly triggering corrective actions without waiting for human review. This is the agentic AI capability Gartner predicts will define 33% of enterprise apps by 2028. Vision Group has it live today.

POWERS:
Execution.AI
Live

Multi-Model AI Pipeline Orchestration

Manages complex, multi-model AI pipelines — coordinating different models simultaneously and adapting to category, regional, and retailer nuances for scalable execution across 340+ customers and 75+ countries, without custom engineering for each deployment.

POWERS:
All Five Layers
Live (Internal)
SUB-SYSTEM 2

Multimodal AI Models

Retail Image Recognition Engine

Converts unstructured shelf image and video data into structured retail intelligence — identifying SKUs, counting facings, detecting gaps, measuring share of shelf, and flagging compliance failures in real time. Trained on 1B+ retail images across 11+ years — the most retail-specific image recognition training dataset in the market.

POWERS:
Execution.AI Product.AI
Live

3D Scene Reconstruction Model

Performs 3D reconstruction from multi-view shelf and store images — enabling spatial localisation, product deduplication, and precise positioning within a retail fixture. Generates planogram-accurate 3D representations of real shelves from standard field team photography, without specialist equipment.

POWERS:
Space.AI Product.AI
Live

Shelf Sales Volume & Behavior Model

Analyses shelf images at different time points to measure sales volume from shelf change — comparing before and after states to infer sell-through rates and identify which planogram layouts drive highest removal rates. Also processes shopping behaviour video to understand consumer movement, dwell time, and product interaction at the fixture level.

POWERS:
Demand.AI Assortment.AI
Live

Retail Image Recognition Engine

Converts unstructured shelf image and video data into structured retail intelligence — identifying SKUs, counting facings, detecting gaps, measuring share of shelf, and flagging compliance failures in real time. Trained on 1B+ retail images across 11+ years — the most retail-specific image recognition training dataset in the market.

POWERS:
Product.AI Space.AI
Live (Internal)
SUB-SYSTEM 3

Strategy AI Models

Natural Language Data Intelligence

An LLM-based auto data analysis system that answers natural-language questions by querying all platform data in real time. Category managers can ask "which SKUs are driving the most category exits in the South East?" and receive a data-backed answer in seconds — without a data analyst or SQL query.

POWERS:
All Five Layers
Live

AI Assortment Optimization Model

Recommends the optimal product assortment for every store at store-cluster level — not just banner-wide averages. Inputs include demand signals, consumer decision trees, store demographics, product attributes, and category strategy constraints. Outputs a ranked assortment recommendation with expected revenue lift per change.

POWERS:
Assortment.AI
Live

AI Demand Transfer Model

Predicts where and how demand shifts when products are added or removed from the range — quantifying transfer to adjacent SKUs, competitor brands, and category exit at store-cluster level. Built from POS and loyalty data across 340+ customers. Commercially live — drives Assortment.AI simulations today.

POWERS:
Assortment.AI Demand.AI
Commercially Live

AI Consumer Decision Tree Model

Maps the hierarchy of purchase decisions consumers make at the fixture — brand first, category first, price tier first, or occasion first — segmented by store type, channel, and shopper cohort. Built from real observed POS and loyalty data. Drives assortment decisions that reflect how shoppers actually think, not how planners assume they think.

POWERS:
Assortment.AI Demand.AI
Commercially Live

Retail Demand Foundation Model

Predicts sales volume based on in-store display conditions, planogram compliance, promotional execution, and external factors including seasonality, events, and weather. Unlike standard forecasting models that use only historical POS data, Vision Group's model incorporates real shelf execution state — knowing whether a product is actually on shelf, correctly placed, and correctly priced — producing forecasts that reflect true demand rather than observed sales constrained by execution failures.

WHAT MAKES IT DIFFERENT

→ Standard models forecast from observed sales — constrained by stockouts and execution failures

→ This model incorporates real shelf execution state from Execution.AI before forecasting

→ Result: forecasts reflect true demand — not what sold, but what consumers wanted to buy

POWERS:
Assortment.AI Demand.AI
Phase 2
WHAT THIS MEANS COMPETITIVELY

No retail AI vendor has a model stack of this depth and breadth.

The Gartner predictions about domain-specific, agentic, and multimodal AI describe what Vision Group has already built — and has had live in production for years.

RELEX: forecasting models only Trax: image recognition only NielsenIQ: analytics models only Vision Group: all three sub-systems, unified

See the AI engine at work.

Demo includes a live walkthrough of the foundation models and how they power each intelligence layer.

Schedule a Demo