Data Sources
Ops Center — external APIs, content channels, and data infrastructure
EKIS Inferentia2 via vLLM
Self-hosted DeepSeek-R1-Distill-Llama-8B on AWS Inferentia2 (inf2.xlarge). Normalizes Reddit posts, fixes automotive slang, redacts PII. Endpoint: vllm.ekis.internal:8000
Google Cloud Platform
YouTube Data API
APIVideo channel discovery and metadata ingestion
Owner Manual Channels
ContentOEM owner manual scraping and indexing pipelines
Brand Brochure Channels
ContentBrand marketing brochure discovery and ingestion
OEM Brand Images
ContentManufacturer press photos and vehicle imagery from OEM newsrooms
OEM Vehicle Colors
ContentOEM brand color names, codes, and color swatch imagery per make/model/year/trim
Dealership Groups
ContentDealer group hierarchies, rooftop profiles, and franchise data
Approved Makes (Brands)
RegistryCanonical make/brand registry driving all corpus ingestion
Reddit Channels
APISubreddit sources for real-owner experience data
NHTSA Content
APIRecalls, complaints, safety ratings, and car seat stations
Bright Data
ProxyWeb data collection infrastructure and proxy network
Decodo
APIWeb scraping API for Reddit and structured content extraction
Firecrawl
APIWeb crawling and content extraction service
Pinecone
Vector DBVector indexes for brochures, manuals, Reddit, and multimodal data
Google Gemini
AISearch grounding, content analysis, and VLM processing
Fortellis (CDK)
APIDMS integration for vehicle lookup and repair order sync
Vertex AI
AIGoogle Cloud AI services for image and content processing
Neo4j Graph
Graph DBKnowledge graph for features, vehicles, and relationships
DataOne VIN Decode
APIVIN decoding and vehicle specification data