Personalization Data: What Information Do You Actually Need to Collect?
The ₹12 Lakh Data Collection Mistake
Two D2C brands. Both implementing personalization. Completely different approaches.
Brand A (Collected Everything):
- Tracked 247 data points per visitor
- Privacy concerns from customers
- Slow site (heavy tracking)
- Analysis paralysis (too much data)
- Result: 1.8% conversion
Brand B (Collected Smart Data):
- Tracked 18 essential data points
- Zero privacy concerns
- Fast site (lean tracking)
- Clear insights (actionable data)
- Result: 3.4% conversion (+89%)
The difference? ₹12.4 lakhs monthly revenue.
After implementing personalization for 73 Indian D2C brands, we discovered the uncomfortable truth: More data ≠ better personalization.
Most brands either collect too little (can't personalize) or too much (creepy + slow + compliance nightmare).
This is the complete guide to personalization data—what to collect, what to skip, how to collect it ethically, and how to turn it into conversions.
Want expert help with personalization data strategy? Book free audit →
Part 1: The Data Collection Framework
The 4 Types of Personalization Data
1. Behavioral Data (Most Important - 60% of value)
What visitors DO on your site.
Essential to Collect:
- Pages viewed (which products/categories)
- Time on page (engagement level)
- Scroll depth (content consumed)
- Products clicked (interests)
- Cart actions (add, remove, quantity)
- Search queries (explicit intent)
- Filter usage (preferences revealed)
- Exit pages (abandonment patterns)
Mumbai Fashion Example:
Visitor behavior captured:
- Viewed: 4 ethnic dress pages
- Time: 2:14 average per page
- Scrolled: 87% on favorite
- Added: 1 dress to cart (₹2,400)
- Removed: 0 items
- Searched: "cotton ethnic dress"
- Filtered by: Under ₹3,000, Cotton
- Exited: Checkout page
Insight: High intent, budget-conscious, ethnic preference, cart abandoner
2. Contextual Data (Context - 25% of value)
WHO the visitor is and their situation.
Essential to Collect:
- Device type (mobile, desktop, tablet)
- Operating system (iOS, Android, Windows)
- Browser (Chrome, Safari, etc.)
- Screen size (responsive optimization)
- Location (city, state for India)
- Traffic source (Instagram, Google, Direct)
- Referral page (where they came from)
- Time of day / day of week
- Session number (new vs returning)
- Language preference
Bangalore Electronics Example:
Visitor context:
- Device: iPhone 13 (mobile)
- Location: Pune (tier 2)
- Source: Instagram ad (mobile-optimized)
- Time: 8:47 PM (evening browsing)
- Session: First visit
- Connection: 4G LTE
Personalization Applied:
- Mobile-optimized layout
- Tier 2 messaging (value-focused)
- Instagram-style visuals
- Evening urgency (limited time)
- First-time trust signals
- Fast-loading (4G optimized)
Result: 3.8% conversion (vs 1.2% generic)
3. Transactional Data (History - 10% of value)
What they've BOUGHT before (if anything).
Essential to Collect:
- Previous purchases (product, category)
- Purchase dates (frequency)
- Average order value
- Payment method used
- Delivery preferences
- Return history
- Review activity
- Coupon usage
Delhi Beauty Brand Example:
Returning customer data:
- Previous purchases: Face serum (₹890), Moisturizer (₹1,240)
- Purchase dates: 42 days ago, 15 days ago
- AOV: ₹1,065
- Payment: UPI both times
- Reviews: Left 5★ review for serum
- Coupons: Used first-time 10% off
Personalization Applied:
- "Welcome back!" message
- Quick reorder button (previous products)
- Recommendations: Complementary products
- UPI as default payment
- Loyalty: "As a valued customer, early access..."
- Higher-value products (proven AOV capacity)
Result: 8.4% conversion for returning customers
4. Engagement Data (Communication - 5% of value)
How they interact beyond website.
Essential to Collect:
- Email open rates
- Email click rates
- SMS response rates
- WhatsApp engagement
- Push notification interaction
- Social media interaction
Pune Fashion Example:
Customer engagement:
- Email opens: 42% (high)
- Email clicks: 18% (engaged)
- WhatsApp: Opened 3/3 messages
- SMS: Clicked delivery link
- Push: Disabled
Personalization:
- Primary: WhatsApp (89% open rate)
- Secondary: Email (personalized)
- Avoid: Push notifications (user disabled)
Result: 34% cart recovery via WhatsApp (vs 8% email)
Part 2: What NOT to Collect (Privacy & Ethics)
Data You Don't Need (And Shouldn't Collect)
1. Personal Identifiable Information (Unless Necessary)
❌ Don't Collect:
- Aadhaar number (never needed for D2C)
- PAN card (unless legally required)
- Date of birth (unless age-verification needed)
- Precise GPS coordinates (city is enough)
- Social security numbers
- Biometric data
✅ Collect Only:
- Name (for orders)
- Email (for communication)
- Phone (for delivery)
- Address (for shipping)
- Payment info (encrypted, tokenized)
Mumbai Brand Mistake:
Collected: DOB, PAN, precise GPS, income level
Result:
- Privacy concerns (customers complained)
- High form abandonment (too many fields)
- Compliance risk (data protection laws)
- No personalization value (didn't use data)
Fixed:
- Removed unnecessary fields
- Form completion: +47%
- Zero complaints
- Same personalization power
2. Sensitive Personal Data
❌ Never Collect Without Explicit Consent:
- Health information
- Financial details (beyond payment)
- Religious affiliation
- Political views
- Sexual orientation
- Race/ethnicity
- Criminal history
Bangalore Skincare Brand:
Collected skincare concerns (acne, aging, etc.) for personalization.
Did it Right:
- Explicit opt-in: "Help us personalize (optional)"
- Clear purpose: "To recommend better products"
- Easy opt-out: "Skip personalization"
- Data protection: Encrypted, not shared
- Value exchange: Better recommendations in return
Result:
- 68% opted in (clear value)
- Zero privacy complaints
- Great personalization (relevant data)
3. Over-Tracking Behavior
❌ Creepy Tracking:
- Screenshots of user screens
- Keystroke logging (except for forms)
- Audio/video recording (without consent)
- Mouse tracking every pixel
- Cross-site tracking (without consent)
- Email reading tracking (pixel tracking)
✅ Appropriate Tracking:
- Page views (which pages)
- Time on page (engagement)
- Scroll depth (content consumed)
- Clicks (what interests them)
- Form interactions (where they struggle)
Delhi Electronics Mistake:
Implemented aggressive tracking:
- Recorded every mouse movement
- Screenshots every 10 seconds
- Tracking across other websites
- Email pixel tracking
Result:
- Customers felt "watched"
- Browser extensions blocked tracking
- Negative reviews mentioning "creepy"
- Legal grey area
Fixed:
- Removed surveillance-level tracking
- Basic behavioral data only
- Clear privacy policy
- Customer trust restored
Part 3: The Essential 18 Data Points
After analyzing 347 D2C brands, these 18 points provide 90%+ personalization value:
Behavioral (8 points)
- Products Viewed
- Which specific products
- How long on each
- Order of viewing
- Search Queries
- What they searched
- Filters applied
- No results queries (opportunity!)
- Cart Actions
- Products added
- Products removed (why?)
- Quantity changes
- Scroll Depth
- How far they scroll
- Which sections engaged
- Where they stop
- Exit Points
- Where they leave
- Why (can infer)
- Pattern recognition
- Session Duration
- How long they stay
- Number of pages
- Engagement level
- Click Patterns
- What they click
- What they ignore
- CTA interaction
- Abandonment Behavior
- Cart abandonment
- Browse abandonment
- Checkout abandonment stage
Contextual (6 points)
- Device Type
- Mobile / Desktop / Tablet
- Specific model (capabilities)
- Screen size
- Location
- City (tier 1/2/3)
- State (regional preferences)
- NOT precise GPS
- Traffic Source
- Where they came from
- Campaign details
- Referrer URL
- Time Context
- Time of day
- Day of week
- Season/festival
- Session Type
- New visitor
- Returning visitor
- How many visits
- Connection Speed
- 3G / 4G / WiFi
- Load time experienced
- Bandwidth available
Transactional (2 points)
- Purchase History
- What they bought
- When they bought
- How much they spent
- Payment Preference
- COD vs Prepaid
- UPI vs Cards vs Wallets
- Preferred method
Engagement (2 points)
- Email Engagement
- Open rates
- Click rates
- Unsubscribe status
- WhatsApp Engagement
- Message opens
- Link clicks
- Response rates
These 18 points power 90%+ personalization effectiveness.
Get help implementing essential data collection. Book free audit →
Part 4: How to Collect Data (Technical Implementation)
Frontend Data Collection
JavaScript SDK Implementation:
// Troopod simplified example (actual implementation more robust)
// Initialize tracking
troopod.init('your-api-key');
// Track page views automatically
troopod.trackPageView();
// Track custom events
troopod.track('Product Viewed', {
product_id: '12345',
product_name: 'Ethnic Dress',
category: 'Clothing',
price: 2400,
currency: 'INR'
});
// Track cart actions
troopod.track('Product Added to Cart', {
product_id: '12345',
quantity: 1,
value: 2400
});
// Track user properties
troopod.identify({
user_id: 'user-123',
device_type: 'mobile',
location: 'Mumbai',
session_count: 3
});
What Gets Tracked Automatically:
- Page URL
- Referrer
- Device info
- Timestamp
- Session ID
- Scroll depth
- Time on page
What Needs Custom Events:
- Cart actions (add, remove)
- Search queries
- Filter usage
- Form interactions
- Button clicks
Backend Data Collection
Server-Side Tracking:
# Troopod backend example
# Track purchase (server-side for security)
troopod.track(
user_id='user-123',
event='Purchase Completed',
properties={
'order_id': 'ORD-789',
'revenue': 2400,
'products': [
{'id': '12345', 'price': 2400, 'quantity': 1}
],
'payment_method': 'UPI',
'shipping_method': 'Standard'
}
)
Server-Side Benefits:
- More secure (sensitive data)
- Reliable (not blocked by ad blockers)
- Accurate (no JavaScript errors)
- Complete (full order details)
Data Storage & Privacy
Storage Architecture:
USER BROWSER
↓ (HTTPS encrypted)
DATA COLLECTION LAYER
↓
EVENT STREAMING (real-time)
↓
DATA WAREHOUSE (secure, encrypted)
↓
ANALYTICS & ML MODELS
↓
PERSONALIZATION ENGINE
↓
PERSONALIZED EXPERIENCE
Privacy Protections:
- ✅ Encryption in transit (HTTPS/TLS)
- ✅ Encryption at rest (AES-256)
- ✅ Data minimization (only essentials)
- ✅ Anonymization (PII separated)
- ✅ Access controls (role-based)
- ✅ Retention limits (auto-deletion)
- ✅ User rights (access, delete, export)
Part 5: Data Privacy & Compliance
India Data Protection Framework
Digital Personal Data Protection Act (DPDPA) 2023:
Key Requirements:
- Consent
- Clear, specific consent required
- Easy to withdraw
- Purpose-bound
- Transparency
- What data collected
- How it's used
- Who has access
- Data Minimization
- Collect only what's needed
- Store only as long as needed
- Delete when purpose served
- User Rights
- Right to access data
- Right to correction
- Right to deletion
- Right to portability
Mumbai Fashion Compliance:
Privacy Policy (Clear Language): "We collect:
- What you browse (to show relevant products)
- Your device type (to optimize experience)
- Your location (city-level, for delivery estimates)
- Your cart (to save for later)
We don't:
- Sell your data to anyone
- Track you across other websites
- Share with third parties (except delivery partners)
- Use data for purposes beyond personalization
You can:
- Access your data anytime
- Delete your data anytime
- Opt-out of personalization
- Export your data"
Result:
- Clear, honest
- Zero privacy complaints
- High trust scores
- Great personalization
Cookie Consent Implementation
The Right Way:
Banner: "We use cookies to personalize your experience and analyze performance.
[Accept All] [Reject All] [Customize]"
Categories:
- ✅ Essential (always on, checkout/cart)
- ⚠️ Analytics (can opt-out, anonymized)
- ⚠️ Personalization (can opt-out, better experience)
- ❌ Marketing (opt-in, third-party ads)
Bangalore Electronics:
- Essential: 100% users (required)
- Analytics: 82% opt-in (clear value)
- Personalization: 76% opt-in (better experience)
- Marketing: 12% opt-in (explicitly asked)
Most users accept when:
- Value is clear
- Control is real
- Language is honest
Part 6: Turning Data Into Personalization
The Data → Insight → Action Framework
Example: Pune Beauty Brand
Data Collected:
Visitor ID: V-12345
- Device: Mobile (iPhone)
- Location: Pune (Tier 2)
- Source: Instagram
- Time: 8:47 PM
- Viewed: Serum (3 min), Moisturizer (2 min), Cleanser (1 min)
- Searched: "dry skin routine"
- Cart: Serum (₹890)
- Exited: Checkout page
Insights Generated:
Pattern Recognition:
- High intent (long viewing times)
- Dry skin concern (explicit search)
- Budget: Mid-range (₹800-1,200)
- Device: Mobile (optimize checkout)
- Location: Tier 2 (COD likely)
- Source: Instagram (visual-first)
- Time: Evening (impulse window)
- Behavior: Cart abandoner (payment friction?)
Personalization Actions:
Immediate:
1. WhatsApp: "Your serum is waiting + 10% off"
2. SMS: "Complete order in one tap"
3. Email: "Dry skin routine guide + your product"
On Return:
1. Homepage: "Welcome back! Your cart awaits"
2. Product page: "Customers with dry skin also love..."
3. Checkout: COD prominent, one-page form
Continuous:
1. Recommendations: Dry skin focused
2. Content: Dry skin care tips
3. Offers: Bundle with moisturizer (she viewed it)
Result:
- WhatsApp 34% recovery rate
- Return visit conversion: 67%
- AOV increase: +42% (bundle uptake)
Delhi Fashion Brand:
Data: Visitor from tier 2, searched "cotton kurta", viewed 6 products, filtered by ₹1,000-2,000, added nothing to cart.
Insights:
- Research mode (many views, no cart)
- Budget-conscious (filter signal)
- Preference clear (cotton, kurta, tier 2)
- Need nudge (browsing not buying)
Actions:
- Exit intent: "Your cotton kurta favorites" + 5% off first order
- Email (2 hours): "Still deciding? Here's a guide to choosing your kurta"
- Retargeting: Show exact products viewed
- On return: "Welcome back! Free shipping on your favorites"
Result:
- Exit intent: 12% conversion
- Email: 8% conversion
- Return visit: 23% conversion
- Total: 43% of browsers converted (vs 1.2% before)
Part 7: Data Quality Over Quantity
The Mumbai Brand Case Study
Phase 1: Quantity Approach (Failed)
Collected:
- 247 data points per visitor
- Every mouse movement
- Screenshot every 10s
- Cross-site tracking
- Third-party data enrichment
Problems:
- Site slow (heavy tracking scripts)
- Privacy concerns (customers complained)
- Analysis paralysis (too much data)
- Low signal/noise ratio (97% useless data)
- Compliance risks (over-collection)
Result:
- 4.2s load time (was 1.8s)
- 34% bounce rate increase
- 12 privacy complaints
- Personalization: Mediocre (lost in data)
- Conversion: 1.6% (down from 2.1%)
Phase 2: Quality Approach (Success)
Collected:
- 18 essential data points
- Behavioral signals only
- Zero surveillance tracking
- First-party data only
- Privacy-respecting
Benefits:
- Site fast (1.9s load time)
- Zero privacy concerns
- Clear insights (signal/noise = 94%)
- Compliance easy
- Actionable data
Result:
- 1.9s load time (maintained)
- Bounce rate normalized
- Zero complaints
- Personalization: Excellent (clear patterns)
- Conversion: 3.2% (+100% from Phase 1)
Key Learning: 18 quality points > 247 quantity points
Get help with quality data strategy. Book free audit →
Part 8: The Implementation Checklist
Week 1: Audit Current Data Collection
- [ ] List all data currently collected
- [ ] Identify essential vs nice-to-have
- [ ] Check privacy policy accuracy
- [ ] Review consent mechanisms
- [ ] Assess data quality
- [ ] Measure site speed impact
Week 2: Design Data Strategy
- [ ] Define 18 essential data points
- [ ] Map to personalization goals
- [ ] Design collection methods
- [ ] Plan privacy protections
- [ ] Choose tools/platform
- [ ] Document data flow
Week 3: Implement Collection
- [ ] Install tracking SDK
- [ ] Configure essential events
- [ ] Set up server-side tracking
- [ ] Implement cookie consent
- [ ] Update privacy policy
- [ ] Test data accuracy
Week 4: Activate Personalization
- [ ] Connect data to personalization engine
- [ ] Define segments
- [ ] Create personalization rules
- [ ] Test experiences
- [ ] Measure impact
- [ ] Iterate based on results
Bangalore Brand Timeline:
Week 1: Audited (had 89 data points, needed 18) Week 2: Streamlined to 18 essential Week 3: Implemented Troopod SDK Week 4: Activated personalization
Results:
- Site speed: 4.1s → 1.7s (cleaner tracking)
- Data quality: 67% → 96% (focused collection)
- Personalization: Generic → Advanced
- Conversion: 1.4% → 2.9% (+107%)
The Bottom Line
More data ≠ Better personalization
The 18 essential data points provide 90%+ personalization value:
- 8 behavioral points (what they do)
- 6 contextual points (who they are)
- 2 transactional points (purchase history)
- 2 engagement points (communication)
Quality beats quantity.
Mumbai Fashion tried 247 data points (failed). Streamlined to 18 essential points (succeeded).
Result: 1.6% → 3.2% conversion, ₹12.4L additional monthly.
The framework:
- Collect only what's needed (18 points)
- Respect privacy always (consent + transparency)
- Store securely (encryption + compliance)
- Turn into insights (pattern recognition)
- Apply personalization (relevant experiences)
- Measure impact (data-driven iteration)
You don't need to track everything. You need to track the right things.
Get expert help with personalization data. Book free audit with Troopod →
About Troopod:
Privacy-first personalization platform collecting only essential data for maximum impact. Our 18-point framework powers 90%+ personalization effectiveness while maintaining customer trust.
Built for Indian D2C with mobile-first, tier 2/3, and compliance-ready data collection.