← Back to Projects

Instagram Engagement Analytics

OLS regression and caption text feature analysis identifying engagement drivers across 5,975 posts from 47 top Instagram creators.

Python (Pandas) OLS Regression Text Analysis

Overview

This project analyzes what drives engagement across 5,975 Instagram posts from 47 top creators, using OLS regression and caption text feature extraction. The goal was to isolate the impact of content format, caption characteristics, and creator identity on engagement outcomes.

Key Insights

  • Album posts outperform single photos by 15.2% — the single most impactful format decision a creator can make
  • Creator identity explains 45% of engagement variation, confirmed by R² jumping from 0.163 to 0.618 with fixed effects
  • 🏆 Milestone Achiever Award & Finalist — Fordham Gabelli Marketing Analytics Competition 2026

Tools Used

Python (Pandas) OLS Regression Text Analysis

Business Takeaway

Carousel (album) format is the highest-leverage content decision available to creators, outperforming single photos by 15.2%. The jump in R² from 0.163 to 0.618 when adding creator fixed effects reveals that brand equity and audience trust are the dominant drivers of engagement — meaning sustained audience-building matters far more than any individual post tactic.

Project Screenshots

Bar chart showing average engagement rate by post format — albums lead single photos by 15.2%
Post Format vs. Average Engagement Rate
Line chart showing engagement rate by day of week
Engagement Rate by Day of Week
Grouped bar chart showing genre and post format interaction effects on engagement
Genre × Format Interaction