đź“‹ Table of Contents
Jump to any section (18 sections available)
📹 Watch the Complete Video Tutorial
📺 Title: Master Data Analysis with ChatGPT (in just 12 minutes)
⏱️ Duration: 714
👤 Channel: Jeff Su
🎯 Topic: Master Data Analysis
đź’ˇ This comprehensive article is based on the tutorial above. Watch the video for visual demonstrations and detailed explanations.
In today’s data-driven world, everyone works with data—whether you’re in marketing, sales, product, or management. Yet, most professionals never received formal training in data analysis. The result? Hours wasted trying to make sense of spreadsheets, missed insights, and analysis that misses the mark.
But what if you could turn ChatGPT into your personal data analyst—with zero technical skills required?
In this comprehensive guide, we’ll walk you through the exact three-step DIG framework (Description, Introspection, Goal Setting) taught in a top-rated Coursera course on AI for data analysis. You’ll learn how to use ChatGPT to understand unfamiliar datasets in minutes, uncover hidden insights, merge datasets, and deliver business-relevant recommendations—even if you’ve never written a line of code.
Every tip, prompt, example, and insight below comes directly from a real-world demonstration using an Apple TV+ dataset featuring shows like Avatar: The Last Airbender, The Godfather, and Sherlock. Let’s dive in.
Why Traditional Data Analysis Fails Non-Experts
Most professionals—management consultants, account managers, product marketers—regularly work with data but lack structured training. This leads to:
- Spending hours trying to interpret raw spreadsheets
- Missing critical insights due to unfamiliarity with the data
- Creating technically correct but irrelevant analyses
The solution isn’t learning Python or SQL. It’s giving AI a proven framework to follow—so it can do the heavy lifting for you.
Introducing the DIG Framework for Data Analysis
The DIG framework—Description, Introspection, and Goal Setting—is a simplified version of the industry-standard Exploratory Data Analysis (EDA). The name “DIG” was chosen for its memorability (as taught in the Coursera course), but the principles align with professional data science practices.
By applying DIG through ChatGPT, you can:
- Understand any dataset in minutes—even with zero context
- Extract insights non-analysts would miss
Imagine receiving a spreadsheet from a colleague who “rage quit” (we’ll call him Tim Cookie). With no documentation or explanation, your understanding starts at 0%. But with each DIG prompt, your comprehension grows—until you uncover actionable insights that might have taken hours to find manually.
Step 1: Description – Understand What’s in Your Data
The goal of the Description phase is to rapidly grasp the structure, content, and quality of your dataset. This prevents wasted effort on flawed or incomplete data.
Prompt 1: List All Columns + Show One Sample per Column
Prompt: “List all the columns in the attached spreadsheet and show me a sample of data from each column.”
Why this works:
- Forces ChatGPT to scan every column
- Provides a human-readable snapshot instead of overwhelming you with raw rows
Real example output: For a movie titled Forrest Gump, ChatGPT returned all 8 columns. But it also revealed potential issues:
- Release year listed as “994.0” (likely a typo for 1994)
- Genres separated by commas (e.g., “Drama, War”)
- Unclear meaning of “IMDb ID”
These red flags signal possible data quality issues that need addressing before deeper analysis.
Prompt 2: Take Five Random Samples per Column
Prompt: “Take five more random samples of the data for each column to make sure you understand the format and type of information in each column.”
Why this matters: A single sample might be an outlier. Multiple samples help you spot inconsistencies like:
- Some entries labeled “TV,” others “Movie” under the “Type” column
- Varying numbers of genres (1, 2, or even 3 per title)
- “Available Countries” ranging from one country to multiple
This builds a more accurate mental model of your data’s structure and variability.
Prompt 3: Run a Data Quality Check
Prompt: “Run a data quality check on each column. Specifically look for missing or empty values, unexpected formats or data types, outliers or suspicious values.”
Key findings from the Apple TV+ dataset:
| Column | Missing Values | Percentage Missing | Action Required |
|---|---|---|---|
| Title | 589 | 3.1% | Minor issue |
| Available Countries | — | 99.7% | Do NOT use for geographic analysis |
Verifying in the raw data confirmed: most rows had empty values for “Available Countries.” This insight alone saves hours of futile analysis.
Pro Insight: While ChatGPT doesn’t do 100% of the work, it dramatically reduces the cognitive load on the human analyst. Always verify ambiguous fields (e.g., “What is an IMDb ID?”)—ChatGPT can confirm it’s a unique identifier for movies/shows.
Step 2: Introspection – Brainstorm What Questions the Data Can Answer
Now that you understand your data, it’s time to explore its analytical potential. The Introspection phase tests whether ChatGPT truly “gets” your dataset—and often reveals insights you hadn’t considered.
Prompt 1: Generate 10 Interesting Questions
Prompt: “Tell me 10 interesting questions we could answer with this data set and explain why each would be valuable.”
High-quality questions indicate strong data understanding. Examples from the Apple TV+ dataset:
- How has Apple TV’s yearly output grown since launch? → Indicates market expansion and content strategy success.
- What share of releases are movies vs. series each year? → Reveals shifts in viewer preferences or platform focus.
- Which genres dominate the catalog and how have they shifted over time? → Critical for content investment decisions (e.g., double down on popular genres or avoid oversaturated ones).
If ChatGPT asks irrelevant or impossible questions, it signals a misunderstanding—go back and clarify your data.
Prompt 2: Validate Data Sufficiency for Top Questions
Prompt: “For the first three questions, tell me exactly which columns you need to use and whether the current data is sufficient to answer it.”
This forces ChatGPT to “show its work” and assess feasibility:
| Question | Required Columns | Data Sufficient? | Action Needed |
|---|---|---|---|
| Yearly output growth | Release Year, Type | Yes | Fix 0.3% non-numeric entries |
| Movies vs. series share | Type, Release Year | Yes | Light data cleanup |
| Genre dominance over time | Genre, Release Year | Yes | None |
All three questions were feasible—enabling immediate next-step analysis.
Prompt 3: Identify Unanswerable Questions Due to Missing Data
Prompt: “What questions do you think someone would want to ask about this data but we can’t answer due to missing information?”
This surfaces critical data gaps and manages stakeholder expectations:
- “What’s the most watched genre?” → Missing viewership metrics
- “Which genres deliver the best cost per hour of content?” → Missing production budget, revenue, or cost fields
Knowing what you can’t answer is as important as knowing what you can.
Merging Datasets Using a Common Key (IMDb ID)
What if you could access viewership and cost data? In a hypothetical (and clearly fictional!) scenario, the speaker created a second CSV with:
- Column A: IMDb ID
- Column B: Total Viewership
- Column C: Total Production Cost
Prompt to merge datasets: “I just received this data set from a colleague. Your task is to explore and explain the relationships between this new data set with the original one and how they might be used to join the data together.”
ChatGPT’s response:
- Confirmed IMDb ID is the common key for joining
- Provided instructions to merge datasets
- Generated a sample merged row for Forrest Gump showing title, genre, release year, viewership, and cost in one row
- Offered a downloadable merged CSV
This unlocks advanced analyses like cost per viewer by genre—turning descriptive data into strategic insights.
Important Note: The speaker emphasized this second dataset was “made up” and joked, “Don’t report me to Apple.” Always ensure data privacy and compliance in real-world scenarios.
Step 3: Goal Setting – Align Analysis with Business Objectives
Without clear goals, even perfect analysis can be useless. Imagine spending days building 20 slides—only for your manager to say, “I just wanted to know if we should discontinue Product X.”
The Goal Setting phase ensures your analysis delivers what stakeholders actually need.
Use a Mission-Briefing Prompt
Prompt: “My goal is to understand what content Apple TV should invest in next. Given this goal, which aspects of the data should we focus on?”
ChatGPT responded with a tailored roadmap based on role-specific priorities:
| Team | Focus Areas | Recommended Actions |
|---|---|---|
| Content Team | Viewership, audience demand, content supply | Analyze genre popularity, trend velocity |
| Finance Team | Unit economics, ROI | Calculate cost per viewer, genre profitability |
Sample Step-by-Step Roadmap from ChatGPT
- Clean the data (handle missing values, format inconsistencies)
- Build a genre scorecard (rank genres by performance)
- Rank investment opportunities
- Layer in trend velocity (how fast is a genre growing?)
- Stress-test with outliers (ensure robustness)
Example Insight Generated
Following this process, ChatGPT surfaced a powerful insight:
“True crime series deliver three times the median views of all series. They cost 18% less per finished hour and have climbed from 4% to 9% share of total watch time in the last 3 years.”
This single insight could justify a major content investment shift.
Pro Tip: Anticipate Stakeholder Questions Before Presenting
Before any presentation, ask ChatGPT:
“What are the key questions someone reading my analysis would ask, and how should we proactively address them?”
This prompt has “single-handedly saved my ass multiple times” by preparing for tough questions like:
- “What about seasonality effects?”
- “Is this trend statistically significant?”
- “How does this compare to competitors?”
It transforms you from a data reporter into a strategic advisor.
Why the DIG Framework Levels the Playing Field
The DIG framework + ChatGPT empowers non-technical professionals by providing:
- A repeatable, structured process for any dataset
- Rapid comprehension of unfamiliar data
- Actionable business insights without coding
- Confidence in data quality before investing analysis time
You no longer need to be a data scientist to deliver high-impact analysis.
Tools and Resources Mentioned
- ChatGPT (use the latest reasoning model for best results)
- Free Apple TV+ dataset (includes titles like Avatar: The Last Airbender, The Godfather, Sherlock)
- Coursera course: “AI for Data Analysis” (covers DIG framework, hallucination mitigation, debugging)
- Google Workspace (for users who want weekly productivity tips via the speaker’s newsletter)
Advanced Considerations from the Full Coursera Course
While this guide covers the essentials, the full Coursera course dives deeper into:
- Mitigating AI hallucinations in data analysis
- Debugging weird data errors flagged by ChatGPT
- Best practices for prompt engineering in analytical contexts
- Ethical considerations when using AI with sensitive data
A 40% discount for 3 months of Coursera Plus was offered via a special link (mentioned in the video description).
Real-World Application: Don’t Wait—Branch Out Early
In practice, don’t rigidly follow all DIG steps before acting. As soon as ChatGPT mentions “genre popularity,” immediately ask:
“Analyze genre popularity over time and show the top 5 growing genres.”
Then, based on those results, ask follow-ups like:
“For the top growing genre, what’s the average production cost vs. viewership?”
This iterative digging accelerates insight discovery.
Common Pitfalls to Avoid
- Skipping data quality checks → Leads to flawed conclusions
- Not validating ChatGPT’s understanding → Results in irrelevant questions
- Ignoring missing data limitations → Wastes time on impossible analyses
- Analyzing without a goal → Produces technically correct but useless reports
Performance Metrics That Matter
The DIG framework helps you track progress through:
- % Understanding of dataset (from 0% to actionable insight)
- Time saved (minutes vs. hours for initial exploration)
- Insight relevance (aligned with business goals)
- Data gap awareness (managing expectations upfront)
Future-Proofing Your Data Skills
As AI evolves, the ability to frame problems, validate outputs, and connect insights to business outcomes will become more valuable than coding skills. The DIG framework builds these muscles.
Start small: apply it to your next spreadsheet. Over time, you’ll develop intuition for data storytelling, stakeholder management, and strategic recommendation.
Next Steps After Mastering DIG
Once comfortable with the DIG framework, explore:
- Advanced ChatGPT prompting for data (e.g., “Create a Python script to visualize this” if you want to automate)
- Connecting ChatGPT to live data sources (via APIs or Google Sheets plugins)
- The speaker’s “ChatGPT Pro Tips” video (recommended for next viewing)
Final Summary: Your Action Plan to Master Data Analysis
1. Description: Use 3 prompts to understand structure, samples, and quality.
2. Introspection: Generate questions, validate feasibility, identify gaps.
3. Goal Setting: Align analysis with business objectives using mission-briefing prompts.
4. Anticipate: Always ask, “What questions will stakeholders ask?”
5. Iterate: Branch out early—don’t wait to finish all steps.
The DIG framework isn’t just a method—it’s your shortcut to becoming a confident, strategic data-driven professional, regardless of your technical background.
Key Takeaways
- You don’t need coding skills to perform high-value data analysis
- ChatGPT + DIG = your personal data analyst
- Always start with data quality checks to avoid wasted effort
- Merge datasets using common keys (like IMDb ID) to unlock deeper insights
- Goal alignment is the difference between useful and useless analysis
Ready to transform your next spreadsheet from confusing to crystal clear? Apply the DIG framework today—and master data analysis on your own terms.

