4–6 hoursIntermediate

Analyze a Real Public Dataset

Maps to: Data Analyst · Quantitative Analyst, Marketing Analyst, Journalist, Researcher, Consultant

You're going to take a real question you care about, find a public dataset that can answer it, and publish a piece that argues a finding with real charts. The skill is analytical judgment: deciding what the data actually supports versus what you hoped to find, and not over-claiming when it's close. That's the real work of a data analyst, the call about what's true and what isn't, and doing one tells you fast whether digging for an honest answer is your kind of work.

The plan

0/4 done

You're 20% in just for starting, the hardest part. Mark your first step done to keep the momentum.

  1. Pick a question you actually care about, then find a real, accessible dataset that can answer it. Confirm the data exists and you can get it before you fall in love with the question. The question is the whole project.

    Objective: A question + a confirmed real dataset that can answer it.

    1. 1

      Pick your question: something you care about / 'is X actually true' / a local-data question / a trend you suspect.

    2. 2

      Find a dataset that can actually answer it, and confirm it's real and downloadable.

      Tool: data.gov

    Your call

    Choose the question you care about and find a dataset that can actually answer it, yourself.

    The question, and whether a dataset can actually answer it.

    What good looks like: You have a question you care about and you've confirmed a real dataset can actually answer it, before falling in love with the question.

    • Confirm the data exists FIRST. A great question with no dataset is a dead end.

The bar to look back against

A published 1,000 to 1,500 word piece with 2+ real charts that argues a verified finding from a real dataset, and you can say what the data actually supports versus what you hoped, and what you got wrong. The judgment is the work: not 'I made charts,' but 'I asked a real question and didn't over-claim the answer.'

Finish the final step, then submit what you built. Your progress is saved.

Tools you'll use

Step 1 · Pick a question + find a real dataset

US government open datasets.

Best for: Finding a real, accessible dataset.

Huge library of public datasets.

Best for: Another source of real datasets.

FRED Free

Federal Reserve economic data.

Best for: Economics/finance questions.

Curated global datasets.

Best for: Global/social questions.

Step 2 · AI writes the analysis + you verify

AI that writes + runs Python/SQL to clean and analyze data.

Best for: Doing the analysis (you verify every finding).

Steps 3–4 · Build charts + decide what the data actually supports

Publication-quality charts, free (attribution on free plan).

Best for: Building your 2–3 charts.

Notebook for interactive data viz.

Best for: Interactive charts if you want to go deeper.

How this shows up on a resume or college app

I analyzed [dataset] to answer [question], publishing a piece with charts that argued [finding], and being honest about what the data did and didn't support. I learned that the hard part of data analysis is asking the right question and not over-claiming the answer, not running the code.