The Importance of Thinking
…and many examples along the road
What others know about the data
What we can find from the data
To compare and verify outcomes of previous sources
Which data preprocessing is needed?
Which modelling techniques will be suitable?
(no. of obs., no. of potential features, NAs, dtypes, relationships between target and variables or variables themselves…)
Tip
Documenting findings already during your exploration in a reproducible way >>> Generating random stats and plots and then trying to make sense of them retrospectively.
More in week 4
Warning
Be careful about security and GDPR.
Be doubtful - check outcomes and don’t rely on them blindly.
| time | state |
|---|---|
| 10:00:00 | scoring |
| 10:00:02 | manual check |
| 12:00:00 | scoring |
| time | state |
|---|---|
| 10:00:00 | scoring |
| 10:00:02 | waits for income verif docs |
| 10:15:00 | scoring |
| 10:15:02 | income verif failed |
| 10:15:03 | waits for manual income verif |
Tip
From exploration perspective