{ "cells": [ { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "If you haven't it done before: \n", "\n", "* Install python & pandas from [Anaconda](https://www.anaconda.com/products/individual)\n", "* Download `titanic2.zip` data from repository." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "pd.set_option(\"display.precision\", 2)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "The Challenge\n", "\n", "The sinking of the Titanic is one of the most infamous shipwrecks in history.\n", "\n", "On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg.\n", "Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.\n", "\n", "While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.\n", "\n", "In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).\n", "\n", "(c) [Kaggle](https://www.kaggle.com/c/titanic/overview)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.25NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.28C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.92NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.10C123S
4503Allen, Mr. William Henrymale35.0003734508.05NaNS
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "0 1 0 3 \n", "1 2 1 1 \n", "2 3 1 3 \n", "3 4 1 1 \n", "4 5 0 3 \n", "\n", " Name Sex Age SibSp \\\n", "0 Braund, Mr. Owen Harris male 22.0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "4 Allen, Mr. William Henry male 35.0 0 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "0 0 A/5 21171 7.25 NaN S \n", "1 0 PC 17599 71.28 C85 C \n", "2 0 STON/O2. 3101282 7.92 NaN S \n", "3 0 113803 53.10 C123 S \n", "4 0 373450 8.05 NaN S " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('titanic_train.csv')\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "1. How many rows has the dataframe?\n", "2. How many columns has the dataframe?\n", "3. What is the percentage of non-null values in the age column?\n", "4. How many text columns has the dataframe?\n", "5. How many men and women are in the dataset?\n", "6. What is the average age of men and women? On average who is younger?\n", "7. What is the percentage of passengers travelling in 3rd class cabins? (pclass variable)\n", "8. How much did the most expensive ticket cost? (fare variable)\n", "9. Calculate average age, proportion of females and average fare per pclass.\n", "10. Who is more likely to travel alone men or women? (consider a passenger as travelling alone if he/she has neither siblings/spouse not parents/children)\n", "11. What is the most popular lastname? firstname? (name column)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 1 }