AI-powered search & chat for Data / Computer Science Students

10 minutes to pandas

 Pandas User Guide

10 minutes to pandas This is a short introduction to pandas, geared mainly for new users. You can see more complex recipes in the Cookbook . Customarily, we import as follows: Object creation See the ...

Read more at Pandas User Guide

IO tools (text, CSV, HDF5, …)

 Pandas User Guide

IO tools (text, CSV, HDF5, …) The pandas I/O API is a set of top level reader functions accessed like pandas.read_csv() that generally return a pandas object. The corresponding writer functions are ob...

Read more at Pandas User Guide

Intro to data structures

 Pandas User Guide

Intro to data structures We’ll start with a quick, non-comprehensive overview of the fundamental data structures in pandas to get you started. The fundamental behavior about data types, indexing, and ...

Read more at Pandas User Guide

Essential basic functionality

 Pandas User Guide

Essential basic functionality Here we discuss a lot of the essential functionality common to the pandas data structures. To begin, let’s create some example objects like we did in the 10 minutes to pa...

Read more at Pandas User Guide

Indexing and selecting data

 Pandas User Guide

Indexing and selecting data The axis labeling information in pandas objects serves many purposes: Identifies data (i.e. provides metadata ) using known indicators, important for analysis, visualizatio...

Read more at Pandas User Guide

MultiIndex / advanced indexing

 Pandas User Guide

MultiIndex / advanced indexing This section covers indexing with a MultiIndex and other advanced indexing features . See the Indexing and Selecting Data for general indexing documentation. Warning Whe...

Read more at Pandas User Guide

Merge, join, concatenate and compare

 Pandas User Guide

Merge, join, concatenate and compare pandas provides various facilities for easily combining together Series or DataFrame with various kinds of set logic for the indexes and relational algebra functio...

Read more at Pandas User Guide

Reshaping and pivot tables

 Pandas User Guide

Reshaping and pivot tables Reshaping by pivoting DataFrame objects Data is often stored in so-called “stacked” or “record” format: To select out everything for variable A we could do: But suppose we w...

Read more at Pandas User Guide

Working with text data

 Pandas User Guide

Working with text data Text data types New in version 1.0.0. There are two ways to store text data in pandas: object -dtype NumPy array. StringDtype extension type. We recommend using StringDtype to s...

Read more at Pandas User Guide

Frequently Asked Questions (FAQ)

 Pandas User Guide

Frequently Asked Questions (FAQ) DataFrame memory usage The memory usage of a DataFrame (including the index) is shown when calling the info() . A configuration option, display.memory_usage (see the l...

Read more at Pandas User Guide

Cookbook

 Pandas User Guide

Cookbook This is a repository for short and sweet examples and links for useful pandas recipes. We encourage users to add to this documentation. Adding interesting links and/or inline examples to this...

Read more at Pandas User Guide

Working with missing data

 Pandas User Guide

Working with missing data In this section, we will discuss missing (also referred to as NA) values in pandas. Note The choice of using NaN internally to denote missing data was largely for simplicity ...

Read more at Pandas User Guide

Duplicate Labels

 Pandas User Guide

Duplicate Labels Index objects are not required to be unique; you can have duplicate row or column labels. This may be a bit confusing at first. If you’re familiar with SQL, you know that row labels a...

Read more at Pandas User Guide

Categorical data

 Pandas User Guide

Categorical data This is an introduction to pandas categorical data type, including a short comparison with R’s factor . Categoricals are a pandas data type corresponding to categorical variables in s...

Read more at Pandas User Guide

Nullable integer data type

 Pandas User Guide

Nullable integer data type Note IntegerArray is currently experimental. Its API or implementation may change without warning. Changed in version 1.0.0: Now uses pandas.NA as the missing value rather t...

Read more at Pandas User Guide

Nullable Boolean data type

 Pandas User Guide

Nullable Boolean data type Note BooleanArray is currently experimental. Its API or implementation may change without warning. New in version 1.0.0. Indexing with NA values pandas allows indexing with ...

Read more at Pandas User Guide

Chart Visualization

 Pandas User Guide

Chart Visualization This section demonstrates visualization through charting. For information on visualization of tabular data please see the section on Table Visualization . We use the standard conve...

Read more at Pandas User Guide

Table Visualization

 Pandas User Guide

Table Visualization This section demonstrates visualization of tabular data using the Styler class. For information on visualization with charting please see Chart Visualization . This document is wri...

Read more at Pandas User Guide

Group by: split-apply-combine

 Pandas User Guide

Group by: split-apply-combine By “group by” we are referring to a process involving one or more of the following steps: Splitting the data into groups based on some criteria. Applying a function to ea...

Read more at Pandas User Guide

Scaling to large datasets

 Pandas User Guide

Scaling to large datasets pandas provides data structures for in-memory analytics, which makes using pandas to analyze datasets that are larger than memory datasets somewhat tricky. Even datasets that...

Read more at Pandas User Guide

Options and settings

 Pandas User Guide

Options and settings Overview pandas has an options API configure and customize global behavior related to DataFrame display, data behavior and more. Options have a full “dotted-style”, case-insensiti...

Read more at Pandas User Guide

Time series / date functionality

 Pandas User Guide

Time series / date functionality pandas contains extensive capabilities and features for working with time series data for all domains. Using the NumPy datetime64 and timedelta64 dtypes, pandas has co...

Read more at Pandas User Guide

Windowing Operations

 Pandas User Guide

Windowing Operations pandas contains a compact set of APIs for performing windowing operations - an operation that performs an aggregation over a sliding partition of values. The API functions similar...

Read more at Pandas User Guide

Enhancing performance

 Pandas User Guide

Enhancing performance In this part of the tutorial, we will investigate how to speed up certain functions operating on pandas DataFrames using three different techniques: Cython, Numba and pandas.eval...

Read more at Pandas User Guide