AI-powered search & chat for Data / Computer Science Students
10 minutes to pandas
10 minutes to pandas This is a short introduction to pandas, geared mainly for new users. You can see more complex recipes in the Cookbook . Customarily, we import as follows: Object creation See the ...
Read more at Pandas User GuideIO tools (text, CSV, HDF5, …)
IO tools (text, CSV, HDF5, …) The pandas I/O API is a set of top level reader functions accessed like pandas.read_csv() that generally return a pandas object. The corresponding writer functions are ob...
Read more at Pandas User GuideIntro to data structures
Intro to data structures We’ll start with a quick, non-comprehensive overview of the fundamental data structures in pandas to get you started. The fundamental behavior about data types, indexing, and ...
Read more at Pandas User GuideEssential basic functionality
Essential basic functionality Here we discuss a lot of the essential functionality common to the pandas data structures. To begin, let’s create some example objects like we did in the 10 minutes to pa...
Read more at Pandas User GuideIndexing and selecting data
Indexing and selecting data The axis labeling information in pandas objects serves many purposes: Identifies data (i.e. provides metadata ) using known indicators, important for analysis, visualizatio...
Read more at Pandas User GuideMultiIndex / advanced indexing
MultiIndex / advanced indexing This section covers indexing with a MultiIndex and other advanced indexing features . See the Indexing and Selecting Data for general indexing documentation. Warning Whe...
Read more at Pandas User GuideMerge, join, concatenate and compare
Merge, join, concatenate and compare pandas provides various facilities for easily combining together Series or DataFrame with various kinds of set logic for the indexes and relational algebra functio...
Read more at Pandas User GuideReshaping and pivot tables
Reshaping and pivot tables Reshaping by pivoting DataFrame objects Data is often stored in so-called “stacked” or “record” format: To select out everything for variable A we could do: But suppose we w...
Read more at Pandas User GuideWorking with text data
Working with text data Text data types New in version 1.0.0. There are two ways to store text data in pandas: object -dtype NumPy array. StringDtype extension type. We recommend using StringDtype to s...
Read more at Pandas User GuideFrequently Asked Questions (FAQ)
Frequently Asked Questions (FAQ) DataFrame memory usage The memory usage of a DataFrame (including the index) is shown when calling the info() . A configuration option, display.memory_usage (see the l...
Read more at Pandas User GuideCookbook
Cookbook This is a repository for short and sweet examples and links for useful pandas recipes. We encourage users to add to this documentation. Adding interesting links and/or inline examples to this...
Read more at Pandas User GuideWorking with missing data
Working with missing data In this section, we will discuss missing (also referred to as NA) values in pandas. Note The choice of using NaN internally to denote missing data was largely for simplicity ...
Read more at Pandas User GuideDuplicate Labels
Duplicate Labels Index objects are not required to be unique; you can have duplicate row or column labels. This may be a bit confusing at first. If you’re familiar with SQL, you know that row labels a...
Read more at Pandas User GuideCategorical data
Categorical data This is an introduction to pandas categorical data type, including a short comparison with R’s factor . Categoricals are a pandas data type corresponding to categorical variables in s...
Read more at Pandas User GuideNullable integer data type
Nullable integer data type Note IntegerArray is currently experimental. Its API or implementation may change without warning. Changed in version 1.0.0: Now uses pandas.NA as the missing value rather t...
Read more at Pandas User GuideNullable Boolean data type
Nullable Boolean data type Note BooleanArray is currently experimental. Its API or implementation may change without warning. New in version 1.0.0. Indexing with NA values pandas allows indexing with ...
Read more at Pandas User GuideChart Visualization
Chart Visualization This section demonstrates visualization through charting. For information on visualization of tabular data please see the section on Table Visualization . We use the standard conve...
Read more at Pandas User GuideTable Visualization
Table Visualization This section demonstrates visualization of tabular data using the Styler class. For information on visualization with charting please see Chart Visualization . This document is wri...
Read more at Pandas User GuideGroup by: split-apply-combine
Group by: split-apply-combine By “group by” we are referring to a process involving one or more of the following steps: Splitting the data into groups based on some criteria. Applying a function to ea...
Read more at Pandas User GuideScaling to large datasets
Scaling to large datasets pandas provides data structures for in-memory analytics, which makes using pandas to analyze datasets that are larger than memory datasets somewhat tricky. Even datasets that...
Read more at Pandas User GuideOptions and settings
Options and settings Overview pandas has an options API configure and customize global behavior related to DataFrame display, data behavior and more. Options have a full “dotted-style”, case-insensiti...
Read more at Pandas User GuideTime series / date functionality
Time series / date functionality pandas contains extensive capabilities and features for working with time series data for all domains. Using the NumPy datetime64 and timedelta64 dtypes, pandas has co...
Read more at Pandas User GuideWindowing Operations
Windowing Operations pandas contains a compact set of APIs for performing windowing operations - an operation that performs an aggregation over a sliding partition of values. The API functions similar...
Read more at Pandas User GuideEnhancing performance
Enhancing performance In this part of the tutorial, we will investigate how to speed up certain functions operating on pandas DataFrames using three different techniques: Cython, Numba and pandas.eval...
Read more at Pandas User Guide- «
- ‹