Posted under » Python Data Analysis on 12 June 2023
Pandas or "Python Data Analysis" is a Python library used for working with data sets.
In simplest form
import pandas mydataset = { 'cats': ["Lion", "Tiger", "Puma"], 'passmotion': [3, 7, 2] } myvar = pandas.DataFrame(mydataset) print(myvar) cats passmotion 0 Lion 3 1 Tiger 7 2 Puma 2
A Pandas Series is like a column in a table. It is a one-dimensional array holding data of any type.
If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc. This label can be used to access a specified value, just like an array index.
import pandas as pd a = [1, 7, 2] # series in a column myvar = pd.Series(a, index = ["x", "y", "z"]) # label print(myvar) x 1 y 7 z 2
You can also use a dictionary when creating a Series. Dictionaries are used to store data values in key:value pairs. A dictionary is a collection which is ordered, changeable and do not allow duplicates.
Dictionaries are written with curly brackets, and have keys and values
import pandas as pd calories = {"day1": 420, "day2": 380, "day3": 390} myvar = pd.Series(calories) print(myvar)
To select only some of the items in the dictionary (calories), use the index argument and specify only the items you want to include in the Series.
import pandas as pd calories = {"day1": 420, "day2": 380, "day3": 390} myvar = pd.Series(calories, index = ["day1", "day2"]) print(myvar) day1 420 day2 380 dtype: int64
Next : Dataframe ».