Get the name of a pandas DataFrame

How do I get the name of a DataFrame and print it as a string?

Example:

boston (var name assigned to a csv file)

import pandas as pd
boston = pd.read_csv('boston.csv')
print('The winner is team A based on the %s table.) % boston
2

7 Answers

You can name the dataframe with the following, and then call the name wherever you like:

import pandas as pd
df = pd.DataFrame( data=np.ones([4,4]) )
df.name = 'Ones'
print df.name
>>>
Ones

Hope that helps.

6

Sometimes df.name doesn't work.

you might get an error message:

'DataFrame' object has no attribute 'name'

try the below function:

def get_df_name(df): name =[x for x in globals() if globals()[x] is df][0] return name
2

In many situations, a custom attribute attached to a pd.DataFrame object is not necessary. In addition, note that pandas-object attributes may not serialize. So pickling will lose this data.

Instead, consider creating a dictionary with appropriately named keys and access the dataframe via dfs['some_label'].

df = pd.DataFrame()
dfs = {'some_label': df}

From here what I understand DataFrames are:

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.

And Series are:

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.).

Series have a name attribute which can be accessed like so:

 In [27]: s = pd.Series(np.random.randn(5), name='something') In [28]: s Out[28]: 0 0.541 1 -1.175 2 0.129 3 0.043 4 -0.429 Name: something, dtype: float64 In [29]: s.name Out[29]: 'something'

EDIT: Based on OP's comments, I think OP was looking for something like:

 >>> df = pd.DataFrame(...) >>> df.name = 'df' # making a custom attribute that DataFrame doesn't intrinsically have >>> print(df.name) 'df'
6

Here is a sample function: 'df.name = file` : Sixth line in the code below

def df_list(): filename_list = current_stage_files(PATH) df_list = [] for file in filename_list: df = pd.read_csv(PATH+file) df.name = file df_list.append(df) return df_list

I am working on a module for feature analysis and I had the same need as yours, as I would like to generate a report with the name of the pandas.Dataframe being analyzed. To solve this, I used the same solution presented by @scohe001 and @LeopardShark, originally in , implemented with the inspect library:

import inspect
def aux_retrieve_name(var): callers_local_vars = inspect.currentframe().f_back.f_back.f_locals.items() return [var_name for var_name, var_val in callers_local_vars if var_val is var]

Note the additional .f_back term since I intend to call it from another function:

def header_generator(df): print('--------- Feature Analyzer ----------') print('Dataframe name: "{}"'.format(aux_retrieve_name(df))) print('Memory usage: {:03.2f} MB'.format(df.memory_usage(deep=True).sum() / 1024 ** 2)) return

Running this code with a given dataframe, I get the following output:

header_generator(trial_dataframe)

--------- Feature Analyzer ----------
Dataframe name: "trial_dataframe"
Memory usage: 63.08 MB

DataFrames don't have names, but you have an (experimental) attribute dictionary you can use. For example:

df.attrs['name'] = "My name" # Can be retrieved later

attributes are retained through some operations.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like